1
This is mostly my code_gen_buffer cleanup, plus a few other random
1
Pretty small still, but there are two patches that ought
2
changes thrown in. Including a fix for a recent float32_exp2 bug.
2
to get backported to stable, so no point in delaying.
3
4
3
5
r~
4
r~
6
5
6
The following changes since commit a5ba0a7e4e150d1350a041f0d0ef9ca6c8d7c307:
7
7
8
The following changes since commit 894fc4fd670aaf04a67dc7507739f914ff4bacf2:
8
Merge tag 'pull-aspeed-20241211' of https://github.com/legoater/qemu into staging (2024-12-11 15:16:47 +0000)
9
10
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2021-06-11 09:21:48 +0100)
11
9
12
are available in the Git repository at:
10
are available in the Git repository at:
13
11
14
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20210611
12
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20241212
15
13
16
for you to fetch changes up to 60afaddc208d34f6dc86dd974f6e02724fba6eb6:
14
for you to fetch changes up to 7ac87b14a92234b6a89b701b4043ad6cf8bdcccf:
17
15
18
docs/devel: Explain in more detail the TB chaining mechanisms (2021-06-11 09:41:25 -0700)
16
target/sparc: Use memcpy() and remove memcpy32() (2024-12-12 14:28:38 -0600)
19
17
20
----------------------------------------------------------------
18
----------------------------------------------------------------
21
Clean up code_gen_buffer allocation.
19
tcg: Reset free_temps before tcg_optimize
22
Add tcg_remove_ops_after.
20
tcg/riscv: Fix StoreStore barrier generation
23
Fix tcg_constant_* documentation.
21
include/exec: Introduce fpst alias in helper-head.h.inc
24
Improve TB chaining documentation.
22
target/sparc: Use memcpy() and remove memcpy32()
25
Fix float32_exp2.
26
23
27
----------------------------------------------------------------
24
----------------------------------------------------------------
28
Jose R. Ziviani (1):
25
Philippe Mathieu-Daudé (1):
29
tcg/arm: Fix tcg_out_op function signature
26
target/sparc: Use memcpy() and remove memcpy32()
30
27
31
Luis Pires (1):
28
Richard Henderson (2):
32
docs/devel: Explain in more detail the TB chaining mechanisms
29
tcg: Reset free_temps before tcg_optimize
30
include/exec: Introduce fpst alias in helper-head.h.inc
33
31
34
Richard Henderson (32):
32
Roman Artemev (1):
35
meson: Split out tcg/meson.build
33
tcg/riscv: Fix StoreStore barrier generation
36
meson: Split out fpu/meson.build
37
tcg: Re-order tcg_region_init vs tcg_prologue_init
38
tcg: Remove error return from tcg_region_initial_alloc__locked
39
tcg: Split out tcg_region_initial_alloc
40
tcg: Split out tcg_region_prologue_set
41
tcg: Split out region.c
42
accel/tcg: Inline cpu_gen_init
43
accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
44
accel/tcg: Rename tcg_init to tcg_init_machine
45
tcg: Create tcg_init
46
accel/tcg: Merge tcg_exec_init into tcg_init_machine
47
accel/tcg: Use MiB in tcg_init_machine
48
accel/tcg: Pass down max_cpus to tcg_init
49
tcg: Introduce tcg_max_ctxs
50
tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
51
tcg: Replace region.end with region.total_size
52
tcg: Rename region.start to region.after_prologue
53
tcg: Tidy tcg_n_regions
54
tcg: Tidy split_cross_256mb
55
tcg: Move in_code_gen_buffer and tests to region.c
56
tcg: Allocate code_gen_buffer into struct tcg_region_state
57
tcg: Return the map protection from alloc_code_gen_buffer
58
tcg: Sink qemu_madvise call to common code
59
util/osdep: Add qemu_mprotect_rw
60
tcg: Round the tb_size default from qemu_get_host_physmem
61
tcg: Merge buffer protection and guard page protection
62
tcg: When allocating for !splitwx, begin with PROT_NONE
63
tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
64
tcg: Introduce tcg_remove_ops_after
65
tcg: Fix documentation for tcg_constant_* vs tcg_temp_free_*
66
softfloat: Fix tp init in float32_exp2
67
34
68
docs/devel/tcg.rst | 101 ++++-
35
include/tcg/tcg-temp-internal.h | 6 ++++++
69
meson.build | 12 +-
36
accel/tcg/plugin-gen.c | 2 +-
70
accel/tcg/internal.h | 2 +
37
target/sparc/win_helper.c | 26 ++++++++------------------
71
include/qemu/osdep.h | 1 +
38
tcg/tcg.c | 5 ++++-
72
include/sysemu/tcg.h | 2 -
39
include/exec/helper-head.h.inc | 3 +++
73
include/tcg/tcg.h | 28 +-
40
tcg/riscv/tcg-target.c.inc | 2 +-
74
tcg/aarch64/tcg-target.h | 1 +
41
6 files changed, 23 insertions(+), 21 deletions(-)
75
tcg/arm/tcg-target.h | 1 +
76
tcg/i386/tcg-target.h | 2 +
77
tcg/mips/tcg-target.h | 6 +
78
tcg/ppc/tcg-target.h | 2 +
79
tcg/riscv/tcg-target.h | 1 +
80
tcg/s390/tcg-target.h | 3 +
81
tcg/sparc/tcg-target.h | 1 +
82
tcg/tcg-internal.h | 40 ++
83
tcg/tci/tcg-target.h | 1 +
84
accel/tcg/tcg-all.c | 32 +-
85
accel/tcg/translate-all.c | 439 +-------------------
86
bsd-user/main.c | 3 +-
87
fpu/softfloat.c | 2 +-
88
linux-user/main.c | 1 -
89
tcg/region.c | 999 ++++++++++++++++++++++++++++++++++++++++++++++
90
tcg/tcg.c | 649 +++---------------------------
91
util/osdep.c | 9 +
92
tcg/arm/tcg-target.c.inc | 3 +-
93
fpu/meson.build | 1 +
94
tcg/meson.build | 14 +
95
27 files changed, 1266 insertions(+), 1090 deletions(-)
96
create mode 100644 tcg/tcg-internal.h
97
create mode 100644 tcg/region.c
98
create mode 100644 fpu/meson.build
99
create mode 100644 tcg/meson.build
100
42
diff view generated by jsdifflib
Deleted patch
1
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
2
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
meson.build | 8 +-------
7
tcg/meson.build | 13 +++++++++++++
8
2 files changed, 14 insertions(+), 7 deletions(-)
9
create mode 100644 tcg/meson.build
10
1
11
diff --git a/meson.build b/meson.build
12
index XXXXXXX..XXXXXXX 100644
13
--- a/meson.build
14
+++ b/meson.build
15
@@ -XXX,XX +XXX,XX @@ common_ss.add(capstone)
16
specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
17
specific_ss.add(when: 'CONFIG_TCG', if_true: files(
18
'fpu/softfloat.c',
19
- 'tcg/optimize.c',
20
- 'tcg/tcg-common.c',
21
- 'tcg/tcg-op-gvec.c',
22
- 'tcg/tcg-op-vec.c',
23
- 'tcg/tcg-op.c',
24
- 'tcg/tcg.c',
25
))
26
-specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('tcg/tci.c'))
27
28
# Work around a gcc bug/misfeature wherein constant propagation looks
29
# through an alias:
30
@@ -XXX,XX +XXX,XX @@ subdir('net')
31
subdir('replay')
32
subdir('semihosting')
33
subdir('hw')
34
+subdir('tcg')
35
subdir('accel')
36
subdir('plugins')
37
subdir('bsd-user')
38
diff --git a/tcg/meson.build b/tcg/meson.build
39
new file mode 100644
40
index XXXXXXX..XXXXXXX
41
--- /dev/null
42
+++ b/tcg/meson.build
43
@@ -XXX,XX +XXX,XX @@
44
+tcg_ss = ss.source_set()
45
+
46
+tcg_ss.add(files(
47
+ 'optimize.c',
48
+ 'tcg.c',
49
+ 'tcg-common.c',
50
+ 'tcg-op.c',
51
+ 'tcg-op-gvec.c',
52
+ 'tcg-op-vec.c',
53
+))
54
+tcg_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('tci.c'))
55
+
56
+specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
57
--
58
2.25.1
59
60
diff view generated by jsdifflib
Deleted patch
1
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
2
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
meson.build | 4 +---
7
fpu/meson.build | 1 +
8
2 files changed, 2 insertions(+), 3 deletions(-)
9
create mode 100644 fpu/meson.build
10
1
11
diff --git a/meson.build b/meson.build
12
index XXXXXXX..XXXXXXX 100644
13
--- a/meson.build
14
+++ b/meson.build
15
@@ -XXX,XX +XXX,XX @@ subdir('softmmu')
16
17
common_ss.add(capstone)
18
specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
19
-specific_ss.add(when: 'CONFIG_TCG', if_true: files(
20
- 'fpu/softfloat.c',
21
-))
22
23
# Work around a gcc bug/misfeature wherein constant propagation looks
24
# through an alias:
25
@@ -XXX,XX +XXX,XX @@ subdir('replay')
26
subdir('semihosting')
27
subdir('hw')
28
subdir('tcg')
29
+subdir('fpu')
30
subdir('accel')
31
subdir('plugins')
32
subdir('bsd-user')
33
diff --git a/fpu/meson.build b/fpu/meson.build
34
new file mode 100644
35
index XXXXXXX..XXXXXXX
36
--- /dev/null
37
+++ b/fpu/meson.build
38
@@ -0,0 +1 @@
39
+specific_ss.add(when: 'CONFIG_TCG', if_true: files('softfloat.c'))
40
--
41
2.25.1
42
43
diff view generated by jsdifflib
1
Instead of delaying tcg_region_init until after tcg_prologue_init
1
When allocating new temps during tcg_optmize, do not re-use
2
is complete, do tcg_region_init first and let tcg_prologue_init
2
any EBB temps that were used within the TB. We do not have
3
shrink the first region by the size of the generated prologue.
3
any idea what span of the TB in which the temp was live.
4
4
5
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Introduce tcg_temp_ebb_reset_freed and use before tcg_optimize,
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
as well as replacing the equivalent in plugin_gen_inject and
7
tcg_func_start.
8
9
Cc: qemu-stable@nongnu.org
10
Fixes: fb04ab7ddd8 ("tcg/optimize: Lower TCG_COND_TST{EQ,NE} if unsupported")
11
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2711
12
Reported-by: wannacu <wannacu2049@gmail.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
15
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
---
16
---
9
accel/tcg/tcg-all.c | 11 ---------
17
include/tcg/tcg-temp-internal.h | 6 ++++++
10
accel/tcg/translate-all.c | 3 +++
18
accel/tcg/plugin-gen.c | 2 +-
11
bsd-user/main.c | 1 -
19
tcg/tcg.c | 5 ++++-
12
linux-user/main.c | 1 -
20
3 files changed, 11 insertions(+), 2 deletions(-)
13
tcg/tcg.c | 52 ++++++++++++++-------------------------
14
5 files changed, 22 insertions(+), 46 deletions(-)
15
21
16
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
22
diff --git a/include/tcg/tcg-temp-internal.h b/include/tcg/tcg-temp-internal.h
17
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
18
--- a/accel/tcg/tcg-all.c
24
--- a/include/tcg/tcg-temp-internal.h
19
+++ b/accel/tcg/tcg-all.c
25
+++ b/include/tcg/tcg-temp-internal.h
20
@@ -XXX,XX +XXX,XX @@ static int tcg_init(MachineState *ms)
26
@@ -XXX,XX +XXX,XX @@ TCGv_i64 tcg_temp_ebb_new_i64(void);
21
27
TCGv_ptr tcg_temp_ebb_new_ptr(void);
22
tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
28
TCGv_i128 tcg_temp_ebb_new_i128(void);
23
mttcg_enabled = s->mttcg_enabled;
29
24
-
30
+/* Forget all freed EBB temps, so that new allocations produce new temps. */
25
- /*
31
+static inline void tcg_temp_ebb_reset_freed(TCGContext *s)
26
- * Initialize TCG regions only for softmmu.
32
+{
27
- *
33
+ memset(s->free_temps, 0, sizeof(s->free_temps));
28
- * This needs to be done later for user mode, because the prologue
34
+}
29
- * generation needs to be delayed so that GUEST_BASE is already set.
35
+
30
- */
36
#endif /* TCG_TEMP_FREE_H */
31
-#ifndef CONFIG_USER_ONLY
37
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
32
- tcg_region_init();
33
-#endif /* !CONFIG_USER_ONLY */
34
-
35
return 0;
36
}
37
38
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
39
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
40
--- a/accel/tcg/translate-all.c
39
--- a/accel/tcg/plugin-gen.c
41
+++ b/accel/tcg/translate-all.c
40
+++ b/accel/tcg/plugin-gen.c
42
@@ -XXX,XX +XXX,XX @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
41
@@ -XXX,XX +XXX,XX @@ static void plugin_gen_inject(struct qemu_plugin_tb *plugin_tb)
43
splitwx, &error_fatal);
42
* that might be live within the existing opcode stream.
44
assert(ok);
43
* The simplest solution is to release them all and create new.
45
46
+ /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
47
+ tcg_region_init();
48
+
49
#if defined(CONFIG_SOFTMMU)
50
/* There's no guest base to take into account, so go ahead and
51
initialize the prologue now. */
52
diff --git a/bsd-user/main.c b/bsd-user/main.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/bsd-user/main.c
55
+++ b/bsd-user/main.c
56
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
57
* the real value of GUEST_BASE into account.
58
*/
44
*/
59
tcg_prologue_init(tcg_ctx);
45
- memset(tcg_ctx->free_temps, 0, sizeof(tcg_ctx->free_temps));
60
- tcg_region_init();
46
+ tcg_temp_ebb_reset_freed(tcg_ctx);
61
47
62
/* build Task State */
48
QTAILQ_FOREACH_SAFE(op, &tcg_ctx->ops, link, next) {
63
memset(ts, 0, sizeof(TaskState));
49
switch (op->opc) {
64
diff --git a/linux-user/main.c b/linux-user/main.c
65
index XXXXXXX..XXXXXXX 100644
66
--- a/linux-user/main.c
67
+++ b/linux-user/main.c
68
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv, char **envp)
69
generating the prologue until now so that the prologue can take
70
the real value of GUEST_BASE into account. */
71
tcg_prologue_init(tcg_ctx);
72
- tcg_region_init();
73
74
target_cpu_copy_regs(env, regs);
75
76
diff --git a/tcg/tcg.c b/tcg/tcg.c
50
diff --git a/tcg/tcg.c b/tcg/tcg.c
77
index XXXXXXX..XXXXXXX 100644
51
index XXXXXXX..XXXXXXX 100644
78
--- a/tcg/tcg.c
52
--- a/tcg/tcg.c
79
+++ b/tcg/tcg.c
53
+++ b/tcg/tcg.c
80
@@ -XXX,XX +XXX,XX @@ TranslationBlock *tcg_tb_alloc(TCGContext *s)
54
@@ -XXX,XX +XXX,XX @@ void tcg_func_start(TCGContext *s)
81
55
s->nb_temps = s->nb_globals;
82
void tcg_prologue_init(TCGContext *s)
56
83
{
57
/* No temps have been previously allocated for size or locality. */
84
- size_t prologue_size, total_size;
58
- memset(s->free_temps, 0, sizeof(s->free_temps));
85
- void *buf0, *buf1;
59
+ tcg_temp_ebb_reset_freed(s);
86
+ size_t prologue_size;
60
87
61
/* No constant temps have been previously allocated. */
88
/* Put the prologue at the beginning of code_gen_buffer. */
62
for (int i = 0; i < TCG_TYPE_COUNT; ++i) {
89
- buf0 = s->code_gen_buffer;
63
@@ -XXX,XX +XXX,XX @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb, uint64_t pc_start)
90
- total_size = s->code_gen_buffer_size;
91
- s->code_ptr = buf0;
92
- s->code_buf = buf0;
93
+ tcg_region_assign(s, 0);
94
+ s->code_ptr = s->code_gen_ptr;
95
+ s->code_buf = s->code_gen_ptr;
96
s->data_gen_ptr = NULL;
97
98
- /*
99
- * The region trees are not yet configured, but tcg_splitwx_to_rx
100
- * needs the bounds for an assert.
101
- */
102
- region.start = buf0;
103
- region.end = buf0 + total_size;
104
-
105
#ifndef CONFIG_TCG_INTERPRETER
106
- tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(buf0);
107
+ tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(s->code_ptr);
108
#endif
109
110
- /* Compute a high-water mark, at which we voluntarily flush the buffer
111
- and start over. The size here is arbitrary, significantly larger
112
- than we expect the code generation for any one opcode to require. */
113
- s->code_gen_highwater = s->code_gen_buffer + (total_size - TCG_HIGHWATER);
114
-
115
#ifdef TCG_TARGET_NEED_POOL_LABELS
116
s->pool_labels = NULL;
117
#endif
118
@@ -XXX,XX +XXX,XX @@ void tcg_prologue_init(TCGContext *s)
119
}
64
}
120
#endif
65
#endif
121
66
122
- buf1 = s->code_ptr;
67
+ /* Do not reuse any EBB that may be allocated within the TB. */
123
+ prologue_size = tcg_current_code_size(s);
68
+ tcg_temp_ebb_reset_freed(s);
124
+
69
+
125
#ifndef CONFIG_TCG_INTERPRETER
70
tcg_optimize(s);
126
- flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(buf0), (uintptr_t)buf0,
71
127
- tcg_ptr_byte_diff(buf1, buf0));
72
reachable_code_pass(s);
128
+ flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(s->code_buf),
129
+ (uintptr_t)s->code_buf, prologue_size);
130
#endif
131
132
- /* Deduct the prologue from the buffer. */
133
- prologue_size = tcg_current_code_size(s);
134
- s->code_gen_ptr = buf1;
135
- s->code_gen_buffer = buf1;
136
- s->code_buf = buf1;
137
- total_size -= prologue_size;
138
- s->code_gen_buffer_size = total_size;
139
+ /* Deduct the prologue from the first region. */
140
+ region.start = s->code_ptr;
141
142
- tcg_register_jit(tcg_splitwx_to_rx(s->code_gen_buffer), total_size);
143
+ /* Recompute boundaries of the first region. */
144
+ tcg_region_assign(s, 0);
145
+
146
+ tcg_register_jit(tcg_splitwx_to_rx(region.start),
147
+ region.end - region.start);
148
149
#ifdef DEBUG_DISAS
150
if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
151
FILE *logfile = qemu_log_lock();
152
qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
153
if (s->data_gen_ptr) {
154
- size_t code_size = s->data_gen_ptr - buf0;
155
+ size_t code_size = s->data_gen_ptr - s->code_gen_ptr;
156
size_t data_size = prologue_size - code_size;
157
size_t i;
158
159
- log_disas(buf0, code_size);
160
+ log_disas(s->code_gen_ptr, code_size);
161
162
for (i = 0; i < data_size; i += sizeof(tcg_target_ulong)) {
163
if (sizeof(tcg_target_ulong) == 8) {
164
@@ -XXX,XX +XXX,XX @@ void tcg_prologue_init(TCGContext *s)
165
}
166
}
167
} else {
168
- log_disas(buf0, prologue_size);
169
+ log_disas(s->code_gen_ptr, prologue_size);
170
}
171
qemu_log("\n");
172
qemu_log_flush();
173
--
73
--
174
2.25.1
74
2.43.0
175
75
176
76
diff view generated by jsdifflib
Deleted patch
1
All callers immediately assert on error, so move the assert
2
into the function itself.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/tcg.c | 19 ++++++-------------
10
1 file changed, 6 insertions(+), 13 deletions(-)
11
12
diff --git a/tcg/tcg.c b/tcg/tcg.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/tcg.c
15
+++ b/tcg/tcg.c
16
@@ -XXX,XX +XXX,XX @@ static bool tcg_region_alloc(TCGContext *s)
17
* Perform a context's first region allocation.
18
* This function does _not_ increment region.agg_size_full.
19
*/
20
-static inline bool tcg_region_initial_alloc__locked(TCGContext *s)
21
+static void tcg_region_initial_alloc__locked(TCGContext *s)
22
{
23
- return tcg_region_alloc__locked(s);
24
+ bool err = tcg_region_alloc__locked(s);
25
+ g_assert(!err);
26
}
27
28
/* Call from a safe-work context */
29
@@ -XXX,XX +XXX,XX @@ void tcg_region_reset_all(void)
30
31
for (i = 0; i < n_ctxs; i++) {
32
TCGContext *s = qatomic_read(&tcg_ctxs[i]);
33
- bool err = tcg_region_initial_alloc__locked(s);
34
-
35
- g_assert(!err);
36
+ tcg_region_initial_alloc__locked(s);
37
}
38
qemu_mutex_unlock(&region.lock);
39
40
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(void)
41
42
/* In user-mode we support only one ctx, so do the initial allocation now */
43
#ifdef CONFIG_USER_ONLY
44
- {
45
- bool err = tcg_region_initial_alloc__locked(tcg_ctx);
46
-
47
- g_assert(!err);
48
- }
49
+ tcg_region_initial_alloc__locked(tcg_ctx);
50
#endif
51
}
52
53
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
54
MachineState *ms = MACHINE(qdev_get_machine());
55
TCGContext *s = g_malloc(sizeof(*s));
56
unsigned int i, n;
57
- bool err;
58
59
*s = tcg_init_ctx;
60
61
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
62
63
tcg_ctx = s;
64
qemu_mutex_lock(&region.lock);
65
- err = tcg_region_initial_alloc__locked(tcg_ctx);
66
- g_assert(!err);
67
+ tcg_region_initial_alloc__locked(s);
68
qemu_mutex_unlock(&region.lock);
69
}
70
#endif /* !CONFIG_USER_ONLY */
71
--
72
2.25.1
73
74
diff view generated by jsdifflib
Deleted patch
1
This has only one user, and currently needs an ifdef,
2
but will make more sense after some code motion.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/tcg.c | 13 ++++++++++---
9
1 file changed, 10 insertions(+), 3 deletions(-)
10
11
diff --git a/tcg/tcg.c b/tcg/tcg.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/tcg.c
14
+++ b/tcg/tcg.c
15
@@ -XXX,XX +XXX,XX @@ static void tcg_region_initial_alloc__locked(TCGContext *s)
16
g_assert(!err);
17
}
18
19
+#ifndef CONFIG_USER_ONLY
20
+static void tcg_region_initial_alloc(TCGContext *s)
21
+{
22
+ qemu_mutex_lock(&region.lock);
23
+ tcg_region_initial_alloc__locked(s);
24
+ qemu_mutex_unlock(&region.lock);
25
+}
26
+#endif
27
+
28
/* Call from a safe-work context */
29
void tcg_region_reset_all(void)
30
{
31
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
32
}
33
34
tcg_ctx = s;
35
- qemu_mutex_lock(&region.lock);
36
- tcg_region_initial_alloc__locked(s);
37
- qemu_mutex_unlock(&region.lock);
38
+ tcg_region_initial_alloc(s);
39
}
40
#endif /* !CONFIG_USER_ONLY */
41
42
--
43
2.25.1
44
45
diff view generated by jsdifflib
Deleted patch
1
This has only one user, but will make more sense after some
2
code motion.
3
1
4
Always leave the tcg_init_ctx initialized to the first region,
5
in preparation for tcg_prologue_init(). This also requires
6
that we don't re-allocate the region for the first cpu, lest
7
we hit the assertion for total number of regions allocated .
8
9
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
10
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
tcg/tcg.c | 37 ++++++++++++++++++++++---------------
14
1 file changed, 22 insertions(+), 15 deletions(-)
15
16
diff --git a/tcg/tcg.c b/tcg/tcg.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/tcg/tcg.c
19
+++ b/tcg/tcg.c
20
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(void)
21
22
tcg_region_trees_init();
23
24
- /* In user-mode we support only one ctx, so do the initial allocation now */
25
-#ifdef CONFIG_USER_ONLY
26
- tcg_region_initial_alloc__locked(tcg_ctx);
27
-#endif
28
+ /*
29
+ * Leave the initial context initialized to the first region.
30
+ * This will be the context into which we generate the prologue.
31
+ * It is also the only context for CONFIG_USER_ONLY.
32
+ */
33
+ tcg_region_initial_alloc__locked(&tcg_init_ctx);
34
+}
35
+
36
+static void tcg_region_prologue_set(TCGContext *s)
37
+{
38
+ /* Deduct the prologue from the first region. */
39
+ g_assert(region.start == s->code_gen_buffer);
40
+ region.start = s->code_ptr;
41
+
42
+ /* Recompute boundaries of the first region. */
43
+ tcg_region_assign(s, 0);
44
+
45
+ /* Register the balance of the buffer with gdb. */
46
+ tcg_register_jit(tcg_splitwx_to_rx(region.start),
47
+ region.end - region.start);
48
}
49
50
#ifdef CONFIG_DEBUG_TCG
51
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
52
53
if (n > 0) {
54
alloc_tcg_plugin_context(s);
55
+ tcg_region_initial_alloc(s);
56
}
57
58
tcg_ctx = s;
59
- tcg_region_initial_alloc(s);
60
}
61
#endif /* !CONFIG_USER_ONLY */
62
63
@@ -XXX,XX +XXX,XX @@ void tcg_prologue_init(TCGContext *s)
64
{
65
size_t prologue_size;
66
67
- /* Put the prologue at the beginning of code_gen_buffer. */
68
- tcg_region_assign(s, 0);
69
s->code_ptr = s->code_gen_ptr;
70
s->code_buf = s->code_gen_ptr;
71
s->data_gen_ptr = NULL;
72
@@ -XXX,XX +XXX,XX @@ void tcg_prologue_init(TCGContext *s)
73
(uintptr_t)s->code_buf, prologue_size);
74
#endif
75
76
- /* Deduct the prologue from the first region. */
77
- region.start = s->code_ptr;
78
-
79
- /* Recompute boundaries of the first region. */
80
- tcg_region_assign(s, 0);
81
-
82
- tcg_register_jit(tcg_splitwx_to_rx(region.start),
83
- region.end - region.start);
84
+ tcg_region_prologue_set(s);
85
86
#ifdef DEBUG_DISAS
87
if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
88
--
89
2.25.1
90
91
diff view generated by jsdifflib
Deleted patch
1
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
2
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
---
5
tcg/tcg-internal.h | 37 +++
6
tcg/region.c | 572 +++++++++++++++++++++++++++++++++++++++++++++
7
tcg/tcg.c | 547 +------------------------------------------
8
tcg/meson.build | 1 +
9
4 files changed, 613 insertions(+), 544 deletions(-)
10
create mode 100644 tcg/tcg-internal.h
11
create mode 100644 tcg/region.c
12
1
13
diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
14
new file mode 100644
15
index XXXXXXX..XXXXXXX
16
--- /dev/null
17
+++ b/tcg/tcg-internal.h
18
@@ -XXX,XX +XXX,XX @@
19
+/*
20
+ * Internal declarations for Tiny Code Generator for QEMU
21
+ *
22
+ * Copyright (c) 2008 Fabrice Bellard
23
+ *
24
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
25
+ * of this software and associated documentation files (the "Software"), to deal
26
+ * in the Software without restriction, including without limitation the rights
27
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
28
+ * copies of the Software, and to permit persons to whom the Software is
29
+ * furnished to do so, subject to the following conditions:
30
+ *
31
+ * The above copyright notice and this permission notice shall be included in
32
+ * all copies or substantial portions of the Software.
33
+ *
34
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
35
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
36
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
37
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
38
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
39
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
40
+ * THE SOFTWARE.
41
+ */
42
+
43
+#ifndef TCG_INTERNAL_H
44
+#define TCG_INTERNAL_H 1
45
+
46
+#define TCG_HIGHWATER 1024
47
+
48
+extern TCGContext **tcg_ctxs;
49
+extern unsigned int n_tcg_ctxs;
50
+
51
+bool tcg_region_alloc(TCGContext *s);
52
+void tcg_region_initial_alloc(TCGContext *s);
53
+void tcg_region_prologue_set(TCGContext *s);
54
+
55
+#endif /* TCG_INTERNAL_H */
56
diff --git a/tcg/region.c b/tcg/region.c
57
new file mode 100644
58
index XXXXXXX..XXXXXXX
59
--- /dev/null
60
+++ b/tcg/region.c
61
@@ -XXX,XX +XXX,XX @@
62
+/*
63
+ * Memory region management for Tiny Code Generator for QEMU
64
+ *
65
+ * Copyright (c) 2008 Fabrice Bellard
66
+ *
67
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
68
+ * of this software and associated documentation files (the "Software"), to deal
69
+ * in the Software without restriction, including without limitation the rights
70
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
71
+ * copies of the Software, and to permit persons to whom the Software is
72
+ * furnished to do so, subject to the following conditions:
73
+ *
74
+ * The above copyright notice and this permission notice shall be included in
75
+ * all copies or substantial portions of the Software.
76
+ *
77
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
78
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
79
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
80
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
81
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
82
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
83
+ * THE SOFTWARE.
84
+ */
85
+
86
+#include "qemu/osdep.h"
87
+#include "exec/exec-all.h"
88
+#include "tcg/tcg.h"
89
+#if !defined(CONFIG_USER_ONLY)
90
+#include "hw/boards.h"
91
+#endif
92
+#include "tcg-internal.h"
93
+
94
+
95
+struct tcg_region_tree {
96
+ QemuMutex lock;
97
+ GTree *tree;
98
+ /* padding to avoid false sharing is computed at run-time */
99
+};
100
+
101
+/*
102
+ * We divide code_gen_buffer into equally-sized "regions" that TCG threads
103
+ * dynamically allocate from as demand dictates. Given appropriate region
104
+ * sizing, this minimizes flushes even when some TCG threads generate a lot
105
+ * more code than others.
106
+ */
107
+struct tcg_region_state {
108
+ QemuMutex lock;
109
+
110
+ /* fields set at init time */
111
+ void *start;
112
+ void *start_aligned;
113
+ void *end;
114
+ size_t n;
115
+ size_t size; /* size of one region */
116
+ size_t stride; /* .size + guard size */
117
+
118
+ /* fields protected by the lock */
119
+ size_t current; /* current region index */
120
+ size_t agg_size_full; /* aggregate size of full regions */
121
+};
122
+
123
+static struct tcg_region_state region;
124
+
125
+/*
126
+ * This is an array of struct tcg_region_tree's, with padding.
127
+ * We use void * to simplify the computation of region_trees[i]; each
128
+ * struct is found every tree_size bytes.
129
+ */
130
+static void *region_trees;
131
+static size_t tree_size;
132
+
133
+/* compare a pointer @ptr and a tb_tc @s */
134
+static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
135
+{
136
+ if (ptr >= s->ptr + s->size) {
137
+ return 1;
138
+ } else if (ptr < s->ptr) {
139
+ return -1;
140
+ }
141
+ return 0;
142
+}
143
+
144
+static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp)
145
+{
146
+ const struct tb_tc *a = ap;
147
+ const struct tb_tc *b = bp;
148
+
149
+ /*
150
+ * When both sizes are set, we know this isn't a lookup.
151
+ * This is the most likely case: every TB must be inserted; lookups
152
+ * are a lot less frequent.
153
+ */
154
+ if (likely(a->size && b->size)) {
155
+ if (a->ptr > b->ptr) {
156
+ return 1;
157
+ } else if (a->ptr < b->ptr) {
158
+ return -1;
159
+ }
160
+ /* a->ptr == b->ptr should happen only on deletions */
161
+ g_assert(a->size == b->size);
162
+ return 0;
163
+ }
164
+ /*
165
+ * All lookups have either .size field set to 0.
166
+ * From the glib sources we see that @ap is always the lookup key. However
167
+ * the docs provide no guarantee, so we just mark this case as likely.
168
+ */
169
+ if (likely(a->size == 0)) {
170
+ return ptr_cmp_tb_tc(a->ptr, b);
171
+ }
172
+ return ptr_cmp_tb_tc(b->ptr, a);
173
+}
174
+
175
+static void tcg_region_trees_init(void)
176
+{
177
+ size_t i;
178
+
179
+ tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize);
180
+ region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size);
181
+ for (i = 0; i < region.n; i++) {
182
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
183
+
184
+ qemu_mutex_init(&rt->lock);
185
+ rt->tree = g_tree_new(tb_tc_cmp);
186
+ }
187
+}
188
+
189
+static struct tcg_region_tree *tc_ptr_to_region_tree(const void *p)
190
+{
191
+ size_t region_idx;
192
+
193
+ /*
194
+ * Like tcg_splitwx_to_rw, with no assert. The pc may come from
195
+ * a signal handler over which the caller has no control.
196
+ */
197
+ if (!in_code_gen_buffer(p)) {
198
+ p -= tcg_splitwx_diff;
199
+ if (!in_code_gen_buffer(p)) {
200
+ return NULL;
201
+ }
202
+ }
203
+
204
+ if (p < region.start_aligned) {
205
+ region_idx = 0;
206
+ } else {
207
+ ptrdiff_t offset = p - region.start_aligned;
208
+
209
+ if (offset > region.stride * (region.n - 1)) {
210
+ region_idx = region.n - 1;
211
+ } else {
212
+ region_idx = offset / region.stride;
213
+ }
214
+ }
215
+ return region_trees + region_idx * tree_size;
216
+}
217
+
218
+void tcg_tb_insert(TranslationBlock *tb)
219
+{
220
+ struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
221
+
222
+ g_assert(rt != NULL);
223
+ qemu_mutex_lock(&rt->lock);
224
+ g_tree_insert(rt->tree, &tb->tc, tb);
225
+ qemu_mutex_unlock(&rt->lock);
226
+}
227
+
228
+void tcg_tb_remove(TranslationBlock *tb)
229
+{
230
+ struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
231
+
232
+ g_assert(rt != NULL);
233
+ qemu_mutex_lock(&rt->lock);
234
+ g_tree_remove(rt->tree, &tb->tc);
235
+ qemu_mutex_unlock(&rt->lock);
236
+}
237
+
238
+/*
239
+ * Find the TB 'tb' such that
240
+ * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size
241
+ * Return NULL if not found.
242
+ */
243
+TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
244
+{
245
+ struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr);
246
+ TranslationBlock *tb;
247
+ struct tb_tc s = { .ptr = (void *)tc_ptr };
248
+
249
+ if (rt == NULL) {
250
+ return NULL;
251
+ }
252
+
253
+ qemu_mutex_lock(&rt->lock);
254
+ tb = g_tree_lookup(rt->tree, &s);
255
+ qemu_mutex_unlock(&rt->lock);
256
+ return tb;
257
+}
258
+
259
+static void tcg_region_tree_lock_all(void)
260
+{
261
+ size_t i;
262
+
263
+ for (i = 0; i < region.n; i++) {
264
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
265
+
266
+ qemu_mutex_lock(&rt->lock);
267
+ }
268
+}
269
+
270
+static void tcg_region_tree_unlock_all(void)
271
+{
272
+ size_t i;
273
+
274
+ for (i = 0; i < region.n; i++) {
275
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
276
+
277
+ qemu_mutex_unlock(&rt->lock);
278
+ }
279
+}
280
+
281
+void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
282
+{
283
+ size_t i;
284
+
285
+ tcg_region_tree_lock_all();
286
+ for (i = 0; i < region.n; i++) {
287
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
288
+
289
+ g_tree_foreach(rt->tree, func, user_data);
290
+ }
291
+ tcg_region_tree_unlock_all();
292
+}
293
+
294
+size_t tcg_nb_tbs(void)
295
+{
296
+ size_t nb_tbs = 0;
297
+ size_t i;
298
+
299
+ tcg_region_tree_lock_all();
300
+ for (i = 0; i < region.n; i++) {
301
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
302
+
303
+ nb_tbs += g_tree_nnodes(rt->tree);
304
+ }
305
+ tcg_region_tree_unlock_all();
306
+ return nb_tbs;
307
+}
308
+
309
+static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
310
+{
311
+ TranslationBlock *tb = v;
312
+
313
+ tb_destroy(tb);
314
+ return FALSE;
315
+}
316
+
317
+static void tcg_region_tree_reset_all(void)
318
+{
319
+ size_t i;
320
+
321
+ tcg_region_tree_lock_all();
322
+ for (i = 0; i < region.n; i++) {
323
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
324
+
325
+ g_tree_foreach(rt->tree, tcg_region_tree_traverse, NULL);
326
+ /* Increment the refcount first so that destroy acts as a reset */
327
+ g_tree_ref(rt->tree);
328
+ g_tree_destroy(rt->tree);
329
+ }
330
+ tcg_region_tree_unlock_all();
331
+}
332
+
333
+static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
334
+{
335
+ void *start, *end;
336
+
337
+ start = region.start_aligned + curr_region * region.stride;
338
+ end = start + region.size;
339
+
340
+ if (curr_region == 0) {
341
+ start = region.start;
342
+ }
343
+ if (curr_region == region.n - 1) {
344
+ end = region.end;
345
+ }
346
+
347
+ *pstart = start;
348
+ *pend = end;
349
+}
350
+
351
+static void tcg_region_assign(TCGContext *s, size_t curr_region)
352
+{
353
+ void *start, *end;
354
+
355
+ tcg_region_bounds(curr_region, &start, &end);
356
+
357
+ s->code_gen_buffer = start;
358
+ s->code_gen_ptr = start;
359
+ s->code_gen_buffer_size = end - start;
360
+ s->code_gen_highwater = end - TCG_HIGHWATER;
361
+}
362
+
363
+static bool tcg_region_alloc__locked(TCGContext *s)
364
+{
365
+ if (region.current == region.n) {
366
+ return true;
367
+ }
368
+ tcg_region_assign(s, region.current);
369
+ region.current++;
370
+ return false;
371
+}
372
+
373
+/*
374
+ * Request a new region once the one in use has filled up.
375
+ * Returns true on error.
376
+ */
377
+bool tcg_region_alloc(TCGContext *s)
378
+{
379
+ bool err;
380
+ /* read the region size now; alloc__locked will overwrite it on success */
381
+ size_t size_full = s->code_gen_buffer_size;
382
+
383
+ qemu_mutex_lock(&region.lock);
384
+ err = tcg_region_alloc__locked(s);
385
+ if (!err) {
386
+ region.agg_size_full += size_full - TCG_HIGHWATER;
387
+ }
388
+ qemu_mutex_unlock(&region.lock);
389
+ return err;
390
+}
391
+
392
+/*
393
+ * Perform a context's first region allocation.
394
+ * This function does _not_ increment region.agg_size_full.
395
+ */
396
+static void tcg_region_initial_alloc__locked(TCGContext *s)
397
+{
398
+ bool err = tcg_region_alloc__locked(s);
399
+ g_assert(!err);
400
+}
401
+
402
+void tcg_region_initial_alloc(TCGContext *s)
403
+{
404
+ qemu_mutex_lock(&region.lock);
405
+ tcg_region_initial_alloc__locked(s);
406
+ qemu_mutex_unlock(&region.lock);
407
+}
408
+
409
+/* Call from a safe-work context */
410
+void tcg_region_reset_all(void)
411
+{
412
+ unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
413
+ unsigned int i;
414
+
415
+ qemu_mutex_lock(&region.lock);
416
+ region.current = 0;
417
+ region.agg_size_full = 0;
418
+
419
+ for (i = 0; i < n_ctxs; i++) {
420
+ TCGContext *s = qatomic_read(&tcg_ctxs[i]);
421
+ tcg_region_initial_alloc__locked(s);
422
+ }
423
+ qemu_mutex_unlock(&region.lock);
424
+
425
+ tcg_region_tree_reset_all();
426
+}
427
+
428
+#ifdef CONFIG_USER_ONLY
429
+static size_t tcg_n_regions(void)
430
+{
431
+ return 1;
432
+}
433
+#else
434
+/*
435
+ * It is likely that some vCPUs will translate more code than others, so we
436
+ * first try to set more regions than max_cpus, with those regions being of
437
+ * reasonable size. If that's not possible we make do by evenly dividing
438
+ * the code_gen_buffer among the vCPUs.
439
+ */
440
+static size_t tcg_n_regions(void)
441
+{
442
+ size_t i;
443
+
444
+ /* Use a single region if all we have is one vCPU thread */
445
+#if !defined(CONFIG_USER_ONLY)
446
+ MachineState *ms = MACHINE(qdev_get_machine());
447
+ unsigned int max_cpus = ms->smp.max_cpus;
448
+#endif
449
+ if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
450
+ return 1;
451
+ }
452
+
453
+ /* Try to have more regions than max_cpus, with each region being >= 2 MB */
454
+ for (i = 8; i > 0; i--) {
455
+ size_t regions_per_thread = i;
456
+ size_t region_size;
457
+
458
+ region_size = tcg_init_ctx.code_gen_buffer_size;
459
+ region_size /= max_cpus * regions_per_thread;
460
+
461
+ if (region_size >= 2 * 1024u * 1024) {
462
+ return max_cpus * regions_per_thread;
463
+ }
464
+ }
465
+ /* If we can't, then just allocate one region per vCPU thread */
466
+ return max_cpus;
467
+}
468
+#endif
469
+
470
+/*
471
+ * Initializes region partitioning.
472
+ *
473
+ * Called at init time from the parent thread (i.e. the one calling
474
+ * tcg_context_init), after the target's TCG globals have been set.
475
+ *
476
+ * Region partitioning works by splitting code_gen_buffer into separate regions,
477
+ * and then assigning regions to TCG threads so that the threads can translate
478
+ * code in parallel without synchronization.
479
+ *
480
+ * In softmmu the number of TCG threads is bounded by max_cpus, so we use at
481
+ * least max_cpus regions in MTTCG. In !MTTCG we use a single region.
482
+ * Note that the TCG options from the command-line (i.e. -accel accel=tcg,[...])
483
+ * must have been parsed before calling this function, since it calls
484
+ * qemu_tcg_mttcg_enabled().
485
+ *
486
+ * In user-mode we use a single region. Having multiple regions in user-mode
487
+ * is not supported, because the number of vCPU threads (recall that each thread
488
+ * spawned by the guest corresponds to a vCPU thread) is only bounded by the
489
+ * OS, and usually this number is huge (tens of thousands is not uncommon).
490
+ * Thus, given this large bound on the number of vCPU threads and the fact
491
+ * that code_gen_buffer is allocated at compile-time, we cannot guarantee
492
+ * that the availability of at least one region per vCPU thread.
493
+ *
494
+ * However, this user-mode limitation is unlikely to be a significant problem
495
+ * in practice. Multi-threaded guests share most if not all of their translated
496
+ * code, which makes parallel code generation less appealing than in softmmu.
497
+ */
498
+void tcg_region_init(void)
499
+{
500
+ void *buf = tcg_init_ctx.code_gen_buffer;
501
+ void *aligned;
502
+ size_t size = tcg_init_ctx.code_gen_buffer_size;
503
+ size_t page_size = qemu_real_host_page_size;
504
+ size_t region_size;
505
+ size_t n_regions;
506
+ size_t i;
507
+
508
+ n_regions = tcg_n_regions();
509
+
510
+ /* The first region will be 'aligned - buf' bytes larger than the others */
511
+ aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
512
+ g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
513
+ /*
514
+ * Make region_size a multiple of page_size, using aligned as the start.
515
+ * As a result of this we might end up with a few extra pages at the end of
516
+ * the buffer; we will assign those to the last region.
517
+ */
518
+ region_size = (size - (aligned - buf)) / n_regions;
519
+ region_size = QEMU_ALIGN_DOWN(region_size, page_size);
520
+
521
+ /* A region must have at least 2 pages; one code, one guard */
522
+ g_assert(region_size >= 2 * page_size);
523
+
524
+ /* init the region struct */
525
+ qemu_mutex_init(&region.lock);
526
+ region.n = n_regions;
527
+ region.size = region_size - page_size;
528
+ region.stride = region_size;
529
+ region.start = buf;
530
+ region.start_aligned = aligned;
531
+ /* page-align the end, since its last page will be a guard page */
532
+ region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
533
+ /* account for that last guard page */
534
+ region.end -= page_size;
535
+
536
+ /*
537
+ * Set guard pages in the rw buffer, as that's the one into which
538
+ * buffer overruns could occur. Do not set guard pages in the rx
539
+ * buffer -- let that one use hugepages throughout.
540
+ */
541
+ for (i = 0; i < region.n; i++) {
542
+ void *start, *end;
543
+
544
+ tcg_region_bounds(i, &start, &end);
545
+
546
+ /*
547
+ * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
548
+ * rejects a permission change from RWX -> NONE. Guard pages are
549
+ * nice for bug detection but are not essential; ignore any failure.
550
+ */
551
+ (void)qemu_mprotect_none(end, page_size);
552
+ }
553
+
554
+ tcg_region_trees_init();
555
+
556
+ /*
557
+ * Leave the initial context initialized to the first region.
558
+ * This will be the context into which we generate the prologue.
559
+ * It is also the only context for CONFIG_USER_ONLY.
560
+ */
561
+ tcg_region_initial_alloc__locked(&tcg_init_ctx);
562
+}
563
+
564
+void tcg_region_prologue_set(TCGContext *s)
565
+{
566
+ /* Deduct the prologue from the first region. */
567
+ g_assert(region.start == s->code_gen_buffer);
568
+ region.start = s->code_ptr;
569
+
570
+ /* Recompute boundaries of the first region. */
571
+ tcg_region_assign(s, 0);
572
+
573
+ /* Register the balance of the buffer with gdb. */
574
+ tcg_register_jit(tcg_splitwx_to_rx(region.start),
575
+ region.end - region.start);
576
+}
577
+
578
+/*
579
+ * Returns the size (in bytes) of all translated code (i.e. from all regions)
580
+ * currently in the cache.
581
+ * See also: tcg_code_capacity()
582
+ * Do not confuse with tcg_current_code_size(); that one applies to a single
583
+ * TCG context.
584
+ */
585
+size_t tcg_code_size(void)
586
+{
587
+ unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
588
+ unsigned int i;
589
+ size_t total;
590
+
591
+ qemu_mutex_lock(&region.lock);
592
+ total = region.agg_size_full;
593
+ for (i = 0; i < n_ctxs; i++) {
594
+ const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
595
+ size_t size;
596
+
597
+ size = qatomic_read(&s->code_gen_ptr) - s->code_gen_buffer;
598
+ g_assert(size <= s->code_gen_buffer_size);
599
+ total += size;
600
+ }
601
+ qemu_mutex_unlock(&region.lock);
602
+ return total;
603
+}
604
+
605
+/*
606
+ * Returns the code capacity (in bytes) of the entire cache, i.e. including all
607
+ * regions.
608
+ * See also: tcg_code_size()
609
+ */
610
+size_t tcg_code_capacity(void)
611
+{
612
+ size_t guard_size, capacity;
613
+
614
+ /* no need for synchronization; these variables are set at init time */
615
+ guard_size = region.stride - region.size;
616
+ capacity = region.end + guard_size - region.start;
617
+ capacity -= region.n * (guard_size + TCG_HIGHWATER);
618
+ return capacity;
619
+}
620
+
621
+size_t tcg_tb_phys_invalidate_count(void)
622
+{
623
+ unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
624
+ unsigned int i;
625
+ size_t total = 0;
626
+
627
+ for (i = 0; i < n_ctxs; i++) {
628
+ const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
629
+
630
+ total += qatomic_read(&s->tb_phys_invalidate_count);
631
+ }
632
+ return total;
633
+}
634
diff --git a/tcg/tcg.c b/tcg/tcg.c
635
index XXXXXXX..XXXXXXX 100644
636
--- a/tcg/tcg.c
637
+++ b/tcg/tcg.c
638
@@ -XXX,XX +XXX,XX @@
639
640
#include "elf.h"
641
#include "exec/log.h"
642
+#include "tcg-internal.h"
643
644
/* Forward declarations for functions declared in tcg-target.c.inc and
645
used here. */
646
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct);
647
static int tcg_out_ldst_finalize(TCGContext *s);
648
#endif
649
650
-#define TCG_HIGHWATER 1024
651
-
652
-static TCGContext **tcg_ctxs;
653
-static unsigned int n_tcg_ctxs;
654
+TCGContext **tcg_ctxs;
655
+unsigned int n_tcg_ctxs;
656
TCGv_env cpu_env = 0;
657
const void *tcg_code_gen_epilogue;
658
uintptr_t tcg_splitwx_diff;
659
@@ -XXX,XX +XXX,XX @@ uintptr_t tcg_splitwx_diff;
660
tcg_prologue_fn *tcg_qemu_tb_exec;
661
#endif
662
663
-struct tcg_region_tree {
664
- QemuMutex lock;
665
- GTree *tree;
666
- /* padding to avoid false sharing is computed at run-time */
667
-};
668
-
669
-/*
670
- * We divide code_gen_buffer into equally-sized "regions" that TCG threads
671
- * dynamically allocate from as demand dictates. Given appropriate region
672
- * sizing, this minimizes flushes even when some TCG threads generate a lot
673
- * more code than others.
674
- */
675
-struct tcg_region_state {
676
- QemuMutex lock;
677
-
678
- /* fields set at init time */
679
- void *start;
680
- void *start_aligned;
681
- void *end;
682
- size_t n;
683
- size_t size; /* size of one region */
684
- size_t stride; /* .size + guard size */
685
-
686
- /* fields protected by the lock */
687
- size_t current; /* current region index */
688
- size_t agg_size_full; /* aggregate size of full regions */
689
-};
690
-
691
-static struct tcg_region_state region;
692
-/*
693
- * This is an array of struct tcg_region_tree's, with padding.
694
- * We use void * to simplify the computation of region_trees[i]; each
695
- * struct is found every tree_size bytes.
696
- */
697
-static void *region_trees;
698
-static size_t tree_size;
699
static TCGRegSet tcg_target_available_regs[TCG_TYPE_COUNT];
700
static TCGRegSet tcg_target_call_clobber_regs;
701
702
@@ -XXX,XX +XXX,XX @@ static const TCGTargetOpDef constraint_sets[] = {
703
704
#include "tcg-target.c.inc"
705
706
-/* compare a pointer @ptr and a tb_tc @s */
707
-static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
708
-{
709
- if (ptr >= s->ptr + s->size) {
710
- return 1;
711
- } else if (ptr < s->ptr) {
712
- return -1;
713
- }
714
- return 0;
715
-}
716
-
717
-static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp)
718
-{
719
- const struct tb_tc *a = ap;
720
- const struct tb_tc *b = bp;
721
-
722
- /*
723
- * When both sizes are set, we know this isn't a lookup.
724
- * This is the most likely case: every TB must be inserted; lookups
725
- * are a lot less frequent.
726
- */
727
- if (likely(a->size && b->size)) {
728
- if (a->ptr > b->ptr) {
729
- return 1;
730
- } else if (a->ptr < b->ptr) {
731
- return -1;
732
- }
733
- /* a->ptr == b->ptr should happen only on deletions */
734
- g_assert(a->size == b->size);
735
- return 0;
736
- }
737
- /*
738
- * All lookups have either .size field set to 0.
739
- * From the glib sources we see that @ap is always the lookup key. However
740
- * the docs provide no guarantee, so we just mark this case as likely.
741
- */
742
- if (likely(a->size == 0)) {
743
- return ptr_cmp_tb_tc(a->ptr, b);
744
- }
745
- return ptr_cmp_tb_tc(b->ptr, a);
746
-}
747
-
748
-static void tcg_region_trees_init(void)
749
-{
750
- size_t i;
751
-
752
- tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize);
753
- region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size);
754
- for (i = 0; i < region.n; i++) {
755
- struct tcg_region_tree *rt = region_trees + i * tree_size;
756
-
757
- qemu_mutex_init(&rt->lock);
758
- rt->tree = g_tree_new(tb_tc_cmp);
759
- }
760
-}
761
-
762
-static struct tcg_region_tree *tc_ptr_to_region_tree(const void *p)
763
-{
764
- size_t region_idx;
765
-
766
- /*
767
- * Like tcg_splitwx_to_rw, with no assert. The pc may come from
768
- * a signal handler over which the caller has no control.
769
- */
770
- if (!in_code_gen_buffer(p)) {
771
- p -= tcg_splitwx_diff;
772
- if (!in_code_gen_buffer(p)) {
773
- return NULL;
774
- }
775
- }
776
-
777
- if (p < region.start_aligned) {
778
- region_idx = 0;
779
- } else {
780
- ptrdiff_t offset = p - region.start_aligned;
781
-
782
- if (offset > region.stride * (region.n - 1)) {
783
- region_idx = region.n - 1;
784
- } else {
785
- region_idx = offset / region.stride;
786
- }
787
- }
788
- return region_trees + region_idx * tree_size;
789
-}
790
-
791
-void tcg_tb_insert(TranslationBlock *tb)
792
-{
793
- struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
794
-
795
- g_assert(rt != NULL);
796
- qemu_mutex_lock(&rt->lock);
797
- g_tree_insert(rt->tree, &tb->tc, tb);
798
- qemu_mutex_unlock(&rt->lock);
799
-}
800
-
801
-void tcg_tb_remove(TranslationBlock *tb)
802
-{
803
- struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
804
-
805
- g_assert(rt != NULL);
806
- qemu_mutex_lock(&rt->lock);
807
- g_tree_remove(rt->tree, &tb->tc);
808
- qemu_mutex_unlock(&rt->lock);
809
-}
810
-
811
-/*
812
- * Find the TB 'tb' such that
813
- * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size
814
- * Return NULL if not found.
815
- */
816
-TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
817
-{
818
- struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr);
819
- TranslationBlock *tb;
820
- struct tb_tc s = { .ptr = (void *)tc_ptr };
821
-
822
- if (rt == NULL) {
823
- return NULL;
824
- }
825
-
826
- qemu_mutex_lock(&rt->lock);
827
- tb = g_tree_lookup(rt->tree, &s);
828
- qemu_mutex_unlock(&rt->lock);
829
- return tb;
830
-}
831
-
832
-static void tcg_region_tree_lock_all(void)
833
-{
834
- size_t i;
835
-
836
- for (i = 0; i < region.n; i++) {
837
- struct tcg_region_tree *rt = region_trees + i * tree_size;
838
-
839
- qemu_mutex_lock(&rt->lock);
840
- }
841
-}
842
-
843
-static void tcg_region_tree_unlock_all(void)
844
-{
845
- size_t i;
846
-
847
- for (i = 0; i < region.n; i++) {
848
- struct tcg_region_tree *rt = region_trees + i * tree_size;
849
-
850
- qemu_mutex_unlock(&rt->lock);
851
- }
852
-}
853
-
854
-void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
855
-{
856
- size_t i;
857
-
858
- tcg_region_tree_lock_all();
859
- for (i = 0; i < region.n; i++) {
860
- struct tcg_region_tree *rt = region_trees + i * tree_size;
861
-
862
- g_tree_foreach(rt->tree, func, user_data);
863
- }
864
- tcg_region_tree_unlock_all();
865
-}
866
-
867
-size_t tcg_nb_tbs(void)
868
-{
869
- size_t nb_tbs = 0;
870
- size_t i;
871
-
872
- tcg_region_tree_lock_all();
873
- for (i = 0; i < region.n; i++) {
874
- struct tcg_region_tree *rt = region_trees + i * tree_size;
875
-
876
- nb_tbs += g_tree_nnodes(rt->tree);
877
- }
878
- tcg_region_tree_unlock_all();
879
- return nb_tbs;
880
-}
881
-
882
-static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
883
-{
884
- TranslationBlock *tb = v;
885
-
886
- tb_destroy(tb);
887
- return FALSE;
888
-}
889
-
890
-static void tcg_region_tree_reset_all(void)
891
-{
892
- size_t i;
893
-
894
- tcg_region_tree_lock_all();
895
- for (i = 0; i < region.n; i++) {
896
- struct tcg_region_tree *rt = region_trees + i * tree_size;
897
-
898
- g_tree_foreach(rt->tree, tcg_region_tree_traverse, NULL);
899
- /* Increment the refcount first so that destroy acts as a reset */
900
- g_tree_ref(rt->tree);
901
- g_tree_destroy(rt->tree);
902
- }
903
- tcg_region_tree_unlock_all();
904
-}
905
-
906
-static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
907
-{
908
- void *start, *end;
909
-
910
- start = region.start_aligned + curr_region * region.stride;
911
- end = start + region.size;
912
-
913
- if (curr_region == 0) {
914
- start = region.start;
915
- }
916
- if (curr_region == region.n - 1) {
917
- end = region.end;
918
- }
919
-
920
- *pstart = start;
921
- *pend = end;
922
-}
923
-
924
-static void tcg_region_assign(TCGContext *s, size_t curr_region)
925
-{
926
- void *start, *end;
927
-
928
- tcg_region_bounds(curr_region, &start, &end);
929
-
930
- s->code_gen_buffer = start;
931
- s->code_gen_ptr = start;
932
- s->code_gen_buffer_size = end - start;
933
- s->code_gen_highwater = end - TCG_HIGHWATER;
934
-}
935
-
936
-static bool tcg_region_alloc__locked(TCGContext *s)
937
-{
938
- if (region.current == region.n) {
939
- return true;
940
- }
941
- tcg_region_assign(s, region.current);
942
- region.current++;
943
- return false;
944
-}
945
-
946
-/*
947
- * Request a new region once the one in use has filled up.
948
- * Returns true on error.
949
- */
950
-static bool tcg_region_alloc(TCGContext *s)
951
-{
952
- bool err;
953
- /* read the region size now; alloc__locked will overwrite it on success */
954
- size_t size_full = s->code_gen_buffer_size;
955
-
956
- qemu_mutex_lock(&region.lock);
957
- err = tcg_region_alloc__locked(s);
958
- if (!err) {
959
- region.agg_size_full += size_full - TCG_HIGHWATER;
960
- }
961
- qemu_mutex_unlock(&region.lock);
962
- return err;
963
-}
964
-
965
-/*
966
- * Perform a context's first region allocation.
967
- * This function does _not_ increment region.agg_size_full.
968
- */
969
-static void tcg_region_initial_alloc__locked(TCGContext *s)
970
-{
971
- bool err = tcg_region_alloc__locked(s);
972
- g_assert(!err);
973
-}
974
-
975
-#ifndef CONFIG_USER_ONLY
976
-static void tcg_region_initial_alloc(TCGContext *s)
977
-{
978
- qemu_mutex_lock(&region.lock);
979
- tcg_region_initial_alloc__locked(s);
980
- qemu_mutex_unlock(&region.lock);
981
-}
982
-#endif
983
-
984
-/* Call from a safe-work context */
985
-void tcg_region_reset_all(void)
986
-{
987
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
988
- unsigned int i;
989
-
990
- qemu_mutex_lock(&region.lock);
991
- region.current = 0;
992
- region.agg_size_full = 0;
993
-
994
- for (i = 0; i < n_ctxs; i++) {
995
- TCGContext *s = qatomic_read(&tcg_ctxs[i]);
996
- tcg_region_initial_alloc__locked(s);
997
- }
998
- qemu_mutex_unlock(&region.lock);
999
-
1000
- tcg_region_tree_reset_all();
1001
-}
1002
-
1003
-#ifdef CONFIG_USER_ONLY
1004
-static size_t tcg_n_regions(void)
1005
-{
1006
- return 1;
1007
-}
1008
-#else
1009
-/*
1010
- * It is likely that some vCPUs will translate more code than others, so we
1011
- * first try to set more regions than max_cpus, with those regions being of
1012
- * reasonable size. If that's not possible we make do by evenly dividing
1013
- * the code_gen_buffer among the vCPUs.
1014
- */
1015
-static size_t tcg_n_regions(void)
1016
-{
1017
- size_t i;
1018
-
1019
- /* Use a single region if all we have is one vCPU thread */
1020
-#if !defined(CONFIG_USER_ONLY)
1021
- MachineState *ms = MACHINE(qdev_get_machine());
1022
- unsigned int max_cpus = ms->smp.max_cpus;
1023
-#endif
1024
- if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
1025
- return 1;
1026
- }
1027
-
1028
- /* Try to have more regions than max_cpus, with each region being >= 2 MB */
1029
- for (i = 8; i > 0; i--) {
1030
- size_t regions_per_thread = i;
1031
- size_t region_size;
1032
-
1033
- region_size = tcg_init_ctx.code_gen_buffer_size;
1034
- region_size /= max_cpus * regions_per_thread;
1035
-
1036
- if (region_size >= 2 * 1024u * 1024) {
1037
- return max_cpus * regions_per_thread;
1038
- }
1039
- }
1040
- /* If we can't, then just allocate one region per vCPU thread */
1041
- return max_cpus;
1042
-}
1043
-#endif
1044
-
1045
-/*
1046
- * Initializes region partitioning.
1047
- *
1048
- * Called at init time from the parent thread (i.e. the one calling
1049
- * tcg_context_init), after the target's TCG globals have been set.
1050
- *
1051
- * Region partitioning works by splitting code_gen_buffer into separate regions,
1052
- * and then assigning regions to TCG threads so that the threads can translate
1053
- * code in parallel without synchronization.
1054
- *
1055
- * In softmmu the number of TCG threads is bounded by max_cpus, so we use at
1056
- * least max_cpus regions in MTTCG. In !MTTCG we use a single region.
1057
- * Note that the TCG options from the command-line (i.e. -accel accel=tcg,[...])
1058
- * must have been parsed before calling this function, since it calls
1059
- * qemu_tcg_mttcg_enabled().
1060
- *
1061
- * In user-mode we use a single region. Having multiple regions in user-mode
1062
- * is not supported, because the number of vCPU threads (recall that each thread
1063
- * spawned by the guest corresponds to a vCPU thread) is only bounded by the
1064
- * OS, and usually this number is huge (tens of thousands is not uncommon).
1065
- * Thus, given this large bound on the number of vCPU threads and the fact
1066
- * that code_gen_buffer is allocated at compile-time, we cannot guarantee
1067
- * that the availability of at least one region per vCPU thread.
1068
- *
1069
- * However, this user-mode limitation is unlikely to be a significant problem
1070
- * in practice. Multi-threaded guests share most if not all of their translated
1071
- * code, which makes parallel code generation less appealing than in softmmu.
1072
- */
1073
-void tcg_region_init(void)
1074
-{
1075
- void *buf = tcg_init_ctx.code_gen_buffer;
1076
- void *aligned;
1077
- size_t size = tcg_init_ctx.code_gen_buffer_size;
1078
- size_t page_size = qemu_real_host_page_size;
1079
- size_t region_size;
1080
- size_t n_regions;
1081
- size_t i;
1082
-
1083
- n_regions = tcg_n_regions();
1084
-
1085
- /* The first region will be 'aligned - buf' bytes larger than the others */
1086
- aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
1087
- g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
1088
- /*
1089
- * Make region_size a multiple of page_size, using aligned as the start.
1090
- * As a result of this we might end up with a few extra pages at the end of
1091
- * the buffer; we will assign those to the last region.
1092
- */
1093
- region_size = (size - (aligned - buf)) / n_regions;
1094
- region_size = QEMU_ALIGN_DOWN(region_size, page_size);
1095
-
1096
- /* A region must have at least 2 pages; one code, one guard */
1097
- g_assert(region_size >= 2 * page_size);
1098
-
1099
- /* init the region struct */
1100
- qemu_mutex_init(&region.lock);
1101
- region.n = n_regions;
1102
- region.size = region_size - page_size;
1103
- region.stride = region_size;
1104
- region.start = buf;
1105
- region.start_aligned = aligned;
1106
- /* page-align the end, since its last page will be a guard page */
1107
- region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
1108
- /* account for that last guard page */
1109
- region.end -= page_size;
1110
-
1111
- /*
1112
- * Set guard pages in the rw buffer, as that's the one into which
1113
- * buffer overruns could occur. Do not set guard pages in the rx
1114
- * buffer -- let that one use hugepages throughout.
1115
- */
1116
- for (i = 0; i < region.n; i++) {
1117
- void *start, *end;
1118
-
1119
- tcg_region_bounds(i, &start, &end);
1120
-
1121
- /*
1122
- * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
1123
- * rejects a permission change from RWX -> NONE. Guard pages are
1124
- * nice for bug detection but are not essential; ignore any failure.
1125
- */
1126
- (void)qemu_mprotect_none(end, page_size);
1127
- }
1128
-
1129
- tcg_region_trees_init();
1130
-
1131
- /*
1132
- * Leave the initial context initialized to the first region.
1133
- * This will be the context into which we generate the prologue.
1134
- * It is also the only context for CONFIG_USER_ONLY.
1135
- */
1136
- tcg_region_initial_alloc__locked(&tcg_init_ctx);
1137
-}
1138
-
1139
-static void tcg_region_prologue_set(TCGContext *s)
1140
-{
1141
- /* Deduct the prologue from the first region. */
1142
- g_assert(region.start == s->code_gen_buffer);
1143
- region.start = s->code_ptr;
1144
-
1145
- /* Recompute boundaries of the first region. */
1146
- tcg_region_assign(s, 0);
1147
-
1148
- /* Register the balance of the buffer with gdb. */
1149
- tcg_register_jit(tcg_splitwx_to_rx(region.start),
1150
- region.end - region.start);
1151
-}
1152
-
1153
#ifdef CONFIG_DEBUG_TCG
1154
const void *tcg_splitwx_to_rx(void *rw)
1155
{
1156
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
1157
}
1158
#endif /* !CONFIG_USER_ONLY */
1159
1160
-/*
1161
- * Returns the size (in bytes) of all translated code (i.e. from all regions)
1162
- * currently in the cache.
1163
- * See also: tcg_code_capacity()
1164
- * Do not confuse with tcg_current_code_size(); that one applies to a single
1165
- * TCG context.
1166
- */
1167
-size_t tcg_code_size(void)
1168
-{
1169
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
1170
- unsigned int i;
1171
- size_t total;
1172
-
1173
- qemu_mutex_lock(&region.lock);
1174
- total = region.agg_size_full;
1175
- for (i = 0; i < n_ctxs; i++) {
1176
- const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
1177
- size_t size;
1178
-
1179
- size = qatomic_read(&s->code_gen_ptr) - s->code_gen_buffer;
1180
- g_assert(size <= s->code_gen_buffer_size);
1181
- total += size;
1182
- }
1183
- qemu_mutex_unlock(&region.lock);
1184
- return total;
1185
-}
1186
-
1187
-/*
1188
- * Returns the code capacity (in bytes) of the entire cache, i.e. including all
1189
- * regions.
1190
- * See also: tcg_code_size()
1191
- */
1192
-size_t tcg_code_capacity(void)
1193
-{
1194
- size_t guard_size, capacity;
1195
-
1196
- /* no need for synchronization; these variables are set at init time */
1197
- guard_size = region.stride - region.size;
1198
- capacity = region.end + guard_size - region.start;
1199
- capacity -= region.n * (guard_size + TCG_HIGHWATER);
1200
- return capacity;
1201
-}
1202
-
1203
-size_t tcg_tb_phys_invalidate_count(void)
1204
-{
1205
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
1206
- unsigned int i;
1207
- size_t total = 0;
1208
-
1209
- for (i = 0; i < n_ctxs; i++) {
1210
- const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
1211
-
1212
- total += qatomic_read(&s->tb_phys_invalidate_count);
1213
- }
1214
- return total;
1215
-}
1216
-
1217
/* pool based memory allocation */
1218
void *tcg_malloc_internal(TCGContext *s, int size)
1219
{
1220
diff --git a/tcg/meson.build b/tcg/meson.build
1221
index XXXXXXX..XXXXXXX 100644
1222
--- a/tcg/meson.build
1223
+++ b/tcg/meson.build
1224
@@ -XXX,XX +XXX,XX @@ tcg_ss = ss.source_set()
1225
1226
tcg_ss.add(files(
1227
'optimize.c',
1228
+ 'region.c',
1229
'tcg.c',
1230
'tcg-common.c',
1231
'tcg-op.c',
1232
--
1233
2.25.1
1234
1235
diff view generated by jsdifflib
Deleted patch
1
It consists of one function call and has only one caller.
2
1
3
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
accel/tcg/translate-all.c | 7 +------
9
1 file changed, 1 insertion(+), 6 deletions(-)
10
11
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/accel/tcg/translate-all.c
14
+++ b/accel/tcg/translate-all.c
15
@@ -XXX,XX +XXX,XX @@ static void page_table_config_init(void)
16
assert(v_l2_levels >= 0);
17
}
18
19
-static void cpu_gen_init(void)
20
-{
21
- tcg_context_init(&tcg_init_ctx);
22
-}
23
-
24
/* Encode VAL as a signed leb128 sequence at P.
25
Return P incremented past the encoded value. */
26
static uint8_t *encode_sleb128(uint8_t *p, target_long val)
27
@@ -XXX,XX +XXX,XX @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
28
bool ok;
29
30
tcg_allowed = true;
31
- cpu_gen_init();
32
+ tcg_context_init(&tcg_init_ctx);
33
page_init();
34
tb_htable_init();
35
36
--
37
2.25.1
38
39
diff view generated by jsdifflib
1
Typo in the conversion to FloatParts64.
1
From: Roman Artemev <roman.artemev@syntacore.com>
2
2
3
Fixes: 572c4d862ff2
3
On RISC-V to StoreStore barrier corresponds
4
Fixes: Coverity CID 1457457
4
`fence w, w` not `fence r, r`
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Cc: qemu-stable@nongnu.org
7
Message-Id: <20210607223812.110596-1-richard.henderson@linaro.org>
7
Fixes: efbea94c76b ("tcg/riscv: Add slowpath load and store instructions")
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Denis Tomashev <denis.tomashev@syntacore.com>
10
Signed-off-by: Roman Artemev <roman.artemev@syntacore.com>
11
Message-ID: <e2f2131e294a49e79959d4fa9ec02cf4@syntacore.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
13
---
10
fpu/softfloat.c | 2 +-
14
tcg/riscv/tcg-target.c.inc | 2 +-
11
1 file changed, 1 insertion(+), 1 deletion(-)
15
1 file changed, 1 insertion(+), 1 deletion(-)
12
16
13
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
17
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
14
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
15
--- a/fpu/softfloat.c
19
--- a/tcg/riscv/tcg-target.c.inc
16
+++ b/fpu/softfloat.c
20
+++ b/tcg/riscv/tcg-target.c.inc
17
@@ -XXX,XX +XXX,XX @@ float32 float32_exp2(float32 a, float_status *status)
21
@@ -XXX,XX +XXX,XX @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
18
22
insn |= 0x02100000;
19
float_raise(float_flag_inexact, status);
23
}
20
24
if (a0 & TCG_MO_ST_ST) {
21
- float64_unpack_canonical(&xnp, float64_ln2, status);
25
- insn |= 0x02200000;
22
+ float64_unpack_canonical(&tp, float64_ln2, status);
26
+ insn |= 0x01100000;
23
xp = *parts_mul(&xp, &tp, status);
27
}
24
xnp = xp;
28
tcg_out32(s, insn);
25
29
}
26
--
30
--
27
2.25.1
31
2.43.0
28
29
diff view generated by jsdifflib
1
From: Luis Pires <luis.pires@eldorado.org.br>
1
This allows targets to declare that the helper requires a
2
float_status pointer and instead of a generic void pointer.
2
3
3
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Message-Id: <20210601125143.191165-1-luis.pires@eldorado.org.br>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
6
---
7
docs/devel/tcg.rst | 101 ++++++++++++++++++++++++++++++++++++++++-----
7
include/exec/helper-head.h.inc | 3 +++
8
1 file changed, 90 insertions(+), 11 deletions(-)
8
1 file changed, 3 insertions(+)
9
9
10
diff --git a/docs/devel/tcg.rst b/docs/devel/tcg.rst
10
diff --git a/include/exec/helper-head.h.inc b/include/exec/helper-head.h.inc
11
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
12
--- a/docs/devel/tcg.rst
12
--- a/include/exec/helper-head.h.inc
13
+++ b/docs/devel/tcg.rst
13
+++ b/include/exec/helper-head.h.inc
14
@@ -XXX,XX +XXX,XX @@ performances.
14
@@ -XXX,XX +XXX,XX @@
15
QEMU's dynamic translation backend is called TCG, for "Tiny Code
15
#define dh_alias_ptr ptr
16
Generator". For more information, please take a look at ``tcg/README``.
16
#define dh_alias_cptr ptr
17
17
#define dh_alias_env ptr
18
-Some notable features of QEMU's dynamic translator are:
18
+#define dh_alias_fpst ptr
19
+The following sections outline some notable features and implementation
19
#define dh_alias_void void
20
+details of QEMU's dynamic translator.
20
#define dh_alias_noreturn noreturn
21
21
#define dh_alias(t) glue(dh_alias_, t)
22
CPU state optimisations
22
@@ -XXX,XX +XXX,XX @@
23
-----------------------
23
#define dh_ctype_ptr void *
24
24
#define dh_ctype_cptr const void *
25
-The target CPUs have many internal states which change the way it
25
#define dh_ctype_env CPUArchState *
26
-evaluates instructions. In order to achieve a good speed, the
26
+#define dh_ctype_fpst float_status *
27
+The target CPUs have many internal states which change the way they
27
#define dh_ctype_void void
28
+evaluate instructions. In order to achieve a good speed, the
28
#define dh_ctype_noreturn G_NORETURN void
29
translation phase considers that some state information of the virtual
29
#define dh_ctype(t) dh_ctype_##t
30
CPU cannot change in it. The state is recorded in the Translation
30
@@ -XXX,XX +XXX,XX @@
31
Block (TB). If the state changes (e.g. privilege level), a new TB will
31
#define dh_typecode_f64 dh_typecode_i64
32
@@ -XXX,XX +XXX,XX @@ Direct block chaining
32
#define dh_typecode_cptr dh_typecode_ptr
33
---------------------
33
#define dh_typecode_env dh_typecode_ptr
34
34
+#define dh_typecode_fpst dh_typecode_ptr
35
After each translated basic block is executed, QEMU uses the simulated
35
#define dh_typecode(t) dh_typecode_##t
36
-Program Counter (PC) and other cpu state information (such as the CS
36
37
+Program Counter (PC) and other CPU state information (such as the CS
37
#define dh_callflag_i32 0
38
segment base value) to find the next basic block.
39
40
-In order to accelerate the most common cases where the new simulated PC
41
-is known, QEMU can patch a basic block so that it jumps directly to the
42
-next one.
43
+In its simplest, less optimized form, this is done by exiting from the
44
+current TB, going through the TB epilogue, and then back to the
45
+main loop. That’s where QEMU looks for the next TB to execute,
46
+translating it from the guest architecture if it isn’t already available
47
+in memory. Then QEMU proceeds to execute this next TB, starting at the
48
+prologue and then moving on to the translated instructions.
49
50
-The most portable code uses an indirect jump. An indirect jump makes
51
-it easier to make the jump target modification atomic. On some host
52
-architectures (such as x86 or PowerPC), the ``JUMP`` opcode is
53
-directly patched so that the block chaining has no overhead.
54
+Exiting from the TB this way will cause the ``cpu_exec_interrupt()``
55
+callback to be re-evaluated before executing additional instructions.
56
+It is mandatory to exit this way after any CPU state changes that may
57
+unmask interrupts.
58
+
59
+In order to accelerate the cases where the TB for the new
60
+simulated PC is already available, QEMU has mechanisms that allow
61
+multiple TBs to be chained directly, without having to go back to the
62
+main loop as described above. These mechanisms are:
63
+
64
+``lookup_and_goto_ptr``
65
+^^^^^^^^^^^^^^^^^^^^^^^
66
+
67
+Calling ``tcg_gen_lookup_and_goto_ptr()`` will emit a call to
68
+``helper_lookup_tb_ptr``. This helper will look for an existing TB that
69
+matches the current CPU state. If the destination TB is available its
70
+code address is returned, otherwise the address of the JIT epilogue is
71
+returned. The call to the helper is always followed by the tcg ``goto_ptr``
72
+opcode, which branches to the returned address. In this way, we either
73
+branch to the next TB or return to the main loop.
74
+
75
+``goto_tb + exit_tb``
76
+^^^^^^^^^^^^^^^^^^^^^
77
+
78
+The translation code usually implements branching by performing the
79
+following steps:
80
+
81
+1. Call ``tcg_gen_goto_tb()`` passing a jump slot index (either 0 or 1)
82
+ as a parameter.
83
+
84
+2. Emit TCG instructions to update the CPU state with any information
85
+ that has been assumed constant and is required by the main loop to
86
+ correctly locate and execute the next TB. For most guests, this is
87
+ just the PC of the branch destination, but others may store additional
88
+ data. The information updated in this step must be inferable from both
89
+ ``cpu_get_tb_cpu_state()`` and ``cpu_restore_state()``.
90
+
91
+3. Call ``tcg_gen_exit_tb()`` passing the address of the current TB and
92
+ the jump slot index again.
93
+
94
+Step 1, ``tcg_gen_goto_tb()``, will emit a ``goto_tb`` TCG
95
+instruction that later on gets translated to a jump to an address
96
+associated with the specified jump slot. Initially, this is the address
97
+of step 2's instructions, which update the CPU state information. Step 3,
98
+``tcg_gen_exit_tb()``, exits from the current TB returning a tagged
99
+pointer composed of the last executed TB’s address and the jump slot
100
+index.
101
+
102
+The first time this whole sequence is executed, step 1 simply jumps
103
+to step 2. Then the CPU state information gets updated and we exit from
104
+the current TB. As a result, the behavior is very similar to the less
105
+optimized form described earlier in this section.
106
+
107
+Next, the main loop looks for the next TB to execute using the
108
+current CPU state information (creating the TB if it wasn’t already
109
+available) and, before starting to execute the new TB’s instructions,
110
+patches the previously executed TB by associating one of its jump
111
+slots (the one specified in the call to ``tcg_gen_exit_tb()``) with the
112
+address of the new TB.
113
+
114
+The next time this previous TB is executed and we get to that same
115
+``goto_tb`` step, it will already be patched (assuming the destination TB
116
+is still in memory) and will jump directly to the first instruction of
117
+the destination TB, without going back to the main loop.
118
+
119
+For the ``goto_tb + exit_tb`` mechanism to be used, the following
120
+conditions need to be satisfied:
121
+
122
+* The change in CPU state must be constant, e.g., a direct branch and
123
+ not an indirect branch.
124
+
125
+* The direct branch cannot cross a page boundary. Memory mappings
126
+ may change, causing the code at the destination address to change.
127
+
128
+Note that, on step 3 (``tcg_gen_exit_tb()``), in addition to the
129
+jump slot index, the address of the TB just executed is also returned.
130
+This address corresponds to the TB that will be patched; it may be
131
+different than the one that was directly executed from the main loop
132
+if the latter had already been chained to other TBs.
133
134
Self-modifying code and translated code invalidation
135
----------------------------------------------------
136
--
38
--
137
2.25.1
39
2.43.0
138
40
139
41
diff view generated by jsdifflib
1
Buffer management is integral to tcg. Do not leave the allocation
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
to code outside of tcg/. This is code movement, with further
3
cleanups to follow.
4
2
5
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
3
Rather than manually copying each register, use
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
the libc memcpy(), which is well optimized nowadays.
5
6
Suggested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
7
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Message-ID: <20241205205418.67613-1-philmd@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
12
---
9
include/tcg/tcg.h | 2 +-
13
target/sparc/win_helper.c | 26 ++++++++------------------
10
accel/tcg/translate-all.c | 414 +-----------------------------------
14
1 file changed, 8 insertions(+), 18 deletions(-)
11
tcg/region.c | 431 +++++++++++++++++++++++++++++++++++++-
12
3 files changed, 428 insertions(+), 419 deletions(-)
13
15
14
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
16
diff --git a/target/sparc/win_helper.c b/target/sparc/win_helper.c
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/include/tcg/tcg.h
18
--- a/target/sparc/win_helper.c
17
+++ b/include/tcg/tcg.h
19
+++ b/target/sparc/win_helper.c
18
@@ -XXX,XX +XXX,XX @@ void *tcg_malloc_internal(TCGContext *s, int size);
19
void tcg_pool_reset(TCGContext *s);
20
TranslationBlock *tcg_tb_alloc(TCGContext *s);
21
22
-void tcg_region_init(void);
23
+void tcg_region_init(size_t tb_size, int splitwx);
24
void tb_destroy(TranslationBlock *tb);
25
void tcg_region_reset_all(void);
26
27
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/accel/tcg/translate-all.c
30
+++ b/accel/tcg/translate-all.c
31
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@
32
*/
21
#include "exec/helper-proto.h"
33
22
#include "trace.h"
34
#include "qemu/osdep.h"
23
35
-#include "qemu/units.h"
24
-static inline void memcpy32(target_ulong *dst, const target_ulong *src)
36
#include "qemu-common.h"
25
-{
37
26
- dst[0] = src[0];
38
#define NO_CPU_IO_DEFS
27
- dst[1] = src[1];
39
@@ -XXX,XX +XXX,XX @@
28
- dst[2] = src[2];
40
#include "exec/cputlb.h"
29
- dst[3] = src[3];
41
#include "exec/translate-all.h"
30
- dst[4] = src[4];
42
#include "qemu/bitmap.h"
31
- dst[5] = src[5];
43
-#include "qemu/error-report.h"
32
- dst[6] = src[6];
44
#include "qemu/qemu-print.h"
33
- dst[7] = src[7];
45
#include "qemu/timer.h"
34
-}
46
#include "qemu/main-loop.h"
35
-
47
@@ -XXX,XX +XXX,XX @@ static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
36
void cpu_set_cwp(CPUSPARCState *env, int new_cwp)
37
{
38
/* put the modified wrap registers at their proper location */
39
if (env->cwp == env->nwindows - 1) {
40
- memcpy32(env->regbase, env->regbase + env->nwindows * 16);
41
+ memcpy(env->regbase, env->regbase + env->nwindows * 16,
42
+ sizeof(env->gregs));
43
}
44
env->cwp = new_cwp;
45
46
/* put the wrap registers at their temporary location */
47
if (new_cwp == env->nwindows - 1) {
48
- memcpy32(env->regbase + env->nwindows * 16, env->regbase);
49
+ memcpy(env->regbase + env->nwindows * 16, env->regbase,
50
+ sizeof(env->gregs));
51
}
52
env->regwptr = env->regbase + (new_cwp * 16);
53
}
54
@@ -XXX,XX +XXX,XX @@ void cpu_gl_switch_gregs(CPUSPARCState *env, uint32_t new_gl)
55
dst = get_gl_gregset(env, env->gl);
56
57
if (src != dst) {
58
- memcpy32(dst, env->gregs);
59
- memcpy32(env->gregs, src);
60
+ memcpy(dst, env->gregs, sizeof(env->gregs));
61
+ memcpy(env->gregs, src, sizeof(env->gregs));
48
}
62
}
49
}
63
}
50
64
51
-/* Minimum size of the code gen buffer. This number is randomly chosen,
65
@@ -XXX,XX +XXX,XX @@ void cpu_change_pstate(CPUSPARCState *env, uint32_t new_pstate)
52
- but not so small that we can't have a fair number of TB's live. */
66
/* Switch global register bank */
53
-#define MIN_CODE_GEN_BUFFER_SIZE (1 * MiB)
67
src = get_gregset(env, new_pstate_regs);
54
-
68
dst = get_gregset(env, pstate_regs);
55
-/* Maximum size of the code gen buffer we'd like to use. Unless otherwise
69
- memcpy32(dst, env->gregs);
56
- indicated, this is constrained by the range of direct branches on the
70
- memcpy32(env->gregs, src);
57
- host cpu, as used by the TCG implementation of goto_tb. */
71
+ memcpy(dst, env->gregs, sizeof(env->gregs));
58
-#if defined(__x86_64__)
72
+ memcpy(env->gregs, src, sizeof(env->gregs));
59
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
73
} else {
60
-#elif defined(__sparc__)
74
trace_win_helper_no_switch_pstate(new_pstate_regs);
61
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
75
}
62
-#elif defined(__powerpc64__)
63
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
64
-#elif defined(__powerpc__)
65
-# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
66
-#elif defined(__aarch64__)
67
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
68
-#elif defined(__s390x__)
69
- /* We have a +- 4GB range on the branches; leave some slop. */
70
-# define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
71
-#elif defined(__mips__)
72
- /* We have a 256MB branch region, but leave room to make sure the
73
- main executable is also within that region. */
74
-# define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
75
-#else
76
-# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
77
-#endif
78
-
79
-#if TCG_TARGET_REG_BITS == 32
80
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
81
-#ifdef CONFIG_USER_ONLY
82
-/*
83
- * For user mode on smaller 32 bit systems we may run into trouble
84
- * allocating big chunks of data in the right place. On these systems
85
- * we utilise a static code generation buffer directly in the binary.
86
- */
87
-#define USE_STATIC_CODE_GEN_BUFFER
88
-#endif
89
-#else /* TCG_TARGET_REG_BITS == 64 */
90
-#ifdef CONFIG_USER_ONLY
91
-/*
92
- * As user-mode emulation typically means running multiple instances
93
- * of the translator don't go too nuts with our default code gen
94
- * buffer lest we make things too hard for the OS.
95
- */
96
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (128 * MiB)
97
-#else
98
-/*
99
- * We expect most system emulation to run one or two guests per host.
100
- * Users running large scale system emulation may want to tweak their
101
- * runtime setup via the tb-size control on the command line.
102
- */
103
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
104
-#endif
105
-#endif
106
-
107
-#define DEFAULT_CODE_GEN_BUFFER_SIZE \
108
- (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
109
- ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
110
-
111
-static size_t size_code_gen_buffer(size_t tb_size)
112
-{
113
- /* Size the buffer. */
114
- if (tb_size == 0) {
115
- size_t phys_mem = qemu_get_host_physmem();
116
- if (phys_mem == 0) {
117
- tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
118
- } else {
119
- tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
120
- }
121
- }
122
- if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
123
- tb_size = MIN_CODE_GEN_BUFFER_SIZE;
124
- }
125
- if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
126
- tb_size = MAX_CODE_GEN_BUFFER_SIZE;
127
- }
128
- return tb_size;
129
-}
130
-
131
-#ifdef __mips__
132
-/* In order to use J and JAL within the code_gen_buffer, we require
133
- that the buffer not cross a 256MB boundary. */
134
-static inline bool cross_256mb(void *addr, size_t size)
135
-{
136
- return ((uintptr_t)addr ^ ((uintptr_t)addr + size)) & ~0x0ffffffful;
137
-}
138
-
139
-/* We weren't able to allocate a buffer without crossing that boundary,
140
- so make do with the larger portion of the buffer that doesn't cross.
141
- Returns the new base of the buffer, and adjusts code_gen_buffer_size. */
142
-static inline void *split_cross_256mb(void *buf1, size_t size1)
143
-{
144
- void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
145
- size_t size2 = buf1 + size1 - buf2;
146
-
147
- size1 = buf2 - buf1;
148
- if (size1 < size2) {
149
- size1 = size2;
150
- buf1 = buf2;
151
- }
152
-
153
- tcg_ctx->code_gen_buffer_size = size1;
154
- return buf1;
155
-}
156
-#endif
157
-
158
-#ifdef USE_STATIC_CODE_GEN_BUFFER
159
-static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
160
- __attribute__((aligned(CODE_GEN_ALIGN)));
161
-
162
-static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
163
-{
164
- void *buf, *end;
165
- size_t size;
166
-
167
- if (splitwx > 0) {
168
- error_setg(errp, "jit split-wx not supported");
169
- return false;
170
- }
171
-
172
- /* page-align the beginning and end of the buffer */
173
- buf = static_code_gen_buffer;
174
- end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
175
- buf = QEMU_ALIGN_PTR_UP(buf, qemu_real_host_page_size);
176
- end = QEMU_ALIGN_PTR_DOWN(end, qemu_real_host_page_size);
177
-
178
- size = end - buf;
179
-
180
- /* Honor a command-line option limiting the size of the buffer. */
181
- if (size > tb_size) {
182
- size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
183
- }
184
- tcg_ctx->code_gen_buffer_size = size;
185
-
186
-#ifdef __mips__
187
- if (cross_256mb(buf, size)) {
188
- buf = split_cross_256mb(buf, size);
189
- size = tcg_ctx->code_gen_buffer_size;
190
- }
191
-#endif
192
-
193
- if (qemu_mprotect_rwx(buf, size)) {
194
- error_setg_errno(errp, errno, "mprotect of jit buffer");
195
- return false;
196
- }
197
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
198
-
199
- tcg_ctx->code_gen_buffer = buf;
200
- return true;
201
-}
202
-#elif defined(_WIN32)
203
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
204
-{
205
- void *buf;
206
-
207
- if (splitwx > 0) {
208
- error_setg(errp, "jit split-wx not supported");
209
- return false;
210
- }
211
-
212
- buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
213
- PAGE_EXECUTE_READWRITE);
214
- if (buf == NULL) {
215
- error_setg_win32(errp, GetLastError(),
216
- "allocate %zu bytes for jit buffer", size);
217
- return false;
218
- }
219
-
220
- tcg_ctx->code_gen_buffer = buf;
221
- tcg_ctx->code_gen_buffer_size = size;
222
- return true;
223
-}
224
-#else
225
-static bool alloc_code_gen_buffer_anon(size_t size, int prot,
226
- int flags, Error **errp)
227
-{
228
- void *buf;
229
-
230
- buf = mmap(NULL, size, prot, flags, -1, 0);
231
- if (buf == MAP_FAILED) {
232
- error_setg_errno(errp, errno,
233
- "allocate %zu bytes for jit buffer", size);
234
- return false;
235
- }
236
- tcg_ctx->code_gen_buffer_size = size;
237
-
238
-#ifdef __mips__
239
- if (cross_256mb(buf, size)) {
240
- /*
241
- * Try again, with the original still mapped, to avoid re-acquiring
242
- * the same 256mb crossing.
243
- */
244
- size_t size2;
245
- void *buf2 = mmap(NULL, size, prot, flags, -1, 0);
246
- switch ((int)(buf2 != MAP_FAILED)) {
247
- case 1:
248
- if (!cross_256mb(buf2, size)) {
249
- /* Success! Use the new buffer. */
250
- munmap(buf, size);
251
- break;
252
- }
253
- /* Failure. Work with what we had. */
254
- munmap(buf2, size);
255
- /* fallthru */
256
- default:
257
- /* Split the original buffer. Free the smaller half. */
258
- buf2 = split_cross_256mb(buf, size);
259
- size2 = tcg_ctx->code_gen_buffer_size;
260
- if (buf == buf2) {
261
- munmap(buf + size2, size - size2);
262
- } else {
263
- munmap(buf, size - size2);
264
- }
265
- size = size2;
266
- break;
267
- }
268
- buf = buf2;
269
- }
270
-#endif
271
-
272
- /* Request large pages for the buffer. */
273
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
274
-
275
- tcg_ctx->code_gen_buffer = buf;
276
- return true;
277
-}
278
-
279
-#ifndef CONFIG_TCG_INTERPRETER
280
-#ifdef CONFIG_POSIX
281
-#include "qemu/memfd.h"
282
-
283
-static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
284
-{
285
- void *buf_rw = NULL, *buf_rx = MAP_FAILED;
286
- int fd = -1;
287
-
288
-#ifdef __mips__
289
- /* Find space for the RX mapping, vs the 256MiB regions. */
290
- if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
291
- MAP_PRIVATE | MAP_ANONYMOUS |
292
- MAP_NORESERVE, errp)) {
293
- return false;
294
- }
295
- /* The size of the mapping may have been adjusted. */
296
- size = tcg_ctx->code_gen_buffer_size;
297
- buf_rx = tcg_ctx->code_gen_buffer;
298
-#endif
299
-
300
- buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
301
- if (buf_rw == NULL) {
302
- goto fail;
303
- }
304
-
305
-#ifdef __mips__
306
- void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
307
- MAP_SHARED | MAP_FIXED, fd, 0);
308
- if (tmp != buf_rx) {
309
- goto fail_rx;
310
- }
311
-#else
312
- buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
313
- if (buf_rx == MAP_FAILED) {
314
- goto fail_rx;
315
- }
316
-#endif
317
-
318
- close(fd);
319
- tcg_ctx->code_gen_buffer = buf_rw;
320
- tcg_ctx->code_gen_buffer_size = size;
321
- tcg_splitwx_diff = buf_rx - buf_rw;
322
-
323
- /* Request large pages for the buffer and the splitwx. */
324
- qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
325
- qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
326
- return true;
327
-
328
- fail_rx:
329
- error_setg_errno(errp, errno, "failed to map shared memory for execute");
330
- fail:
331
- if (buf_rx != MAP_FAILED) {
332
- munmap(buf_rx, size);
333
- }
334
- if (buf_rw) {
335
- munmap(buf_rw, size);
336
- }
337
- if (fd >= 0) {
338
- close(fd);
339
- }
340
- return false;
341
-}
342
-#endif /* CONFIG_POSIX */
343
-
344
-#ifdef CONFIG_DARWIN
345
-#include <mach/mach.h>
346
-
347
-extern kern_return_t mach_vm_remap(vm_map_t target_task,
348
- mach_vm_address_t *target_address,
349
- mach_vm_size_t size,
350
- mach_vm_offset_t mask,
351
- int flags,
352
- vm_map_t src_task,
353
- mach_vm_address_t src_address,
354
- boolean_t copy,
355
- vm_prot_t *cur_protection,
356
- vm_prot_t *max_protection,
357
- vm_inherit_t inheritance);
358
-
359
-static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
360
-{
361
- kern_return_t ret;
362
- mach_vm_address_t buf_rw, buf_rx;
363
- vm_prot_t cur_prot, max_prot;
364
-
365
- /* Map the read-write portion via normal anon memory. */
366
- if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
367
- MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
368
- return false;
369
- }
370
-
371
- buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
372
- buf_rx = 0;
373
- ret = mach_vm_remap(mach_task_self(),
374
- &buf_rx,
375
- size,
376
- 0,
377
- VM_FLAGS_ANYWHERE,
378
- mach_task_self(),
379
- buf_rw,
380
- false,
381
- &cur_prot,
382
- &max_prot,
383
- VM_INHERIT_NONE);
384
- if (ret != KERN_SUCCESS) {
385
- /* TODO: Convert "ret" to a human readable error message. */
386
- error_setg(errp, "vm_remap for jit splitwx failed");
387
- munmap((void *)buf_rw, size);
388
- return false;
389
- }
390
-
391
- if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
392
- error_setg_errno(errp, errno, "mprotect for jit splitwx");
393
- munmap((void *)buf_rx, size);
394
- munmap((void *)buf_rw, size);
395
- return false;
396
- }
397
-
398
- tcg_splitwx_diff = buf_rx - buf_rw;
399
- return true;
400
-}
401
-#endif /* CONFIG_DARWIN */
402
-#endif /* CONFIG_TCG_INTERPRETER */
403
-
404
-static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
405
-{
406
-#ifndef CONFIG_TCG_INTERPRETER
407
-# ifdef CONFIG_DARWIN
408
- return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
409
-# endif
410
-# ifdef CONFIG_POSIX
411
- return alloc_code_gen_buffer_splitwx_memfd(size, errp);
412
-# endif
413
-#endif
414
- error_setg(errp, "jit split-wx not supported");
415
- return false;
416
-}
417
-
418
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
419
-{
420
- ERRP_GUARD();
421
- int prot, flags;
422
-
423
- if (splitwx) {
424
- if (alloc_code_gen_buffer_splitwx(size, errp)) {
425
- return true;
426
- }
427
- /*
428
- * If splitwx force-on (1), fail;
429
- * if splitwx default-on (-1), fall through to splitwx off.
430
- */
431
- if (splitwx > 0) {
432
- return false;
433
- }
434
- error_free_or_abort(errp);
435
- }
436
-
437
- prot = PROT_READ | PROT_WRITE | PROT_EXEC;
438
- flags = MAP_PRIVATE | MAP_ANONYMOUS;
439
-#ifdef CONFIG_TCG_INTERPRETER
440
- /* The tcg interpreter does not need execute permission. */
441
- prot = PROT_READ | PROT_WRITE;
442
-#elif defined(CONFIG_DARWIN)
443
- /* Applicable to both iOS and macOS (Apple Silicon). */
444
- if (!splitwx) {
445
- flags |= MAP_JIT;
446
- }
447
-#endif
448
-
449
- return alloc_code_gen_buffer_anon(size, prot, flags, errp);
450
-}
451
-#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
452
-
453
static bool tb_cmp(const void *ap, const void *bp)
454
{
455
const TranslationBlock *a = ap;
456
@@ -XXX,XX +XXX,XX @@ static void tb_htable_init(void)
457
size. */
458
void tcg_exec_init(unsigned long tb_size, int splitwx)
459
{
460
- bool ok;
461
-
462
tcg_allowed = true;
463
tcg_context_init(&tcg_init_ctx);
464
page_init();
465
tb_htable_init();
466
-
467
- ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
468
- splitwx, &error_fatal);
469
- assert(ok);
470
-
471
- /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
472
- tcg_region_init();
473
+ tcg_region_init(tb_size, splitwx);
474
475
#if defined(CONFIG_SOFTMMU)
476
/* There's no guest base to take into account, so go ahead and
477
diff --git a/tcg/region.c b/tcg/region.c
478
index XXXXXXX..XXXXXXX 100644
479
--- a/tcg/region.c
480
+++ b/tcg/region.c
481
@@ -XXX,XX +XXX,XX @@
482
*/
483
484
#include "qemu/osdep.h"
485
+#include "qemu/units.h"
486
+#include "qapi/error.h"
487
#include "exec/exec-all.h"
488
#include "tcg/tcg.h"
489
#if !defined(CONFIG_USER_ONLY)
490
@@ -XXX,XX +XXX,XX @@ static size_t tcg_n_regions(void)
491
}
492
#endif
493
494
+/*
495
+ * Minimum size of the code gen buffer. This number is randomly chosen,
496
+ * but not so small that we can't have a fair number of TB's live.
497
+ */
498
+#define MIN_CODE_GEN_BUFFER_SIZE (1 * MiB)
499
+
500
+/*
501
+ * Maximum size of the code gen buffer we'd like to use. Unless otherwise
502
+ * indicated, this is constrained by the range of direct branches on the
503
+ * host cpu, as used by the TCG implementation of goto_tb.
504
+ */
505
+#if defined(__x86_64__)
506
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
507
+#elif defined(__sparc__)
508
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
509
+#elif defined(__powerpc64__)
510
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
511
+#elif defined(__powerpc__)
512
+# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
513
+#elif defined(__aarch64__)
514
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
515
+#elif defined(__s390x__)
516
+ /* We have a +- 4GB range on the branches; leave some slop. */
517
+# define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
518
+#elif defined(__mips__)
519
+ /*
520
+ * We have a 256MB branch region, but leave room to make sure the
521
+ * main executable is also within that region.
522
+ */
523
+# define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
524
+#else
525
+# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
526
+#endif
527
+
528
+#if TCG_TARGET_REG_BITS == 32
529
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
530
+#ifdef CONFIG_USER_ONLY
531
+/*
532
+ * For user mode on smaller 32 bit systems we may run into trouble
533
+ * allocating big chunks of data in the right place. On these systems
534
+ * we utilise a static code generation buffer directly in the binary.
535
+ */
536
+#define USE_STATIC_CODE_GEN_BUFFER
537
+#endif
538
+#else /* TCG_TARGET_REG_BITS == 64 */
539
+#ifdef CONFIG_USER_ONLY
540
+/*
541
+ * As user-mode emulation typically means running multiple instances
542
+ * of the translator don't go too nuts with our default code gen
543
+ * buffer lest we make things too hard for the OS.
544
+ */
545
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (128 * MiB)
546
+#else
547
+/*
548
+ * We expect most system emulation to run one or two guests per host.
549
+ * Users running large scale system emulation may want to tweak their
550
+ * runtime setup via the tb-size control on the command line.
551
+ */
552
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
553
+#endif
554
+#endif
555
+
556
+#define DEFAULT_CODE_GEN_BUFFER_SIZE \
557
+ (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
558
+ ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
559
+
560
+static size_t size_code_gen_buffer(size_t tb_size)
561
+{
562
+ /* Size the buffer. */
563
+ if (tb_size == 0) {
564
+ size_t phys_mem = qemu_get_host_physmem();
565
+ if (phys_mem == 0) {
566
+ tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
567
+ } else {
568
+ tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
569
+ }
570
+ }
571
+ if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
572
+ tb_size = MIN_CODE_GEN_BUFFER_SIZE;
573
+ }
574
+ if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
575
+ tb_size = MAX_CODE_GEN_BUFFER_SIZE;
576
+ }
577
+ return tb_size;
578
+}
579
+
580
+#ifdef __mips__
581
+/*
582
+ * In order to use J and JAL within the code_gen_buffer, we require
583
+ * that the buffer not cross a 256MB boundary.
584
+ */
585
+static inline bool cross_256mb(void *addr, size_t size)
586
+{
587
+ return ((uintptr_t)addr ^ ((uintptr_t)addr + size)) & ~0x0ffffffful;
588
+}
589
+
590
+/*
591
+ * We weren't able to allocate a buffer without crossing that boundary,
592
+ * so make do with the larger portion of the buffer that doesn't cross.
593
+ * Returns the new base of the buffer, and adjusts code_gen_buffer_size.
594
+ */
595
+static inline void *split_cross_256mb(void *buf1, size_t size1)
596
+{
597
+ void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
598
+ size_t size2 = buf1 + size1 - buf2;
599
+
600
+ size1 = buf2 - buf1;
601
+ if (size1 < size2) {
602
+ size1 = size2;
603
+ buf1 = buf2;
604
+ }
605
+
606
+ tcg_ctx->code_gen_buffer_size = size1;
607
+ return buf1;
608
+}
609
+#endif
610
+
611
+#ifdef USE_STATIC_CODE_GEN_BUFFER
612
+static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
613
+ __attribute__((aligned(CODE_GEN_ALIGN)));
614
+
615
+static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
616
+{
617
+ void *buf, *end;
618
+ size_t size;
619
+
620
+ if (splitwx > 0) {
621
+ error_setg(errp, "jit split-wx not supported");
622
+ return false;
623
+ }
624
+
625
+ /* page-align the beginning and end of the buffer */
626
+ buf = static_code_gen_buffer;
627
+ end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
628
+ buf = QEMU_ALIGN_PTR_UP(buf, qemu_real_host_page_size);
629
+ end = QEMU_ALIGN_PTR_DOWN(end, qemu_real_host_page_size);
630
+
631
+ size = end - buf;
632
+
633
+ /* Honor a command-line option limiting the size of the buffer. */
634
+ if (size > tb_size) {
635
+ size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
636
+ }
637
+ tcg_ctx->code_gen_buffer_size = size;
638
+
639
+#ifdef __mips__
640
+ if (cross_256mb(buf, size)) {
641
+ buf = split_cross_256mb(buf, size);
642
+ size = tcg_ctx->code_gen_buffer_size;
643
+ }
644
+#endif
645
+
646
+ if (qemu_mprotect_rwx(buf, size)) {
647
+ error_setg_errno(errp, errno, "mprotect of jit buffer");
648
+ return false;
649
+ }
650
+ qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
651
+
652
+ tcg_ctx->code_gen_buffer = buf;
653
+ return true;
654
+}
655
+#elif defined(_WIN32)
656
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
657
+{
658
+ void *buf;
659
+
660
+ if (splitwx > 0) {
661
+ error_setg(errp, "jit split-wx not supported");
662
+ return false;
663
+ }
664
+
665
+ buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
666
+ PAGE_EXECUTE_READWRITE);
667
+ if (buf == NULL) {
668
+ error_setg_win32(errp, GetLastError(),
669
+ "allocate %zu bytes for jit buffer", size);
670
+ return false;
671
+ }
672
+
673
+ tcg_ctx->code_gen_buffer = buf;
674
+ tcg_ctx->code_gen_buffer_size = size;
675
+ return true;
676
+}
677
+#else
678
+static bool alloc_code_gen_buffer_anon(size_t size, int prot,
679
+ int flags, Error **errp)
680
+{
681
+ void *buf;
682
+
683
+ buf = mmap(NULL, size, prot, flags, -1, 0);
684
+ if (buf == MAP_FAILED) {
685
+ error_setg_errno(errp, errno,
686
+ "allocate %zu bytes for jit buffer", size);
687
+ return false;
688
+ }
689
+ tcg_ctx->code_gen_buffer_size = size;
690
+
691
+#ifdef __mips__
692
+ if (cross_256mb(buf, size)) {
693
+ /*
694
+ * Try again, with the original still mapped, to avoid re-acquiring
695
+ * the same 256mb crossing.
696
+ */
697
+ size_t size2;
698
+ void *buf2 = mmap(NULL, size, prot, flags, -1, 0);
699
+ switch ((int)(buf2 != MAP_FAILED)) {
700
+ case 1:
701
+ if (!cross_256mb(buf2, size)) {
702
+ /* Success! Use the new buffer. */
703
+ munmap(buf, size);
704
+ break;
705
+ }
706
+ /* Failure. Work with what we had. */
707
+ munmap(buf2, size);
708
+ /* fallthru */
709
+ default:
710
+ /* Split the original buffer. Free the smaller half. */
711
+ buf2 = split_cross_256mb(buf, size);
712
+ size2 = tcg_ctx->code_gen_buffer_size;
713
+ if (buf == buf2) {
714
+ munmap(buf + size2, size - size2);
715
+ } else {
716
+ munmap(buf, size - size2);
717
+ }
718
+ size = size2;
719
+ break;
720
+ }
721
+ buf = buf2;
722
+ }
723
+#endif
724
+
725
+ /* Request large pages for the buffer. */
726
+ qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
727
+
728
+ tcg_ctx->code_gen_buffer = buf;
729
+ return true;
730
+}
731
+
732
+#ifndef CONFIG_TCG_INTERPRETER
733
+#ifdef CONFIG_POSIX
734
+#include "qemu/memfd.h"
735
+
736
+static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
737
+{
738
+ void *buf_rw = NULL, *buf_rx = MAP_FAILED;
739
+ int fd = -1;
740
+
741
+#ifdef __mips__
742
+ /* Find space for the RX mapping, vs the 256MiB regions. */
743
+ if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
744
+ MAP_PRIVATE | MAP_ANONYMOUS |
745
+ MAP_NORESERVE, errp)) {
746
+ return false;
747
+ }
748
+ /* The size of the mapping may have been adjusted. */
749
+ size = tcg_ctx->code_gen_buffer_size;
750
+ buf_rx = tcg_ctx->code_gen_buffer;
751
+#endif
752
+
753
+ buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
754
+ if (buf_rw == NULL) {
755
+ goto fail;
756
+ }
757
+
758
+#ifdef __mips__
759
+ void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
760
+ MAP_SHARED | MAP_FIXED, fd, 0);
761
+ if (tmp != buf_rx) {
762
+ goto fail_rx;
763
+ }
764
+#else
765
+ buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
766
+ if (buf_rx == MAP_FAILED) {
767
+ goto fail_rx;
768
+ }
769
+#endif
770
+
771
+ close(fd);
772
+ tcg_ctx->code_gen_buffer = buf_rw;
773
+ tcg_ctx->code_gen_buffer_size = size;
774
+ tcg_splitwx_diff = buf_rx - buf_rw;
775
+
776
+ /* Request large pages for the buffer and the splitwx. */
777
+ qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
778
+ qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
779
+ return true;
780
+
781
+ fail_rx:
782
+ error_setg_errno(errp, errno, "failed to map shared memory for execute");
783
+ fail:
784
+ if (buf_rx != MAP_FAILED) {
785
+ munmap(buf_rx, size);
786
+ }
787
+ if (buf_rw) {
788
+ munmap(buf_rw, size);
789
+ }
790
+ if (fd >= 0) {
791
+ close(fd);
792
+ }
793
+ return false;
794
+}
795
+#endif /* CONFIG_POSIX */
796
+
797
+#ifdef CONFIG_DARWIN
798
+#include <mach/mach.h>
799
+
800
+extern kern_return_t mach_vm_remap(vm_map_t target_task,
801
+ mach_vm_address_t *target_address,
802
+ mach_vm_size_t size,
803
+ mach_vm_offset_t mask,
804
+ int flags,
805
+ vm_map_t src_task,
806
+ mach_vm_address_t src_address,
807
+ boolean_t copy,
808
+ vm_prot_t *cur_protection,
809
+ vm_prot_t *max_protection,
810
+ vm_inherit_t inheritance);
811
+
812
+static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
813
+{
814
+ kern_return_t ret;
815
+ mach_vm_address_t buf_rw, buf_rx;
816
+ vm_prot_t cur_prot, max_prot;
817
+
818
+ /* Map the read-write portion via normal anon memory. */
819
+ if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
820
+ MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
821
+ return false;
822
+ }
823
+
824
+ buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
825
+ buf_rx = 0;
826
+ ret = mach_vm_remap(mach_task_self(),
827
+ &buf_rx,
828
+ size,
829
+ 0,
830
+ VM_FLAGS_ANYWHERE,
831
+ mach_task_self(),
832
+ buf_rw,
833
+ false,
834
+ &cur_prot,
835
+ &max_prot,
836
+ VM_INHERIT_NONE);
837
+ if (ret != KERN_SUCCESS) {
838
+ /* TODO: Convert "ret" to a human readable error message. */
839
+ error_setg(errp, "vm_remap for jit splitwx failed");
840
+ munmap((void *)buf_rw, size);
841
+ return false;
842
+ }
843
+
844
+ if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
845
+ error_setg_errno(errp, errno, "mprotect for jit splitwx");
846
+ munmap((void *)buf_rx, size);
847
+ munmap((void *)buf_rw, size);
848
+ return false;
849
+ }
850
+
851
+ tcg_splitwx_diff = buf_rx - buf_rw;
852
+ return true;
853
+}
854
+#endif /* CONFIG_DARWIN */
855
+#endif /* CONFIG_TCG_INTERPRETER */
856
+
857
+static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
858
+{
859
+#ifndef CONFIG_TCG_INTERPRETER
860
+# ifdef CONFIG_DARWIN
861
+ return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
862
+# endif
863
+# ifdef CONFIG_POSIX
864
+ return alloc_code_gen_buffer_splitwx_memfd(size, errp);
865
+# endif
866
+#endif
867
+ error_setg(errp, "jit split-wx not supported");
868
+ return false;
869
+}
870
+
871
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
872
+{
873
+ ERRP_GUARD();
874
+ int prot, flags;
875
+
876
+ if (splitwx) {
877
+ if (alloc_code_gen_buffer_splitwx(size, errp)) {
878
+ return true;
879
+ }
880
+ /*
881
+ * If splitwx force-on (1), fail;
882
+ * if splitwx default-on (-1), fall through to splitwx off.
883
+ */
884
+ if (splitwx > 0) {
885
+ return false;
886
+ }
887
+ error_free_or_abort(errp);
888
+ }
889
+
890
+ prot = PROT_READ | PROT_WRITE | PROT_EXEC;
891
+ flags = MAP_PRIVATE | MAP_ANONYMOUS;
892
+#ifdef CONFIG_TCG_INTERPRETER
893
+ /* The tcg interpreter does not need execute permission. */
894
+ prot = PROT_READ | PROT_WRITE;
895
+#elif defined(CONFIG_DARWIN)
896
+ /* Applicable to both iOS and macOS (Apple Silicon). */
897
+ if (!splitwx) {
898
+ flags |= MAP_JIT;
899
+ }
900
+#endif
901
+
902
+ return alloc_code_gen_buffer_anon(size, prot, flags, errp);
903
+}
904
+#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
905
+
906
/*
907
* Initializes region partitioning.
908
*
909
@@ -XXX,XX +XXX,XX @@ static size_t tcg_n_regions(void)
910
* in practice. Multi-threaded guests share most if not all of their translated
911
* code, which makes parallel code generation less appealing than in softmmu.
912
*/
913
-void tcg_region_init(void)
914
+void tcg_region_init(size_t tb_size, int splitwx)
915
{
916
- void *buf = tcg_init_ctx.code_gen_buffer;
917
- void *aligned;
918
- size_t size = tcg_init_ctx.code_gen_buffer_size;
919
- size_t page_size = qemu_real_host_page_size;
920
+ void *buf, *aligned;
921
+ size_t size;
922
+ size_t page_size;
923
size_t region_size;
924
size_t n_regions;
925
size_t i;
926
+ bool ok;
927
928
+ ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
929
+ splitwx, &error_fatal);
930
+ assert(ok);
931
+
932
+ buf = tcg_init_ctx.code_gen_buffer;
933
+ size = tcg_init_ctx.code_gen_buffer_size;
934
+ page_size = qemu_real_host_page_size;
935
n_regions = tcg_n_regions();
936
937
/* The first region will be 'aligned - buf' bytes larger than the others */
938
--
76
--
939
2.25.1
77
2.43.0
940
78
941
79
diff view generated by jsdifflib
Deleted patch
1
We shortly want to use tcg_init for something else.
2
Since the hook is called init_machine, match that.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
accel/tcg/tcg-all.c | 4 ++--
10
1 file changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/accel/tcg/tcg-all.c
15
+++ b/accel/tcg/tcg-all.c
16
@@ -XXX,XX +XXX,XX @@ static void tcg_accel_instance_init(Object *obj)
17
18
bool mttcg_enabled;
19
20
-static int tcg_init(MachineState *ms)
21
+static int tcg_init_machine(MachineState *ms)
22
{
23
TCGState *s = TCG_STATE(current_accel());
24
25
@@ -XXX,XX +XXX,XX @@ static void tcg_accel_class_init(ObjectClass *oc, void *data)
26
{
27
AccelClass *ac = ACCEL_CLASS(oc);
28
ac->name = "tcg";
29
- ac->init_machine = tcg_init;
30
+ ac->init_machine = tcg_init_machine;
31
ac->allowed = &tcg_allowed;
32
33
object_class_property_add_str(oc, "thread",
34
--
35
2.25.1
36
37
diff view generated by jsdifflib
Deleted patch
1
Perform both tcg_context_init and tcg_region_init.
2
Do not leave this split to the caller.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
include/tcg/tcg.h | 3 +--
9
tcg/tcg-internal.h | 1 +
10
accel/tcg/translate-all.c | 3 +--
11
tcg/tcg.c | 9 ++++++++-
12
4 files changed, 11 insertions(+), 5 deletions(-)
13
14
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/include/tcg/tcg.h
17
+++ b/include/tcg/tcg.h
18
@@ -XXX,XX +XXX,XX @@ void *tcg_malloc_internal(TCGContext *s, int size);
19
void tcg_pool_reset(TCGContext *s);
20
TranslationBlock *tcg_tb_alloc(TCGContext *s);
21
22
-void tcg_region_init(size_t tb_size, int splitwx);
23
void tb_destroy(TranslationBlock *tb);
24
void tcg_region_reset_all(void);
25
26
@@ -XXX,XX +XXX,XX @@ static inline void *tcg_malloc(int size)
27
}
28
}
29
30
-void tcg_context_init(TCGContext *s);
31
+void tcg_init(size_t tb_size, int splitwx);
32
void tcg_register_thread(void);
33
void tcg_prologue_init(TCGContext *s);
34
void tcg_func_start(TCGContext *s);
35
diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
36
index XXXXXXX..XXXXXXX 100644
37
--- a/tcg/tcg-internal.h
38
+++ b/tcg/tcg-internal.h
39
@@ -XXX,XX +XXX,XX @@
40
extern TCGContext **tcg_ctxs;
41
extern unsigned int n_tcg_ctxs;
42
43
+void tcg_region_init(size_t tb_size, int splitwx);
44
bool tcg_region_alloc(TCGContext *s);
45
void tcg_region_initial_alloc(TCGContext *s);
46
void tcg_region_prologue_set(TCGContext *s);
47
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/accel/tcg/translate-all.c
50
+++ b/accel/tcg/translate-all.c
51
@@ -XXX,XX +XXX,XX @@ static void tb_htable_init(void)
52
void tcg_exec_init(unsigned long tb_size, int splitwx)
53
{
54
tcg_allowed = true;
55
- tcg_context_init(&tcg_init_ctx);
56
page_init();
57
tb_htable_init();
58
- tcg_region_init(tb_size, splitwx);
59
+ tcg_init(tb_size, splitwx);
60
61
#if defined(CONFIG_SOFTMMU)
62
/* There's no guest base to take into account, so go ahead and
63
diff --git a/tcg/tcg.c b/tcg/tcg.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/tcg/tcg.c
66
+++ b/tcg/tcg.c
67
@@ -XXX,XX +XXX,XX @@ static void process_op_defs(TCGContext *s);
68
static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type,
69
TCGReg reg, const char *name);
70
71
-void tcg_context_init(TCGContext *s)
72
+static void tcg_context_init(void)
73
{
74
+ TCGContext *s = &tcg_init_ctx;
75
int op, total_args, n, i;
76
TCGOpDef *def;
77
TCGArgConstraint *args_ct;
78
@@ -XXX,XX +XXX,XX @@ void tcg_context_init(TCGContext *s)
79
cpu_env = temp_tcgv_ptr(ts);
80
}
81
82
+void tcg_init(size_t tb_size, int splitwx)
83
+{
84
+ tcg_context_init();
85
+ tcg_region_init(tb_size, splitwx);
86
+}
87
+
88
/*
89
* Allocate TBs right before their corresponding translated code, making
90
* sure that TBs and code are on different cache lines.
91
--
92
2.25.1
93
94
diff view generated by jsdifflib
Deleted patch
1
There is only one caller, and shortly we will need access
2
to the MachineState, which tcg_init_machine already has.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
accel/tcg/internal.h | 2 ++
9
include/sysemu/tcg.h | 2 --
10
accel/tcg/tcg-all.c | 16 +++++++++++++++-
11
accel/tcg/translate-all.c | 21 ++-------------------
12
bsd-user/main.c | 2 +-
13
5 files changed, 20 insertions(+), 23 deletions(-)
14
15
diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/accel/tcg/internal.h
18
+++ b/accel/tcg/internal.h
19
@@ -XXX,XX +XXX,XX @@ TranslationBlock *tb_gen_code(CPUState *cpu, target_ulong pc,
20
int cflags);
21
22
void QEMU_NORETURN cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
23
+void page_init(void);
24
+void tb_htable_init(void);
25
26
#endif /* ACCEL_TCG_INTERNAL_H */
27
diff --git a/include/sysemu/tcg.h b/include/sysemu/tcg.h
28
index XXXXXXX..XXXXXXX 100644
29
--- a/include/sysemu/tcg.h
30
+++ b/include/sysemu/tcg.h
31
@@ -XXX,XX +XXX,XX @@
32
#ifndef SYSEMU_TCG_H
33
#define SYSEMU_TCG_H
34
35
-void tcg_exec_init(unsigned long tb_size, int splitwx);
36
-
37
#ifdef CONFIG_TCG
38
extern bool tcg_allowed;
39
#define tcg_enabled() (tcg_allowed)
40
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/accel/tcg/tcg-all.c
43
+++ b/accel/tcg/tcg-all.c
44
@@ -XXX,XX +XXX,XX @@
45
#include "qemu/error-report.h"
46
#include "qemu/accel.h"
47
#include "qapi/qapi-builtin-visit.h"
48
+#include "internal.h"
49
50
struct TCGState {
51
AccelState parent_obj;
52
@@ -XXX,XX +XXX,XX @@ static int tcg_init_machine(MachineState *ms)
53
{
54
TCGState *s = TCG_STATE(current_accel());
55
56
- tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
57
+ tcg_allowed = true;
58
mttcg_enabled = s->mttcg_enabled;
59
+
60
+ page_init();
61
+ tb_htable_init();
62
+ tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
63
+
64
+#if defined(CONFIG_SOFTMMU)
65
+ /*
66
+ * There's no guest base to take into account, so go ahead and
67
+ * initialize the prologue now.
68
+ */
69
+ tcg_prologue_init(tcg_ctx);
70
+#endif
71
+
72
return 0;
73
}
74
75
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
76
index XXXXXXX..XXXXXXX 100644
77
--- a/accel/tcg/translate-all.c
78
+++ b/accel/tcg/translate-all.c
79
@@ -XXX,XX +XXX,XX @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit)
80
return false;
81
}
82
83
-static void page_init(void)
84
+void page_init(void)
85
{
86
page_size_init();
87
page_table_config_init();
88
@@ -XXX,XX +XXX,XX @@ static bool tb_cmp(const void *ap, const void *bp)
89
a->page_addr[1] == b->page_addr[1];
90
}
91
92
-static void tb_htable_init(void)
93
+void tb_htable_init(void)
94
{
95
unsigned int mode = QHT_MODE_AUTO_RESIZE;
96
97
qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
98
}
99
100
-/* Must be called before using the QEMU cpus. 'tb_size' is the size
101
- (in bytes) allocated to the translation buffer. Zero means default
102
- size. */
103
-void tcg_exec_init(unsigned long tb_size, int splitwx)
104
-{
105
- tcg_allowed = true;
106
- page_init();
107
- tb_htable_init();
108
- tcg_init(tb_size, splitwx);
109
-
110
-#if defined(CONFIG_SOFTMMU)
111
- /* There's no guest base to take into account, so go ahead and
112
- initialize the prologue now. */
113
- tcg_prologue_init(tcg_ctx);
114
-#endif
115
-}
116
-
117
/* call with @p->lock held */
118
static inline void invalidate_page_bitmap(PageDesc *p)
119
{
120
diff --git a/bsd-user/main.c b/bsd-user/main.c
121
index XXXXXXX..XXXXXXX 100644
122
--- a/bsd-user/main.c
123
+++ b/bsd-user/main.c
124
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
125
envlist_free(envlist);
126
127
/*
128
- * Now that page sizes are configured in tcg_exec_init() we can do
129
+ * Now that page sizes are configured we can do
130
* proper page alignment for guest_base.
131
*/
132
guest_base = HOST_PAGE_ALIGN(guest_base);
133
--
134
2.25.1
135
136
diff view generated by jsdifflib
Deleted patch
1
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
3
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
accel/tcg/tcg-all.c | 3 ++-
7
1 file changed, 2 insertions(+), 1 deletion(-)
8
1
9
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
10
index XXXXXXX..XXXXXXX 100644
11
--- a/accel/tcg/tcg-all.c
12
+++ b/accel/tcg/tcg-all.c
13
@@ -XXX,XX +XXX,XX @@
14
#include "qemu/error-report.h"
15
#include "qemu/accel.h"
16
#include "qapi/qapi-builtin-visit.h"
17
+#include "qemu/units.h"
18
#include "internal.h"
19
20
struct TCGState {
21
@@ -XXX,XX +XXX,XX @@ static int tcg_init_machine(MachineState *ms)
22
23
page_init();
24
tb_htable_init();
25
- tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
26
+ tcg_init(s->tb_size * MiB, s->splitwx_enabled);
27
28
#if defined(CONFIG_SOFTMMU)
29
/*
30
--
31
2.25.1
32
33
diff view generated by jsdifflib
Deleted patch
1
Start removing the include of hw/boards.h from tcg/.
2
Pass down the max_cpus value from tcg_init_machine,
3
where we have the MachineState already.
4
1
5
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
10
include/tcg/tcg.h | 2 +-
11
tcg/tcg-internal.h | 2 +-
12
accel/tcg/tcg-all.c | 10 +++++++++-
13
tcg/region.c | 32 +++++++++++---------------------
14
tcg/tcg.c | 10 ++++------
15
5 files changed, 26 insertions(+), 30 deletions(-)
16
17
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/include/tcg/tcg.h
20
+++ b/include/tcg/tcg.h
21
@@ -XXX,XX +XXX,XX @@ static inline void *tcg_malloc(int size)
22
}
23
}
24
25
-void tcg_init(size_t tb_size, int splitwx);
26
+void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
27
void tcg_register_thread(void);
28
void tcg_prologue_init(TCGContext *s);
29
void tcg_func_start(TCGContext *s);
30
diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
31
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/tcg-internal.h
33
+++ b/tcg/tcg-internal.h
34
@@ -XXX,XX +XXX,XX @@
35
extern TCGContext **tcg_ctxs;
36
extern unsigned int n_tcg_ctxs;
37
38
-void tcg_region_init(size_t tb_size, int splitwx);
39
+void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus);
40
bool tcg_region_alloc(TCGContext *s);
41
void tcg_region_initial_alloc(TCGContext *s);
42
void tcg_region_prologue_set(TCGContext *s);
43
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/accel/tcg/tcg-all.c
46
+++ b/accel/tcg/tcg-all.c
47
@@ -XXX,XX +XXX,XX @@
48
#include "qemu/accel.h"
49
#include "qapi/qapi-builtin-visit.h"
50
#include "qemu/units.h"
51
+#if !defined(CONFIG_USER_ONLY)
52
+#include "hw/boards.h"
53
+#endif
54
#include "internal.h"
55
56
struct TCGState {
57
@@ -XXX,XX +XXX,XX @@ bool mttcg_enabled;
58
static int tcg_init_machine(MachineState *ms)
59
{
60
TCGState *s = TCG_STATE(current_accel());
61
+#ifdef CONFIG_USER_ONLY
62
+ unsigned max_cpus = 1;
63
+#else
64
+ unsigned max_cpus = ms->smp.max_cpus;
65
+#endif
66
67
tcg_allowed = true;
68
mttcg_enabled = s->mttcg_enabled;
69
70
page_init();
71
tb_htable_init();
72
- tcg_init(s->tb_size * MiB, s->splitwx_enabled);
73
+ tcg_init(s->tb_size * MiB, s->splitwx_enabled, max_cpus);
74
75
#if defined(CONFIG_SOFTMMU)
76
/*
77
diff --git a/tcg/region.c b/tcg/region.c
78
index XXXXXXX..XXXXXXX 100644
79
--- a/tcg/region.c
80
+++ b/tcg/region.c
81
@@ -XXX,XX +XXX,XX @@
82
#include "qapi/error.h"
83
#include "exec/exec-all.h"
84
#include "tcg/tcg.h"
85
-#if !defined(CONFIG_USER_ONLY)
86
-#include "hw/boards.h"
87
-#endif
88
#include "tcg-internal.h"
89
90
91
@@ -XXX,XX +XXX,XX @@ void tcg_region_reset_all(void)
92
tcg_region_tree_reset_all();
93
}
94
95
+static size_t tcg_n_regions(unsigned max_cpus)
96
+{
97
#ifdef CONFIG_USER_ONLY
98
-static size_t tcg_n_regions(void)
99
-{
100
return 1;
101
-}
102
#else
103
-/*
104
- * It is likely that some vCPUs will translate more code than others, so we
105
- * first try to set more regions than max_cpus, with those regions being of
106
- * reasonable size. If that's not possible we make do by evenly dividing
107
- * the code_gen_buffer among the vCPUs.
108
- */
109
-static size_t tcg_n_regions(void)
110
-{
111
+ /*
112
+ * It is likely that some vCPUs will translate more code than others,
113
+ * so we first try to set more regions than max_cpus, with those regions
114
+ * being of reasonable size. If that's not possible we make do by evenly
115
+ * dividing the code_gen_buffer among the vCPUs.
116
+ */
117
size_t i;
118
119
/* Use a single region if all we have is one vCPU thread */
120
-#if !defined(CONFIG_USER_ONLY)
121
- MachineState *ms = MACHINE(qdev_get_machine());
122
- unsigned int max_cpus = ms->smp.max_cpus;
123
-#endif
124
if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
125
return 1;
126
}
127
@@ -XXX,XX +XXX,XX @@ static size_t tcg_n_regions(void)
128
}
129
/* If we can't, then just allocate one region per vCPU thread */
130
return max_cpus;
131
-}
132
#endif
133
+}
134
135
/*
136
* Minimum size of the code gen buffer. This number is randomly chosen,
137
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
138
* in practice. Multi-threaded guests share most if not all of their translated
139
* code, which makes parallel code generation less appealing than in softmmu.
140
*/
141
-void tcg_region_init(size_t tb_size, int splitwx)
142
+void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
143
{
144
void *buf, *aligned;
145
size_t size;
146
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx)
147
buf = tcg_init_ctx.code_gen_buffer;
148
size = tcg_init_ctx.code_gen_buffer_size;
149
page_size = qemu_real_host_page_size;
150
- n_regions = tcg_n_regions();
151
+ n_regions = tcg_n_regions(max_cpus);
152
153
/* The first region will be 'aligned - buf' bytes larger than the others */
154
aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
155
diff --git a/tcg/tcg.c b/tcg/tcg.c
156
index XXXXXXX..XXXXXXX 100644
157
--- a/tcg/tcg.c
158
+++ b/tcg/tcg.c
159
@@ -XXX,XX +XXX,XX @@ static void process_op_defs(TCGContext *s);
160
static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type,
161
TCGReg reg, const char *name);
162
163
-static void tcg_context_init(void)
164
+static void tcg_context_init(unsigned max_cpus)
165
{
166
TCGContext *s = &tcg_init_ctx;
167
int op, total_args, n, i;
168
@@ -XXX,XX +XXX,XX @@ static void tcg_context_init(void)
169
tcg_ctxs = &tcg_ctx;
170
n_tcg_ctxs = 1;
171
#else
172
- MachineState *ms = MACHINE(qdev_get_machine());
173
- unsigned int max_cpus = ms->smp.max_cpus;
174
tcg_ctxs = g_new(TCGContext *, max_cpus);
175
#endif
176
177
@@ -XXX,XX +XXX,XX @@ static void tcg_context_init(void)
178
cpu_env = temp_tcgv_ptr(ts);
179
}
180
181
-void tcg_init(size_t tb_size, int splitwx)
182
+void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus)
183
{
184
- tcg_context_init();
185
- tcg_region_init(tb_size, splitwx);
186
+ tcg_context_init(max_cpus);
187
+ tcg_region_init(tb_size, splitwx, max_cpus);
188
}
189
190
/*
191
--
192
2.25.1
193
194
diff view generated by jsdifflib
Deleted patch
1
Finish the divorce of tcg/ from hw/, and do not take
2
the max cpu value from MachineState; just remember what
3
we were passed in tcg_init.
4
1
5
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
10
tcg/tcg-internal.h | 3 ++-
11
tcg/region.c | 6 +++---
12
tcg/tcg.c | 23 ++++++++++-------------
13
3 files changed, 15 insertions(+), 17 deletions(-)
14
15
diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/tcg/tcg-internal.h
18
+++ b/tcg/tcg-internal.h
19
@@ -XXX,XX +XXX,XX @@
20
#define TCG_HIGHWATER 1024
21
22
extern TCGContext **tcg_ctxs;
23
-extern unsigned int n_tcg_ctxs;
24
+extern unsigned int tcg_cur_ctxs;
25
+extern unsigned int tcg_max_ctxs;
26
27
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus);
28
bool tcg_region_alloc(TCGContext *s);
29
diff --git a/tcg/region.c b/tcg/region.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/tcg/region.c
32
+++ b/tcg/region.c
33
@@ -XXX,XX +XXX,XX @@ void tcg_region_initial_alloc(TCGContext *s)
34
/* Call from a safe-work context */
35
void tcg_region_reset_all(void)
36
{
37
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
38
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
39
unsigned int i;
40
41
qemu_mutex_lock(&region.lock);
42
@@ -XXX,XX +XXX,XX @@ void tcg_region_prologue_set(TCGContext *s)
43
*/
44
size_t tcg_code_size(void)
45
{
46
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
47
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
48
unsigned int i;
49
size_t total;
50
51
@@ -XXX,XX +XXX,XX @@ size_t tcg_code_capacity(void)
52
53
size_t tcg_tb_phys_invalidate_count(void)
54
{
55
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
56
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
57
unsigned int i;
58
size_t total = 0;
59
60
diff --git a/tcg/tcg.c b/tcg/tcg.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/tcg/tcg.c
63
+++ b/tcg/tcg.c
64
@@ -XXX,XX +XXX,XX @@
65
#define NO_CPU_IO_DEFS
66
67
#include "exec/exec-all.h"
68
-
69
-#if !defined(CONFIG_USER_ONLY)
70
-#include "hw/boards.h"
71
-#endif
72
-
73
#include "tcg/tcg-op.h"
74
75
#if UINTPTR_MAX == UINT32_MAX
76
@@ -XXX,XX +XXX,XX @@ static int tcg_out_ldst_finalize(TCGContext *s);
77
#endif
78
79
TCGContext **tcg_ctxs;
80
-unsigned int n_tcg_ctxs;
81
+unsigned int tcg_cur_ctxs;
82
+unsigned int tcg_max_ctxs;
83
TCGv_env cpu_env = 0;
84
const void *tcg_code_gen_epilogue;
85
uintptr_t tcg_splitwx_diff;
86
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
87
#else
88
void tcg_register_thread(void)
89
{
90
- MachineState *ms = MACHINE(qdev_get_machine());
91
TCGContext *s = g_malloc(sizeof(*s));
92
unsigned int i, n;
93
94
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void)
95
}
96
97
/* Claim an entry in tcg_ctxs */
98
- n = qatomic_fetch_inc(&n_tcg_ctxs);
99
- g_assert(n < ms->smp.max_cpus);
100
+ n = qatomic_fetch_inc(&tcg_cur_ctxs);
101
+ g_assert(n < tcg_max_ctxs);
102
qatomic_set(&tcg_ctxs[n], s);
103
104
if (n > 0) {
105
@@ -XXX,XX +XXX,XX @@ static void tcg_context_init(unsigned max_cpus)
106
*/
107
#ifdef CONFIG_USER_ONLY
108
tcg_ctxs = &tcg_ctx;
109
- n_tcg_ctxs = 1;
110
+ tcg_cur_ctxs = 1;
111
+ tcg_max_ctxs = 1;
112
#else
113
- tcg_ctxs = g_new(TCGContext *, max_cpus);
114
+ tcg_max_ctxs = max_cpus;
115
+ tcg_ctxs = g_new0(TCGContext *, max_cpus);
116
#endif
117
118
tcg_debug_assert(!tcg_regset_test_reg(s->reserved_regs, TCG_AREG0));
119
@@ -XXX,XX +XXX,XX @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
120
static inline
121
void tcg_profile_snapshot(TCGProfile *prof, bool counters, bool table)
122
{
123
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
124
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
125
unsigned int i;
126
127
for (i = 0; i < n_ctxs; i++) {
128
@@ -XXX,XX +XXX,XX @@ void tcg_dump_op_count(void)
129
130
int64_t tcg_cpu_exec_time(void)
131
{
132
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
133
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
134
unsigned int i;
135
int64_t ret = 0;
136
137
--
138
2.25.1
139
140
diff view generated by jsdifflib
Deleted patch
1
Remove the ifdef ladder and move each define into the
2
appropriate header file.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/aarch64/tcg-target.h | 1 +
9
tcg/arm/tcg-target.h | 1 +
10
tcg/i386/tcg-target.h | 2 ++
11
tcg/mips/tcg-target.h | 6 ++++++
12
tcg/ppc/tcg-target.h | 2 ++
13
tcg/riscv/tcg-target.h | 1 +
14
tcg/s390/tcg-target.h | 3 +++
15
tcg/sparc/tcg-target.h | 1 +
16
tcg/tci/tcg-target.h | 1 +
17
tcg/region.c | 33 +++++----------------------------
18
10 files changed, 23 insertions(+), 28 deletions(-)
19
20
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
21
index XXXXXXX..XXXXXXX 100644
22
--- a/tcg/aarch64/tcg-target.h
23
+++ b/tcg/aarch64/tcg-target.h
24
@@ -XXX,XX +XXX,XX @@
25
26
#define TCG_TARGET_INSN_UNIT_SIZE 4
27
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 24
28
+#define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
29
#undef TCG_TARGET_STACK_GROWSUP
30
31
typedef enum {
32
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/tcg/arm/tcg-target.h
35
+++ b/tcg/arm/tcg-target.h
36
@@ -XXX,XX +XXX,XX @@ extern int arm_arch;
37
#undef TCG_TARGET_STACK_GROWSUP
38
#define TCG_TARGET_INSN_UNIT_SIZE 4
39
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
40
+#define MAX_CODE_GEN_BUFFER_SIZE UINT32_MAX
41
42
typedef enum {
43
TCG_REG_R0 = 0,
44
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
45
index XXXXXXX..XXXXXXX 100644
46
--- a/tcg/i386/tcg-target.h
47
+++ b/tcg/i386/tcg-target.h
48
@@ -XXX,XX +XXX,XX @@
49
#ifdef __x86_64__
50
# define TCG_TARGET_REG_BITS 64
51
# define TCG_TARGET_NB_REGS 32
52
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
53
#else
54
# define TCG_TARGET_REG_BITS 32
55
# define TCG_TARGET_NB_REGS 24
56
+# define MAX_CODE_GEN_BUFFER_SIZE UINT32_MAX
57
#endif
58
59
typedef enum {
60
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
61
index XXXXXXX..XXXXXXX 100644
62
--- a/tcg/mips/tcg-target.h
63
+++ b/tcg/mips/tcg-target.h
64
@@ -XXX,XX +XXX,XX @@
65
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
66
#define TCG_TARGET_NB_REGS 32
67
68
+/*
69
+ * We have a 256MB branch region, but leave room to make sure the
70
+ * main executable is also within that region.
71
+ */
72
+#define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
73
+
74
typedef enum {
75
TCG_REG_ZERO = 0,
76
TCG_REG_AT,
77
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
78
index XXXXXXX..XXXXXXX 100644
79
--- a/tcg/ppc/tcg-target.h
80
+++ b/tcg/ppc/tcg-target.h
81
@@ -XXX,XX +XXX,XX @@
82
83
#ifdef _ARCH_PPC64
84
# define TCG_TARGET_REG_BITS 64
85
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
86
#else
87
# define TCG_TARGET_REG_BITS 32
88
+# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
89
#endif
90
91
#define TCG_TARGET_NB_REGS 64
92
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
93
index XXXXXXX..XXXXXXX 100644
94
--- a/tcg/riscv/tcg-target.h
95
+++ b/tcg/riscv/tcg-target.h
96
@@ -XXX,XX +XXX,XX @@
97
#define TCG_TARGET_INSN_UNIT_SIZE 4
98
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
99
#define TCG_TARGET_NB_REGS 32
100
+#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
101
102
typedef enum {
103
TCG_REG_ZERO,
104
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
105
index XXXXXXX..XXXXXXX 100644
106
--- a/tcg/s390/tcg-target.h
107
+++ b/tcg/s390/tcg-target.h
108
@@ -XXX,XX +XXX,XX @@
109
#define TCG_TARGET_INSN_UNIT_SIZE 2
110
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 19
111
112
+/* We have a +- 4GB range on the branches; leave some slop. */
113
+#define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
114
+
115
typedef enum TCGReg {
116
TCG_REG_R0 = 0,
117
TCG_REG_R1,
118
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
119
index XXXXXXX..XXXXXXX 100644
120
--- a/tcg/sparc/tcg-target.h
121
+++ b/tcg/sparc/tcg-target.h
122
@@ -XXX,XX +XXX,XX @@
123
#define TCG_TARGET_INSN_UNIT_SIZE 4
124
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
125
#define TCG_TARGET_NB_REGS 32
126
+#define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
127
128
typedef enum {
129
TCG_REG_G0 = 0,
130
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
131
index XXXXXXX..XXXXXXX 100644
132
--- a/tcg/tci/tcg-target.h
133
+++ b/tcg/tci/tcg-target.h
134
@@ -XXX,XX +XXX,XX @@
135
#define TCG_TARGET_INTERPRETER 1
136
#define TCG_TARGET_INSN_UNIT_SIZE 1
137
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
138
+#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
139
140
#if UINTPTR_MAX == UINT32_MAX
141
# define TCG_TARGET_REG_BITS 32
142
diff --git a/tcg/region.c b/tcg/region.c
143
index XXXXXXX..XXXXXXX 100644
144
--- a/tcg/region.c
145
+++ b/tcg/region.c
146
@@ -XXX,XX +XXX,XX @@ static size_t tcg_n_regions(unsigned max_cpus)
147
/*
148
* Minimum size of the code gen buffer. This number is randomly chosen,
149
* but not so small that we can't have a fair number of TB's live.
150
+ *
151
+ * Maximum size, MAX_CODE_GEN_BUFFER_SIZE, is defined in tcg-target.h.
152
+ * Unless otherwise indicated, this is constrained by the range of
153
+ * direct branches on the host cpu, as used by the TCG implementation
154
+ * of goto_tb.
155
*/
156
#define MIN_CODE_GEN_BUFFER_SIZE (1 * MiB)
157
158
-/*
159
- * Maximum size of the code gen buffer we'd like to use. Unless otherwise
160
- * indicated, this is constrained by the range of direct branches on the
161
- * host cpu, as used by the TCG implementation of goto_tb.
162
- */
163
-#if defined(__x86_64__)
164
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
165
-#elif defined(__sparc__)
166
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
167
-#elif defined(__powerpc64__)
168
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
169
-#elif defined(__powerpc__)
170
-# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
171
-#elif defined(__aarch64__)
172
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
173
-#elif defined(__s390x__)
174
- /* We have a +- 4GB range on the branches; leave some slop. */
175
-# define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
176
-#elif defined(__mips__)
177
- /*
178
- * We have a 256MB branch region, but leave room to make sure the
179
- * main executable is also within that region.
180
- */
181
-# define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
182
-#else
183
-# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
184
-#endif
185
-
186
#if TCG_TARGET_REG_BITS == 32
187
#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
188
#ifdef CONFIG_USER_ONLY
189
--
190
2.25.1
191
192
diff view generated by jsdifflib
Deleted patch
1
A size is easier to work with than an end point,
2
particularly during initial buffer allocation.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/region.c | 30 ++++++++++++++++++------------
9
1 file changed, 18 insertions(+), 12 deletions(-)
10
11
diff --git a/tcg/region.c b/tcg/region.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/region.c
14
+++ b/tcg/region.c
15
@@ -XXX,XX +XXX,XX @@ struct tcg_region_state {
16
/* fields set at init time */
17
void *start;
18
void *start_aligned;
19
- void *end;
20
size_t n;
21
size_t size; /* size of one region */
22
size_t stride; /* .size + guard size */
23
+ size_t total_size; /* size of entire buffer, >= n * stride */
24
25
/* fields protected by the lock */
26
size_t current; /* current region index */
27
@@ -XXX,XX +XXX,XX @@ static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
28
if (curr_region == 0) {
29
start = region.start;
30
}
31
+ /* The final region may have a few extra pages due to earlier rounding. */
32
if (curr_region == region.n - 1) {
33
- end = region.end;
34
+ end = region.start_aligned + region.total_size;
35
}
36
37
*pstart = start;
38
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
39
*/
40
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
41
{
42
- void *buf, *aligned;
43
- size_t size;
44
+ void *buf, *aligned, *end;
45
+ size_t total_size;
46
size_t page_size;
47
size_t region_size;
48
size_t n_regions;
49
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
50
assert(ok);
51
52
buf = tcg_init_ctx.code_gen_buffer;
53
- size = tcg_init_ctx.code_gen_buffer_size;
54
+ total_size = tcg_init_ctx.code_gen_buffer_size;
55
page_size = qemu_real_host_page_size;
56
n_regions = tcg_n_regions(max_cpus);
57
58
/* The first region will be 'aligned - buf' bytes larger than the others */
59
aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
60
- g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
61
+ g_assert(aligned < tcg_init_ctx.code_gen_buffer + total_size);
62
+
63
/*
64
* Make region_size a multiple of page_size, using aligned as the start.
65
* As a result of this we might end up with a few extra pages at the end of
66
* the buffer; we will assign those to the last region.
67
*/
68
- region_size = (size - (aligned - buf)) / n_regions;
69
+ region_size = (total_size - (aligned - buf)) / n_regions;
70
region_size = QEMU_ALIGN_DOWN(region_size, page_size);
71
72
/* A region must have at least 2 pages; one code, one guard */
73
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
74
region.start = buf;
75
region.start_aligned = aligned;
76
/* page-align the end, since its last page will be a guard page */
77
- region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
78
+ end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
79
/* account for that last guard page */
80
- region.end -= page_size;
81
+ end -= page_size;
82
+ total_size = end - aligned;
83
+ region.total_size = total_size;
84
85
/*
86
* Set guard pages in the rw buffer, as that's the one into which
87
@@ -XXX,XX +XXX,XX @@ void tcg_region_prologue_set(TCGContext *s)
88
89
/* Register the balance of the buffer with gdb. */
90
tcg_register_jit(tcg_splitwx_to_rx(region.start),
91
- region.end - region.start);
92
+ region.start_aligned + region.total_size - region.start);
93
}
94
95
/*
96
@@ -XXX,XX +XXX,XX @@ size_t tcg_code_capacity(void)
97
98
/* no need for synchronization; these variables are set at init time */
99
guard_size = region.stride - region.size;
100
- capacity = region.end + guard_size - region.start;
101
- capacity -= region.n * (guard_size + TCG_HIGHWATER);
102
+ capacity = region.total_size;
103
+ capacity -= (region.n - 1) * guard_size;
104
+ capacity -= region.n * TCG_HIGHWATER;
105
+
106
return capacity;
107
}
108
109
--
110
2.25.1
111
112
diff view generated by jsdifflib
Deleted patch
1
Give the field a name reflecting its actual meaning.
2
1
3
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/region.c | 15 ++++++++-------
8
1 file changed, 8 insertions(+), 7 deletions(-)
9
10
diff --git a/tcg/region.c b/tcg/region.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/region.c
13
+++ b/tcg/region.c
14
@@ -XXX,XX +XXX,XX @@ struct tcg_region_state {
15
QemuMutex lock;
16
17
/* fields set at init time */
18
- void *start;
19
void *start_aligned;
20
+ void *after_prologue;
21
size_t n;
22
size_t size; /* size of one region */
23
size_t stride; /* .size + guard size */
24
@@ -XXX,XX +XXX,XX @@ static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
25
end = start + region.size;
26
27
if (curr_region == 0) {
28
- start = region.start;
29
+ start = region.after_prologue;
30
}
31
/* The final region may have a few extra pages due to earlier rounding. */
32
if (curr_region == region.n - 1) {
33
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
34
region.n = n_regions;
35
region.size = region_size - page_size;
36
region.stride = region_size;
37
- region.start = buf;
38
+ region.after_prologue = buf;
39
region.start_aligned = aligned;
40
/* page-align the end, since its last page will be a guard page */
41
end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
42
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
43
void tcg_region_prologue_set(TCGContext *s)
44
{
45
/* Deduct the prologue from the first region. */
46
- g_assert(region.start == s->code_gen_buffer);
47
- region.start = s->code_ptr;
48
+ g_assert(region.start_aligned == s->code_gen_buffer);
49
+ region.after_prologue = s->code_ptr;
50
51
/* Recompute boundaries of the first region. */
52
tcg_region_assign(s, 0);
53
54
/* Register the balance of the buffer with gdb. */
55
- tcg_register_jit(tcg_splitwx_to_rx(region.start),
56
- region.start_aligned + region.total_size - region.start);
57
+ tcg_register_jit(tcg_splitwx_to_rx(region.after_prologue),
58
+ region.start_aligned + region.total_size -
59
+ region.after_prologue);
60
}
61
62
/*
63
--
64
2.25.1
65
66
diff view generated by jsdifflib
Deleted patch
1
Compute the value using straight division and bounds,
2
rather than a loop. Pass in tb_size rather than reading
3
from tcg_init_ctx.code_gen_buffer_size,
4
1
5
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/region.c | 29 ++++++++++++-----------------
10
1 file changed, 12 insertions(+), 17 deletions(-)
11
12
diff --git a/tcg/region.c b/tcg/region.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/region.c
15
+++ b/tcg/region.c
16
@@ -XXX,XX +XXX,XX @@ void tcg_region_reset_all(void)
17
tcg_region_tree_reset_all();
18
}
19
20
-static size_t tcg_n_regions(unsigned max_cpus)
21
+static size_t tcg_n_regions(size_t tb_size, unsigned max_cpus)
22
{
23
#ifdef CONFIG_USER_ONLY
24
return 1;
25
#else
26
+ size_t n_regions;
27
+
28
/*
29
* It is likely that some vCPUs will translate more code than others,
30
* so we first try to set more regions than max_cpus, with those regions
31
* being of reasonable size. If that's not possible we make do by evenly
32
* dividing the code_gen_buffer among the vCPUs.
33
*/
34
- size_t i;
35
-
36
/* Use a single region if all we have is one vCPU thread */
37
if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
38
return 1;
39
}
40
41
- /* Try to have more regions than max_cpus, with each region being >= 2 MB */
42
- for (i = 8; i > 0; i--) {
43
- size_t regions_per_thread = i;
44
- size_t region_size;
45
-
46
- region_size = tcg_init_ctx.code_gen_buffer_size;
47
- region_size /= max_cpus * regions_per_thread;
48
-
49
- if (region_size >= 2 * 1024u * 1024) {
50
- return max_cpus * regions_per_thread;
51
- }
52
+ /*
53
+ * Try to have more regions than max_cpus, with each region being >= 2 MB.
54
+ * If we can't, then just allocate one region per vCPU thread.
55
+ */
56
+ n_regions = tb_size / (2 * MiB);
57
+ if (n_regions <= max_cpus) {
58
+ return max_cpus;
59
}
60
- /* If we can't, then just allocate one region per vCPU thread */
61
- return max_cpus;
62
+ return MIN(n_regions, max_cpus * 8);
63
#endif
64
}
65
66
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
67
buf = tcg_init_ctx.code_gen_buffer;
68
total_size = tcg_init_ctx.code_gen_buffer_size;
69
page_size = qemu_real_host_page_size;
70
- n_regions = tcg_n_regions(max_cpus);
71
+ n_regions = tcg_n_regions(total_size, max_cpus);
72
73
/* The first region will be 'aligned - buf' bytes larger than the others */
74
aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
75
--
76
2.25.1
77
78
diff view generated by jsdifflib
Deleted patch
1
Return output buffer and size via output pointer arguments,
2
rather than returning size via tcg_ctx->code_gen_buffer_size.
3
1
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/region.c | 19 +++++++++----------
9
1 file changed, 9 insertions(+), 10 deletions(-)
10
11
diff --git a/tcg/region.c b/tcg/region.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/region.c
14
+++ b/tcg/region.c
15
@@ -XXX,XX +XXX,XX @@ static inline bool cross_256mb(void *addr, size_t size)
16
/*
17
* We weren't able to allocate a buffer without crossing that boundary,
18
* so make do with the larger portion of the buffer that doesn't cross.
19
- * Returns the new base of the buffer, and adjusts code_gen_buffer_size.
20
+ * Returns the new base and size of the buffer in *obuf and *osize.
21
*/
22
-static inline void *split_cross_256mb(void *buf1, size_t size1)
23
+static inline void split_cross_256mb(void **obuf, size_t *osize,
24
+ void *buf1, size_t size1)
25
{
26
void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
27
size_t size2 = buf1 + size1 - buf2;
28
@@ -XXX,XX +XXX,XX @@ static inline void *split_cross_256mb(void *buf1, size_t size1)
29
buf1 = buf2;
30
}
31
32
- tcg_ctx->code_gen_buffer_size = size1;
33
- return buf1;
34
+ *obuf = buf1;
35
+ *osize = size1;
36
}
37
#endif
38
39
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
40
if (size > tb_size) {
41
size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
42
}
43
- tcg_ctx->code_gen_buffer_size = size;
44
45
#ifdef __mips__
46
if (cross_256mb(buf, size)) {
47
- buf = split_cross_256mb(buf, size);
48
- size = tcg_ctx->code_gen_buffer_size;
49
+ split_cross_256mb(&buf, &size, buf, size);
50
}
51
#endif
52
53
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
54
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
55
56
tcg_ctx->code_gen_buffer = buf;
57
+ tcg_ctx->code_gen_buffer_size = size;
58
return true;
59
}
60
#elif defined(_WIN32)
61
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
62
"allocate %zu bytes for jit buffer", size);
63
return false;
64
}
65
- tcg_ctx->code_gen_buffer_size = size;
66
67
#ifdef __mips__
68
if (cross_256mb(buf, size)) {
69
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
70
/* fallthru */
71
default:
72
/* Split the original buffer. Free the smaller half. */
73
- buf2 = split_cross_256mb(buf, size);
74
- size2 = tcg_ctx->code_gen_buffer_size;
75
+ split_cross_256mb(&buf2, &size2, buf, size);
76
if (buf == buf2) {
77
munmap(buf + size2, size - size2);
78
} else {
79
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
80
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
81
82
tcg_ctx->code_gen_buffer = buf;
83
+ tcg_ctx->code_gen_buffer_size = size;
84
return true;
85
}
86
87
--
88
2.25.1
89
90
diff view generated by jsdifflib
Deleted patch
1
Shortly, the full code_gen_buffer will only be visible
2
to region.c, so move in_code_gen_buffer out-of-line.
3
1
4
Move the debugging versions of tcg_splitwx_to_{rx,rw}
5
to region.c as well, so that the compiler gets to see
6
the implementation of in_code_gen_buffer.
7
8
This leaves exactly one use of in_code_gen_buffer outside
9
of region.c, in cpu_restore_state. Which, being on the
10
exception path, is not performance critical.
11
12
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
13
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
14
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
15
---
16
include/tcg/tcg.h | 11 +----------
17
tcg/region.c | 34 ++++++++++++++++++++++++++++++++++
18
tcg/tcg.c | 23 -----------------------
19
3 files changed, 35 insertions(+), 33 deletions(-)
20
21
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
22
index XXXXXXX..XXXXXXX 100644
23
--- a/include/tcg/tcg.h
24
+++ b/include/tcg/tcg.h
25
@@ -XXX,XX +XXX,XX @@ extern const void *tcg_code_gen_epilogue;
26
extern uintptr_t tcg_splitwx_diff;
27
extern TCGv_env cpu_env;
28
29
-static inline bool in_code_gen_buffer(const void *p)
30
-{
31
- const TCGContext *s = &tcg_init_ctx;
32
- /*
33
- * Much like it is valid to have a pointer to the byte past the
34
- * end of an array (so long as you don't dereference it), allow
35
- * a pointer to the byte past the end of the code gen buffer.
36
- */
37
- return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
38
-}
39
+bool in_code_gen_buffer(const void *p);
40
41
#ifdef CONFIG_DEBUG_TCG
42
const void *tcg_splitwx_to_rx(void *rw);
43
diff --git a/tcg/region.c b/tcg/region.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/tcg/region.c
46
+++ b/tcg/region.c
47
@@ -XXX,XX +XXX,XX @@ static struct tcg_region_state region;
48
static void *region_trees;
49
static size_t tree_size;
50
51
+bool in_code_gen_buffer(const void *p)
52
+{
53
+ const TCGContext *s = &tcg_init_ctx;
54
+ /*
55
+ * Much like it is valid to have a pointer to the byte past the
56
+ * end of an array (so long as you don't dereference it), allow
57
+ * a pointer to the byte past the end of the code gen buffer.
58
+ */
59
+ return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
60
+}
61
+
62
+#ifdef CONFIG_DEBUG_TCG
63
+const void *tcg_splitwx_to_rx(void *rw)
64
+{
65
+ /* Pass NULL pointers unchanged. */
66
+ if (rw) {
67
+ g_assert(in_code_gen_buffer(rw));
68
+ rw += tcg_splitwx_diff;
69
+ }
70
+ return rw;
71
+}
72
+
73
+void *tcg_splitwx_to_rw(const void *rx)
74
+{
75
+ /* Pass NULL pointers unchanged. */
76
+ if (rx) {
77
+ rx -= tcg_splitwx_diff;
78
+ /* Assert that we end with a pointer in the rw region. */
79
+ g_assert(in_code_gen_buffer(rx));
80
+ }
81
+ return (void *)rx;
82
+}
83
+#endif /* CONFIG_DEBUG_TCG */
84
+
85
/* compare a pointer @ptr and a tb_tc @s */
86
static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
87
{
88
diff --git a/tcg/tcg.c b/tcg/tcg.c
89
index XXXXXXX..XXXXXXX 100644
90
--- a/tcg/tcg.c
91
+++ b/tcg/tcg.c
92
@@ -XXX,XX +XXX,XX @@ static const TCGTargetOpDef constraint_sets[] = {
93
94
#include "tcg-target.c.inc"
95
96
-#ifdef CONFIG_DEBUG_TCG
97
-const void *tcg_splitwx_to_rx(void *rw)
98
-{
99
- /* Pass NULL pointers unchanged. */
100
- if (rw) {
101
- g_assert(in_code_gen_buffer(rw));
102
- rw += tcg_splitwx_diff;
103
- }
104
- return rw;
105
-}
106
-
107
-void *tcg_splitwx_to_rw(const void *rx)
108
-{
109
- /* Pass NULL pointers unchanged. */
110
- if (rx) {
111
- rx -= tcg_splitwx_diff;
112
- /* Assert that we end with a pointer in the rw region. */
113
- g_assert(in_code_gen_buffer(rx));
114
- }
115
- return (void *)rx;
116
-}
117
-#endif /* CONFIG_DEBUG_TCG */
118
-
119
static void alloc_tcg_plugin_context(TCGContext *s)
120
{
121
#ifdef CONFIG_PLUGIN
122
--
123
2.25.1
124
125
diff view generated by jsdifflib
Deleted patch
1
Do not mess around with setting values within tcg_init_ctx.
2
Put the values into 'region' directly, which is where they
3
will live for the lifetime of the program.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/region.c | 64 ++++++++++++++++++++++------------------------------
10
1 file changed, 27 insertions(+), 37 deletions(-)
11
12
diff --git a/tcg/region.c b/tcg/region.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/region.c
15
+++ b/tcg/region.c
16
@@ -XXX,XX +XXX,XX @@ static size_t tree_size;
17
18
bool in_code_gen_buffer(const void *p)
19
{
20
- const TCGContext *s = &tcg_init_ctx;
21
/*
22
* Much like it is valid to have a pointer to the byte past the
23
* end of an array (so long as you don't dereference it), allow
24
* a pointer to the byte past the end of the code gen buffer.
25
*/
26
- return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
27
+ return (size_t)(p - region.start_aligned) <= region.total_size;
28
}
29
30
#ifdef CONFIG_DEBUG_TCG
31
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
32
}
33
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
34
35
- tcg_ctx->code_gen_buffer = buf;
36
- tcg_ctx->code_gen_buffer_size = size;
37
+ region.start_aligned = buf;
38
+ region.total_size = size;
39
return true;
40
}
41
#elif defined(_WIN32)
42
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
43
return false;
44
}
45
46
- tcg_ctx->code_gen_buffer = buf;
47
- tcg_ctx->code_gen_buffer_size = size;
48
+ region.start_aligned = buf;
49
+ region.total_size = size;
50
return true;
51
}
52
#else
53
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
54
/* Request large pages for the buffer. */
55
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
56
57
- tcg_ctx->code_gen_buffer = buf;
58
- tcg_ctx->code_gen_buffer_size = size;
59
+ region.start_aligned = buf;
60
+ region.total_size = size;
61
return true;
62
}
63
64
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
65
return false;
66
}
67
/* The size of the mapping may have been adjusted. */
68
- size = tcg_ctx->code_gen_buffer_size;
69
- buf_rx = tcg_ctx->code_gen_buffer;
70
+ buf_rx = region.start_aligned;
71
+ size = region.total_size;
72
#endif
73
74
buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
75
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
76
#endif
77
78
close(fd);
79
- tcg_ctx->code_gen_buffer = buf_rw;
80
- tcg_ctx->code_gen_buffer_size = size;
81
+ region.start_aligned = buf_rw;
82
+ region.total_size = size;
83
tcg_splitwx_diff = buf_rx - buf_rw;
84
85
/* Request large pages for the buffer and the splitwx. */
86
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
87
return false;
88
}
89
90
- buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
91
+ buf_rw = region.start_aligned;
92
buf_rx = 0;
93
ret = mach_vm_remap(mach_task_self(),
94
&buf_rx,
95
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
96
*/
97
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
98
{
99
- void *buf, *aligned, *end;
100
- size_t total_size;
101
size_t page_size;
102
size_t region_size;
103
- size_t n_regions;
104
size_t i;
105
bool ok;
106
107
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
108
splitwx, &error_fatal);
109
assert(ok);
110
111
- buf = tcg_init_ctx.code_gen_buffer;
112
- total_size = tcg_init_ctx.code_gen_buffer_size;
113
- page_size = qemu_real_host_page_size;
114
- n_regions = tcg_n_regions(total_size, max_cpus);
115
-
116
- /* The first region will be 'aligned - buf' bytes larger than the others */
117
- aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
118
- g_assert(aligned < tcg_init_ctx.code_gen_buffer + total_size);
119
-
120
/*
121
* Make region_size a multiple of page_size, using aligned as the start.
122
* As a result of this we might end up with a few extra pages at the end of
123
* the buffer; we will assign those to the last region.
124
*/
125
- region_size = (total_size - (aligned - buf)) / n_regions;
126
+ region.n = tcg_n_regions(region.total_size, max_cpus);
127
+ page_size = qemu_real_host_page_size;
128
+ region_size = region.total_size / region.n;
129
region_size = QEMU_ALIGN_DOWN(region_size, page_size);
130
131
/* A region must have at least 2 pages; one code, one guard */
132
g_assert(region_size >= 2 * page_size);
133
+ region.stride = region_size;
134
+
135
+ /* Reserve space for guard pages. */
136
+ region.size = region_size - page_size;
137
+ region.total_size -= page_size;
138
+
139
+ /*
140
+ * The first region will be smaller than the others, via the prologue,
141
+ * which has yet to be allocated. For now, the first region begins at
142
+ * the page boundary.
143
+ */
144
+ region.after_prologue = region.start_aligned;
145
146
/* init the region struct */
147
qemu_mutex_init(&region.lock);
148
- region.n = n_regions;
149
- region.size = region_size - page_size;
150
- region.stride = region_size;
151
- region.after_prologue = buf;
152
- region.start_aligned = aligned;
153
- /* page-align the end, since its last page will be a guard page */
154
- end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
155
- /* account for that last guard page */
156
- end -= page_size;
157
- total_size = end - aligned;
158
- region.total_size = total_size;
159
160
/*
161
* Set guard pages in the rw buffer, as that's the one into which
162
--
163
2.25.1
164
165
diff view generated by jsdifflib
Deleted patch
1
Change the interface from a boolean error indication to a
2
negative error vs a non-negative protection. For the moment
3
this is only interface change, not making use of the new data.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/region.c | 63 +++++++++++++++++++++++++++-------------------------
10
1 file changed, 33 insertions(+), 30 deletions(-)
11
12
diff --git a/tcg/region.c b/tcg/region.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/region.c
15
+++ b/tcg/region.c
16
@@ -XXX,XX +XXX,XX @@ static inline void split_cross_256mb(void **obuf, size_t *osize,
17
static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
18
__attribute__((aligned(CODE_GEN_ALIGN)));
19
20
-static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
21
+static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
22
{
23
void *buf, *end;
24
size_t size;
25
26
if (splitwx > 0) {
27
error_setg(errp, "jit split-wx not supported");
28
- return false;
29
+ return -1;
30
}
31
32
/* page-align the beginning and end of the buffer */
33
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
34
35
region.start_aligned = buf;
36
region.total_size = size;
37
- return true;
38
+
39
+ return PROT_READ | PROT_WRITE;
40
}
41
#elif defined(_WIN32)
42
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
43
+static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
44
{
45
void *buf;
46
47
if (splitwx > 0) {
48
error_setg(errp, "jit split-wx not supported");
49
- return false;
50
+ return -1;
51
}
52
53
buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
54
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
55
56
region.start_aligned = buf;
57
region.total_size = size;
58
- return true;
59
+
60
+ return PAGE_READ | PAGE_WRITE | PAGE_EXEC;
61
}
62
#else
63
-static bool alloc_code_gen_buffer_anon(size_t size, int prot,
64
- int flags, Error **errp)
65
+static int alloc_code_gen_buffer_anon(size_t size, int prot,
66
+ int flags, Error **errp)
67
{
68
void *buf;
69
70
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
71
if (buf == MAP_FAILED) {
72
error_setg_errno(errp, errno,
73
"allocate %zu bytes for jit buffer", size);
74
- return false;
75
+ return -1;
76
}
77
78
#ifdef __mips__
79
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
80
81
region.start_aligned = buf;
82
region.total_size = size;
83
- return true;
84
+ return prot;
85
}
86
87
#ifndef CONFIG_TCG_INTERPRETER
88
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
89
90
#ifdef __mips__
91
/* Find space for the RX mapping, vs the 256MiB regions. */
92
- if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
93
- MAP_PRIVATE | MAP_ANONYMOUS |
94
- MAP_NORESERVE, errp)) {
95
+ if (alloc_code_gen_buffer_anon(size, PROT_NONE,
96
+ MAP_PRIVATE | MAP_ANONYMOUS |
97
+ MAP_NORESERVE, errp) < 0) {
98
return false;
99
}
100
/* The size of the mapping may have been adjusted. */
101
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
102
/* Request large pages for the buffer and the splitwx. */
103
qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
104
qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
105
- return true;
106
+ return PROT_READ | PROT_WRITE;
107
108
fail_rx:
109
error_setg_errno(errp, errno, "failed to map shared memory for execute");
110
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
111
if (fd >= 0) {
112
close(fd);
113
}
114
- return false;
115
+ return -1;
116
}
117
#endif /* CONFIG_POSIX */
118
119
@@ -XXX,XX +XXX,XX @@ extern kern_return_t mach_vm_remap(vm_map_t target_task,
120
vm_prot_t *max_protection,
121
vm_inherit_t inheritance);
122
123
-static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
124
+static int alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
125
{
126
kern_return_t ret;
127
mach_vm_address_t buf_rw, buf_rx;
128
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
129
/* Map the read-write portion via normal anon memory. */
130
if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
131
MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
132
- return false;
133
+ return -1;
134
}
135
136
buf_rw = region.start_aligned;
137
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
138
/* TODO: Convert "ret" to a human readable error message. */
139
error_setg(errp, "vm_remap for jit splitwx failed");
140
munmap((void *)buf_rw, size);
141
- return false;
142
+ return -1;
143
}
144
145
if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
146
error_setg_errno(errp, errno, "mprotect for jit splitwx");
147
munmap((void *)buf_rx, size);
148
munmap((void *)buf_rw, size);
149
- return false;
150
+ return -1;
151
}
152
153
tcg_splitwx_diff = buf_rx - buf_rw;
154
- return true;
155
+ return PROT_READ | PROT_WRITE;
156
}
157
#endif /* CONFIG_DARWIN */
158
#endif /* CONFIG_TCG_INTERPRETER */
159
160
-static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
161
+static int alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
162
{
163
#ifndef CONFIG_TCG_INTERPRETER
164
# ifdef CONFIG_DARWIN
165
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
166
# endif
167
#endif
168
error_setg(errp, "jit split-wx not supported");
169
- return false;
170
+ return -1;
171
}
172
173
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
174
+static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
175
{
176
ERRP_GUARD();
177
int prot, flags;
178
179
if (splitwx) {
180
- if (alloc_code_gen_buffer_splitwx(size, errp)) {
181
- return true;
182
+ prot = alloc_code_gen_buffer_splitwx(size, errp);
183
+ if (prot >= 0) {
184
+ return prot;
185
}
186
/*
187
* If splitwx force-on (1), fail;
188
* if splitwx default-on (-1), fall through to splitwx off.
189
*/
190
if (splitwx > 0) {
191
- return false;
192
+ return -1;
193
}
194
error_free_or_abort(errp);
195
}
196
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
197
size_t page_size;
198
size_t region_size;
199
size_t i;
200
- bool ok;
201
+ int have_prot;
202
203
- ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
204
- splitwx, &error_fatal);
205
- assert(ok);
206
+ have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
207
+ splitwx, &error_fatal);
208
+ assert(have_prot >= 0);
209
210
/*
211
* Make region_size a multiple of page_size, using aligned as the start.
212
--
213
2.25.1
214
215
diff view generated by jsdifflib
Deleted patch
1
Move the call out of the N versions of alloc_code_gen_buffer
2
and into tcg_region_init.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/region.c | 14 +++++++-------
9
1 file changed, 7 insertions(+), 7 deletions(-)
10
11
diff --git a/tcg/region.c b/tcg/region.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/region.c
14
+++ b/tcg/region.c
15
@@ -XXX,XX +XXX,XX @@ static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
16
error_setg_errno(errp, errno, "mprotect of jit buffer");
17
return false;
18
}
19
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
20
21
region.start_aligned = buf;
22
region.total_size = size;
23
@@ -XXX,XX +XXX,XX @@ static int alloc_code_gen_buffer_anon(size_t size, int prot,
24
}
25
#endif
26
27
- /* Request large pages for the buffer. */
28
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
29
-
30
region.start_aligned = buf;
31
region.total_size = size;
32
return prot;
33
@@ -XXX,XX +XXX,XX @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
34
region.total_size = size;
35
tcg_splitwx_diff = buf_rx - buf_rw;
36
37
- /* Request large pages for the buffer and the splitwx. */
38
- qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
39
- qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
40
return PROT_READ | PROT_WRITE;
41
42
fail_rx:
43
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
44
splitwx, &error_fatal);
45
assert(have_prot >= 0);
46
47
+ /* Request large pages for the buffer and the splitwx. */
48
+ qemu_madvise(region.start_aligned, region.total_size, QEMU_MADV_HUGEPAGE);
49
+ if (tcg_splitwx_diff) {
50
+ qemu_madvise(region.start_aligned + tcg_splitwx_diff,
51
+ region.total_size, QEMU_MADV_HUGEPAGE);
52
+ }
53
+
54
/*
55
* Make region_size a multiple of page_size, using aligned as the start.
56
* As a result of this we might end up with a few extra pages at the end of
57
--
58
2.25.1
59
60
diff view generated by jsdifflib
Deleted patch
1
For --enable-tcg-interpreter on Windows, we will need this.
2
1
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
include/qemu/osdep.h | 1 +
9
util/osdep.c | 9 +++++++++
10
2 files changed, 10 insertions(+)
11
12
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/include/qemu/osdep.h
15
+++ b/include/qemu/osdep.h
16
@@ -XXX,XX +XXX,XX @@ void sigaction_invoke(struct sigaction *action,
17
#endif
18
19
int qemu_madvise(void *addr, size_t len, int advice);
20
+int qemu_mprotect_rw(void *addr, size_t size);
21
int qemu_mprotect_rwx(void *addr, size_t size);
22
int qemu_mprotect_none(void *addr, size_t size);
23
24
diff --git a/util/osdep.c b/util/osdep.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/util/osdep.c
27
+++ b/util/osdep.c
28
@@ -XXX,XX +XXX,XX @@ static int qemu_mprotect__osdep(void *addr, size_t size, int prot)
29
#endif
30
}
31
32
+int qemu_mprotect_rw(void *addr, size_t size)
33
+{
34
+#ifdef _WIN32
35
+ return qemu_mprotect__osdep(addr, size, PAGE_READWRITE);
36
+#else
37
+ return qemu_mprotect__osdep(addr, size, PROT_READ | PROT_WRITE);
38
+#endif
39
+}
40
+
41
int qemu_mprotect_rwx(void *addr, size_t size)
42
{
43
#ifdef _WIN32
44
--
45
2.25.1
46
47
diff view generated by jsdifflib
Deleted patch
1
If qemu_get_host_physmem returns an odd number of pages,
2
then physmem / 8 will not be a multiple of the page size.
3
1
4
The following was observed on a gitlab runner:
5
6
ERROR qtest-arm/boot-serial-test - Bail out!
7
ERROR:../util/osdep.c:80:qemu_mprotect__osdep: \
8
assertion failed: (!(size & ~qemu_real_host_page_mask))
9
10
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
11
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
---
14
tcg/region.c | 47 +++++++++++++++++++++--------------------------
15
1 file changed, 21 insertions(+), 26 deletions(-)
16
17
diff --git a/tcg/region.c b/tcg/region.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/tcg/region.c
20
+++ b/tcg/region.c
21
@@ -XXX,XX +XXX,XX @@ static size_t tcg_n_regions(size_t tb_size, unsigned max_cpus)
22
(DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
23
? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
24
25
-static size_t size_code_gen_buffer(size_t tb_size)
26
-{
27
- /* Size the buffer. */
28
- if (tb_size == 0) {
29
- size_t phys_mem = qemu_get_host_physmem();
30
- if (phys_mem == 0) {
31
- tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
32
- } else {
33
- tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
34
- }
35
- }
36
- if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
37
- tb_size = MIN_CODE_GEN_BUFFER_SIZE;
38
- }
39
- if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
40
- tb_size = MAX_CODE_GEN_BUFFER_SIZE;
41
- }
42
- return tb_size;
43
-}
44
-
45
#ifdef __mips__
46
/*
47
* In order to use J and JAL within the code_gen_buffer, we require
48
@@ -XXX,XX +XXX,XX @@ static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
49
*/
50
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
51
{
52
- size_t page_size;
53
+ const size_t page_size = qemu_real_host_page_size;
54
size_t region_size;
55
size_t i;
56
int have_prot;
57
58
- have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
59
- splitwx, &error_fatal);
60
+ /* Size the buffer. */
61
+ if (tb_size == 0) {
62
+ size_t phys_mem = qemu_get_host_physmem();
63
+ if (phys_mem == 0) {
64
+ tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
65
+ } else {
66
+ tb_size = QEMU_ALIGN_DOWN(phys_mem / 8, page_size);
67
+ tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, tb_size);
68
+ }
69
+ }
70
+ if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
71
+ tb_size = MIN_CODE_GEN_BUFFER_SIZE;
72
+ }
73
+ if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
74
+ tb_size = MAX_CODE_GEN_BUFFER_SIZE;
75
+ }
76
+
77
+ have_prot = alloc_code_gen_buffer(tb_size, splitwx, &error_fatal);
78
assert(have_prot >= 0);
79
80
/* Request large pages for the buffer and the splitwx. */
81
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
82
* As a result of this we might end up with a few extra pages at the end of
83
* the buffer; we will assign those to the last region.
84
*/
85
- region.n = tcg_n_regions(region.total_size, max_cpus);
86
- page_size = qemu_real_host_page_size;
87
- region_size = region.total_size / region.n;
88
+ region.n = tcg_n_regions(tb_size, max_cpus);
89
+ region_size = tb_size / region.n;
90
region_size = QEMU_ALIGN_DOWN(region_size, page_size);
91
92
/* A region must have at least 2 pages; one code, one guard */
93
--
94
2.25.1
95
96
diff view generated by jsdifflib
Deleted patch
1
Do not handle protections on a case-by-case basis in the
2
various alloc_code_gen_buffer instances; do it within a
3
single loop in tcg_region_init.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/region.c | 45 +++++++++++++++++++++++++++++++--------------
10
1 file changed, 31 insertions(+), 14 deletions(-)
11
12
diff --git a/tcg/region.c b/tcg/region.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/region.c
15
+++ b/tcg/region.c
16
@@ -XXX,XX +XXX,XX @@ static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
17
}
18
#endif
19
20
- if (qemu_mprotect_rwx(buf, size)) {
21
- error_setg_errno(errp, errno, "mprotect of jit buffer");
22
- return false;
23
- }
24
-
25
region.start_aligned = buf;
26
region.total_size = size;
27
28
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
29
{
30
const size_t page_size = qemu_real_host_page_size;
31
size_t region_size;
32
- size_t i;
33
- int have_prot;
34
+ int have_prot, need_prot;
35
36
/* Size the buffer. */
37
if (tb_size == 0) {
38
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
39
* Set guard pages in the rw buffer, as that's the one into which
40
* buffer overruns could occur. Do not set guard pages in the rx
41
* buffer -- let that one use hugepages throughout.
42
+ * Work with the page protections set up with the initial mapping.
43
*/
44
- for (i = 0; i < region.n; i++) {
45
+ need_prot = PAGE_READ | PAGE_WRITE;
46
+#ifndef CONFIG_TCG_INTERPRETER
47
+ if (tcg_splitwx_diff == 0) {
48
+ need_prot |= PAGE_EXEC;
49
+ }
50
+#endif
51
+ for (size_t i = 0, n = region.n; i < n; i++) {
52
void *start, *end;
53
54
tcg_region_bounds(i, &start, &end);
55
+ if (have_prot != need_prot) {
56
+ int rc;
57
58
- /*
59
- * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
60
- * rejects a permission change from RWX -> NONE. Guard pages are
61
- * nice for bug detection but are not essential; ignore any failure.
62
- */
63
- (void)qemu_mprotect_none(end, page_size);
64
+ if (need_prot == (PAGE_READ | PAGE_WRITE | PAGE_EXEC)) {
65
+ rc = qemu_mprotect_rwx(start, end - start);
66
+ } else if (need_prot == (PAGE_READ | PAGE_WRITE)) {
67
+ rc = qemu_mprotect_rw(start, end - start);
68
+ } else {
69
+ g_assert_not_reached();
70
+ }
71
+ if (rc) {
72
+ error_setg_errno(&error_fatal, errno,
73
+ "mprotect of jit buffer");
74
+ }
75
+ }
76
+ if (have_prot != 0) {
77
+ /*
78
+ * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
79
+ * rejects a permission change from RWX -> NONE. Guard pages are
80
+ * nice for bug detection but are not essential; ignore any failure.
81
+ */
82
+ (void)qemu_mprotect_none(end, page_size);
83
+ }
84
}
85
86
tcg_region_trees_init();
87
--
88
2.25.1
89
90
diff view generated by jsdifflib
Deleted patch
1
There's a change in mprotect() behaviour [1] in the latest macOS
2
on M1 and it's not yet clear if it's going to be fixed by Apple.
3
1
4
In this case, instead of changing permissions of N guard pages,
5
we change permissions of N rwx regions. The same number of
6
syscalls are required either way.
7
8
[1] https://gist.github.com/hikalium/75ae822466ee4da13cbbe486498a191f
9
10
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
tcg/region.c | 19 +++++++++----------
14
1 file changed, 9 insertions(+), 10 deletions(-)
15
16
diff --git a/tcg/region.c b/tcg/region.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/tcg/region.c
19
+++ b/tcg/region.c
20
@@ -XXX,XX +XXX,XX @@ static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
21
error_free_or_abort(errp);
22
}
23
24
- prot = PROT_READ | PROT_WRITE | PROT_EXEC;
25
+ /*
26
+ * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
27
+ * rejects a permission change from RWX -> NONE when reserving the
28
+ * guard pages later. We can go the other way with the same number
29
+ * of syscalls, so always begin with PROT_NONE.
30
+ */
31
+ prot = PROT_NONE;
32
flags = MAP_PRIVATE | MAP_ANONYMOUS;
33
-#ifdef CONFIG_TCG_INTERPRETER
34
- /* The tcg interpreter does not need execute permission. */
35
- prot = PROT_READ | PROT_WRITE;
36
-#elif defined(CONFIG_DARWIN)
37
+#ifdef CONFIG_DARWIN
38
/* Applicable to both iOS and macOS (Apple Silicon). */
39
if (!splitwx) {
40
flags |= MAP_JIT;
41
@@ -XXX,XX +XXX,XX @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
42
}
43
}
44
if (have_prot != 0) {
45
- /*
46
- * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
47
- * rejects a permission change from RWX -> NONE. Guard pages are
48
- * nice for bug detection but are not essential; ignore any failure.
49
- */
50
+ /* Guard pages are nice for bug detection but are not essential. */
51
(void)qemu_mprotect_none(end, page_size);
52
}
53
}
54
--
55
2.25.1
56
57
diff view generated by jsdifflib
Deleted patch
1
These variables belong to the jit side, not the user side.
2
1
3
Since tcg_init_ctx is no longer used outside of tcg/, move
4
the declaration to tcg-internal.h.
5
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
8
Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
11
include/tcg/tcg.h | 1 -
12
tcg/tcg-internal.h | 1 +
13
accel/tcg/translate-all.c | 3 ---
14
tcg/tcg.c | 3 +++
15
4 files changed, 4 insertions(+), 4 deletions(-)
16
17
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/include/tcg/tcg.h
20
+++ b/include/tcg/tcg.h
21
@@ -XXX,XX +XXX,XX @@ static inline bool temp_readonly(TCGTemp *ts)
22
return ts->kind >= TEMP_FIXED;
23
}
24
25
-extern TCGContext tcg_init_ctx;
26
extern __thread TCGContext *tcg_ctx;
27
extern const void *tcg_code_gen_epilogue;
28
extern uintptr_t tcg_splitwx_diff;
29
diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
30
index XXXXXXX..XXXXXXX 100644
31
--- a/tcg/tcg-internal.h
32
+++ b/tcg/tcg-internal.h
33
@@ -XXX,XX +XXX,XX @@
34
35
#define TCG_HIGHWATER 1024
36
37
+extern TCGContext tcg_init_ctx;
38
extern TCGContext **tcg_ctxs;
39
extern unsigned int tcg_cur_ctxs;
40
extern unsigned int tcg_max_ctxs;
41
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
42
index XXXXXXX..XXXXXXX 100644
43
--- a/accel/tcg/translate-all.c
44
+++ b/accel/tcg/translate-all.c
45
@@ -XXX,XX +XXX,XX @@ static int v_l2_levels;
46
47
static void *l1_map[V_L1_MAX_SIZE];
48
49
-/* code generation context */
50
-TCGContext tcg_init_ctx;
51
-__thread TCGContext *tcg_ctx;
52
TBContext tb_ctx;
53
54
static void page_table_config_init(void)
55
diff --git a/tcg/tcg.c b/tcg/tcg.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/tcg/tcg.c
58
+++ b/tcg/tcg.c
59
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct);
60
static int tcg_out_ldst_finalize(TCGContext *s);
61
#endif
62
63
+TCGContext tcg_init_ctx;
64
+__thread TCGContext *tcg_ctx;
65
+
66
TCGContext **tcg_ctxs;
67
unsigned int tcg_cur_ctxs;
68
unsigned int tcg_max_ctxs;
69
--
70
2.25.1
71
72
diff view generated by jsdifflib
Deleted patch
1
Introduce a function to remove everything emitted
2
since a given point.
3
1
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
include/tcg/tcg.h | 10 ++++++++++
8
tcg/tcg.c | 13 +++++++++++++
9
2 files changed, 23 insertions(+)
10
11
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
12
index XXXXXXX..XXXXXXX 100644
13
--- a/include/tcg/tcg.h
14
+++ b/include/tcg/tcg.h
15
@@ -XXX,XX +XXX,XX @@ void tcg_op_remove(TCGContext *s, TCGOp *op);
16
TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *op, TCGOpcode opc);
17
TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *op, TCGOpcode opc);
18
19
+/**
20
+ * tcg_remove_ops_after:
21
+ * @op: target operation
22
+ *
23
+ * Discard any opcodes emitted since @op. Expected usage is to save
24
+ * a starting point with tcg_last_op(), speculatively emit opcodes,
25
+ * then decide whether or not to keep those opcodes after the fact.
26
+ */
27
+void tcg_remove_ops_after(TCGOp *op);
28
+
29
void tcg_optimize(TCGContext *s);
30
31
/* Allocate a new temporary and initialize it with a constant. */
32
diff --git a/tcg/tcg.c b/tcg/tcg.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/tcg/tcg.c
35
+++ b/tcg/tcg.c
36
@@ -XXX,XX +XXX,XX @@ void tcg_op_remove(TCGContext *s, TCGOp *op)
37
#endif
38
}
39
40
+void tcg_remove_ops_after(TCGOp *op)
41
+{
42
+ TCGContext *s = tcg_ctx;
43
+
44
+ while (true) {
45
+ TCGOp *last = tcg_last_op();
46
+ if (last == op) {
47
+ return;
48
+ }
49
+ tcg_op_remove(s, last);
50
+ }
51
+}
52
+
53
static TCGOp *tcg_op_alloc(TCGOpcode opc)
54
{
55
TCGContext *s = tcg_ctx;
56
--
57
2.25.1
58
59
diff view generated by jsdifflib
Deleted patch
1
At some point during the development of tcg_constant_*, I changed
2
my mind about whether such temps should be able to be passed to
3
tcg_temp_free_*. The final version committed allows this, but the
4
commentary was not updated to match.
5
1
6
Fixes: c0522136adf
7
Reported-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
---
12
include/tcg/tcg.h | 3 ++-
13
1 file changed, 2 insertions(+), 1 deletion(-)
14
15
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/include/tcg/tcg.h
18
+++ b/include/tcg/tcg.h
19
@@ -XXX,XX +XXX,XX @@ TCGv_vec tcg_const_ones_vec_matching(TCGv_vec);
20
21
/*
22
* Locate or create a read-only temporary that is a constant.
23
- * This kind of temporary need not and should not be freed.
24
+ * This kind of temporary need not be freed, but for convenience
25
+ * will be silently ignored by tcg_temp_free_*.
26
*/
27
TCGTemp *tcg_constant_internal(TCGType type, int64_t val);
28
29
--
30
2.25.1
31
32
diff view generated by jsdifflib
Deleted patch
1
From: "Jose R. Ziviani" <jziviani@suse.de>
2
1
3
Commit 5e8892db93 fixed several function signatures but tcg_out_op for
4
arm is missing. This patch fixes it as well.
5
6
Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
7
Message-Id: <20210610224450.23425-1-jziviani@suse.de>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
10
tcg/arm/tcg-target.c.inc | 3 ++-
11
1 file changed, 2 insertions(+), 1 deletion(-)
12
13
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/arm/tcg-target.c.inc
16
+++ b/tcg/arm/tcg-target.c.inc
17
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
18
static void tcg_out_epilogue(TCGContext *s);
19
20
static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
21
- const TCGArg *args, const int *const_args)
22
+ const TCGArg args[TCG_MAX_OP_ARGS],
23
+ const int const_args[TCG_MAX_OP_ARGS])
24
{
25
TCGArg a0, a1, a2, a3, a4, a5;
26
int c;
27
--
28
2.25.1
29
30
diff view generated by jsdifflib