1
The linux-user patches are on the tcg-ish side of user-only
1
The following changes since commit 7fe6cb68117ac856e03c93d18aca09de015392b0:
2
emulation, rather than the syscall-ish side, so queuing here.
3
Solving the deadlock issue is quite important vs timeouts.
4
2
5
3
Merge tag 'pull-target-arm-20230530-1' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2023-05-30 08:02:05 -0700)
6
r~
7
8
9
The following changes since commit 6dffbe36af79e26a4d23f94a9a1c1201de99c261:
10
11
Merge tag 'migration-20230215-pull-request' of https://gitlab.com/juan.quintela/qemu into staging (2023-02-16 13:09:51 +0000)
12
4
13
are available in the Git repository at:
5
are available in the Git repository at:
14
6
15
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230219
7
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230530
16
8
17
for you to fetch changes up to 2f5b4792c0220920831ac84f94c3435b14791857:
9
for you to fetch changes up to 276d77de503e8f5f5cbd3f7d94302ca12d1d982e:
18
10
19
target/microblaze: Add gdbstub xml (2023-02-19 16:12:26 -1000)
11
tests/decode: Add tests for various named-field cases (2023-05-30 10:55:39 -0700)
20
12
21
----------------------------------------------------------------
13
----------------------------------------------------------------
22
tcg: Allow first half of insn in ram, and second half in mmio
14
Improvements to 128-bit atomics:
23
linux-user/sparc: SIGILL for unknown trap vectors
15
- Separate __int128_t type and arithmetic detection
24
linux-user/microblaze: SIGILL for privileged insns
16
- Support 128-bit load/store in backend for i386, aarch64, ppc64, s390x
25
linux-user: Fix deadlock while exiting due to signal
17
- Accelerate atomics via host/include/
26
target/microblaze: Add gdbstub xml
18
Decodetree:
19
- Add named field syntax
20
- Move tests to meson
27
21
28
----------------------------------------------------------------
22
----------------------------------------------------------------
29
Ilya Leoshkevich (4):
23
Peter Maydell (5):
30
linux-user: Always exit from exclusive state in fork_end()
24
docs: Document decodetree named field syntax
31
cpus: Make {start,end}_exclusive() recursive
25
scripts/decodetree: Pass lvalue-formatter function to str_extract()
32
linux-user/microblaze: Handle privileged exception
26
scripts/decodetree: Implement a topological sort
33
tests/tcg/linux-test: Add linux-fork-trap test
27
scripts/decodetree: Implement named field support
28
tests/decode: Add tests for various named-field cases
34
29
35
Richard Henderson (3):
30
Richard Henderson (22):
36
accel/tcg: Allow the second page of an instruction to be MMIO
31
tcg: Fix register move type in tcg_out_ld_helper_ret
37
linux-user/sparc: Raise SIGILL for all unhandled software traps
32
accel/tcg: Fix check for page writeability in load_atomic16_or_exit
38
target/microblaze: Add gdbstub xml
33
meson: Split test for __int128_t type from __int128_t arithmetic
34
qemu/atomic128: Add x86_64 atomic128-ldst.h
35
tcg/i386: Support 128-bit load/store
36
tcg/aarch64: Rename temporaries
37
tcg/aarch64: Reserve TCG_REG_TMP1, TCG_REG_TMP2
38
tcg/aarch64: Simplify constraints on qemu_ld/st
39
tcg/aarch64: Support 128-bit load/store
40
tcg/ppc: Support 128-bit load/store
41
tcg/s390x: Support 128-bit load/store
42
accel/tcg: Extract load_atom_extract_al16_or_al8 to host header
43
accel/tcg: Extract store_atom_insert_al16 to host header
44
accel/tcg: Add x86_64 load_atom_extract_al16_or_al8
45
accel/tcg: Add aarch64 lse2 load_atom_extract_al16_or_al8
46
accel/tcg: Add aarch64 store_atom_insert_al16
47
tcg: Remove TCG_TARGET_TLB_DISPLACEMENT_BITS
48
decodetree: Add --test-for-error
49
decodetree: Fix recursion in prop_format and build_tree
50
decodetree: Diagnose empty pattern group
51
decodetree: Do not remove output_file from /dev
52
tests/decode: Convert tests to meson
39
53
40
configs/targets/microblaze-linux-user.mak | 1 +
54
docs/devel/decodetree.rst | 33 ++-
41
configs/targets/microblaze-softmmu.mak | 1 +
55
meson.build | 15 +-
42
configs/targets/microblazeel-linux-user.mak | 1 +
56
host/include/aarch64/host/load-extract-al16-al8.h | 40 ++++
43
configs/targets/microblazeel-softmmu.mak | 1 +
57
host/include/aarch64/host/store-insert-al16.h | 47 ++++
44
include/hw/core/cpu.h | 4 +-
58
host/include/generic/host/load-extract-al16-al8.h | 45 ++++
45
target/microblaze/cpu.h | 2 +
59
host/include/generic/host/store-insert-al16.h | 50 ++++
46
accel/tcg/translator.c | 12 +++++-
60
host/include/x86_64/host/atomic128-ldst.h | 68 ++++++
47
cpus-common.c | 12 +++++-
61
host/include/x86_64/host/load-extract-al16-al8.h | 50 ++++
48
linux-user/main.c | 10 +++--
62
include/qemu/int128.h | 4 +-
49
linux-user/microblaze/cpu_loop.c | 10 ++++-
63
tcg/aarch64/tcg-target-con-set.h | 4 +-
50
linux-user/sparc/cpu_loop.c | 8 ++++
64
tcg/aarch64/tcg-target-con-str.h | 1 -
51
linux-user/syscall.c | 1 +
65
tcg/aarch64/tcg-target.h | 12 +-
52
target/microblaze/cpu.c | 7 ++-
66
tcg/arm/tcg-target.h | 1 -
53
target/microblaze/gdbstub.c | 51 ++++++++++++++++------
67
tcg/i386/tcg-target.h | 5 +-
54
tests/tcg/multiarch/linux/linux-fork-trap.c | 51 ++++++++++++++++++++++
68
tcg/mips/tcg-target.h | 1 -
55
gdb-xml/microblaze-core.xml | 67 +++++++++++++++++++++++++++++
69
tcg/ppc/tcg-target-con-set.h | 2 +
56
gdb-xml/microblaze-stack-protect.xml | 12 ++++++
70
tcg/ppc/tcg-target-con-str.h | 1 +
57
17 files changed, 224 insertions(+), 27 deletions(-)
71
tcg/ppc/tcg-target.h | 4 +-
58
create mode 100644 tests/tcg/multiarch/linux/linux-fork-trap.c
72
tcg/riscv/tcg-target.h | 1 -
59
create mode 100644 gdb-xml/microblaze-core.xml
73
tcg/s390x/tcg-target-con-set.h | 2 +
60
create mode 100644 gdb-xml/microblaze-stack-protect.xml
74
tcg/s390x/tcg-target.h | 3 +-
75
tcg/sparc64/tcg-target.h | 1 -
76
tcg/tci/tcg-target.h | 1 -
77
tests/decode/err_field10.decode | 7 +
78
tests/decode/err_field7.decode | 7 +
79
tests/decode/err_field8.decode | 8 +
80
tests/decode/err_field9.decode | 14 ++
81
tests/decode/succ_named_field.decode | 19 ++
82
tcg/tcg.c | 4 +-
83
accel/tcg/ldst_atomicity.c.inc | 80 +------
84
tcg/aarch64/tcg-target.c.inc | 243 +++++++++++++++-----
85
tcg/i386/tcg-target.c.inc | 191 +++++++++++++++-
86
tcg/ppc/tcg-target.c.inc | 108 ++++++++-
87
tcg/s390x/tcg-target.c.inc | 107 ++++++++-
88
scripts/decodetree.py | 265 ++++++++++++++++++++--
89
tests/decode/check.sh | 24 --
90
tests/decode/meson.build | 64 ++++++
91
tests/meson.build | 5 +-
92
38 files changed, 1312 insertions(+), 225 deletions(-)
93
create mode 100644 host/include/aarch64/host/load-extract-al16-al8.h
94
create mode 100644 host/include/aarch64/host/store-insert-al16.h
95
create mode 100644 host/include/generic/host/load-extract-al16-al8.h
96
create mode 100644 host/include/generic/host/store-insert-al16.h
97
create mode 100644 host/include/x86_64/host/atomic128-ldst.h
98
create mode 100644 host/include/x86_64/host/load-extract-al16-al8.h
99
create mode 100644 tests/decode/err_field10.decode
100
create mode 100644 tests/decode/err_field7.decode
101
create mode 100644 tests/decode/err_field8.decode
102
create mode 100644 tests/decode/err_field9.decode
103
create mode 100644 tests/decode/succ_named_field.decode
104
delete mode 100755 tests/decode/check.sh
105
create mode 100644 tests/decode/meson.build
diff view generated by jsdifflib
New patch
1
The first move was incorrectly using TCG_TYPE_I32 while the second
2
move was correctly using TCG_TYPE_REG. This prevents a 64-bit host
3
from moving all 128-bits of the return value.
1
4
5
Fixes: ebebea53ef8 ("tcg: Support TCG_TYPE_I128 in tcg_out_{ld,st}_helper_{args,ret}")
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
tcg/tcg.c | 4 ++--
10
1 file changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/tcg/tcg.c b/tcg/tcg.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/tcg.c
15
+++ b/tcg/tcg.c
16
@@ -XXX,XX +XXX,XX @@ static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *ldst,
17
mov[0].dst = ldst->datalo_reg;
18
mov[0].src =
19
tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, HOST_BIG_ENDIAN);
20
- mov[0].dst_type = TCG_TYPE_I32;
21
- mov[0].src_type = TCG_TYPE_I32;
22
+ mov[0].dst_type = TCG_TYPE_REG;
23
+ mov[0].src_type = TCG_TYPE_REG;
24
mov[0].src_ext = TCG_TARGET_REG_BITS == 32 ? MO_32 : MO_64;
25
26
mov[1].dst = ldst->datahi_reg;
27
--
28
2.34.1
diff view generated by jsdifflib
New patch
1
PAGE_WRITE is current writability, as modified by TB protection;
2
PAGE_WRITE_ORG is the original page writability.
1
3
4
Fixes: cdfac37be0d ("accel/tcg: Honor atomicity of loads")
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
accel/tcg/ldst_atomicity.c.inc | 4 ++--
9
1 file changed, 2 insertions(+), 2 deletions(-)
10
11
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/accel/tcg/ldst_atomicity.c.inc
14
+++ b/accel/tcg/ldst_atomicity.c.inc
15
@@ -XXX,XX +XXX,XX @@ static uint64_t load_atomic8_or_exit(CPUArchState *env, uintptr_t ra, void *pv)
16
* another process, because the fallback start_exclusive solution
17
* provides no protection across processes.
18
*/
19
- if (!page_check_range(h2g(pv), 8, PAGE_WRITE)) {
20
+ if (!page_check_range(h2g(pv), 8, PAGE_WRITE_ORG)) {
21
uint64_t *p = __builtin_assume_aligned(pv, 8);
22
return *p;
23
}
24
@@ -XXX,XX +XXX,XX @@ static Int128 load_atomic16_or_exit(CPUArchState *env, uintptr_t ra, void *pv)
25
* another process, because the fallback start_exclusive solution
26
* provides no protection across processes.
27
*/
28
- if (!page_check_range(h2g(p), 16, PAGE_WRITE)) {
29
+ if (!page_check_range(h2g(p), 16, PAGE_WRITE_ORG)) {
30
return *p;
31
}
32
#endif
33
--
34
2.34.1
diff view generated by jsdifflib
New patch
1
Older versions of clang have missing runtime functions for arithmetic
2
with -fsanitize=undefined (see 464e3671f9d5c), so we cannot use
3
__int128_t for implementing Int128. But __int128_t is present,
4
data movement works, and it can be used for atomic128.
1
5
6
Probe for both CONFIG_INT128_TYPE and CONFIG_INT128, adjust
7
qemu/int128.h to define Int128Alias if CONFIG_INT128_TYPE,
8
and adjust the meson probe for atomics to use has_int128_type.
9
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
meson.build | 15 ++++++++++-----
14
include/qemu/int128.h | 4 ++--
15
2 files changed, 12 insertions(+), 7 deletions(-)
16
17
diff --git a/meson.build b/meson.build
18
index XXXXXXX..XXXXXXX 100644
19
--- a/meson.build
20
+++ b/meson.build
21
@@ -XXX,XX +XXX,XX @@ config_host_data.set('CONFIG_ATOMIC64', cc.links('''
22
return 0;
23
}'''))
24
25
-has_int128 = cc.links('''
26
+has_int128_type = cc.compiles('''
27
+ __int128_t a;
28
+ __uint128_t b;
29
+ int main(void) { b = a; }''')
30
+config_host_data.set('CONFIG_INT128_TYPE', has_int128_type)
31
+
32
+has_int128 = has_int128_type and cc.links('''
33
__int128_t a;
34
__uint128_t b;
35
int main (void) {
36
@@ -XXX,XX +XXX,XX @@ has_int128 = cc.links('''
37
a = a * a;
38
return 0;
39
}''')
40
-
41
config_host_data.set('CONFIG_INT128', has_int128)
42
43
-if has_int128
44
+if has_int128_type
45
# "do we have 128-bit atomics which are handled inline and specifically not
46
# via libatomic". The reason we can't use libatomic is documented in the
47
# comment starting "GCC is a house divided" in include/qemu/atomic128.h.
48
@@ -XXX,XX +XXX,XX @@ if has_int128
49
# __alignof(unsigned __int128) for the host.
50
atomic_test_128 = '''
51
int main(int ac, char **av) {
52
- unsigned __int128 *p = __builtin_assume_aligned(av[ac - 1], 16);
53
+ __uint128_t *p = __builtin_assume_aligned(av[ac - 1], 16);
54
p[1] = __atomic_load_n(&p[0], __ATOMIC_RELAXED);
55
__atomic_store_n(&p[2], p[3], __ATOMIC_RELAXED);
56
__atomic_compare_exchange_n(&p[4], &p[5], p[6], 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
57
@@ -XXX,XX +XXX,XX @@ if has_int128
58
config_host_data.set('CONFIG_CMPXCHG128', cc.links('''
59
int main(void)
60
{
61
- unsigned __int128 x = 0, y = 0;
62
+ __uint128_t x = 0, y = 0;
63
__sync_val_compare_and_swap_16(&x, y, x);
64
return 0;
65
}
66
diff --git a/include/qemu/int128.h b/include/qemu/int128.h
67
index XXXXXXX..XXXXXXX 100644
68
--- a/include/qemu/int128.h
69
+++ b/include/qemu/int128.h
70
@@ -XXX,XX +XXX,XX @@ static inline void bswap128s(Int128 *s)
71
* a possible structure and the native types. Ease parameter passing
72
* via use of the transparent union extension.
73
*/
74
-#ifdef CONFIG_INT128
75
+#ifdef CONFIG_INT128_TYPE
76
typedef union {
77
__uint128_t u;
78
__int128_t i;
79
@@ -XXX,XX +XXX,XX @@ typedef union {
80
} Int128Alias __attribute__((transparent_union));
81
#else
82
typedef Int128 Int128Alias;
83
-#endif /* CONFIG_INT128 */
84
+#endif /* CONFIG_INT128_TYPE */
85
86
#endif /* INT128_H */
87
--
88
2.34.1
diff view generated by jsdifflib
1
From: Ilya Leoshkevich <iii@linux.ibm.com>
1
With CPUINFO_ATOMIC_VMOVDQA, we can perform proper atomic
2
load/store without cmpxchg16b.
2
3
3
Check that dying due to a signal does not deadlock.
4
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
8
Message-Id: <20230214140829.45392-5-iii@linux.ibm.com>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
6
---
11
tests/tcg/multiarch/linux/linux-fork-trap.c | 51 +++++++++++++++++++++
7
host/include/x86_64/host/atomic128-ldst.h | 68 +++++++++++++++++++++++
12
1 file changed, 51 insertions(+)
8
1 file changed, 68 insertions(+)
13
create mode 100644 tests/tcg/multiarch/linux/linux-fork-trap.c
9
create mode 100644 host/include/x86_64/host/atomic128-ldst.h
14
10
15
diff --git a/tests/tcg/multiarch/linux/linux-fork-trap.c b/tests/tcg/multiarch/linux/linux-fork-trap.c
11
diff --git a/host/include/x86_64/host/atomic128-ldst.h b/host/include/x86_64/host/atomic128-ldst.h
16
new file mode 100644
12
new file mode 100644
17
index XXXXXXX..XXXXXXX
13
index XXXXXXX..XXXXXXX
18
--- /dev/null
14
--- /dev/null
19
+++ b/tests/tcg/multiarch/linux/linux-fork-trap.c
15
+++ b/host/include/x86_64/host/atomic128-ldst.h
20
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@
21
+/*
17
+/*
22
+ * Test that a fork()ed process terminates after __builtin_trap().
18
+ * SPDX-License-Identifier: GPL-2.0-or-later
19
+ * Load/store for 128-bit atomic operations, x86_64 version.
23
+ *
20
+ *
24
+ * SPDX-License-Identifier: GPL-2.0-or-later
21
+ * Copyright (C) 2023 Linaro, Ltd.
22
+ *
23
+ * See docs/devel/atomics.rst for discussion about the guarantees each
24
+ * atomic primitive is meant to provide.
25
+ */
25
+ */
26
+#include <assert.h>
27
+#include <stdio.h>
28
+#include <stdlib.h>
29
+#include <sys/resource.h>
30
+#include <sys/wait.h>
31
+#include <unistd.h>
32
+
26
+
33
+int main(void)
27
+#ifndef AARCH64_ATOMIC128_LDST_H
28
+#define AARCH64_ATOMIC128_LDST_H
29
+
30
+#ifdef CONFIG_INT128_TYPE
31
+#include "host/cpuinfo.h"
32
+#include "tcg/debug-assert.h"
33
+
34
+/*
35
+ * Through clang 16, with -mcx16, __atomic_load_n is incorrectly
36
+ * expanded to a read-write operation: lock cmpxchg16b.
37
+ */
38
+
39
+#define HAVE_ATOMIC128_RO likely(cpuinfo & CPUINFO_ATOMIC_VMOVDQA)
40
+#define HAVE_ATOMIC128_RW 1
41
+
42
+static inline Int128 atomic16_read_ro(const Int128 *ptr)
34
+{
43
+{
35
+ struct rlimit nodump;
44
+ Int128Alias r;
36
+ pid_t err, pid;
37
+ int wstatus;
38
+
45
+
39
+ pid = fork();
46
+ tcg_debug_assert(HAVE_ATOMIC128_RO);
40
+ assert(pid != -1);
47
+ asm("vmovdqa %1, %0" : "=x" (r.i) : "m" (*ptr));
41
+ if (pid == 0) {
48
+
42
+ /* We are about to crash on purpose; disable core dumps. */
49
+ return r.s;
43
+ if (getrlimit(RLIMIT_CORE, &nodump)) {
50
+}
44
+ return EXIT_FAILURE;
51
+
45
+ }
52
+static inline Int128 atomic16_read_rw(Int128 *ptr)
46
+ nodump.rlim_cur = 0;
53
+{
47
+ if (setrlimit(RLIMIT_CORE, &nodump)) {
54
+ __int128_t *ptr_align = __builtin_assume_aligned(ptr, 16);
48
+ return EXIT_FAILURE;
55
+ Int128Alias r;
49
+ }
56
+
50
+ /*
57
+ if (HAVE_ATOMIC128_RO) {
51
+ * An alternative would be to dereference a NULL pointer, but that
58
+ asm("vmovdqa %1, %0" : "=x" (r.i) : "m" (*ptr_align));
52
+ * would be an UB in C.
59
+ } else {
53
+ */
60
+ r.i = __sync_val_compare_and_swap_16(ptr_align, 0, 0);
54
+ printf("about to trigger fault...\n");
61
+ }
55
+#if defined(__MICROBLAZE__)
62
+ return r.s;
56
+ /*
63
+}
57
+ * gcc emits "bri 0", which is an endless loop.
64
+
58
+ * Take glibc's ABORT_INSTRUCTION.
65
+static inline void atomic16_set(Int128 *ptr, Int128 val)
59
+ */
66
+{
60
+ asm volatile("brki r0,-1");
67
+ __int128_t *ptr_align = __builtin_assume_aligned(ptr, 16);
68
+ Int128Alias new = { .s = val };
69
+
70
+ if (HAVE_ATOMIC128_RO) {
71
+ asm("vmovdqa %1, %0" : "=m"(*ptr_align) : "x" (new.i));
72
+ } else {
73
+ __int128_t old;
74
+ do {
75
+ old = *ptr_align;
76
+ } while (!__sync_bool_compare_and_swap_16(ptr_align, old, new.i));
77
+ }
78
+}
61
+#else
79
+#else
62
+ __builtin_trap();
80
+/* Provide QEMU_ERROR stubs. */
81
+#include "host/include/generic/host/atomic128-ldst.h"
63
+#endif
82
+#endif
64
+ }
65
+ err = waitpid(pid, &wstatus, 0);
66
+ assert(err == pid);
67
+ assert(WIFSIGNALED(wstatus));
68
+ printf("faulting thread exited cleanly\n");
69
+
83
+
70
+ return EXIT_SUCCESS;
84
+#endif /* AARCH64_ATOMIC128_LDST_H */
71
+}
72
--
85
--
73
2.34.1
86
2.34.1
74
87
75
88
diff view generated by jsdifflib
New patch
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
tcg/i386/tcg-target.h | 4 +-
5
tcg/i386/tcg-target.c.inc | 191 +++++++++++++++++++++++++++++++++++++-
6
2 files changed, 190 insertions(+), 5 deletions(-)
1
7
8
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
9
index XXXXXXX..XXXXXXX 100644
10
--- a/tcg/i386/tcg-target.h
11
+++ b/tcg/i386/tcg-target.h
12
@@ -XXX,XX +XXX,XX @@ typedef enum {
13
#define have_avx1 (cpuinfo & CPUINFO_AVX1)
14
#define have_avx2 (cpuinfo & CPUINFO_AVX2)
15
#define have_movbe (cpuinfo & CPUINFO_MOVBE)
16
-#define have_atomic16 (cpuinfo & CPUINFO_ATOMIC_VMOVDQA)
17
18
/*
19
* There are interesting instructions in AVX512, so long as we have AVX512VL,
20
@@ -XXX,XX +XXX,XX @@ typedef enum {
21
#define TCG_TARGET_HAS_qemu_st8_i32 1
22
#endif
23
24
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
25
+#define TCG_TARGET_HAS_qemu_ldst_i128 \
26
+ (TCG_TARGET_REG_BITS == 64 && (cpuinfo & CPUINFO_ATOMIC_VMOVDQA))
27
28
/* We do not support older SSE systems, only beginning with AVX1. */
29
#define TCG_TARGET_HAS_v64 have_avx1
30
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
31
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/i386/tcg-target.c.inc
33
+++ b/tcg/i386/tcg-target.c.inc
34
@@ -XXX,XX +XXX,XX @@ static const int tcg_target_reg_alloc_order[] = {
35
#endif
36
};
37
38
+#define TCG_TMP_VEC TCG_REG_XMM5
39
+
40
static const int tcg_target_call_iarg_regs[] = {
41
#if TCG_TARGET_REG_BITS == 64
42
#if defined(_WIN64)
43
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
44
#define OPC_PCMPGTW (0x65 | P_EXT | P_DATA16)
45
#define OPC_PCMPGTD (0x66 | P_EXT | P_DATA16)
46
#define OPC_PCMPGTQ (0x37 | P_EXT38 | P_DATA16)
47
+#define OPC_PEXTRD (0x16 | P_EXT3A | P_DATA16)
48
+#define OPC_PINSRD (0x22 | P_EXT3A | P_DATA16)
49
#define OPC_PMAXSB (0x3c | P_EXT38 | P_DATA16)
50
#define OPC_PMAXSW (0xee | P_EXT | P_DATA16)
51
#define OPC_PMAXSD (0x3d | P_EXT38 | P_DATA16)
52
@@ -XXX,XX +XXX,XX @@ typedef struct {
53
54
bool tcg_target_has_memory_bswap(MemOp memop)
55
{
56
- return have_movbe;
57
+ TCGAtomAlign aa;
58
+
59
+ if (!have_movbe) {
60
+ return false;
61
+ }
62
+ if ((memop & MO_SIZE) < MO_128) {
63
+ return true;
64
+ }
65
+
66
+ /*
67
+ * Reject 16-byte memop with 16-byte atomicity, i.e. VMOVDQA,
68
+ * but do allow a pair of 64-bit operations, i.e. MOVBEQ.
69
+ */
70
+ aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
71
+ return aa.atom < MO_128;
72
}
73
74
/*
75
@@ -XXX,XX +XXX,XX @@ static const TCGLdstHelperParam ldst_helper_param = {
76
static const TCGLdstHelperParam ldst_helper_param = { };
77
#endif
78
79
+static void tcg_out_vec_to_pair(TCGContext *s, TCGType type,
80
+ TCGReg l, TCGReg h, TCGReg v)
81
+{
82
+ int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW;
83
+
84
+ /* vpmov{d,q} %v, %l */
85
+ tcg_out_vex_modrm(s, OPC_MOVD_EyVy + rexw, v, 0, l);
86
+ /* vpextr{d,q} $1, %v, %h */
87
+ tcg_out_vex_modrm(s, OPC_PEXTRD + rexw, v, 0, h);
88
+ tcg_out8(s, 1);
89
+}
90
+
91
+static void tcg_out_pair_to_vec(TCGContext *s, TCGType type,
92
+ TCGReg v, TCGReg l, TCGReg h)
93
+{
94
+ int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW;
95
+
96
+ /* vmov{d,q} %l, %v */
97
+ tcg_out_vex_modrm(s, OPC_MOVD_VyEy + rexw, v, 0, l);
98
+ /* vpinsr{d,q} $1, %h, %v, %v */
99
+ tcg_out_vex_modrm(s, OPC_PINSRD + rexw, v, v, h);
100
+ tcg_out8(s, 1);
101
+}
102
+
103
/*
104
* Generate code for the slow path for a load at the end of block
105
*/
106
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
107
{
108
TCGLabelQemuLdst *ldst = NULL;
109
MemOp opc = get_memop(oi);
110
+ MemOp s_bits = opc & MO_SIZE;
111
unsigned a_mask;
112
113
#ifdef CONFIG_SOFTMMU
114
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
115
*h = x86_guest_base;
116
#endif
117
h->base = addrlo;
118
- h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
119
+ h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, s_bits == MO_128);
120
a_mask = (1 << h->aa.align) - 1;
121
122
#ifdef CONFIG_SOFTMMU
123
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
124
TCGType tlbtype = TCG_TYPE_I32;
125
int trexw = 0, hrexw = 0, tlbrexw = 0;
126
unsigned mem_index = get_mmuidx(oi);
127
- unsigned s_bits = opc & MO_SIZE;
128
unsigned s_mask = (1 << s_bits) - 1;
129
int tlb_mask;
130
131
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
132
h.base, h.index, 0, h.ofs + 4);
133
}
134
break;
135
+
136
+ case MO_128:
137
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
138
+
139
+ /*
140
+ * Without 16-byte atomicity, use integer regs.
141
+ * That is where we want the data, and it allows bswaps.
142
+ */
143
+ if (h.aa.atom < MO_128) {
144
+ if (use_movbe) {
145
+ TCGReg t = datalo;
146
+ datalo = datahi;
147
+ datahi = t;
148
+ }
149
+ if (h.base == datalo || h.index == datalo) {
150
+ tcg_out_modrm_sib_offset(s, OPC_LEA + P_REXW, datahi,
151
+ h.base, h.index, 0, h.ofs);
152
+ tcg_out_modrm_offset(s, movop + P_REXW + h.seg,
153
+ datalo, datahi, 0);
154
+ tcg_out_modrm_offset(s, movop + P_REXW + h.seg,
155
+ datahi, datahi, 8);
156
+ } else {
157
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
158
+ h.base, h.index, 0, h.ofs);
159
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datahi,
160
+ h.base, h.index, 0, h.ofs + 8);
161
+ }
162
+ break;
163
+ }
164
+
165
+ /*
166
+ * With 16-byte atomicity, a vector load is required.
167
+ * If we already have 16-byte alignment, then VMOVDQA always works.
168
+ * Else if VMOVDQU has atomicity with dynamic alignment, use that.
169
+ * Else use we require a runtime test for alignment for VMOVDQA;
170
+ * use VMOVDQU on the unaligned nonatomic path for simplicity.
171
+ */
172
+ if (h.aa.align >= MO_128) {
173
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_VxWx + h.seg,
174
+ TCG_TMP_VEC, 0,
175
+ h.base, h.index, 0, h.ofs);
176
+ } else if (cpuinfo & CPUINFO_ATOMIC_VMOVDQU) {
177
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_VxWx + h.seg,
178
+ TCG_TMP_VEC, 0,
179
+ h.base, h.index, 0, h.ofs);
180
+ } else {
181
+ TCGLabel *l1 = gen_new_label();
182
+ TCGLabel *l2 = gen_new_label();
183
+
184
+ tcg_out_testi(s, h.base, 15);
185
+ tcg_out_jxx(s, JCC_JNE, l1, true);
186
+
187
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_VxWx + h.seg,
188
+ TCG_TMP_VEC, 0,
189
+ h.base, h.index, 0, h.ofs);
190
+ tcg_out_jxx(s, JCC_JMP, l2, true);
191
+
192
+ tcg_out_label(s, l1);
193
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_VxWx + h.seg,
194
+ TCG_TMP_VEC, 0,
195
+ h.base, h.index, 0, h.ofs);
196
+ tcg_out_label(s, l2);
197
+ }
198
+ tcg_out_vec_to_pair(s, TCG_TYPE_I64, datalo, datahi, TCG_TMP_VEC);
199
+ break;
200
+
201
default:
202
g_assert_not_reached();
203
}
204
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
205
h.base, h.index, 0, h.ofs + 4);
206
}
207
break;
208
+
209
+ case MO_128:
210
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
211
+
212
+ /*
213
+ * Without 16-byte atomicity, use integer regs.
214
+ * That is where we have the data, and it allows bswaps.
215
+ */
216
+ if (h.aa.atom < MO_128) {
217
+ if (use_movbe) {
218
+ TCGReg t = datalo;
219
+ datalo = datahi;
220
+ datahi = t;
221
+ }
222
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
223
+ h.base, h.index, 0, h.ofs);
224
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datahi,
225
+ h.base, h.index, 0, h.ofs + 8);
226
+ break;
227
+ }
228
+
229
+ /*
230
+ * With 16-byte atomicity, a vector store is required.
231
+ * If we already have 16-byte alignment, then VMOVDQA always works.
232
+ * Else if VMOVDQU has atomicity with dynamic alignment, use that.
233
+ * Else use we require a runtime test for alignment for VMOVDQA;
234
+ * use VMOVDQU on the unaligned nonatomic path for simplicity.
235
+ */
236
+ tcg_out_pair_to_vec(s, TCG_TYPE_I64, TCG_TMP_VEC, datalo, datahi);
237
+ if (h.aa.align >= MO_128) {
238
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_WxVx + h.seg,
239
+ TCG_TMP_VEC, 0,
240
+ h.base, h.index, 0, h.ofs);
241
+ } else if (cpuinfo & CPUINFO_ATOMIC_VMOVDQU) {
242
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_WxVx + h.seg,
243
+ TCG_TMP_VEC, 0,
244
+ h.base, h.index, 0, h.ofs);
245
+ } else {
246
+ TCGLabel *l1 = gen_new_label();
247
+ TCGLabel *l2 = gen_new_label();
248
+
249
+ tcg_out_testi(s, h.base, 15);
250
+ tcg_out_jxx(s, JCC_JNE, l1, true);
251
+
252
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_WxVx + h.seg,
253
+ TCG_TMP_VEC, 0,
254
+ h.base, h.index, 0, h.ofs);
255
+ tcg_out_jxx(s, JCC_JMP, l2, true);
256
+
257
+ tcg_out_label(s, l1);
258
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_WxVx + h.seg,
259
+ TCG_TMP_VEC, 0,
260
+ h.base, h.index, 0, h.ofs);
261
+ tcg_out_label(s, l2);
262
+ }
263
+ break;
264
+
265
default:
266
g_assert_not_reached();
267
}
268
@@ -XXX,XX +XXX,XX @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
269
tcg_out_qemu_ld(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
270
}
271
break;
272
+ case INDEX_op_qemu_ld_a32_i128:
273
+ case INDEX_op_qemu_ld_a64_i128:
274
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
275
+ tcg_out_qemu_ld(s, a0, a1, a2, -1, args[3], TCG_TYPE_I128);
276
+ break;
277
278
case INDEX_op_qemu_st_a64_i32:
279
case INDEX_op_qemu_st8_a64_i32:
280
@@ -XXX,XX +XXX,XX @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
281
tcg_out_qemu_st(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
282
}
283
break;
284
+ case INDEX_op_qemu_st_a32_i128:
285
+ case INDEX_op_qemu_st_a64_i128:
286
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
287
+ tcg_out_qemu_st(s, a0, a1, a2, -1, args[3], TCG_TYPE_I128);
288
+ break;
289
290
OP_32_64(mulu2):
291
tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_MUL, args[3]);
292
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
293
case INDEX_op_qemu_st_a64_i64:
294
return TCG_TARGET_REG_BITS == 64 ? C_O0_I2(L, L) : C_O0_I4(L, L, L, L);
295
296
+ case INDEX_op_qemu_ld_a32_i128:
297
+ case INDEX_op_qemu_ld_a64_i128:
298
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
299
+ return C_O2_I1(r, r, L);
300
+ case INDEX_op_qemu_st_a32_i128:
301
+ case INDEX_op_qemu_st_a64_i128:
302
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
303
+ return C_O0_I3(L, L, L);
304
+
305
case INDEX_op_brcond2_i32:
306
return C_O0_I4(r, r, ri, ri);
307
308
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
309
310
s->reserved_regs = 0;
311
tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK);
312
+ tcg_regset_set_reg(s->reserved_regs, TCG_TMP_VEC);
313
#ifdef _WIN64
314
/* These are call saved, and we don't save them, so don't use them. */
315
tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM6);
316
--
317
2.34.1
diff view generated by jsdifflib
New patch
1
We will need to allocate a second general-purpose temporary.
2
Rename the existing temps to add a distinguishing number.
1
3
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/aarch64/tcg-target.c.inc | 50 ++++++++++++++++++------------------
8
1 file changed, 25 insertions(+), 25 deletions(-)
9
10
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/aarch64/tcg-target.c.inc
13
+++ b/tcg/aarch64/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
15
return TCG_REG_X0 + slot;
16
}
17
18
-#define TCG_REG_TMP TCG_REG_X30
19
-#define TCG_VEC_TMP TCG_REG_V31
20
+#define TCG_REG_TMP0 TCG_REG_X30
21
+#define TCG_VEC_TMP0 TCG_REG_V31
22
23
#ifndef CONFIG_SOFTMMU
24
#define TCG_REG_GUEST_BASE TCG_REG_X28
25
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
26
static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece,
27
TCGReg r, TCGReg base, intptr_t offset)
28
{
29
- TCGReg temp = TCG_REG_TMP;
30
+ TCGReg temp = TCG_REG_TMP0;
31
32
if (offset < -0xffffff || offset > 0xffffff) {
33
tcg_out_movi(s, TCG_TYPE_PTR, temp, offset);
34
@@ -XXX,XX +XXX,XX @@ static void tcg_out_ldst(TCGContext *s, AArch64Insn insn, TCGReg rd,
35
}
36
37
/* Worst-case scenario, move offset to temp register, use reg offset. */
38
- tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, offset);
39
- tcg_out_ldst_r(s, insn, rd, rn, TCG_TYPE_I64, TCG_REG_TMP);
40
+ tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP0, offset);
41
+ tcg_out_ldst_r(s, insn, rd, rn, TCG_TYPE_I64, TCG_REG_TMP0);
42
}
43
44
static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg)
45
@@ -XXX,XX +XXX,XX @@ static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *target)
46
if (offset == sextract64(offset, 0, 26)) {
47
tcg_out_insn(s, 3206, BL, offset);
48
} else {
49
- tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, (intptr_t)target);
50
- tcg_out_insn(s, 3207, BLR, TCG_REG_TMP);
51
+ tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP0, (intptr_t)target);
52
+ tcg_out_insn(s, 3207, BLR, TCG_REG_TMP0);
53
}
54
}
55
56
@@ -XXX,XX +XXX,XX @@ static void tcg_out_addsub2(TCGContext *s, TCGType ext, TCGReg rl,
57
AArch64Insn insn;
58
59
if (rl == ah || (!const_bh && rl == bh)) {
60
- rl = TCG_REG_TMP;
61
+ rl = TCG_REG_TMP0;
62
}
63
64
if (const_bl) {
65
@@ -XXX,XX +XXX,XX @@ static void tcg_out_addsub2(TCGContext *s, TCGType ext, TCGReg rl,
66
possibility of adding 0+const in the low part, and the
67
immediate add instructions encode XSP not XZR. Don't try
68
anything more elaborate here than loading another zero. */
69
- al = TCG_REG_TMP;
70
+ al = TCG_REG_TMP0;
71
tcg_out_movi(s, ext, al, 0);
72
}
73
tcg_out_insn_3401(s, insn, ext, rl, al, bl);
74
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
75
{
76
TCGReg a1 = a0;
77
if (is_ctz) {
78
- a1 = TCG_REG_TMP;
79
+ a1 = TCG_REG_TMP0;
80
tcg_out_insn(s, 3507, RBIT, ext, a1, a0);
81
}
82
if (const_b && b == (ext ? 64 : 32)) {
83
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
84
AArch64Insn sel = I3506_CSEL;
85
86
tcg_out_cmp(s, ext, a0, 0, 1);
87
- tcg_out_insn(s, 3507, CLZ, ext, TCG_REG_TMP, a1);
88
+ tcg_out_insn(s, 3507, CLZ, ext, TCG_REG_TMP0, a1);
89
90
if (const_b) {
91
if (b == -1) {
92
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
93
b = d;
94
}
95
}
96
- tcg_out_insn_3506(s, sel, ext, d, TCG_REG_TMP, b, TCG_COND_NE);
97
+ tcg_out_insn_3506(s, sel, ext, d, TCG_REG_TMP0, b, TCG_COND_NE);
98
}
99
}
100
101
@@ -XXX,XX +XXX,XX @@ bool tcg_target_has_memory_bswap(MemOp memop)
102
}
103
104
static const TCGLdstHelperParam ldst_helper_param = {
105
- .ntmp = 1, .tmp = { TCG_REG_TMP }
106
+ .ntmp = 1, .tmp = { TCG_REG_TMP0 }
107
};
108
109
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
110
@@ -XXX,XX +XXX,XX @@ static void tcg_out_goto_tb(TCGContext *s, int which)
111
112
set_jmp_insn_offset(s, which);
113
tcg_out32(s, I3206_B);
114
- tcg_out_insn(s, 3207, BR, TCG_REG_TMP);
115
+ tcg_out_insn(s, 3207, BR, TCG_REG_TMP0);
116
set_jmp_reset_offset(s, which);
117
}
118
119
@@ -XXX,XX +XXX,XX @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
120
ptrdiff_t i_offset = i_addr - jmp_rx;
121
122
/* Note that we asserted this in range in tcg_out_goto_tb. */
123
- insn = deposit32(I3305_LDR | TCG_REG_TMP, 5, 19, i_offset >> 2);
124
+ insn = deposit32(I3305_LDR | TCG_REG_TMP0, 5, 19, i_offset >> 2);
125
}
126
qatomic_set((uint32_t *)jmp_rw, insn);
127
flush_idcache_range(jmp_rx, jmp_rw, 4);
128
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
129
130
case INDEX_op_rem_i64:
131
case INDEX_op_rem_i32:
132
- tcg_out_insn(s, 3508, SDIV, ext, TCG_REG_TMP, a1, a2);
133
- tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP, a2, a1);
134
+ tcg_out_insn(s, 3508, SDIV, ext, TCG_REG_TMP0, a1, a2);
135
+ tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP0, a2, a1);
136
break;
137
case INDEX_op_remu_i64:
138
case INDEX_op_remu_i32:
139
- tcg_out_insn(s, 3508, UDIV, ext, TCG_REG_TMP, a1, a2);
140
- tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP, a2, a1);
141
+ tcg_out_insn(s, 3508, UDIV, ext, TCG_REG_TMP0, a1, a2);
142
+ tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP0, a2, a1);
143
break;
144
145
case INDEX_op_shl_i64:
146
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
147
if (c2) {
148
tcg_out_rotl(s, ext, a0, a1, a2);
149
} else {
150
- tcg_out_insn(s, 3502, SUB, 0, TCG_REG_TMP, TCG_REG_XZR, a2);
151
- tcg_out_insn(s, 3508, RORV, ext, a0, a1, TCG_REG_TMP);
152
+ tcg_out_insn(s, 3502, SUB, 0, TCG_REG_TMP0, TCG_REG_XZR, a2);
153
+ tcg_out_insn(s, 3508, RORV, ext, a0, a1, TCG_REG_TMP0);
154
}
155
break;
156
157
@@ -XXX,XX +XXX,XX @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
158
break;
159
}
160
}
161
- tcg_out_dupi_vec(s, type, MO_8, TCG_VEC_TMP, 0);
162
- a2 = TCG_VEC_TMP;
163
+ tcg_out_dupi_vec(s, type, MO_8, TCG_VEC_TMP0, 0);
164
+ a2 = TCG_VEC_TMP0;
165
}
166
if (is_scalar) {
167
insn = cmp_scalar_insn[cond];
168
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
169
s->reserved_regs = 0;
170
tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP);
171
tcg_regset_set_reg(s->reserved_regs, TCG_REG_FP);
172
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP);
173
tcg_regset_set_reg(s->reserved_regs, TCG_REG_X18); /* platform register */
174
- tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP);
175
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
176
+ tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP0);
177
}
178
179
/* Saving pairs: (X19, X20) .. (X27, X28), (X29(fp), X30(lr)). */
180
--
181
2.34.1
diff view generated by jsdifflib
New patch
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
tcg/aarch64/tcg-target.c.inc | 9 +++++++--
5
1 file changed, 7 insertions(+), 2 deletions(-)
1
6
7
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
8
index XXXXXXX..XXXXXXX 100644
9
--- a/tcg/aarch64/tcg-target.c.inc
10
+++ b/tcg/aarch64/tcg-target.c.inc
11
@@ -XXX,XX +XXX,XX @@ static const int tcg_target_reg_alloc_order[] = {
12
13
TCG_REG_X8, TCG_REG_X9, TCG_REG_X10, TCG_REG_X11,
14
TCG_REG_X12, TCG_REG_X13, TCG_REG_X14, TCG_REG_X15,
15
- TCG_REG_X16, TCG_REG_X17,
16
17
TCG_REG_X0, TCG_REG_X1, TCG_REG_X2, TCG_REG_X3,
18
TCG_REG_X4, TCG_REG_X5, TCG_REG_X6, TCG_REG_X7,
19
20
+ /* X16 reserved as temporary */
21
+ /* X17 reserved as temporary */
22
/* X18 reserved by system */
23
/* X19 reserved for AREG0 */
24
/* X29 reserved as fp */
25
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
26
return TCG_REG_X0 + slot;
27
}
28
29
-#define TCG_REG_TMP0 TCG_REG_X30
30
+#define TCG_REG_TMP0 TCG_REG_X16
31
+#define TCG_REG_TMP1 TCG_REG_X17
32
+#define TCG_REG_TMP2 TCG_REG_X30
33
#define TCG_VEC_TMP0 TCG_REG_V31
34
35
#ifndef CONFIG_SOFTMMU
36
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
37
tcg_regset_set_reg(s->reserved_regs, TCG_REG_FP);
38
tcg_regset_set_reg(s->reserved_regs, TCG_REG_X18); /* platform register */
39
tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
40
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
41
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
42
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP0);
43
}
44
45
--
46
2.34.1
diff view generated by jsdifflib
New patch
1
Adjust the softmmu tlb to use TMP[0-2], not any of the normally available
2
registers. Since we handle overlap betwen inputs and helper arguments,
3
we can allow any allocatable reg.
1
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/aarch64/tcg-target-con-set.h | 2 --
9
tcg/aarch64/tcg-target-con-str.h | 1 -
10
tcg/aarch64/tcg-target.c.inc | 45 ++++++++++++++------------------
11
3 files changed, 19 insertions(+), 29 deletions(-)
12
13
diff --git a/tcg/aarch64/tcg-target-con-set.h b/tcg/aarch64/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/aarch64/tcg-target-con-set.h
16
+++ b/tcg/aarch64/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
18
* tcg-target-con-str.h; the constraint combination is inclusive or.
19
*/
20
C_O0_I1(r)
21
-C_O0_I2(lZ, l)
22
C_O0_I2(r, rA)
23
C_O0_I2(rZ, r)
24
C_O0_I2(w, r)
25
-C_O1_I1(r, l)
26
C_O1_I1(r, r)
27
C_O1_I1(w, r)
28
C_O1_I1(w, w)
29
diff --git a/tcg/aarch64/tcg-target-con-str.h b/tcg/aarch64/tcg-target-con-str.h
30
index XXXXXXX..XXXXXXX 100644
31
--- a/tcg/aarch64/tcg-target-con-str.h
32
+++ b/tcg/aarch64/tcg-target-con-str.h
33
@@ -XXX,XX +XXX,XX @@
34
* REGS(letter, register_mask)
35
*/
36
REGS('r', ALL_GENERAL_REGS)
37
-REGS('l', ALL_QLDST_REGS)
38
REGS('w', ALL_VECTOR_REGS)
39
40
/*
41
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
42
index XXXXXXX..XXXXXXX 100644
43
--- a/tcg/aarch64/tcg-target.c.inc
44
+++ b/tcg/aarch64/tcg-target.c.inc
45
@@ -XXX,XX +XXX,XX @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
46
#define ALL_GENERAL_REGS 0xffffffffu
47
#define ALL_VECTOR_REGS 0xffffffff00000000ull
48
49
-#ifdef CONFIG_SOFTMMU
50
-#define ALL_QLDST_REGS \
51
- (ALL_GENERAL_REGS & ~((1 << TCG_REG_X0) | (1 << TCG_REG_X1) | \
52
- (1 << TCG_REG_X2) | (1 << TCG_REG_X3)))
53
-#else
54
-#define ALL_QLDST_REGS ALL_GENERAL_REGS
55
-#endif
56
-
57
/* Match a constant valid for addition (12-bit, optionally shifted). */
58
static inline bool is_aimm(uint64_t val)
59
{
60
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
61
unsigned s_bits = opc & MO_SIZE;
62
unsigned s_mask = (1u << s_bits) - 1;
63
unsigned mem_index = get_mmuidx(oi);
64
- TCGReg x3;
65
+ TCGReg addr_adj;
66
TCGType mask_type;
67
uint64_t compare_mask;
68
69
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
70
mask_type = (s->page_bits + s->tlb_dyn_max_bits > 32
71
? TCG_TYPE_I64 : TCG_TYPE_I32);
72
73
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
74
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {tmp0,tmp1}. */
75
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
76
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
77
QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
78
QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
79
- tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
80
+ tcg_out_insn(s, 3314, LDP, TCG_REG_TMP0, TCG_REG_TMP1, TCG_AREG0,
81
TLB_MASK_TABLE_OFS(mem_index), 1, 0);
82
83
/* Extract the TLB index from the address into X0. */
84
tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
85
- TCG_REG_X0, TCG_REG_X0, addr_reg,
86
+ TCG_REG_TMP0, TCG_REG_TMP0, addr_reg,
87
s->page_bits - CPU_TLB_ENTRY_BITS);
88
89
- /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
90
- tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
91
+ /* Add the tlb_table pointer, forming the CPUTLBEntry address in TMP1. */
92
+ tcg_out_insn(s, 3502, ADD, 1, TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_TMP0);
93
94
- /* Load the tlb comparator into X0, and the fast path addend into X1. */
95
- tcg_out_ld(s, addr_type, TCG_REG_X0, TCG_REG_X1,
96
+ /* Load the tlb comparator into TMP0, and the fast path addend into TMP1. */
97
+ tcg_out_ld(s, addr_type, TCG_REG_TMP0, TCG_REG_TMP1,
98
is_ld ? offsetof(CPUTLBEntry, addr_read)
99
: offsetof(CPUTLBEntry, addr_write));
100
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
101
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
102
offsetof(CPUTLBEntry, addend));
103
104
/*
105
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
106
* cross pages using the address of the last byte of the access.
107
*/
108
if (a_mask >= s_mask) {
109
- x3 = addr_reg;
110
+ addr_adj = addr_reg;
111
} else {
112
+ addr_adj = TCG_REG_TMP2;
113
tcg_out_insn(s, 3401, ADDI, addr_type,
114
- TCG_REG_X3, addr_reg, s_mask - a_mask);
115
- x3 = TCG_REG_X3;
116
+ addr_adj, addr_reg, s_mask - a_mask);
117
}
118
compare_mask = (uint64_t)s->page_mask | a_mask;
119
120
- /* Store the page mask part of the address into X3. */
121
- tcg_out_logicali(s, I3404_ANDI, addr_type, TCG_REG_X3, x3, compare_mask);
122
+ /* Store the page mask part of the address into TMP2. */
123
+ tcg_out_logicali(s, I3404_ANDI, addr_type, TCG_REG_TMP2,
124
+ addr_adj, compare_mask);
125
126
/* Perform the address comparison. */
127
- tcg_out_cmp(s, addr_type, TCG_REG_X0, TCG_REG_X3, 0);
128
+ tcg_out_cmp(s, addr_type, TCG_REG_TMP0, TCG_REG_TMP2, 0);
129
130
/* If not equal, we jump to the slow path. */
131
ldst->label_ptr[0] = s->code_ptr;
132
tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
133
134
- h->base = TCG_REG_X1,
135
+ h->base = TCG_REG_TMP1;
136
h->index = addr_reg;
137
h->index_ext = addr_type;
138
#else
139
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
140
case INDEX_op_qemu_ld_a64_i32:
141
case INDEX_op_qemu_ld_a32_i64:
142
case INDEX_op_qemu_ld_a64_i64:
143
- return C_O1_I1(r, l);
144
+ return C_O1_I1(r, r);
145
case INDEX_op_qemu_st_a32_i32:
146
case INDEX_op_qemu_st_a64_i32:
147
case INDEX_op_qemu_st_a32_i64:
148
case INDEX_op_qemu_st_a64_i64:
149
- return C_O0_I2(lZ, l);
150
+ return C_O0_I2(rZ, r);
151
152
case INDEX_op_deposit_i32:
153
case INDEX_op_deposit_i64:
154
--
155
2.34.1
diff view generated by jsdifflib
New patch
1
With FEAT_LSE2, LDP/STP suffices. Without FEAT_LSE2, use LDXP+STXP
2
16-byte atomicity is required and LDP/STP otherwise.
1
3
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/aarch64/tcg-target-con-set.h | 2 +
8
tcg/aarch64/tcg-target.h | 11 ++-
9
tcg/aarch64/tcg-target.c.inc | 141 ++++++++++++++++++++++++++++++-
10
3 files changed, 151 insertions(+), 3 deletions(-)
11
12
diff --git a/tcg/aarch64/tcg-target-con-set.h b/tcg/aarch64/tcg-target-con-set.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/aarch64/tcg-target-con-set.h
15
+++ b/tcg/aarch64/tcg-target-con-set.h
16
@@ -XXX,XX +XXX,XX @@ C_O0_I1(r)
17
C_O0_I2(r, rA)
18
C_O0_I2(rZ, r)
19
C_O0_I2(w, r)
20
+C_O0_I3(rZ, rZ, r)
21
C_O1_I1(r, r)
22
C_O1_I1(w, r)
23
C_O1_I1(w, w)
24
@@ -XXX,XX +XXX,XX @@ C_O1_I2(w, w, wO)
25
C_O1_I2(w, w, wZ)
26
C_O1_I3(w, w, w, w)
27
C_O1_I4(r, r, rA, rZ, rZ)
28
+C_O2_I1(r, r, r)
29
C_O2_I4(r, r, rZ, rZ, rA, rMZ)
30
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
31
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/aarch64/tcg-target.h
33
+++ b/tcg/aarch64/tcg-target.h
34
@@ -XXX,XX +XXX,XX @@ typedef enum {
35
#define TCG_TARGET_HAS_muluh_i64 1
36
#define TCG_TARGET_HAS_mulsh_i64 1
37
38
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
39
+/*
40
+ * Without FEAT_LSE2, we must use LDXP+STXP to implement atomic 128-bit load,
41
+ * which requires writable pages. We must defer to the helper for user-only,
42
+ * but in system mode all ram is writable for the host.
43
+ */
44
+#ifdef CONFIG_USER_ONLY
45
+#define TCG_TARGET_HAS_qemu_ldst_i128 have_lse2
46
+#else
47
+#define TCG_TARGET_HAS_qemu_ldst_i128 1
48
+#endif
49
50
#define TCG_TARGET_HAS_v64 1
51
#define TCG_TARGET_HAS_v128 1
52
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
53
index XXXXXXX..XXXXXXX 100644
54
--- a/tcg/aarch64/tcg-target.c.inc
55
+++ b/tcg/aarch64/tcg-target.c.inc
56
@@ -XXX,XX +XXX,XX @@ typedef enum {
57
I3305_LDR_v64 = 0x5c000000,
58
I3305_LDR_v128 = 0x9c000000,
59
60
+ /* Load/store exclusive. */
61
+ I3306_LDXP = 0xc8600000,
62
+ I3306_STXP = 0xc8200000,
63
+
64
/* Load/store register. Described here as 3.3.12, but the helper
65
that emits them can transform to 3.3.10 or 3.3.13. */
66
I3312_STRB = 0x38000000 | LDST_ST << 22 | MO_8 << 30,
67
@@ -XXX,XX +XXX,XX @@ typedef enum {
68
I3406_ADR = 0x10000000,
69
I3406_ADRP = 0x90000000,
70
71
+ /* Add/subtract extended register instructions. */
72
+ I3501_ADD = 0x0b200000,
73
+
74
/* Add/subtract shifted register instructions (without a shift). */
75
I3502_ADD = 0x0b000000,
76
I3502_ADDS = 0x2b000000,
77
@@ -XXX,XX +XXX,XX @@ static void tcg_out_insn_3305(TCGContext *s, AArch64Insn insn,
78
tcg_out32(s, insn | (imm19 & 0x7ffff) << 5 | rt);
79
}
80
81
+static void tcg_out_insn_3306(TCGContext *s, AArch64Insn insn, TCGReg rs,
82
+ TCGReg rt, TCGReg rt2, TCGReg rn)
83
+{
84
+ tcg_out32(s, insn | rs << 16 | rt2 << 10 | rn << 5 | rt);
85
+}
86
+
87
static void tcg_out_insn_3201(TCGContext *s, AArch64Insn insn, TCGType ext,
88
TCGReg rt, int imm19)
89
{
90
@@ -XXX,XX +XXX,XX @@ static void tcg_out_insn_3406(TCGContext *s, AArch64Insn insn,
91
tcg_out32(s, insn | (disp & 3) << 29 | (disp & 0x1ffffc) << (5 - 2) | rd);
92
}
93
94
+static inline void tcg_out_insn_3501(TCGContext *s, AArch64Insn insn,
95
+ TCGType sf, TCGReg rd, TCGReg rn,
96
+ TCGReg rm, int opt, int imm3)
97
+{
98
+ tcg_out32(s, insn | sf << 31 | rm << 16 | opt << 13 |
99
+ imm3 << 10 | rn << 5 | rd);
100
+}
101
+
102
/* This function is for both 3.5.2 (Add/Subtract shifted register), for
103
the rare occasion when we actually want to supply a shift amount. */
104
static inline void tcg_out_insn_3502S(TCGContext *s, AArch64Insn insn,
105
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
106
TCGType addr_type = s->addr_type;
107
TCGLabelQemuLdst *ldst = NULL;
108
MemOp opc = get_memop(oi);
109
+ MemOp s_bits = opc & MO_SIZE;
110
unsigned a_mask;
111
112
h->aa = atom_and_align_for_opc(s, opc,
113
have_lse2 ? MO_ATOM_WITHIN16
114
: MO_ATOM_IFALIGN,
115
- false);
116
+ s_bits == MO_128);
117
a_mask = (1 << h->aa.align) - 1;
118
119
#ifdef CONFIG_SOFTMMU
120
- unsigned s_bits = opc & MO_SIZE;
121
unsigned s_mask = (1u << s_bits) - 1;
122
unsigned mem_index = get_mmuidx(oi);
123
TCGReg addr_adj;
124
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
125
}
126
}
127
128
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
129
+ TCGReg addr_reg, MemOpIdx oi, bool is_ld)
130
+{
131
+ TCGLabelQemuLdst *ldst;
132
+ HostAddress h;
133
+ TCGReg base;
134
+ bool use_pair;
135
+
136
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, is_ld);
137
+
138
+ /* Compose the final address, as LDP/STP have no indexing. */
139
+ if (h.index == TCG_REG_XZR) {
140
+ base = h.base;
141
+ } else {
142
+ base = TCG_REG_TMP2;
143
+ if (h.index_ext == TCG_TYPE_I32) {
144
+ /* add base, base, index, uxtw */
145
+ tcg_out_insn(s, 3501, ADD, TCG_TYPE_I64, base,
146
+ h.base, h.index, MO_32, 0);
147
+ } else {
148
+ /* add base, base, index */
149
+ tcg_out_insn(s, 3502, ADD, 1, base, h.base, h.index);
150
+ }
151
+ }
152
+
153
+ use_pair = h.aa.atom < MO_128 || have_lse2;
154
+
155
+ if (!use_pair) {
156
+ tcg_insn_unit *branch = NULL;
157
+ TCGReg ll, lh, sl, sh;
158
+
159
+ /*
160
+ * If we have already checked for 16-byte alignment, that's all
161
+ * we need. Otherwise we have determined that misaligned atomicity
162
+ * may be handled with two 8-byte loads.
163
+ */
164
+ if (h.aa.align < MO_128) {
165
+ /*
166
+ * TODO: align should be MO_64, so we only need test bit 3,
167
+ * which means we could use TBNZ instead of ANDS+B_C.
168
+ */
169
+ tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, 15);
170
+ branch = s->code_ptr;
171
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
172
+ use_pair = true;
173
+ }
174
+
175
+ if (is_ld) {
176
+ /*
177
+ * 16-byte atomicity without LSE2 requires LDXP+STXP loop:
178
+ * ldxp lo, hi, [base]
179
+ * stxp t0, lo, hi, [base]
180
+ * cbnz t0, .-8
181
+ * Require no overlap between data{lo,hi} and base.
182
+ */
183
+ if (datalo == base || datahi == base) {
184
+ tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_TMP2, base);
185
+ base = TCG_REG_TMP2;
186
+ }
187
+ ll = sl = datalo;
188
+ lh = sh = datahi;
189
+ } else {
190
+ /*
191
+ * 16-byte atomicity without LSE2 requires LDXP+STXP loop:
192
+ * 1: ldxp t0, t1, [base]
193
+ * stxp t0, lo, hi, [base]
194
+ * cbnz t0, 1b
195
+ */
196
+ tcg_debug_assert(base != TCG_REG_TMP0 && base != TCG_REG_TMP1);
197
+ ll = TCG_REG_TMP0;
198
+ lh = TCG_REG_TMP1;
199
+ sl = datalo;
200
+ sh = datahi;
201
+ }
202
+
203
+ tcg_out_insn(s, 3306, LDXP, TCG_REG_XZR, ll, lh, base);
204
+ tcg_out_insn(s, 3306, STXP, TCG_REG_TMP0, sl, sh, base);
205
+ tcg_out_insn(s, 3201, CBNZ, 0, TCG_REG_TMP0, -2);
206
+
207
+ if (use_pair) {
208
+ /* "b .+8", branching across the one insn of use_pair. */
209
+ tcg_out_insn(s, 3206, B, 2);
210
+ reloc_pc19(branch, tcg_splitwx_to_rx(s->code_ptr));
211
+ }
212
+ }
213
+
214
+ if (use_pair) {
215
+ if (is_ld) {
216
+ tcg_out_insn(s, 3314, LDP, datalo, datahi, base, 0, 1, 0);
217
+ } else {
218
+ tcg_out_insn(s, 3314, STP, datalo, datahi, base, 0, 1, 0);
219
+ }
220
+ }
221
+
222
+ if (ldst) {
223
+ ldst->type = TCG_TYPE_I128;
224
+ ldst->datalo_reg = datalo;
225
+ ldst->datahi_reg = datahi;
226
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
227
+ }
228
+}
229
+
230
static const tcg_insn_unit *tb_ret_addr;
231
232
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
233
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
234
case INDEX_op_qemu_st_a64_i64:
235
tcg_out_qemu_st(s, REG0(0), a1, a2, ext);
236
break;
237
+ case INDEX_op_qemu_ld_a32_i128:
238
+ case INDEX_op_qemu_ld_a64_i128:
239
+ tcg_out_qemu_ldst_i128(s, a0, a1, a2, args[3], true);
240
+ break;
241
+ case INDEX_op_qemu_st_a32_i128:
242
+ case INDEX_op_qemu_st_a64_i128:
243
+ tcg_out_qemu_ldst_i128(s, REG0(0), REG0(1), a2, args[3], false);
244
+ break;
245
246
case INDEX_op_bswap64_i64:
247
tcg_out_rev(s, TCG_TYPE_I64, MO_64, a0, a1);
248
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
249
case INDEX_op_qemu_ld_a32_i64:
250
case INDEX_op_qemu_ld_a64_i64:
251
return C_O1_I1(r, r);
252
+ case INDEX_op_qemu_ld_a32_i128:
253
+ case INDEX_op_qemu_ld_a64_i128:
254
+ return C_O2_I1(r, r, r);
255
case INDEX_op_qemu_st_a32_i32:
256
case INDEX_op_qemu_st_a64_i32:
257
case INDEX_op_qemu_st_a32_i64:
258
case INDEX_op_qemu_st_a64_i64:
259
return C_O0_I2(rZ, r);
260
+ case INDEX_op_qemu_st_a32_i128:
261
+ case INDEX_op_qemu_st_a64_i128:
262
+ return C_O0_I3(rZ, rZ, r);
263
264
case INDEX_op_deposit_i32:
265
case INDEX_op_deposit_i64:
266
--
267
2.34.1
diff view generated by jsdifflib
New patch
1
Use LQ/STQ with ISA v2.07, and 16-byte atomicity is required.
2
Note that these instructions do not require 16-byte alignment.
1
3
4
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/ppc/tcg-target-con-set.h | 2 +
8
tcg/ppc/tcg-target-con-str.h | 1 +
9
tcg/ppc/tcg-target.h | 3 +-
10
tcg/ppc/tcg-target.c.inc | 108 +++++++++++++++++++++++++++++++----
11
4 files changed, 101 insertions(+), 13 deletions(-)
12
13
diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/ppc/tcg-target-con-set.h
16
+++ b/tcg/ppc/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@ C_O0_I2(r, r)
18
C_O0_I2(r, ri)
19
C_O0_I2(v, r)
20
C_O0_I3(r, r, r)
21
+C_O0_I3(o, m, r)
22
C_O0_I4(r, r, ri, ri)
23
C_O0_I4(r, r, r, r)
24
C_O1_I1(r, r)
25
@@ -XXX,XX +XXX,XX @@ C_O1_I3(v, v, v, v)
26
C_O1_I4(r, r, ri, rZ, rZ)
27
C_O1_I4(r, r, r, ri, ri)
28
C_O2_I1(r, r, r)
29
+C_O2_I1(o, m, r)
30
C_O2_I2(r, r, r, r)
31
C_O2_I4(r, r, rI, rZM, r, r)
32
C_O2_I4(r, r, r, r, rI, rZM)
33
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
34
index XXXXXXX..XXXXXXX 100644
35
--- a/tcg/ppc/tcg-target-con-str.h
36
+++ b/tcg/ppc/tcg-target-con-str.h
37
@@ -XXX,XX +XXX,XX @@
38
* REGS(letter, register_mask)
39
*/
40
REGS('r', ALL_GENERAL_REGS)
41
+REGS('o', ALL_GENERAL_REGS & 0xAAAAAAAAu) /* odd registers */
42
REGS('v', ALL_VECTOR_REGS)
43
44
/*
45
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
46
index XXXXXXX..XXXXXXX 100644
47
--- a/tcg/ppc/tcg-target.h
48
+++ b/tcg/ppc/tcg-target.h
49
@@ -XXX,XX +XXX,XX @@ extern bool have_vsx;
50
#define TCG_TARGET_HAS_mulsh_i64 1
51
#endif
52
53
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
54
+#define TCG_TARGET_HAS_qemu_ldst_i128 \
55
+ (TCG_TARGET_REG_BITS == 64 && have_isa_2_07)
56
57
/*
58
* While technically Altivec could support V64, it has no 64-bit store
59
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
60
index XXXXXXX..XXXXXXX 100644
61
--- a/tcg/ppc/tcg-target.c.inc
62
+++ b/tcg/ppc/tcg-target.c.inc
63
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
64
65
#define B OPCD( 18)
66
#define BC OPCD( 16)
67
+
68
#define LBZ OPCD( 34)
69
#define LHZ OPCD( 40)
70
#define LHA OPCD( 42)
71
#define LWZ OPCD( 32)
72
#define LWZUX XO31( 55)
73
-#define STB OPCD( 38)
74
-#define STH OPCD( 44)
75
-#define STW OPCD( 36)
76
-
77
-#define STD XO62( 0)
78
-#define STDU XO62( 1)
79
-#define STDX XO31(149)
80
-
81
#define LD XO58( 0)
82
#define LDX XO31( 21)
83
#define LDU XO58( 1)
84
#define LDUX XO31( 53)
85
#define LWA XO58( 2)
86
#define LWAX XO31(341)
87
+#define LQ OPCD( 56)
88
+
89
+#define STB OPCD( 38)
90
+#define STH OPCD( 44)
91
+#define STW OPCD( 36)
92
+#define STD XO62( 0)
93
+#define STDU XO62( 1)
94
+#define STDX XO31(149)
95
+#define STQ XO62( 2)
96
97
#define ADDIC OPCD( 12)
98
#define ADDI OPCD( 14)
99
@@ -XXX,XX +XXX,XX @@ typedef struct {
100
101
bool tcg_target_has_memory_bswap(MemOp memop)
102
{
103
- return true;
104
+ TCGAtomAlign aa;
105
+
106
+ if ((memop & MO_SIZE) <= MO_64) {
107
+ return true;
108
+ }
109
+
110
+ /*
111
+ * Reject 16-byte memop with 16-byte atomicity,
112
+ * but do allow a pair of 64-bit operations.
113
+ */
114
+ aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
115
+ return aa.atom <= MO_64;
116
}
117
118
/*
119
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
120
{
121
TCGLabelQemuLdst *ldst = NULL;
122
MemOp opc = get_memop(oi);
123
- MemOp a_bits;
124
+ MemOp a_bits, s_bits;
125
126
/*
127
* Book II, Section 1.4, Single-Copy Atomicity, specifies:
128
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
129
* As of 3.0, "the non-atomic access is performed as described in
130
* the corresponding list", which matches MO_ATOM_SUBALIGN.
131
*/
132
+ s_bits = opc & MO_SIZE;
133
h->aa = atom_and_align_for_opc(s, opc,
134
have_isa_3_00 ? MO_ATOM_SUBALIGN
135
: MO_ATOM_IFALIGN,
136
- false);
137
+ s_bits == MO_128);
138
a_bits = h->aa.align;
139
140
#ifdef CONFIG_SOFTMMU
141
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
142
int fast_off = TLB_MASK_TABLE_OFS(mem_index);
143
int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
144
int table_off = fast_off + offsetof(CPUTLBDescFast, table);
145
- unsigned s_bits = opc & MO_SIZE;
146
147
ldst = new_ldst_label(s);
148
ldst->is_ld = is_ld;
149
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
150
}
151
}
152
153
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
154
+ TCGReg addr_reg, MemOpIdx oi, bool is_ld)
155
+{
156
+ TCGLabelQemuLdst *ldst;
157
+ HostAddress h;
158
+ bool need_bswap;
159
+ uint32_t insn;
160
+ TCGReg index;
161
+
162
+ ldst = prepare_host_addr(s, &h, addr_reg, -1, oi, is_ld);
163
+
164
+ /* Compose the final address, as LQ/STQ have no indexing. */
165
+ index = h.index;
166
+ if (h.base != 0) {
167
+ index = TCG_REG_TMP1;
168
+ tcg_out32(s, ADD | TAB(index, h.base, h.index));
169
+ }
170
+ need_bswap = get_memop(oi) & MO_BSWAP;
171
+
172
+ if (h.aa.atom == MO_128) {
173
+ tcg_debug_assert(!need_bswap);
174
+ tcg_debug_assert(datalo & 1);
175
+ tcg_debug_assert(datahi == datalo - 1);
176
+ insn = is_ld ? LQ : STQ;
177
+ tcg_out32(s, insn | TAI(datahi, index, 0));
178
+ } else {
179
+ TCGReg d1, d2;
180
+
181
+ if (HOST_BIG_ENDIAN ^ need_bswap) {
182
+ d1 = datahi, d2 = datalo;
183
+ } else {
184
+ d1 = datalo, d2 = datahi;
185
+ }
186
+
187
+ if (need_bswap) {
188
+ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R0, 8);
189
+ insn = is_ld ? LDBRX : STDBRX;
190
+ tcg_out32(s, insn | TAB(d1, 0, index));
191
+ tcg_out32(s, insn | TAB(d2, index, TCG_REG_R0));
192
+ } else {
193
+ insn = is_ld ? LD : STD;
194
+ tcg_out32(s, insn | TAI(d1, index, 0));
195
+ tcg_out32(s, insn | TAI(d2, index, 8));
196
+ }
197
+ }
198
+
199
+ if (ldst) {
200
+ ldst->type = TCG_TYPE_I128;
201
+ ldst->datalo_reg = datalo;
202
+ ldst->datahi_reg = datahi;
203
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
204
+ }
205
+}
206
+
207
static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
208
{
209
int i;
210
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
211
args[4], TCG_TYPE_I64);
212
}
213
break;
214
+ case INDEX_op_qemu_ld_a32_i128:
215
+ case INDEX_op_qemu_ld_a64_i128:
216
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
217
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], true);
218
+ break;
219
220
case INDEX_op_qemu_st_a64_i32:
221
if (TCG_TARGET_REG_BITS == 32) {
222
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
223
args[4], TCG_TYPE_I64);
224
}
225
break;
226
+ case INDEX_op_qemu_st_a32_i128:
227
+ case INDEX_op_qemu_st_a64_i128:
228
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
229
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], false);
230
+ break;
231
232
case INDEX_op_setcond_i32:
233
tcg_out_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2],
234
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
235
case INDEX_op_qemu_st_a64_i64:
236
return TCG_TARGET_REG_BITS == 64 ? C_O0_I2(r, r) : C_O0_I4(r, r, r, r);
237
238
+ case INDEX_op_qemu_ld_a32_i128:
239
+ case INDEX_op_qemu_ld_a64_i128:
240
+ return C_O2_I1(o, m, r);
241
+ case INDEX_op_qemu_st_a32_i128:
242
+ case INDEX_op_qemu_st_a64_i128:
243
+ return C_O0_I3(o, m, r);
244
+
245
case INDEX_op_add_vec:
246
case INDEX_op_sub_vec:
247
case INDEX_op_mul_vec:
248
--
249
2.34.1
diff view generated by jsdifflib
1
From: Ilya Leoshkevich <iii@linux.ibm.com>
1
Use LPQ/STPQ when 16-byte atomicity is required.
2
Note that these instructions require 16-byte alignment.
2
3
3
Currently dying to one of the core_dump_signal()s deadlocks, because
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
dump_core_and_abort() calls start_exclusive() two times: first via
5
stop_all_tasks(), and then via preexit_cleanup() ->
6
qemu_plugin_user_exit().
7
8
There are a number of ways to solve this: resume after dumping core;
9
check cpu_in_exclusive_context() in qemu_plugin_user_exit(); or make
10
{start,end}_exclusive() recursive. Pick the last option, since it's
11
the most straightforward one.
12
13
Fixes: da91c1920242 ("linux-user: Clean up when exiting due to a signal")
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
16
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
17
Message-Id: <20230214140829.45392-3-iii@linux.ibm.com>
18
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
19
---
6
---
20
include/hw/core/cpu.h | 4 ++--
7
tcg/s390x/tcg-target-con-set.h | 2 +
21
cpus-common.c | 12 ++++++++++--
8
tcg/s390x/tcg-target.h | 2 +-
22
2 files changed, 12 insertions(+), 4 deletions(-)
9
tcg/s390x/tcg-target.c.inc | 107 ++++++++++++++++++++++++++++++++-
10
3 files changed, 107 insertions(+), 4 deletions(-)
23
11
24
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
12
diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
25
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
26
--- a/include/hw/core/cpu.h
14
--- a/tcg/s390x/tcg-target-con-set.h
27
+++ b/include/hw/core/cpu.h
15
+++ b/tcg/s390x/tcg-target-con-set.h
28
@@ -XXX,XX +XXX,XX @@ struct CPUState {
16
@@ -XXX,XX +XXX,XX @@ C_O0_I2(r, r)
29
bool unplug;
17
C_O0_I2(r, ri)
30
bool crash_occurred;
18
C_O0_I2(r, rA)
31
bool exit_request;
19
C_O0_I2(v, r)
32
- bool in_exclusive_context;
20
+C_O0_I3(o, m, r)
33
+ int exclusive_context_count;
21
C_O1_I1(r, r)
34
uint32_t cflags_next_tb;
22
C_O1_I1(v, r)
35
/* updates protected by BQL */
23
C_O1_I1(v, v)
36
uint32_t interrupt_request;
24
@@ -XXX,XX +XXX,XX @@ C_O1_I2(v, v, v)
37
@@ -XXX,XX +XXX,XX @@ void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data
25
C_O1_I3(v, v, v, v)
38
*/
26
C_O1_I4(r, r, ri, rI, r)
39
static inline bool cpu_in_exclusive_context(const CPUState *cpu)
27
C_O1_I4(r, r, rA, rI, r)
28
+C_O2_I1(o, m, r)
29
C_O2_I2(o, m, 0, r)
30
C_O2_I2(o, m, r, r)
31
C_O2_I3(o, m, 0, 1, r)
32
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/tcg/s390x/tcg-target.h
35
+++ b/tcg/s390x/tcg-target.h
36
@@ -XXX,XX +XXX,XX @@ extern uint64_t s390_facilities[3];
37
#define TCG_TARGET_HAS_muluh_i64 0
38
#define TCG_TARGET_HAS_mulsh_i64 0
39
40
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
41
+#define TCG_TARGET_HAS_qemu_ldst_i128 1
42
43
#define TCG_TARGET_HAS_v64 HAVE_FACILITY(VECTOR)
44
#define TCG_TARGET_HAS_v128 HAVE_FACILITY(VECTOR)
45
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
46
index XXXXXXX..XXXXXXX 100644
47
--- a/tcg/s390x/tcg-target.c.inc
48
+++ b/tcg/s390x/tcg-target.c.inc
49
@@ -XXX,XX +XXX,XX @@ typedef enum S390Opcode {
50
RXY_LLGF = 0xe316,
51
RXY_LLGH = 0xe391,
52
RXY_LMG = 0xeb04,
53
+ RXY_LPQ = 0xe38f,
54
RXY_LRV = 0xe31e,
55
RXY_LRVG = 0xe30f,
56
RXY_LRVH = 0xe31f,
57
@@ -XXX,XX +XXX,XX @@ typedef enum S390Opcode {
58
RXY_STG = 0xe324,
59
RXY_STHY = 0xe370,
60
RXY_STMG = 0xeb24,
61
+ RXY_STPQ = 0xe38e,
62
RXY_STRV = 0xe33e,
63
RXY_STRVG = 0xe32f,
64
RXY_STRVH = 0xe33f,
65
@@ -XXX,XX +XXX,XX @@ typedef struct {
66
67
bool tcg_target_has_memory_bswap(MemOp memop)
40
{
68
{
41
- return cpu->in_exclusive_context;
69
- return true;
42
+ return cpu->exclusive_context_count;
70
+ TCGAtomAlign aa;
71
+
72
+ if ((memop & MO_SIZE) <= MO_64) {
73
+ return true;
74
+ }
75
+
76
+ /*
77
+ * Reject 16-byte memop with 16-byte atomicity,
78
+ * but do allow a pair of 64-bit operations.
79
+ */
80
+ aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
81
+ return aa.atom <= MO_64;
43
}
82
}
44
83
45
/**
84
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg data,
46
diff --git a/cpus-common.c b/cpus-common.c
85
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
47
index XXXXXXX..XXXXXXX 100644
86
{
48
--- a/cpus-common.c
87
TCGLabelQemuLdst *ldst = NULL;
49
+++ b/cpus-common.c
88
MemOp opc = get_memop(oi);
50
@@ -XXX,XX +XXX,XX @@ void start_exclusive(void)
89
+ MemOp s_bits = opc & MO_SIZE;
51
CPUState *other_cpu;
90
unsigned a_mask;
52
int running_cpus;
91
53
92
- h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
54
+ if (current_cpu->exclusive_context_count) {
93
+ h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, s_bits == MO_128);
55
+ current_cpu->exclusive_context_count++;
94
a_mask = (1 << h->aa.align) - 1;
56
+ return;
95
57
+ }
96
#ifdef CONFIG_SOFTMMU
58
+
97
- unsigned s_bits = opc & MO_SIZE;
59
qemu_mutex_lock(&qemu_cpu_list_lock);
98
unsigned s_mask = (1 << s_bits) - 1;
60
exclusive_idle();
99
int mem_index = get_mmuidx(oi);
61
100
int fast_off = TLB_MASK_TABLE_OFS(mem_index);
62
@@ -XXX,XX +XXX,XX @@ void start_exclusive(void)
101
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
63
*/
102
}
64
qemu_mutex_unlock(&qemu_cpu_list_lock);
65
66
- current_cpu->in_exclusive_context = true;
67
+ current_cpu->exclusive_context_count = 1;
68
}
103
}
69
104
70
/* Finish an exclusive operation. */
105
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
71
void end_exclusive(void)
106
+ TCGReg addr_reg, MemOpIdx oi, bool is_ld)
107
+{
108
+ TCGLabel *l1 = NULL, *l2 = NULL;
109
+ TCGLabelQemuLdst *ldst;
110
+ HostAddress h;
111
+ bool need_bswap;
112
+ bool use_pair;
113
+ S390Opcode insn;
114
+
115
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, is_ld);
116
+
117
+ use_pair = h.aa.atom < MO_128;
118
+ need_bswap = get_memop(oi) & MO_BSWAP;
119
+
120
+ if (!use_pair) {
121
+ /*
122
+ * Atomicity requires we use LPQ. If we've already checked for
123
+ * 16-byte alignment, that's all we need. If we arrive with
124
+ * lesser alignment, we have determined that less than 16-byte
125
+ * alignment can be satisfied with two 8-byte loads.
126
+ */
127
+ if (h.aa.align < MO_128) {
128
+ use_pair = true;
129
+ l1 = gen_new_label();
130
+ l2 = gen_new_label();
131
+
132
+ tcg_out_insn(s, RI, TMLL, addr_reg, 15);
133
+ tgen_branch(s, 7, l1); /* CC in {1,2,3} */
134
+ }
135
+
136
+ tcg_debug_assert(!need_bswap);
137
+ tcg_debug_assert(datalo & 1);
138
+ tcg_debug_assert(datahi == datalo - 1);
139
+ insn = is_ld ? RXY_LPQ : RXY_STPQ;
140
+ tcg_out_insn_RXY(s, insn, datahi, h.base, h.index, h.disp);
141
+
142
+ if (use_pair) {
143
+ tgen_branch(s, S390_CC_ALWAYS, l2);
144
+ tcg_out_label(s, l1);
145
+ }
146
+ }
147
+ if (use_pair) {
148
+ TCGReg d1, d2;
149
+
150
+ if (need_bswap) {
151
+ d1 = datalo, d2 = datahi;
152
+ insn = is_ld ? RXY_LRVG : RXY_STRVG;
153
+ } else {
154
+ d1 = datahi, d2 = datalo;
155
+ insn = is_ld ? RXY_LG : RXY_STG;
156
+ }
157
+
158
+ if (h.base == d1 || h.index == d1) {
159
+ tcg_out_insn(s, RXY, LAY, TCG_TMP0, h.base, h.index, h.disp);
160
+ h.base = TCG_TMP0;
161
+ h.index = TCG_REG_NONE;
162
+ h.disp = 0;
163
+ }
164
+ tcg_out_insn_RXY(s, insn, d1, h.base, h.index, h.disp);
165
+ tcg_out_insn_RXY(s, insn, d2, h.base, h.index, h.disp + 8);
166
+ }
167
+ if (l2) {
168
+ tcg_out_label(s, l2);
169
+ }
170
+
171
+ if (ldst) {
172
+ ldst->type = TCG_TYPE_I128;
173
+ ldst->datalo_reg = datalo;
174
+ ldst->datahi_reg = datahi;
175
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
176
+ }
177
+}
178
+
179
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
72
{
180
{
73
- current_cpu->in_exclusive_context = false;
181
/* Reuse the zeroing that exists for goto_ptr. */
74
+ current_cpu->exclusive_context_count--;
182
@@ -XXX,XX +XXX,XX @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
75
+ if (current_cpu->exclusive_context_count) {
183
case INDEX_op_qemu_st_a64_i64:
76
+ return;
184
tcg_out_qemu_st(s, args[0], args[1], args[2], TCG_TYPE_I64);
77
+ }
185
break;
78
186
+ case INDEX_op_qemu_ld_a32_i128:
79
qemu_mutex_lock(&qemu_cpu_list_lock);
187
+ case INDEX_op_qemu_ld_a64_i128:
80
qatomic_set(&pending_cpus, 0);
188
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], true);
189
+ break;
190
+ case INDEX_op_qemu_st_a32_i128:
191
+ case INDEX_op_qemu_st_a64_i128:
192
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], false);
193
+ break;
194
195
case INDEX_op_ld16s_i64:
196
tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
197
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
198
case INDEX_op_qemu_st_a32_i32:
199
case INDEX_op_qemu_st_a64_i32:
200
return C_O0_I2(r, r);
201
+ case INDEX_op_qemu_ld_a32_i128:
202
+ case INDEX_op_qemu_ld_a64_i128:
203
+ return C_O2_I1(o, m, r);
204
+ case INDEX_op_qemu_st_a32_i128:
205
+ case INDEX_op_qemu_st_a64_i128:
206
+ return C_O0_I3(o, m, r);
207
208
case INDEX_op_deposit_i32:
209
case INDEX_op_deposit_i64:
81
--
210
--
82
2.34.1
211
2.34.1
83
84
diff view generated by jsdifflib
New patch
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
.../generic/host/load-extract-al16-al8.h | 45 +++++++++++++++++++
5
accel/tcg/ldst_atomicity.c.inc | 36 +--------------
6
2 files changed, 47 insertions(+), 34 deletions(-)
7
create mode 100644 host/include/generic/host/load-extract-al16-al8.h
1
8
9
diff --git a/host/include/generic/host/load-extract-al16-al8.h b/host/include/generic/host/load-extract-al16-al8.h
10
new file mode 100644
11
index XXXXXXX..XXXXXXX
12
--- /dev/null
13
+++ b/host/include/generic/host/load-extract-al16-al8.h
14
@@ -XXX,XX +XXX,XX @@
15
+/*
16
+ * SPDX-License-Identifier: GPL-2.0-or-later
17
+ * Atomic extract 64 from 128-bit, generic version.
18
+ *
19
+ * Copyright (C) 2023 Linaro, Ltd.
20
+ */
21
+
22
+#ifndef HOST_LOAD_EXTRACT_AL16_AL8_H
23
+#define HOST_LOAD_EXTRACT_AL16_AL8_H
24
+
25
+/**
26
+ * load_atom_extract_al16_or_al8:
27
+ * @pv: host address
28
+ * @s: object size in bytes, @s <= 8.
29
+ *
30
+ * Load @s bytes from @pv, when pv % s != 0. If [p, p+s-1] does not
31
+ * cross an 16-byte boundary then the access must be 16-byte atomic,
32
+ * otherwise the access must be 8-byte atomic.
33
+ */
34
+static inline uint64_t ATTRIBUTE_ATOMIC128_OPT
35
+load_atom_extract_al16_or_al8(void *pv, int s)
36
+{
37
+ uintptr_t pi = (uintptr_t)pv;
38
+ int o = pi & 7;
39
+ int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8;
40
+ Int128 r;
41
+
42
+ pv = (void *)(pi & ~7);
43
+ if (pi & 8) {
44
+ uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8);
45
+ uint64_t a = qatomic_read__nocheck(p8);
46
+ uint64_t b = qatomic_read__nocheck(p8 + 1);
47
+
48
+ if (HOST_BIG_ENDIAN) {
49
+ r = int128_make128(b, a);
50
+ } else {
51
+ r = int128_make128(a, b);
52
+ }
53
+ } else {
54
+ r = atomic16_read_ro(pv);
55
+ }
56
+ return int128_getlo(int128_urshift(r, shr));
57
+}
58
+
59
+#endif /* HOST_LOAD_EXTRACT_AL16_AL8_H */
60
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
61
index XXXXXXX..XXXXXXX 100644
62
--- a/accel/tcg/ldst_atomicity.c.inc
63
+++ b/accel/tcg/ldst_atomicity.c.inc
64
@@ -XXX,XX +XXX,XX @@
65
* See the COPYING file in the top-level directory.
66
*/
67
68
+#include "host/load-extract-al16-al8.h"
69
+
70
#ifdef CONFIG_ATOMIC64
71
# define HAVE_al8 true
72
#else
73
@@ -XXX,XX +XXX,XX @@ static uint64_t load_atom_extract_al16_or_exit(CPUArchState *env, uintptr_t ra,
74
return int128_getlo(r);
75
}
76
77
-/**
78
- * load_atom_extract_al16_or_al8:
79
- * @p: host address
80
- * @s: object size in bytes, @s <= 8.
81
- *
82
- * Load @s bytes from @p, when p % s != 0. If [p, p+s-1] does not
83
- * cross an 16-byte boundary then the access must be 16-byte atomic,
84
- * otherwise the access must be 8-byte atomic.
85
- */
86
-static inline uint64_t ATTRIBUTE_ATOMIC128_OPT
87
-load_atom_extract_al16_or_al8(void *pv, int s)
88
-{
89
- uintptr_t pi = (uintptr_t)pv;
90
- int o = pi & 7;
91
- int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8;
92
- Int128 r;
93
-
94
- pv = (void *)(pi & ~7);
95
- if (pi & 8) {
96
- uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8);
97
- uint64_t a = qatomic_read__nocheck(p8);
98
- uint64_t b = qatomic_read__nocheck(p8 + 1);
99
-
100
- if (HOST_BIG_ENDIAN) {
101
- r = int128_make128(b, a);
102
- } else {
103
- r = int128_make128(a, b);
104
- }
105
- } else {
106
- r = atomic16_read_ro(pv);
107
- }
108
- return int128_getlo(int128_urshift(r, shr));
109
-}
110
-
111
/**
112
* load_atom_4_by_2:
113
* @pv: host address
114
--
115
2.34.1
diff view generated by jsdifflib
1
From: Ilya Leoshkevich <iii@linux.ibm.com>
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
3
fork()ed processes currently start with
4
current_cpu->in_exclusive_context set, which is, strictly speaking, not
5
correct, but does not cause problems (even assertion failures).
6
7
With one of the next patches, the code begins to rely on this value, so
8
fix it by always calling end_exclusive() in fork_end().
9
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
12
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
13
Message-Id: <20230214140829.45392-2-iii@linux.ibm.com>
14
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
15
---
3
---
16
linux-user/main.c | 10 ++++++----
4
host/include/generic/host/store-insert-al16.h | 50 +++++++++++++++++++
17
linux-user/syscall.c | 1 +
5
accel/tcg/ldst_atomicity.c.inc | 40 +--------------
18
2 files changed, 7 insertions(+), 4 deletions(-)
6
2 files changed, 51 insertions(+), 39 deletions(-)
7
create mode 100644 host/include/generic/host/store-insert-al16.h
19
8
20
diff --git a/linux-user/main.c b/linux-user/main.c
9
diff --git a/host/include/generic/host/store-insert-al16.h b/host/include/generic/host/store-insert-al16.h
10
new file mode 100644
11
index XXXXXXX..XXXXXXX
12
--- /dev/null
13
+++ b/host/include/generic/host/store-insert-al16.h
14
@@ -XXX,XX +XXX,XX @@
15
+/*
16
+ * SPDX-License-Identifier: GPL-2.0-or-later
17
+ * Atomic store insert into 128-bit, generic version.
18
+ *
19
+ * Copyright (C) 2023 Linaro, Ltd.
20
+ */
21
+
22
+#ifndef HOST_STORE_INSERT_AL16_H
23
+#define HOST_STORE_INSERT_AL16_H
24
+
25
+/**
26
+ * store_atom_insert_al16:
27
+ * @p: host address
28
+ * @val: shifted value to store
29
+ * @msk: mask for value to store
30
+ *
31
+ * Atomically store @val to @p masked by @msk.
32
+ */
33
+static inline void ATTRIBUTE_ATOMIC128_OPT
34
+store_atom_insert_al16(Int128 *ps, Int128 val, Int128 msk)
35
+{
36
+#if defined(CONFIG_ATOMIC128)
37
+ __uint128_t *pu;
38
+ Int128Alias old, new;
39
+
40
+ /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */
41
+ pu = __builtin_assume_aligned(ps, 16);
42
+ old.u = *pu;
43
+ msk = int128_not(msk);
44
+ do {
45
+ new.s = int128_and(old.s, msk);
46
+ new.s = int128_or(new.s, val);
47
+ } while (!__atomic_compare_exchange_n(pu, &old.u, new.u, true,
48
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED));
49
+#else
50
+ Int128 old, new, cmp;
51
+
52
+ ps = __builtin_assume_aligned(ps, 16);
53
+ old = *ps;
54
+ msk = int128_not(msk);
55
+ do {
56
+ cmp = old;
57
+ new = int128_and(old, msk);
58
+ new = int128_or(new, val);
59
+ old = atomic16_cmpxchg(ps, cmp, new);
60
+ } while (int128_ne(cmp, old));
61
+#endif
62
+}
63
+
64
+#endif /* HOST_STORE_INSERT_AL16_H */
65
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
21
index XXXXXXX..XXXXXXX 100644
66
index XXXXXXX..XXXXXXX 100644
22
--- a/linux-user/main.c
67
--- a/accel/tcg/ldst_atomicity.c.inc
23
+++ b/linux-user/main.c
68
+++ b/accel/tcg/ldst_atomicity.c.inc
24
@@ -XXX,XX +XXX,XX @@ void fork_end(int child)
69
@@ -XXX,XX +XXX,XX @@
25
}
70
*/
26
qemu_init_cpu_list();
71
27
gdbserver_fork(thread_cpu);
72
#include "host/load-extract-al16-al8.h"
28
- /* qemu_init_cpu_list() takes care of reinitializing the
73
+#include "host/store-insert-al16.h"
29
- * exclusive state, so we don't need to end_exclusive() here.
74
30
- */
75
#ifdef CONFIG_ATOMIC64
31
} else {
76
# define HAVE_al8 true
32
cpu_list_unlock();
77
@@ -XXX,XX +XXX,XX @@ static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk)
33
- end_exclusive();
78
__ATOMIC_RELAXED, __ATOMIC_RELAXED));
34
}
35
+ /*
36
+ * qemu_init_cpu_list() reinitialized the child exclusive state, but we
37
+ * also need to keep current_cpu consistent, so call end_exclusive() for
38
+ * both child and parent.
39
+ */
40
+ end_exclusive();
41
}
79
}
42
80
43
__thread CPUState *thread_cpu;
81
-/**
44
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
82
- * store_atom_insert_al16:
45
index XXXXXXX..XXXXXXX 100644
83
- * @p: host address
46
--- a/linux-user/syscall.c
84
- * @val: shifted value to store
47
+++ b/linux-user/syscall.c
85
- * @msk: mask for value to store
48
@@ -XXX,XX +XXX,XX @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp,
86
- *
49
cpu_clone_regs_parent(env, flags);
87
- * Atomically store @val to @p masked by @msk.
50
fork_end(0);
88
- */
51
}
89
-static void ATTRIBUTE_ATOMIC128_OPT
52
+ g_assert(!cpu_in_exclusive_context(cpu));
90
-store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk)
53
}
91
-{
54
return ret;
92
-#if defined(CONFIG_ATOMIC128)
55
}
93
- __uint128_t *pu, old, new;
94
-
95
- /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */
96
- pu = __builtin_assume_aligned(ps, 16);
97
- old = *pu;
98
- do {
99
- new = (old & ~msk.u) | val.u;
100
- } while (!__atomic_compare_exchange_n(pu, &old, new, true,
101
- __ATOMIC_RELAXED, __ATOMIC_RELAXED));
102
-#elif defined(CONFIG_CMPXCHG128)
103
- __uint128_t *pu, old, new;
104
-
105
- /*
106
- * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always
107
- * defer to libatomic, so we must use __sync_*_compare_and_swap_16
108
- * and accept the sequential consistency that comes with it.
109
- */
110
- pu = __builtin_assume_aligned(ps, 16);
111
- do {
112
- old = *pu;
113
- new = (old & ~msk.u) | val.u;
114
- } while (!__sync_bool_compare_and_swap_16(pu, old, new));
115
-#else
116
- qemu_build_not_reached();
117
-#endif
118
-}
119
-
120
/**
121
* store_bytes_leN:
122
* @pv: host address
56
--
123
--
57
2.34.1
124
2.34.1
58
59
diff view generated by jsdifflib
New patch
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
.../x86_64/host/load-extract-al16-al8.h | 50 +++++++++++++++++++
5
1 file changed, 50 insertions(+)
6
create mode 100644 host/include/x86_64/host/load-extract-al16-al8.h
1
7
8
diff --git a/host/include/x86_64/host/load-extract-al16-al8.h b/host/include/x86_64/host/load-extract-al16-al8.h
9
new file mode 100644
10
index XXXXXXX..XXXXXXX
11
--- /dev/null
12
+++ b/host/include/x86_64/host/load-extract-al16-al8.h
13
@@ -XXX,XX +XXX,XX @@
14
+/*
15
+ * SPDX-License-Identifier: GPL-2.0-or-later
16
+ * Atomic extract 64 from 128-bit, x86_64 version.
17
+ *
18
+ * Copyright (C) 2023 Linaro, Ltd.
19
+ */
20
+
21
+#ifndef X86_64_LOAD_EXTRACT_AL16_AL8_H
22
+#define X86_64_LOAD_EXTRACT_AL16_AL8_H
23
+
24
+#ifdef CONFIG_INT128_TYPE
25
+#include "host/cpuinfo.h"
26
+
27
+/**
28
+ * load_atom_extract_al16_or_al8:
29
+ * @pv: host address
30
+ * @s: object size in bytes, @s <= 8.
31
+ *
32
+ * Load @s bytes from @pv, when pv % s != 0. If [p, p+s-1] does not
33
+ * cross an 16-byte boundary then the access must be 16-byte atomic,
34
+ * otherwise the access must be 8-byte atomic.
35
+ */
36
+static inline uint64_t ATTRIBUTE_ATOMIC128_OPT
37
+load_atom_extract_al16_or_al8(void *pv, int s)
38
+{
39
+ uintptr_t pi = (uintptr_t)pv;
40
+ __int128_t *ptr_align = (__int128_t *)(pi & ~7);
41
+ int shr = (pi & 7) * 8;
42
+ Int128Alias r;
43
+
44
+ /*
45
+ * ptr_align % 16 is now only 0 or 8.
46
+ * If the host supports atomic loads with VMOVDQU, then always use that,
47
+ * making the branch highly predictable. Otherwise we must use VMOVDQA
48
+ * when ptr_align % 16 == 0 for 16-byte atomicity.
49
+ */
50
+ if ((cpuinfo & CPUINFO_ATOMIC_VMOVDQU) || (pi & 8)) {
51
+ asm("vmovdqu %1, %0" : "=x" (r.i) : "m" (*ptr_align));
52
+ } else {
53
+ asm("vmovdqa %1, %0" : "=x" (r.i) : "m" (*ptr_align));
54
+ }
55
+ return int128_getlo(int128_urshift(r.s, shr));
56
+}
57
+#else
58
+/* Fallback definition that must be optimized away, or error. */
59
+uint64_t QEMU_ERROR("unsupported atomic")
60
+ load_atom_extract_al16_or_al8(void *pv, int s);
61
+#endif
62
+
63
+#endif /* X86_64_LOAD_EXTRACT_AL16_AL8_H */
64
--
65
2.34.1
diff view generated by jsdifflib
New patch
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
.../aarch64/host/load-extract-al16-al8.h | 40 +++++++++++++++++++
5
1 file changed, 40 insertions(+)
6
create mode 100644 host/include/aarch64/host/load-extract-al16-al8.h
1
7
8
diff --git a/host/include/aarch64/host/load-extract-al16-al8.h b/host/include/aarch64/host/load-extract-al16-al8.h
9
new file mode 100644
10
index XXXXXXX..XXXXXXX
11
--- /dev/null
12
+++ b/host/include/aarch64/host/load-extract-al16-al8.h
13
@@ -XXX,XX +XXX,XX @@
14
+/*
15
+ * SPDX-License-Identifier: GPL-2.0-or-later
16
+ * Atomic extract 64 from 128-bit, AArch64 version.
17
+ *
18
+ * Copyright (C) 2023 Linaro, Ltd.
19
+ */
20
+
21
+#ifndef AARCH64_LOAD_EXTRACT_AL16_AL8_H
22
+#define AARCH64_LOAD_EXTRACT_AL16_AL8_H
23
+
24
+#include "host/cpuinfo.h"
25
+#include "tcg/debug-assert.h"
26
+
27
+/**
28
+ * load_atom_extract_al16_or_al8:
29
+ * @pv: host address
30
+ * @s: object size in bytes, @s <= 8.
31
+ *
32
+ * Load @s bytes from @pv, when pv % s != 0. If [p, p+s-1] does not
33
+ * cross an 16-byte boundary then the access must be 16-byte atomic,
34
+ * otherwise the access must be 8-byte atomic.
35
+ */
36
+static inline uint64_t load_atom_extract_al16_or_al8(void *pv, int s)
37
+{
38
+ uintptr_t pi = (uintptr_t)pv;
39
+ __int128_t *ptr_align = (__int128_t *)(pi & ~7);
40
+ int shr = (pi & 7) * 8;
41
+ uint64_t l, h;
42
+
43
+ /*
44
+ * With FEAT_LSE2, LDP is single-copy atomic if 16-byte aligned
45
+ * and single-copy atomic on the parts if 8-byte aligned.
46
+ * All we need do is align the pointer mod 8.
47
+ */
48
+ tcg_debug_assert(HAVE_ATOMIC128_RO);
49
+ asm("ldp %0, %1, %2" : "=r"(l), "=r"(h) : "m"(*ptr_align));
50
+ return (l >> shr) | (h << (-shr & 63));
51
+}
52
+
53
+#endif /* AARCH64_LOAD_EXTRACT_AL16_AL8_H */
54
--
55
2.34.1
diff view generated by jsdifflib
New patch
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
host/include/aarch64/host/store-insert-al16.h | 47 +++++++++++++++++++
5
1 file changed, 47 insertions(+)
6
create mode 100644 host/include/aarch64/host/store-insert-al16.h
1
7
8
diff --git a/host/include/aarch64/host/store-insert-al16.h b/host/include/aarch64/host/store-insert-al16.h
9
new file mode 100644
10
index XXXXXXX..XXXXXXX
11
--- /dev/null
12
+++ b/host/include/aarch64/host/store-insert-al16.h
13
@@ -XXX,XX +XXX,XX @@
14
+/*
15
+ * SPDX-License-Identifier: GPL-2.0-or-later
16
+ * Atomic store insert into 128-bit, AArch64 version.
17
+ *
18
+ * Copyright (C) 2023 Linaro, Ltd.
19
+ */
20
+
21
+#ifndef AARCH64_STORE_INSERT_AL16_H
22
+#define AARCH64_STORE_INSERT_AL16_H
23
+
24
+/**
25
+ * store_atom_insert_al16:
26
+ * @p: host address
27
+ * @val: shifted value to store
28
+ * @msk: mask for value to store
29
+ *
30
+ * Atomically store @val to @p masked by @msk.
31
+ */
32
+static inline void ATTRIBUTE_ATOMIC128_OPT
33
+store_atom_insert_al16(Int128 *ps, Int128 val, Int128 msk)
34
+{
35
+ /*
36
+ * GCC only implements __sync* primitives for int128 on aarch64.
37
+ * We can do better without the barriers, and integrating the
38
+ * arithmetic into the load-exclusive/store-conditional pair.
39
+ */
40
+ uint64_t tl, th, vl, vh, ml, mh;
41
+ uint32_t fail;
42
+
43
+ qemu_build_assert(!HOST_BIG_ENDIAN);
44
+ vl = int128_getlo(val);
45
+ vh = int128_gethi(val);
46
+ ml = int128_getlo(msk);
47
+ mh = int128_gethi(msk);
48
+
49
+ asm("0: ldxp %[l], %[h], %[mem]\n\t"
50
+ "bic %[l], %[l], %[ml]\n\t"
51
+ "bic %[h], %[h], %[mh]\n\t"
52
+ "orr %[l], %[l], %[vl]\n\t"
53
+ "orr %[h], %[h], %[vh]\n\t"
54
+ "stxp %w[f], %[l], %[h], %[mem]\n\t"
55
+ "cbnz %w[f], 0b\n"
56
+ : [mem] "+Q"(*ps), [f] "=&r"(fail), [l] "=&r"(tl), [h] "=&r"(th)
57
+ : [vl] "r"(vl), [vh] "r"(vh), [ml] "r"(ml), [mh] "r"(mh));
58
+}
59
+
60
+#endif /* AARCH64_STORE_INSERT_AL16_H */
61
--
62
2.34.1
diff view generated by jsdifflib
1
If an instruction straddles a page boundary, and the first page
1
The last use was removed by e77c89fb086a.
2
was ram, but the second page was MMIO, we would abort. Handle
3
this as if both pages are MMIO, by setting the ram_addr_t for
4
the first page to -1.
5
2
6
Reported-by: Sid Manning <sidneym@quicinc.com>
3
Fixes: e77c89fb086a ("cputlb: Remove static tlb sizing")
7
Reported-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
6
---
11
accel/tcg/translator.c | 12 ++++++++++--
7
tcg/aarch64/tcg-target.h | 1 -
12
1 file changed, 10 insertions(+), 2 deletions(-)
8
tcg/arm/tcg-target.h | 1 -
9
tcg/i386/tcg-target.h | 1 -
10
tcg/mips/tcg-target.h | 1 -
11
tcg/ppc/tcg-target.h | 1 -
12
tcg/riscv/tcg-target.h | 1 -
13
tcg/s390x/tcg-target.h | 1 -
14
tcg/sparc64/tcg-target.h | 1 -
15
tcg/tci/tcg-target.h | 1 -
16
9 files changed, 9 deletions(-)
13
17
14
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
18
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
15
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
16
--- a/accel/tcg/translator.c
20
--- a/tcg/aarch64/tcg-target.h
17
+++ b/accel/tcg/translator.c
21
+++ b/tcg/aarch64/tcg-target.h
18
@@ -XXX,XX +XXX,XX @@ static void *translator_access(CPUArchState *env, DisasContextBase *db,
22
@@ -XXX,XX +XXX,XX @@
19
if (host == NULL) {
23
#include "host/cpuinfo.h"
20
tb_page_addr_t phys_page =
24
21
get_page_addr_code_hostp(env, base, &db->host_addr[1]);
25
#define TCG_TARGET_INSN_UNIT_SIZE 4
22
- /* We cannot handle MMIO as second page. */
26
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 24
23
- assert(phys_page != -1);
27
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
24
+
28
25
+ /*
29
typedef enum {
26
+ * If the second page is MMIO, treat as if the first page
30
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
27
+ * was MMIO as well, so that we do not cache the TB.
31
index XXXXXXX..XXXXXXX 100644
28
+ */
32
--- a/tcg/arm/tcg-target.h
29
+ if (unlikely(phys_page == -1)) {
33
+++ b/tcg/arm/tcg-target.h
30
+ tb_set_page_addr0(tb, -1);
34
@@ -XXX,XX +XXX,XX @@ extern int arm_arch;
31
+ return NULL;
35
#define use_armv7_instructions (__ARM_ARCH >= 7 || arm_arch >= 7)
32
+ }
36
33
+
37
#define TCG_TARGET_INSN_UNIT_SIZE 4
34
tb_set_page_addr1(tb, phys_page);
38
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
35
#ifdef CONFIG_USER_ONLY
39
#define MAX_CODE_GEN_BUFFER_SIZE UINT32_MAX
36
page_protect(end);
40
41
typedef enum {
42
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
43
index XXXXXXX..XXXXXXX 100644
44
--- a/tcg/i386/tcg-target.h
45
+++ b/tcg/i386/tcg-target.h
46
@@ -XXX,XX +XXX,XX @@
47
#include "host/cpuinfo.h"
48
49
#define TCG_TARGET_INSN_UNIT_SIZE 1
50
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 31
51
52
#ifdef __x86_64__
53
# define TCG_TARGET_REG_BITS 64
54
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
55
index XXXXXXX..XXXXXXX 100644
56
--- a/tcg/mips/tcg-target.h
57
+++ b/tcg/mips/tcg-target.h
58
@@ -XXX,XX +XXX,XX @@
59
#endif
60
61
#define TCG_TARGET_INSN_UNIT_SIZE 4
62
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
63
#define TCG_TARGET_NB_REGS 32
64
65
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
66
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
67
index XXXXXXX..XXXXXXX 100644
68
--- a/tcg/ppc/tcg-target.h
69
+++ b/tcg/ppc/tcg-target.h
70
@@ -XXX,XX +XXX,XX @@
71
72
#define TCG_TARGET_NB_REGS 64
73
#define TCG_TARGET_INSN_UNIT_SIZE 4
74
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
75
76
typedef enum {
77
TCG_REG_R0, TCG_REG_R1, TCG_REG_R2, TCG_REG_R3,
78
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
79
index XXXXXXX..XXXXXXX 100644
80
--- a/tcg/riscv/tcg-target.h
81
+++ b/tcg/riscv/tcg-target.h
82
@@ -XXX,XX +XXX,XX @@
83
#define TCG_TARGET_REG_BITS 64
84
85
#define TCG_TARGET_INSN_UNIT_SIZE 4
86
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
87
#define TCG_TARGET_NB_REGS 32
88
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
89
90
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
91
index XXXXXXX..XXXXXXX 100644
92
--- a/tcg/s390x/tcg-target.h
93
+++ b/tcg/s390x/tcg-target.h
94
@@ -XXX,XX +XXX,XX @@
95
#define S390_TCG_TARGET_H
96
97
#define TCG_TARGET_INSN_UNIT_SIZE 2
98
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 19
99
100
/* We have a +- 4GB range on the branches; leave some slop. */
101
#define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
102
diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h
103
index XXXXXXX..XXXXXXX 100644
104
--- a/tcg/sparc64/tcg-target.h
105
+++ b/tcg/sparc64/tcg-target.h
106
@@ -XXX,XX +XXX,XX @@
107
#define SPARC_TCG_TARGET_H
108
109
#define TCG_TARGET_INSN_UNIT_SIZE 4
110
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
111
#define TCG_TARGET_NB_REGS 32
112
#define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
113
114
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
115
index XXXXXXX..XXXXXXX 100644
116
--- a/tcg/tci/tcg-target.h
117
+++ b/tcg/tci/tcg-target.h
118
@@ -XXX,XX +XXX,XX @@
119
120
#define TCG_TARGET_INTERPRETER 1
121
#define TCG_TARGET_INSN_UNIT_SIZE 4
122
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
123
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
124
125
#if UINTPTR_MAX == UINT32_MAX
37
--
126
--
38
2.34.1
127
2.34.1
39
128
40
129
diff view generated by jsdifflib
New patch
1
Invert the exit code, for use with the testsuite.
1
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
---
5
scripts/decodetree.py | 9 +++++++--
6
1 file changed, 7 insertions(+), 2 deletions(-)
7
8
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
9
index XXXXXXX..XXXXXXX 100644
10
--- a/scripts/decodetree.py
11
+++ b/scripts/decodetree.py
12
@@ -XXX,XX +XXX,XX @@
13
formats = {}
14
allpatterns = []
15
anyextern = False
16
+testforerror = False
17
18
translate_prefix = 'trans'
19
translate_scope = 'static '
20
@@ -XXX,XX +XXX,XX @@ def error_with_file(file, lineno, *args):
21
if output_file and output_fd:
22
output_fd.close()
23
os.remove(output_file)
24
- exit(1)
25
+ exit(0 if testforerror else 1)
26
# end error_with_file
27
28
29
@@ -XXX,XX +XXX,XX @@ def main():
30
global bitop_width
31
global variablewidth
32
global anyextern
33
+ global testforerror
34
35
decode_scope = 'static '
36
37
long_opts = ['decode=', 'translate=', 'output=', 'insnwidth=',
38
- 'static-decode=', 'varinsnwidth=']
39
+ 'static-decode=', 'varinsnwidth=', 'test-for-error']
40
try:
41
(opts, args) = getopt.gnu_getopt(sys.argv[1:], 'o:vw:', long_opts)
42
except getopt.GetoptError as err:
43
@@ -XXX,XX +XXX,XX @@ def main():
44
bitop_width = 64
45
elif insnwidth != 32:
46
error(0, 'cannot handle insns of width', insnwidth)
47
+ elif o == '--test-for-error':
48
+ testforerror = True
49
else:
50
assert False, 'unhandled option'
51
52
@@ -XXX,XX +XXX,XX @@ def main():
53
54
if output_file:
55
output_fd.close()
56
+ exit(1 if testforerror else 0)
57
# end main
58
59
60
--
61
2.34.1
diff view generated by jsdifflib
New patch
1
Two copy-paste errors walking the parse tree.
1
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
---
5
scripts/decodetree.py | 4 ++--
6
1 file changed, 2 insertions(+), 2 deletions(-)
7
8
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
9
index XXXXXXX..XXXXXXX 100644
10
--- a/scripts/decodetree.py
11
+++ b/scripts/decodetree.py
12
@@ -XXX,XX +XXX,XX @@ def build_tree(self):
13
14
def prop_format(self):
15
for p in self.pats:
16
- p.build_tree()
17
+ p.prop_format()
18
19
def prop_width(self):
20
width = None
21
@@ -XXX,XX +XXX,XX @@ def __build_tree(pats, outerbits, outermask):
22
return t
23
24
def build_tree(self):
25
- super().prop_format()
26
+ super().build_tree()
27
self.tree = self.__build_tree(self.pats, self.fixedbits,
28
self.fixedmask)
29
30
--
31
2.34.1
diff view generated by jsdifflib
New patch
1
Test err_pattern_group_empty.decode failed with exception:
1
2
3
Traceback (most recent call last):
4
File "./scripts/decodetree.py", line 1424, in <module> main()
5
File "./scripts/decodetree.py", line 1342, in main toppat.build_tree()
6
File "./scripts/decodetree.py", line 627, in build_tree
7
self.tree = self.__build_tree(self.pats, self.fixedbits,
8
File "./scripts/decodetree.py", line 607, in __build_tree
9
fb = i.fixedbits & innermask
10
TypeError: unsupported operand type(s) for &: 'NoneType' and 'int'
11
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
---
14
scripts/decodetree.py | 6 ++++++
15
1 file changed, 6 insertions(+)
16
17
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
18
index XXXXXXX..XXXXXXX 100644
19
--- a/scripts/decodetree.py
20
+++ b/scripts/decodetree.py
21
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
22
output(ind, '}\n')
23
else:
24
p.output_code(i, extracted, p.fixedbits, p.fixedmask)
25
+
26
+ def build_tree(self):
27
+ if not self.pats:
28
+ error_with_file(self.file, self.lineno, 'empty pattern group')
29
+ super().build_tree()
30
+
31
#end IncMultiPattern
32
33
34
--
35
2.34.1
diff view generated by jsdifflib
1
From: Ilya Leoshkevich <iii@linux.ibm.com>
1
Nor report any PermissionError on remove.
2
The primary purpose is testing with -o /dev/null.
2
3
3
Follow what kernel's full_exception() is doing.
4
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
7
Message-Id: <20230214140829.45392-4-iii@linux.ibm.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
5
---
10
linux-user/microblaze/cpu_loop.c | 10 ++++++++--
6
scripts/decodetree.py | 7 ++++++-
11
1 file changed, 8 insertions(+), 2 deletions(-)
7
1 file changed, 6 insertions(+), 1 deletion(-)
12
8
13
diff --git a/linux-user/microblaze/cpu_loop.c b/linux-user/microblaze/cpu_loop.c
9
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
14
index XXXXXXX..XXXXXXX 100644
10
index XXXXXXX..XXXXXXX 100644
15
--- a/linux-user/microblaze/cpu_loop.c
11
--- a/scripts/decodetree.py
16
+++ b/linux-user/microblaze/cpu_loop.c
12
+++ b/scripts/decodetree.py
17
@@ -XXX,XX +XXX,XX @@
13
@@ -XXX,XX +XXX,XX @@ def error_with_file(file, lineno, *args):
18
14
19
void cpu_loop(CPUMBState *env)
15
if output_file and output_fd:
20
{
16
output_fd.close()
21
+ int trapnr, ret, si_code, sig;
17
- os.remove(output_file)
22
CPUState *cs = env_cpu(env);
18
+ # Do not try to remove e.g. -o /dev/null
23
- int trapnr, ret, si_code;
19
+ if not output_file.startswith("/dev"):
24
20
+ try:
25
while (1) {
21
+ os.remove(output_file)
26
cpu_exec_start(cs);
22
+ except PermissionError:
27
@@ -XXX,XX +XXX,XX @@ void cpu_loop(CPUMBState *env)
23
+ pass
28
env->iflags &= ~(IMM_FLAG | D_FLAG);
24
exit(0 if testforerror else 1)
29
switch (env->esr & 31) {
25
# end error_with_file
30
case ESR_EC_DIVZERO:
26
31
+ sig = TARGET_SIGFPE;
32
si_code = TARGET_FPE_INTDIV;
33
break;
34
case ESR_EC_FPU:
35
@@ -XXX,XX +XXX,XX @@ void cpu_loop(CPUMBState *env)
36
* if there's no recognized bit set. Possibly this
37
* implies that si_code is 0, but follow the structure.
38
*/
39
+ sig = TARGET_SIGFPE;
40
si_code = env->fsr;
41
if (si_code & FSR_IO) {
42
si_code = TARGET_FPE_FLTINV;
43
@@ -XXX,XX +XXX,XX @@ void cpu_loop(CPUMBState *env)
44
si_code = TARGET_FPE_FLTRES;
45
}
46
break;
47
+ case ESR_EC_PRIVINSN:
48
+ sig = SIGILL;
49
+ si_code = ILL_PRVOPC;
50
+ break;
51
default:
52
fprintf(stderr, "Unhandled hw-exception: 0x%x\n",
53
env->esr & ESR_EC_MASK);
54
cpu_dump_state(cs, stderr, 0);
55
exit(EXIT_FAILURE);
56
}
57
- force_sig_fault(TARGET_SIGFPE, si_code, env->pc);
58
+ force_sig_fault(sig, si_code, env->pc);
59
break;
60
61
case EXCP_DEBUG:
62
--
27
--
63
2.34.1
28
2.34.1
diff view generated by jsdifflib
1
The linux kernel's trap tables vector all unassigned trap
2
numbers to BAD_TRAP, which then raises SIGILL.
3
4
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
5
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
1
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
2
---
8
linux-user/sparc/cpu_loop.c | 8 ++++++++
3
tests/decode/check.sh | 24 ----------------
9
1 file changed, 8 insertions(+)
4
tests/decode/meson.build | 59 ++++++++++++++++++++++++++++++++++++++++
5
tests/meson.build | 5 +---
6
3 files changed, 60 insertions(+), 28 deletions(-)
7
delete mode 100755 tests/decode/check.sh
8
create mode 100644 tests/decode/meson.build
10
9
11
diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c
10
diff --git a/tests/decode/check.sh b/tests/decode/check.sh
11
deleted file mode 100755
12
index XXXXXXX..XXXXXXX
13
--- a/tests/decode/check.sh
14
+++ /dev/null
15
@@ -XXX,XX +XXX,XX @@
16
-#!/bin/sh
17
-# This work is licensed under the terms of the GNU LGPL, version 2 or later.
18
-# See the COPYING.LIB file in the top-level directory.
19
-
20
-PYTHON=$1
21
-DECODETREE=$2
22
-E=0
23
-
24
-# All of these tests should produce errors
25
-for i in err_*.decode; do
26
- if $PYTHON $DECODETREE $i > /dev/null 2> /dev/null; then
27
- # Pass, aka failed to fail.
28
- echo FAIL: $i 1>&2
29
- E=1
30
- fi
31
-done
32
-
33
-for i in succ_*.decode; do
34
- if ! $PYTHON $DECODETREE $i > /dev/null 2> /dev/null; then
35
- echo FAIL:$i 1>&2
36
- fi
37
-done
38
-
39
-exit $E
40
diff --git a/tests/decode/meson.build b/tests/decode/meson.build
41
new file mode 100644
42
index XXXXXXX..XXXXXXX
43
--- /dev/null
44
+++ b/tests/decode/meson.build
45
@@ -XXX,XX +XXX,XX @@
46
+err_tests = [
47
+ 'err_argset1.decode',
48
+ 'err_argset2.decode',
49
+ 'err_field1.decode',
50
+ 'err_field2.decode',
51
+ 'err_field3.decode',
52
+ 'err_field4.decode',
53
+ 'err_field5.decode',
54
+ 'err_field6.decode',
55
+ 'err_init1.decode',
56
+ 'err_init2.decode',
57
+ 'err_init3.decode',
58
+ 'err_init4.decode',
59
+ 'err_overlap1.decode',
60
+ 'err_overlap2.decode',
61
+ 'err_overlap3.decode',
62
+ 'err_overlap4.decode',
63
+ 'err_overlap5.decode',
64
+ 'err_overlap6.decode',
65
+ 'err_overlap7.decode',
66
+ 'err_overlap8.decode',
67
+ 'err_overlap9.decode',
68
+ 'err_pattern_group_empty.decode',
69
+ 'err_pattern_group_ident1.decode',
70
+ 'err_pattern_group_ident2.decode',
71
+ 'err_pattern_group_nest1.decode',
72
+ 'err_pattern_group_nest2.decode',
73
+ 'err_pattern_group_nest3.decode',
74
+ 'err_pattern_group_overlap1.decode',
75
+ 'err_width1.decode',
76
+ 'err_width2.decode',
77
+ 'err_width3.decode',
78
+ 'err_width4.decode',
79
+]
80
+
81
+succ_tests = [
82
+ 'succ_argset_type1.decode',
83
+ 'succ_function.decode',
84
+ 'succ_ident1.decode',
85
+ 'succ_pattern_group_nest1.decode',
86
+ 'succ_pattern_group_nest2.decode',
87
+ 'succ_pattern_group_nest3.decode',
88
+ 'succ_pattern_group_nest4.decode',
89
+]
90
+
91
+suite = 'decodetree'
92
+decodetree = find_program(meson.project_source_root() / 'scripts/decodetree.py')
93
+
94
+foreach t: err_tests
95
+ test(fs.replace_suffix(t, ''),
96
+ decodetree, args: ['-o', '/dev/null', '--test-for-error', files(t)],
97
+ suite: suite)
98
+endforeach
99
+
100
+foreach t: succ_tests
101
+ test(fs.replace_suffix(t, ''),
102
+ decodetree, args: ['-o', '/dev/null', files(t)],
103
+ suite: suite)
104
+endforeach
105
diff --git a/tests/meson.build b/tests/meson.build
12
index XXXXXXX..XXXXXXX 100644
106
index XXXXXXX..XXXXXXX 100644
13
--- a/linux-user/sparc/cpu_loop.c
107
--- a/tests/meson.build
14
+++ b/linux-user/sparc/cpu_loop.c
108
+++ b/tests/meson.build
15
@@ -XXX,XX +XXX,XX @@ void cpu_loop (CPUSPARCState *env)
109
@@ -XXX,XX +XXX,XX @@ if have_tools and have_vhost_user and 'CONFIG_LINUX' in config_host
16
cpu_exec_step_atomic(cs);
110
dependencies: [qemuutil, vhost_user])
17
break;
111
endif
18
default:
112
19
+ /*
113
-test('decodetree', sh,
20
+ * Most software trap numbers vector to BAD_TRAP.
114
- args: [ files('decode/check.sh'), config_host['PYTHON'], files('../scripts/decodetree.py') ],
21
+ * Handle anything not explicitly matched above.
115
- workdir: meson.current_source_dir() / 'decode',
22
+ */
116
- suite: 'decodetree')
23
+ if (trapnr >= TT_TRAP && trapnr <= TT_TRAP + 0x7f) {
117
+subdir('decode')
24
+ force_sig_fault(TARGET_SIGILL, ILL_ILLTRP, env->pc);
118
25
+ break;
119
if 'CONFIG_TCG' in config_all
26
+ }
120
subdir('fp')
27
fprintf(stderr, "Unhandled trap: 0x%x\n", trapnr);
28
cpu_dump_state(cs, stderr, 0);
29
exit(EXIT_FAILURE);
30
--
121
--
31
2.34.1
122
2.34.1
diff view generated by jsdifflib
New patch
1
From: Peter Maydell <peter.maydell@linaro.org>
1
2
3
Document the named field syntax that we want to implement for the
4
decodetree script. This allows a field to be defined in terms of
5
some other field that the instruction pattern has already set, for
6
example:
7
8
%sz_imm 10:3 sz:3 !function=expand_sz_imm
9
10
to allow a function to be passed both an immediate field from the
11
instruction and also a sz value which might have been specified by
12
the instruction pattern directly (sz=1, etc) rather than being a
13
simple field within the instruction.
14
15
Note that the restriction on not having the format referring to the
16
pattern and the pattern referring to the format simultaneously is a
17
restriction of the decoder generator rather than inherently being a
18
silly thing to do.
19
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
22
Message-Id: <20230523120447.728365-3-peter.maydell@linaro.org>
23
---
24
docs/devel/decodetree.rst | 33 ++++++++++++++++++++++++++++-----
25
1 file changed, 28 insertions(+), 5 deletions(-)
26
27
diff --git a/docs/devel/decodetree.rst b/docs/devel/decodetree.rst
28
index XXXXXXX..XXXXXXX 100644
29
--- a/docs/devel/decodetree.rst
30
+++ b/docs/devel/decodetree.rst
31
@@ -XXX,XX +XXX,XX @@ Fields
32
33
Syntax::
34
35
- field_def := '%' identifier ( unnamed_field )* ( !function=identifier )?
36
+ field_def := '%' identifier ( field )* ( !function=identifier )?
37
+ field := unnamed_field | named_field
38
unnamed_field := number ':' ( 's' ) number
39
+ named_field := identifier ':' ( 's' ) number
40
41
For *unnamed_field*, the first number is the least-significant bit position
42
of the field and the second number is the length of the field. If the 's' is
43
-present, the field is considered signed. If multiple ``unnamed_fields`` are
44
-present, they are concatenated. In this way one can define disjoint fields.
45
+present, the field is considered signed.
46
+
47
+A *named_field* refers to some other field in the instruction pattern
48
+or format. Regardless of the length of the other field where it is
49
+defined, it will be inserted into this field with the specified
50
+signedness and bit width.
51
+
52
+Field definitions that involve loops (i.e. where a field is defined
53
+directly or indirectly in terms of itself) are errors.
54
+
55
+A format can include fields that refer to named fields that are
56
+defined in the instruction pattern(s) that use the format.
57
+Conversely, an instruction pattern can include fields that refer to
58
+named fields that are defined in the format it uses. However you
59
+cannot currently do both at once (i.e. pattern P uses format F; F has
60
+a field A that refers to a named field B that is defined in P, and P
61
+has a field C that refers to a named field D that is defined in F).
62
+
63
+If multiple ``fields`` are present, they are concatenated.
64
+In this way one can define disjoint fields.
65
66
If ``!function`` is specified, the concatenated result is passed through the
67
named function, taking and returning an integral value.
68
69
-One may use ``!function`` with zero ``unnamed_fields``. This case is called
70
+One may use ``!function`` with zero ``fields``. This case is called
71
a *parameter*, and the named function is only passed the ``DisasContext``
72
and returns an integral value extracted from there.
73
74
-A field with no ``unnamed_fields`` and no ``!function`` is in error.
75
+A field with no ``fields`` and no ``!function`` is in error.
76
77
Field examples:
78
79
@@ -XXX,XX +XXX,XX @@ Field examples:
80
| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | |
81
| !function=expand_shimm8 | extract(i, 13, 1)) |
82
+---------------------------+---------------------------------------------+
83
+| %sz_imm 10:2 sz:3 | expand_sz_imm(extract(i, 10, 2) << 3 | |
84
+| !function=expand_sz_imm | extract(a->sz, 0, 3)) |
85
++---------------------------+---------------------------------------------+
86
87
Argument Sets
88
=============
89
--
90
2.34.1
diff view generated by jsdifflib
New patch
1
From: Peter Maydell <peter.maydell@linaro.org>
1
2
3
To support referring to other named fields in field definitions, we
4
need to pass the str_extract() method a function which tells it how
5
to emit the code for a previously initialized named field. (In
6
Pattern::output_code() the other field will be "u.f_foo.field", and
7
in Format::output_extract() it is "a->field".)
8
9
Refactor the two callsites that currently do "output code to
10
initialize each field", and have them pass a lambda that defines how
11
to format the lvalue in each case. This is then used both in
12
emitting the LHS of the assignment and also passed down to
13
str_extract() as a new argument (unused at the moment, but will be
14
used in the following patch).
15
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
18
Message-Id: <20230523120447.728365-4-peter.maydell@linaro.org>
19
---
20
scripts/decodetree.py | 26 +++++++++++++++-----------
21
1 file changed, 15 insertions(+), 11 deletions(-)
22
23
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
24
index XXXXXXX..XXXXXXX 100644
25
--- a/scripts/decodetree.py
26
+++ b/scripts/decodetree.py
27
@@ -XXX,XX +XXX,XX @@ def __str__(self):
28
s = ''
29
return str(self.pos) + ':' + s + str(self.len)
30
31
- def str_extract(self):
32
+ def str_extract(self, lvalue_formatter):
33
global bitop_width
34
s = 's' if self.sign else ''
35
return f'{s}extract{bitop_width}(insn, {self.pos}, {self.len})'
36
@@ -XXX,XX +XXX,XX @@ def __init__(self, subs, mask):
37
def __str__(self):
38
return str(self.subs)
39
40
- def str_extract(self):
41
+ def str_extract(self, lvalue_formatter):
42
global bitop_width
43
ret = '0'
44
pos = 0
45
for f in reversed(self.subs):
46
- ext = f.str_extract()
47
+ ext = f.str_extract(lvalue_formatter)
48
if pos == 0:
49
ret = ext
50
else:
51
@@ -XXX,XX +XXX,XX @@ def __init__(self, value):
52
def __str__(self):
53
return str(self.value)
54
55
- def str_extract(self):
56
+ def str_extract(self, lvalue_formatter):
57
return str(self.value)
58
59
def __cmp__(self, other):
60
@@ -XXX,XX +XXX,XX @@ def __init__(self, func, base):
61
def __str__(self):
62
return self.func + '(' + str(self.base) + ')'
63
64
- def str_extract(self):
65
- return self.func + '(ctx, ' + self.base.str_extract() + ')'
66
+ def str_extract(self, lvalue_formatter):
67
+ return (self.func + '(ctx, '
68
+ + self.base.str_extract(lvalue_formatter) + ')')
69
70
def __eq__(self, other):
71
return self.func == other.func and self.base == other.base
72
@@ -XXX,XX +XXX,XX @@ def __init__(self, func):
73
def __str__(self):
74
return self.func
75
76
- def str_extract(self):
77
+ def str_extract(self, lvalue_formatter):
78
return self.func + '(ctx)'
79
80
def __eq__(self, other):
81
@@ -XXX,XX +XXX,XX @@ def __str__(self):
82
83
def str1(self, i):
84
return str_indent(i) + self.__str__()
85
+
86
+ def output_fields(self, indent, lvalue_formatter):
87
+ for n, f in self.fields.items():
88
+ output(indent, lvalue_formatter(n), ' = ',
89
+ f.str_extract(lvalue_formatter), ';\n')
90
# end General
91
92
93
@@ -XXX,XX +XXX,XX @@ def extract_name(self):
94
def output_extract(self):
95
output('static void ', self.extract_name(), '(DisasContext *ctx, ',
96
self.base.struct_name(), ' *a, ', insntype, ' insn)\n{\n')
97
- for n, f in self.fields.items():
98
- output(' a->', n, ' = ', f.str_extract(), ';\n')
99
+ self.output_fields(str_indent(4), lambda n: 'a->' + n)
100
output('}\n\n')
101
# end Format
102
103
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
104
if not extracted:
105
output(ind, self.base.extract_name(),
106
'(ctx, &u.f_', arg, ', insn);\n')
107
- for n, f in self.fields.items():
108
- output(ind, 'u.f_', arg, '.', n, ' = ', f.str_extract(), ';\n')
109
+ self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
110
output(ind, 'if (', translate_prefix, '_', self.name,
111
'(ctx, &u.f_', arg, ')) return true;\n')
112
113
--
114
2.34.1
diff view generated by jsdifflib
New patch
1
From: Peter Maydell <peter.maydell@linaro.org>
1
2
3
To support named fields, we will need to be able to do a topological
4
sort (so that we ensure that we output the assignment to field A
5
before the assignment to field B if field B refers to field A by
6
name). The good news is that there is a tsort in the python standard
7
library; the bad news is that it was only added in Python 3.9.
8
9
To bridge the gap between our current minimum supported Python
10
version and 3.9, provide a local implementation that has the
11
same API as the stdlib version for the parts we care about.
12
In future when QEMU's minimum Python version requirement reaches
13
3.9 we can delete this code and replace it with an 'import' line.
14
15
The core of this implementation is based on
16
https://code.activestate.com/recipes/578272-topological-sort/
17
which is MIT-licensed.
18
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Acked-by: Richard Henderson <richard.henderson@linaro.org>
21
Message-Id: <20230523120447.728365-5-peter.maydell@linaro.org>
22
---
23
scripts/decodetree.py | 74 +++++++++++++++++++++++++++++++++++++++++++
24
1 file changed, 74 insertions(+)
25
26
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
27
index XXXXXXX..XXXXXXX 100644
28
--- a/scripts/decodetree.py
29
+++ b/scripts/decodetree.py
30
@@ -XXX,XX +XXX,XX @@
31
re_fmt_ident = '@[a-zA-Z0-9_]*'
32
re_pat_ident = '[a-zA-Z0-9_]*'
33
34
+# Local implementation of a topological sort. We use the same API that
35
+# the Python graphlib does, so that when QEMU moves forward to a
36
+# baseline of Python 3.9 or newer this code can all be dropped and
37
+# replaced with:
38
+# from graphlib import TopologicalSorter, CycleError
39
+#
40
+# https://docs.python.org/3.9/library/graphlib.html#graphlib.TopologicalSorter
41
+#
42
+# We only implement the parts of TopologicalSorter we care about:
43
+# ts = TopologicalSorter(graph=None)
44
+# create the sorter. graph is a dictionary whose keys are
45
+# nodes and whose values are lists of the predecessors of that node.
46
+# (That is, if graph contains "A" -> ["B", "C"] then we must output
47
+# B and C before A.)
48
+# ts.static_order()
49
+# returns a list of all the nodes in sorted order, or raises CycleError
50
+# CycleError
51
+# exception raised if there are cycles in the graph. The second
52
+# element in the args attribute is a list of nodes which form a
53
+# cycle; the first and last element are the same, eg [a, b, c, a]
54
+# (Our implementation doesn't give the order correctly.)
55
+#
56
+# For our purposes we can assume that the data set is always small
57
+# (typically 10 nodes or less, actual links in the graph very rare),
58
+# so we don't need to worry about efficiency of implementation.
59
+#
60
+# The core of this implementation is from
61
+# https://code.activestate.com/recipes/578272-topological-sort/
62
+# (but updated to Python 3), and is under the MIT license.
63
+
64
+class CycleError(ValueError):
65
+ """Subclass of ValueError raised if cycles exist in the graph"""
66
+ pass
67
+
68
+class TopologicalSorter:
69
+ """Topologically sort a graph"""
70
+ def __init__(self, graph=None):
71
+ self.graph = graph
72
+
73
+ def static_order(self):
74
+ # We do the sort right here, unlike the stdlib version
75
+ from functools import reduce
76
+ data = {}
77
+ r = []
78
+
79
+ if not self.graph:
80
+ return []
81
+
82
+ # This code wants the values in the dict to be specifically sets
83
+ for k, v in self.graph.items():
84
+ data[k] = set(v)
85
+
86
+ # Find all items that don't depend on anything.
87
+ extra_items_in_deps = (reduce(set.union, data.values())
88
+ - set(data.keys()))
89
+ # Add empty dependencies where needed
90
+ data.update({item:{} for item in extra_items_in_deps})
91
+ while True:
92
+ ordered = set(item for item, dep in data.items() if not dep)
93
+ if not ordered:
94
+ break
95
+ r.extend(ordered)
96
+ data = {item: (dep - ordered)
97
+ for item, dep in data.items()
98
+ if item not in ordered}
99
+ if data:
100
+ # This doesn't give as nice results as the stdlib, which
101
+ # gives you the cycle by listing the nodes in order. Here
102
+ # we only know the nodes in the cycle but not their order.
103
+ raise CycleError(f'nodes are in a cycle', list(data.keys()))
104
+
105
+ return r
106
+# end TopologicalSorter
107
+
108
def error_with_file(file, lineno, *args):
109
"""Print an error message from file:line and args and exit."""
110
global output_file
111
--
112
2.34.1
diff view generated by jsdifflib
New patch
1
1
From: Peter Maydell <peter.maydell@linaro.org>
2
3
Implement support for named fields, i.e. where one field is defined
4
in terms of another, rather than directly in terms of bits extracted
5
from the instruction.
6
7
The new method referenced_fields() on all the Field classes returns a
8
list of fields that this field references. This just passes through,
9
except for the new NamedField class.
10
11
We can then use referenced_fields() to:
12
* construct a list of 'dangling references' for a format or
13
pattern, which is the fields that the format/pattern uses but
14
doesn't define itself
15
* do a topological sort, so that we output "field = value"
16
assignments in an order that means that we assign a field before
17
we reference it in a subsequent assignment
18
* check when we output the code for a pattern whether we need to
19
fill in the format fields before or after the pattern fields, and
20
do other error checking
21
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
24
Message-Id: <20230523120447.728365-6-peter.maydell@linaro.org>
25
---
26
scripts/decodetree.py | 145 ++++++++++++++++++++++++++++++++++++++++--
27
1 file changed, 139 insertions(+), 6 deletions(-)
28
29
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
30
index XXXXXXX..XXXXXXX 100644
31
--- a/scripts/decodetree.py
32
+++ b/scripts/decodetree.py
33
@@ -XXX,XX +XXX,XX @@ def str_extract(self, lvalue_formatter):
34
s = 's' if self.sign else ''
35
return f'{s}extract{bitop_width}(insn, {self.pos}, {self.len})'
36
37
+ def referenced_fields(self):
38
+ return []
39
+
40
def __eq__(self, other):
41
return self.sign == other.sign and self.mask == other.mask
42
43
@@ -XXX,XX +XXX,XX @@ def str_extract(self, lvalue_formatter):
44
pos += f.len
45
return ret
46
47
+ def referenced_fields(self):
48
+ l = []
49
+ for f in self.subs:
50
+ l.extend(f.referenced_fields())
51
+ return l
52
+
53
def __ne__(self, other):
54
if len(self.subs) != len(other.subs):
55
return True
56
@@ -XXX,XX +XXX,XX @@ def __str__(self):
57
def str_extract(self, lvalue_formatter):
58
return str(self.value)
59
60
+ def referenced_fields(self):
61
+ return []
62
+
63
def __cmp__(self, other):
64
return self.value - other.value
65
# end ConstField
66
@@ -XXX,XX +XXX,XX @@ def str_extract(self, lvalue_formatter):
67
return (self.func + '(ctx, '
68
+ self.base.str_extract(lvalue_formatter) + ')')
69
70
+ def referenced_fields(self):
71
+ return self.base.referenced_fields()
72
+
73
def __eq__(self, other):
74
return self.func == other.func and self.base == other.base
75
76
@@ -XXX,XX +XXX,XX @@ def __str__(self):
77
def str_extract(self, lvalue_formatter):
78
return self.func + '(ctx)'
79
80
+ def referenced_fields(self):
81
+ return []
82
+
83
def __eq__(self, other):
84
return self.func == other.func
85
86
@@ -XXX,XX +XXX,XX @@ def __ne__(self, other):
87
return not self.__eq__(other)
88
# end ParameterField
89
90
+class NamedField:
91
+ """Class representing a field already named in the pattern"""
92
+ def __init__(self, name, sign, len):
93
+ self.mask = 0
94
+ self.sign = sign
95
+ self.len = len
96
+ self.name = name
97
+
98
+ def __str__(self):
99
+ return self.name
100
+
101
+ def str_extract(self, lvalue_formatter):
102
+ global bitop_width
103
+ s = 's' if self.sign else ''
104
+ lvalue = lvalue_formatter(self.name)
105
+ return f'{s}extract{bitop_width}({lvalue}, 0, {self.len})'
106
+
107
+ def referenced_fields(self):
108
+ return [self.name]
109
+
110
+ def __eq__(self, other):
111
+ return self.name == other.name
112
+
113
+ def __ne__(self, other):
114
+ return not self.__eq__(other)
115
+# end NamedField
116
117
class Arguments:
118
"""Class representing the extracted fields of a format"""
119
@@ -XXX,XX +XXX,XX @@ def output_def(self):
120
output('} ', self.struct_name(), ';\n\n')
121
# end Arguments
122
123
-
124
class General:
125
"""Common code between instruction formats and instruction patterns"""
126
def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds, w):
127
@@ -XXX,XX +XXX,XX @@ def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds, w):
128
self.fieldmask = fldm
129
self.fields = flds
130
self.width = w
131
+ self.dangling = None
132
133
def __str__(self):
134
return self.name + ' ' + str_match_bits(self.fixedbits, self.fixedmask)
135
@@ -XXX,XX +XXX,XX @@ def __str__(self):
136
def str1(self, i):
137
return str_indent(i) + self.__str__()
138
139
+ def dangling_references(self):
140
+ # Return a list of all named references which aren't satisfied
141
+ # directly by this format/pattern. This will be either:
142
+ # * a format referring to a field which is specified by the
143
+ # pattern(s) using it
144
+ # * a pattern referring to a field which is specified by the
145
+ # format it uses
146
+ # * a user error (referring to a field that doesn't exist at all)
147
+ if self.dangling is None:
148
+ # Compute this once and cache the answer
149
+ dangling = []
150
+ for n, f in self.fields.items():
151
+ for r in f.referenced_fields():
152
+ if r not in self.fields:
153
+ dangling.append(r)
154
+ self.dangling = dangling
155
+ return self.dangling
156
+
157
def output_fields(self, indent, lvalue_formatter):
158
+ # We use a topological sort to ensure that any use of NamedField
159
+ # comes after the initialization of the field it is referencing.
160
+ graph = {}
161
for n, f in self.fields.items():
162
- output(indent, lvalue_formatter(n), ' = ',
163
- f.str_extract(lvalue_formatter), ';\n')
164
+ refs = f.referenced_fields()
165
+ graph[n] = refs
166
+
167
+ try:
168
+ ts = TopologicalSorter(graph)
169
+ for n in ts.static_order():
170
+ # We only want to emit assignments for the keys
171
+ # in our fields list, not for anything that ends up
172
+ # in the tsort graph only because it was referenced as
173
+ # a NamedField.
174
+ try:
175
+ f = self.fields[n]
176
+ output(indent, lvalue_formatter(n), ' = ',
177
+ f.str_extract(lvalue_formatter), ';\n')
178
+ except KeyError:
179
+ pass
180
+ except CycleError as e:
181
+ # The second element of args is a list of nodes which form
182
+ # a cycle (there might be others too, but only one is reported).
183
+ # Pretty-print it to tell the user.
184
+ cycle = ' => '.join(e.args[1])
185
+ error(self.lineno, 'field definitions form a cycle: ' + cycle)
186
# end General
187
188
189
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
190
ind = str_indent(i)
191
arg = self.base.base.name
192
output(ind, '/* ', self.file, ':', str(self.lineno), ' */\n')
193
+ # We might have named references in the format that refer to fields
194
+ # in the pattern, or named references in the pattern that refer
195
+ # to fields in the format. This affects whether we extract the fields
196
+ # for the format before or after the ones for the pattern.
197
+ # For simplicity we don't allow cross references in both directions.
198
+ # This is also where we catch the syntax error of referring to
199
+ # a nonexistent field.
200
+ fmt_refs = self.base.dangling_references()
201
+ for r in fmt_refs:
202
+ if r not in self.fields:
203
+ error(self.lineno, f'format refers to undefined field {r}')
204
+ pat_refs = self.dangling_references()
205
+ for r in pat_refs:
206
+ if r not in self.base.fields:
207
+ error(self.lineno, f'pattern refers to undefined field {r}')
208
+ if pat_refs and fmt_refs:
209
+ error(self.lineno, ('pattern that uses fields defined in format '
210
+ 'cannot use format that uses fields defined '
211
+ 'in pattern'))
212
+ if fmt_refs:
213
+ # pattern fields first
214
+ self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
215
+ assert not extracted, "dangling fmt refs but it was already extracted"
216
if not extracted:
217
output(ind, self.base.extract_name(),
218
'(ctx, &u.f_', arg, ', insn);\n')
219
- self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
220
+ if not fmt_refs:
221
+ # pattern fields last
222
+ self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
223
+
224
output(ind, 'if (', translate_prefix, '_', self.name,
225
'(ctx, &u.f_', arg, ')) return true;\n')
226
227
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
228
ind = str_indent(i)
229
230
# If we identified all nodes below have the same format,
231
- # extract the fields now.
232
- if not extracted and self.base:
233
+ # extract the fields now. But don't do it if the format relies
234
+ # on named fields from the insn pattern, as those won't have
235
+ # been initialised at this point.
236
+ if not extracted and self.base and not self.base.dangling_references():
237
output(ind, self.base.extract_name(),
238
'(ctx, &u.f_', self.base.base.name, ', insn);\n')
239
extracted = True
240
@@ -XXX,XX +XXX,XX @@ def parse_field(lineno, name, toks):
241
"""Parse one instruction field from TOKS at LINENO"""
242
global fields
243
global insnwidth
244
+ global re_C_ident
245
246
# A "simple" field will have only one entry;
247
# a "multifield" will have several.
248
@@ -XXX,XX +XXX,XX @@ def parse_field(lineno, name, toks):
249
func = func[1]
250
continue
251
252
+ if re.fullmatch(re_C_ident + ':s[0-9]+', t):
253
+ # Signed named field
254
+ subtoks = t.split(':')
255
+ n = subtoks[0]
256
+ le = int(subtoks[1])
257
+ f = NamedField(n, True, le)
258
+ subs.append(f)
259
+ width += le
260
+ continue
261
+ if re.fullmatch(re_C_ident + ':[0-9]+', t):
262
+ # Unsigned named field
263
+ subtoks = t.split(':')
264
+ n = subtoks[0]
265
+ le = int(subtoks[1])
266
+ f = NamedField(n, False, le)
267
+ subs.append(f)
268
+ width += le
269
+ continue
270
+
271
if re.fullmatch('[0-9]+:s[0-9]+', t):
272
# Signed field extract
273
subtoks = t.split(':s')
274
--
275
2.34.1
diff view generated by jsdifflib
1
Mirroring the upstream gdb xml files, the two stack boundary
1
From: Peter Maydell <peter.maydell@linaro.org>
2
registers are separated out.
3
2
4
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
3
Add some tests for various cases of named-field use, both ones that
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
should work and ones that should be diagnosed as errors.
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-Id: <20230523120447.728365-7-peter.maydell@linaro.org>
6
---
9
---
7
configs/targets/microblaze-linux-user.mak | 1 +
10
tests/decode/err_field10.decode | 7 +++++++
8
configs/targets/microblaze-softmmu.mak | 1 +
11
tests/decode/err_field7.decode | 7 +++++++
9
configs/targets/microblazeel-linux-user.mak | 1 +
12
tests/decode/err_field8.decode | 8 ++++++++
10
configs/targets/microblazeel-softmmu.mak | 1 +
13
tests/decode/err_field9.decode | 14 ++++++++++++++
11
target/microblaze/cpu.h | 2 +
14
tests/decode/succ_named_field.decode | 19 +++++++++++++++++++
12
target/microblaze/cpu.c | 7 ++-
15
tests/decode/meson.build | 5 +++++
13
target/microblaze/gdbstub.c | 51 +++++++++++-----
16
6 files changed, 60 insertions(+)
14
gdb-xml/microblaze-core.xml | 67 +++++++++++++++++++++
17
create mode 100644 tests/decode/err_field10.decode
15
gdb-xml/microblaze-stack-protect.xml | 12 ++++
18
create mode 100644 tests/decode/err_field7.decode
16
9 files changed, 128 insertions(+), 15 deletions(-)
19
create mode 100644 tests/decode/err_field8.decode
17
create mode 100644 gdb-xml/microblaze-core.xml
20
create mode 100644 tests/decode/err_field9.decode
18
create mode 100644 gdb-xml/microblaze-stack-protect.xml
21
create mode 100644 tests/decode/succ_named_field.decode
19
22
20
diff --git a/configs/targets/microblaze-linux-user.mak b/configs/targets/microblaze-linux-user.mak
23
diff --git a/tests/decode/err_field10.decode b/tests/decode/err_field10.decode
21
index XXXXXXX..XXXXXXX 100644
22
--- a/configs/targets/microblaze-linux-user.mak
23
+++ b/configs/targets/microblaze-linux-user.mak
24
@@ -XXX,XX +XXX,XX @@ TARGET_SYSTBL_ABI=common
25
TARGET_SYSTBL=syscall.tbl
26
TARGET_BIG_ENDIAN=y
27
TARGET_HAS_BFLT=y
28
+TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml
29
diff --git a/configs/targets/microblaze-softmmu.mak b/configs/targets/microblaze-softmmu.mak
30
index XXXXXXX..XXXXXXX 100644
31
--- a/configs/targets/microblaze-softmmu.mak
32
+++ b/configs/targets/microblaze-softmmu.mak
33
@@ -XXX,XX +XXX,XX @@ TARGET_ARCH=microblaze
34
TARGET_BIG_ENDIAN=y
35
TARGET_SUPPORTS_MTTCG=y
36
TARGET_NEED_FDT=y
37
+TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml
38
diff --git a/configs/targets/microblazeel-linux-user.mak b/configs/targets/microblazeel-linux-user.mak
39
index XXXXXXX..XXXXXXX 100644
40
--- a/configs/targets/microblazeel-linux-user.mak
41
+++ b/configs/targets/microblazeel-linux-user.mak
42
@@ -XXX,XX +XXX,XX @@ TARGET_ARCH=microblaze
43
TARGET_SYSTBL_ABI=common
44
TARGET_SYSTBL=syscall.tbl
45
TARGET_HAS_BFLT=y
46
+TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml
47
diff --git a/configs/targets/microblazeel-softmmu.mak b/configs/targets/microblazeel-softmmu.mak
48
index XXXXXXX..XXXXXXX 100644
49
--- a/configs/targets/microblazeel-softmmu.mak
50
+++ b/configs/targets/microblazeel-softmmu.mak
51
@@ -XXX,XX +XXX,XX @@
52
TARGET_ARCH=microblaze
53
TARGET_SUPPORTS_MTTCG=y
54
TARGET_NEED_FDT=y
55
+TARGET_XML_FILES=gdb-xml/microblaze-core.xml gdb-xml/microblaze-stack-protect.xml
56
diff --git a/target/microblaze/cpu.h b/target/microblaze/cpu.h
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/microblaze/cpu.h
59
+++ b/target/microblaze/cpu.h
60
@@ -XXX,XX +XXX,XX @@ hwaddr mb_cpu_get_phys_page_attrs_debug(CPUState *cpu, vaddr addr,
61
MemTxAttrs *attrs);
62
int mb_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
63
int mb_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
64
+int mb_cpu_gdb_read_stack_protect(CPUArchState *cpu, GByteArray *buf, int reg);
65
+int mb_cpu_gdb_write_stack_protect(CPUArchState *cpu, uint8_t *buf, int reg);
66
67
static inline uint32_t mb_cpu_read_msr(const CPUMBState *env)
68
{
69
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/microblaze/cpu.c
72
+++ b/target/microblaze/cpu.c
73
@@ -XXX,XX +XXX,XX @@
74
#include "qemu/module.h"
75
#include "hw/qdev-properties.h"
76
#include "exec/exec-all.h"
77
+#include "exec/gdbstub.h"
78
#include "fpu/softfloat-helpers.h"
79
80
static const struct {
81
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_initfn(Object *obj)
82
CPUMBState *env = &cpu->env;
83
84
cpu_set_cpustate_pointers(cpu);
85
+ gdb_register_coprocessor(CPU(cpu), mb_cpu_gdb_read_stack_protect,
86
+ mb_cpu_gdb_write_stack_protect, 2,
87
+ "microblaze-stack-protect.xml", 0);
88
89
set_float_rounding_mode(float_round_nearest_even, &env->fp_status);
90
91
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_class_init(ObjectClass *oc, void *data)
92
cc->sysemu_ops = &mb_sysemu_ops;
93
#endif
94
device_class_set_props(dc, mb_properties);
95
- cc->gdb_num_core_regs = 32 + 27;
96
+ cc->gdb_num_core_regs = 32 + 25;
97
+ cc->gdb_core_xml_file = "microblaze-core.xml";
98
99
cc->disas_set_info = mb_disas_set_info;
100
cc->tcg_ops = &mb_tcg_ops;
101
diff --git a/target/microblaze/gdbstub.c b/target/microblaze/gdbstub.c
102
index XXXXXXX..XXXXXXX 100644
103
--- a/target/microblaze/gdbstub.c
104
+++ b/target/microblaze/gdbstub.c
105
@@ -XXX,XX +XXX,XX @@ enum {
106
GDB_PVR0 = 32 + 6,
107
GDB_PVR11 = 32 + 17,
108
GDB_EDR = 32 + 18,
109
- GDB_SLR = 32 + 25,
110
- GDB_SHR = 32 + 26,
111
+};
112
+
113
+enum {
114
+ GDB_SP_SHL,
115
+ GDB_SP_SHR,
116
};
117
118
int mb_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
119
@@ -XXX,XX +XXX,XX @@ int mb_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
120
case GDB_EDR:
121
val = env->edr;
122
break;
123
- case GDB_SLR:
124
- val = env->slr;
125
- break;
126
- case GDB_SHR:
127
- val = env->shr;
128
- break;
129
default:
130
/* Other SRegs aren't modeled, so report a value of 0 */
131
val = 0;
132
@@ -XXX,XX +XXX,XX @@ int mb_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
133
return gdb_get_reg32(mem_buf, val);
134
}
135
136
+int mb_cpu_gdb_read_stack_protect(CPUMBState *env, GByteArray *mem_buf, int n)
137
+{
138
+ uint32_t val;
139
+
140
+ switch (n) {
141
+ case GDB_SP_SHL:
142
+ val = env->slr;
143
+ break;
144
+ case GDB_SP_SHR:
145
+ val = env->shr;
146
+ break;
147
+ default:
148
+ return 0;
149
+ }
150
+ return gdb_get_reg32(mem_buf, val);
151
+}
152
+
153
int mb_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
154
{
155
MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
156
@@ -XXX,XX +XXX,XX @@ int mb_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
157
case GDB_EDR:
158
env->edr = tmp;
159
break;
160
- case GDB_SLR:
161
- env->slr = tmp;
162
- break;
163
- case GDB_SHR:
164
- env->shr = tmp;
165
- break;
166
+ }
167
+ return 4;
168
+}
169
+
170
+int mb_cpu_gdb_write_stack_protect(CPUMBState *env, uint8_t *mem_buf, int n)
171
+{
172
+ switch (n) {
173
+ case GDB_SP_SHL:
174
+ env->slr = ldl_p(mem_buf);
175
+ break;
176
+ case GDB_SP_SHR:
177
+ env->shr = ldl_p(mem_buf);
178
+ break;
179
+ default:
180
+ return 0;
181
}
182
return 4;
183
}
184
diff --git a/gdb-xml/microblaze-core.xml b/gdb-xml/microblaze-core.xml
185
new file mode 100644
24
new file mode 100644
186
index XXXXXXX..XXXXXXX
25
index XXXXXXX..XXXXXXX
187
--- /dev/null
26
--- /dev/null
188
+++ b/gdb-xml/microblaze-core.xml
27
+++ b/tests/decode/err_field10.decode
189
@@ -XXX,XX +XXX,XX @@
28
@@ -XXX,XX +XXX,XX @@
190
+<?xml version="1.0"?>
29
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
191
+<!-- Copyright (C) 2008 Free Software Foundation, Inc.
30
+# See the COPYING.LIB file in the top-level directory.
192
+
31
+
193
+ Copying and distribution of this file, with or without modification,
32
+# Diagnose formats which refer to undefined fields
194
+ are permitted in any medium without royalty provided the copyright
33
+%field1 field2:3
195
+ notice and this notice are preserved. -->
34
+@fmt ........ ........ ........ ........ %field1
196
+
35
+insn 00000000 00000000 00000000 00000000 @fmt
197
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
36
diff --git a/tests/decode/err_field7.decode b/tests/decode/err_field7.decode
198
+<feature name="org.gnu.gdb.microblaze.core">
199
+ <reg name="r0" bitsize="32" regnum="0"/>
200
+ <reg name="r1" bitsize="32" type="data_ptr"/>
201
+ <reg name="r2" bitsize="32"/>
202
+ <reg name="r3" bitsize="32"/>
203
+ <reg name="r4" bitsize="32"/>
204
+ <reg name="r5" bitsize="32"/>
205
+ <reg name="r6" bitsize="32"/>
206
+ <reg name="r7" bitsize="32"/>
207
+ <reg name="r8" bitsize="32"/>
208
+ <reg name="r9" bitsize="32"/>
209
+ <reg name="r10" bitsize="32"/>
210
+ <reg name="r11" bitsize="32"/>
211
+ <reg name="r12" bitsize="32"/>
212
+ <reg name="r13" bitsize="32"/>
213
+ <reg name="r14" bitsize="32"/>
214
+ <reg name="r15" bitsize="32"/>
215
+ <reg name="r16" bitsize="32"/>
216
+ <reg name="r17" bitsize="32"/>
217
+ <reg name="r18" bitsize="32"/>
218
+ <reg name="r19" bitsize="32"/>
219
+ <reg name="r20" bitsize="32"/>
220
+ <reg name="r21" bitsize="32"/>
221
+ <reg name="r22" bitsize="32"/>
222
+ <reg name="r23" bitsize="32"/>
223
+ <reg name="r24" bitsize="32"/>
224
+ <reg name="r25" bitsize="32"/>
225
+ <reg name="r26" bitsize="32"/>
226
+ <reg name="r27" bitsize="32"/>
227
+ <reg name="r28" bitsize="32"/>
228
+ <reg name="r29" bitsize="32"/>
229
+ <reg name="r30" bitsize="32"/>
230
+ <reg name="r31" bitsize="32"/>
231
+ <reg name="rpc" bitsize="32" type="code_ptr"/>
232
+ <reg name="rmsr" bitsize="32"/>
233
+ <reg name="rear" bitsize="32"/>
234
+ <reg name="resr" bitsize="32"/>
235
+ <reg name="rfsr" bitsize="32"/>
236
+ <reg name="rbtr" bitsize="32"/>
237
+ <reg name="rpvr0" bitsize="32"/>
238
+ <reg name="rpvr1" bitsize="32"/>
239
+ <reg name="rpvr2" bitsize="32"/>
240
+ <reg name="rpvr3" bitsize="32"/>
241
+ <reg name="rpvr4" bitsize="32"/>
242
+ <reg name="rpvr5" bitsize="32"/>
243
+ <reg name="rpvr6" bitsize="32"/>
244
+ <reg name="rpvr7" bitsize="32"/>
245
+ <reg name="rpvr8" bitsize="32"/>
246
+ <reg name="rpvr9" bitsize="32"/>
247
+ <reg name="rpvr10" bitsize="32"/>
248
+ <reg name="rpvr11" bitsize="32"/>
249
+ <reg name="redr" bitsize="32"/>
250
+ <reg name="rpid" bitsize="32"/>
251
+ <reg name="rzpr" bitsize="32"/>
252
+ <reg name="rtlbx" bitsize="32"/>
253
+ <reg name="rtlbsx" bitsize="32"/>
254
+ <reg name="rtlblo" bitsize="32"/>
255
+ <reg name="rtlbhi" bitsize="32"/>
256
+</feature>
257
diff --git a/gdb-xml/microblaze-stack-protect.xml b/gdb-xml/microblaze-stack-protect.xml
258
new file mode 100644
37
new file mode 100644
259
index XXXXXXX..XXXXXXX
38
index XXXXXXX..XXXXXXX
260
--- /dev/null
39
--- /dev/null
261
+++ b/gdb-xml/microblaze-stack-protect.xml
40
+++ b/tests/decode/err_field7.decode
262
@@ -XXX,XX +XXX,XX @@
41
@@ -XXX,XX +XXX,XX @@
263
+<?xml version="1.0"?>
42
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
264
+<!-- Copyright (C) 2008 Free Software Foundation, Inc.
43
+# See the COPYING.LIB file in the top-level directory.
265
+
44
+
266
+ Copying and distribution of this file, with or without modification,
45
+# Diagnose fields whose definitions form a loop
267
+ are permitted in any medium without royalty provided the copyright
46
+%field1 field2:3
268
+ notice and this notice are preserved. -->
47
+%field2 field1:4
48
+insn 00000000 00000000 00000000 00000000 %field1 %field2
49
diff --git a/tests/decode/err_field8.decode b/tests/decode/err_field8.decode
50
new file mode 100644
51
index XXXXXXX..XXXXXXX
52
--- /dev/null
53
+++ b/tests/decode/err_field8.decode
54
@@ -XXX,XX +XXX,XX @@
55
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
56
+# See the COPYING.LIB file in the top-level directory.
269
+
57
+
270
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
58
+# Diagnose patterns which refer to undefined fields
271
+<feature name="org.gnu.gdb.microblaze.stack-protect">
59
+&f1 f1 a
272
+ <reg name="rslr" bitsize="32"/>
60
+%field1 field2:3
273
+ <reg name="rshr" bitsize="32"/>
61
+@fmt ........ ........ ........ .... a:4 &f1
274
+</feature>
62
+insn 00000000 00000000 00000000 0000 .... @fmt f1=%field1
63
diff --git a/tests/decode/err_field9.decode b/tests/decode/err_field9.decode
64
new file mode 100644
65
index XXXXXXX..XXXXXXX
66
--- /dev/null
67
+++ b/tests/decode/err_field9.decode
68
@@ -XXX,XX +XXX,XX @@
69
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
70
+# See the COPYING.LIB file in the top-level directory.
71
+
72
+# Diagnose fields where the format refers to a field defined in the
73
+# pattern and the pattern refers to a field defined in the format.
74
+# This is theoretically not impossible to implement, but is not
75
+# supported by the script at this time.
76
+&abcd a b c d
77
+%refa a:3
78
+%refc c:4
79
+# Format defines 'c' and sets 'b' to an indirect ref to 'a'
80
+@fmt ........ ........ ........ c:8 &abcd b=%refa
81
+# Pattern defines 'a' and sets 'd' to an indirect ref to 'c'
82
+insn 00000000 00000000 00000000 ........ @fmt d=%refc a=6
83
diff --git a/tests/decode/succ_named_field.decode b/tests/decode/succ_named_field.decode
84
new file mode 100644
85
index XXXXXXX..XXXXXXX
86
--- /dev/null
87
+++ b/tests/decode/succ_named_field.decode
88
@@ -XXX,XX +XXX,XX @@
89
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
90
+# See the COPYING.LIB file in the top-level directory.
91
+
92
+# field using a named_field
93
+%imm_sz    8:8 sz:3
94
+insn 00000000 00000000 ........ 00000000 imm_sz=%imm_sz sz=1
95
+
96
+# Ditto, via a format. Here a field in the format
97
+# references a named field defined in the insn pattern:
98
+&imm_a imm alpha
99
+%foo 0:16 alpha:4
100
+@foo 00000001 ........ ........ ........ &imm_a imm=%foo
101
+i1 ........ 00000000 ........ ........ @foo alpha=1
102
+i2 ........ 00000001 ........ ........ @foo alpha=2
103
+
104
+# Here the named field is defined in the format and referenced
105
+# from the insn pattern:
106
+@bar 00000010 ........ ........ ........ &imm_a alpha=4
107
+i3 ........ 00000000 ........ ........ @bar imm=%foo
108
diff --git a/tests/decode/meson.build b/tests/decode/meson.build
109
index XXXXXXX..XXXXXXX 100644
110
--- a/tests/decode/meson.build
111
+++ b/tests/decode/meson.build
112
@@ -XXX,XX +XXX,XX @@ err_tests = [
113
'err_field4.decode',
114
'err_field5.decode',
115
'err_field6.decode',
116
+ 'err_field7.decode',
117
+ 'err_field8.decode',
118
+ 'err_field9.decode',
119
+ 'err_field10.decode',
120
'err_init1.decode',
121
'err_init2.decode',
122
'err_init3.decode',
123
@@ -XXX,XX +XXX,XX @@ succ_tests = [
124
'succ_argset_type1.decode',
125
'succ_function.decode',
126
'succ_ident1.decode',
127
+ 'succ_named_field.decode',
128
'succ_pattern_group_nest1.decode',
129
'succ_pattern_group_nest2.decode',
130
'succ_pattern_group_nest3.decode',
275
--
131
--
276
2.34.1
132
2.34.1
diff view generated by jsdifflib