1
The following changes since commit 14556211bc6d7125a44d5b5df90caba019b0ec0e:
1
Hi; this is one last arm pullreq before the end of the year.
2
Mostly minor cleanups, and also implementation of the
3
FEAT_XS architectural feature.
2
4
3
Merge tag 'qemu-macppc-20240918' of https://github.com/mcayland/qemu into staging (2024-09-18 20:59:10 +0100)
5
thanks
6
-- PMM
7
8
The following changes since commit 8032c78e556cd0baec111740a6c636863f9bd7c8:
9
10
Merge tag 'firmware-20241216-pull-request' of https://gitlab.com/kraxel/qemu into staging (2024-12-16 14:20:33 -0500)
4
11
5
are available in the Git repository at:
12
are available in the Git repository at:
6
13
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20240919
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20241217
8
15
9
for you to fetch changes up to 89b30b4921e51bb47313d2d8fdc3d7bce987e4c5:
16
for you to fetch changes up to e91254250acb8570bd7b8a8f89d30e6d18291d02:
10
17
11
docs/devel: Remove nested-papr.txt (2024-09-19 13:33:15 +0100)
18
tests/functional: update sbsa-ref firmware used in test (2024-12-17 15:21:06 +0000)
12
19
13
----------------------------------------------------------------
20
----------------------------------------------------------------
14
target-arm queue:
21
target-arm queue:
15
* target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
22
* remove a line of redundant code
16
* target/arm: More conversions to decodetree of A64 SIMD insns
23
* convert various TCG helper fns to use 'fpst' alias
17
* hw/char/stm32l4x5_usart.c: Enable USART ACK bit response
24
* Use float_status in helper_fcvtx_f64_to_f32
18
* tests: update aarch64/sbsa-ref tests
25
* Use float_status in helper_vfp_fcvt{ds,sd}
19
* kvm: minor Coverity nit fixes
26
* Implement FEAT_XS
20
* docs/devel: Remove nested-papr.txt
27
* hw/intc/arm_gicv3_its: Zero initialize local DTEntry etc structs
28
* tests/functional: update sbsa-ref firmware used in test
21
29
22
----------------------------------------------------------------
30
----------------------------------------------------------------
23
Jacob Abrams (1):
31
Denis Rastyogin (1):
24
hw/char/stm32l4x5_usart.c: Enable USART ACK bit response
32
target/arm: remove redundant code
25
33
26
Marcin Juszkiewicz (4):
34
Manos Pitsidianakis (3):
27
tests: use default cpu for aarch64/sbsa-ref
35
target/arm: Add decodetree entry for DSB nXS variant
28
tests: add FreeBSD tests for aarch64/sbsa-ref
36
target/arm: Enable FEAT_XS for the max cpu
29
tests: expand timeout information for aarch64/sbsa-ref
37
tests/tcg/aarch64: add system test for FEAT_XS
30
tests: drop OpenBSD tests for aarch64/sbsa-ref
38
39
Marcin Juszkiewicz (1):
40
tests/functional: update sbsa-ref firmware used in test
31
41
32
Peter Maydell (4):
42
Peter Maydell (4):
33
kvm: Make 'mmap_size' be 'int' in kvm_init_vcpu(), do_kvm_destroy_vcpu()
43
target/arm: Implement fine-grained-trap handling for FEAT_XS
34
kvm: Remove unreachable code in kvm_dirty_ring_reaper_thread()
44
target/arm: Add ARM_CP_ADD_TLBI_NXS type flag for NXS insns
35
target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
45
target/arm: Add ARM_CP_ADD_TLBI_NXS type flag to TLBI insns
36
docs/devel: Remove nested-papr.txt
46
hw/intc/arm_gicv3_its: Zero initialize local DTEntry etc structs
37
47
38
Richard Henderson (29):
48
Richard Henderson (10):
39
target/arm: Replace tcg_gen_dupi_vec with constants in gengvec.c
49
target/arm: Convert vfp_helper.c to fpst alias
40
target/arm: Replace tcg_gen_dupi_vec with constants in translate-sve.c
50
target/arm: Convert helper-a64.c to fpst alias
41
target/arm: Use cmpsel in gen_ushl_vec
51
target/arm: Convert vec_helper.c to fpst alias
42
target/arm: Use cmpsel in gen_sshl_vec
52
target/arm: Convert neon_helper.c to fpst alias
43
target/arm: Use tcg_gen_extract2_i64 for EXT
53
target/arm: Convert sve_helper.c to fpst alias
44
target/arm: Convert EXT to decodetree
54
target/arm: Convert sme_helper.c to fpst alias
45
target/arm: Convert TBL, TBX to decodetree
55
target/arm: Convert vec_helper.c to use env alias
46
target/arm: Convert UZP, TRN, ZIP to decodetree
56
target/arm: Convert neon_helper.c to use env alias
47
target/arm: Simplify do_reduction_op
57
target/arm: Use float_status in helper_fcvtx_f64_to_f32
48
target/arm: Convert ADDV, *ADDLV, *MAXV, *MINV to decodetree
58
target/arm: Use float_status in helper_vfp_fcvt{ds,sd}
49
target/arm: Convert FMAXNMV, FMINNMV, FMAXV, FMINV to decodetree
50
target/arm: Convert FMOVI (scalar, immediate) to decodetree
51
target/arm: Convert MOVI, FMOV, ORR, BIC (vector immediate) to decodetree
52
target/arm: Introduce gen_gvec_sshr, gen_gvec_ushr
53
target/arm: Fix whitespace near gen_srshr64_i64
54
target/arm: Convert handle_vec_simd_shri to decodetree
55
target/arm: Convert handle_vec_simd_shli to decodetree
56
target/arm: Use {, s}extract in handle_vec_simd_wshli
57
target/arm: Convert SSHLL, USHLL to decodetree
58
target/arm: Push tcg_rnd into handle_shri_with_rndacc
59
target/arm: Split out subroutines of handle_shri_with_rndacc
60
target/arm: Convert SHRN, RSHRN to decodetree
61
target/arm: Convert handle_scalar_simd_shri to decodetree
62
target/arm: Convert handle_scalar_simd_shli to decodetree
63
target/arm: Convert VQSHL, VQSHLU to gvec
64
target/arm: Widen NeonGenNarrowEnvFn return to 64 bits
65
target/arm: Convert SQSHL, UQSHL, SQSHLU (immediate) to decodetree
66
target/arm: Convert vector [US]QSHRN, [US]QRSHRN, SQSHRUN to decodetree
67
target/arm: Convert scalar [US]QSHRN, [US]QRSHRN, SQSHRUN to decodetree
68
59
69
docs/devel/nested-papr.txt | 119 --
60
docs/system/arm/emulation.rst | 1 +
70
target/arm/helper.h | 34 +-
61
target/arm/cpregs.h | 80 ++--
71
target/arm/tcg/translate.h | 14 +-
62
target/arm/cpu-features.h | 5 +
72
target/arm/tcg/a64.decode | 257 ++++
63
target/arm/helper.h | 638 +++++++++++++++----------------
73
target/arm/tcg/neon-dp.decode | 6 +-
64
target/arm/tcg/helper-a64.h | 116 +++---
74
accel/kvm/kvm-all.c | 10 +-
65
target/arm/tcg/helper-sme.h | 4 +-
75
hw/char/stm32l4x5_usart.c | 16 +
66
target/arm/tcg/helper-sve.h | 426 ++++++++++-----------
76
target/arm/tcg/cpu64.c | 2 +-
67
target/arm/tcg/a64.decode | 3 +
77
target/arm/tcg/gengvec.c | 121 +-
68
hw/intc/arm_gicv3_its.c | 44 +--
78
target/arm/tcg/neon_helper.c | 76 +-
69
target/arm/helper.c | 30 +-
79
target/arm/tcg/translate-a64.c | 2081 +++++++++++++-----------------
70
target/arm/tcg/cpu64.c | 1 +
80
target/arm/tcg/translate-neon.c | 179 +--
71
target/arm/tcg/helper-a64.c | 101 ++---
81
target/arm/tcg/translate-sve.c | 128 +-
72
target/arm/tcg/neon_helper.c | 27 +-
82
tests/qtest/stm32l4x5_usart-test.c | 36 +-
73
target/arm/tcg/op_helper.c | 11 +-
83
tests/functional/test_aarch64_sbsaref.py | 58 +-
74
target/arm/tcg/sme_helper.c | 8 +-
84
15 files changed, 1479 insertions(+), 1658 deletions(-)
75
target/arm/tcg/sve_helper.c | 96 ++---
85
delete mode 100644 docs/devel/nested-papr.txt
76
target/arm/tcg/tlb-insns.c | 202 ++++++----
77
target/arm/tcg/translate-a64.c | 26 +-
78
target/arm/tcg/translate-vfp.c | 4 +-
79
target/arm/tcg/vec_helper.c | 81 ++--
80
target/arm/vfp_helper.c | 130 +++----
81
tests/tcg/aarch64/system/feat-xs.c | 27 ++
82
tests/functional/test_aarch64_sbsaref.py | 20 +-
83
23 files changed, 1083 insertions(+), 998 deletions(-)
84
create mode 100644 tests/tcg/aarch64/system/feat-xs.c
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Instead of copying a constant into a temporary with dupi,
4
use a vector constant directly.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-2-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/gengvec.c | 43 ++++++++++++++++++----------------------
12
1 file changed, 19 insertions(+), 24 deletions(-)
13
14
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/gengvec.c
17
+++ b/target/arm/tcg/gengvec.c
18
@@ -XXX,XX +XXX,XX @@ void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
19
static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
20
{
21
TCGv_vec t = tcg_temp_new_vec_matching(d);
22
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
23
+ TCGv_vec ones = tcg_constant_vec_matching(d, vece, 1);
24
25
tcg_gen_shri_vec(vece, t, a, sh - 1);
26
- tcg_gen_dupi_vec(vece, ones, 1);
27
tcg_gen_and_vec(vece, t, t, ones);
28
tcg_gen_sari_vec(vece, d, a, sh);
29
tcg_gen_add_vec(vece, d, d, t);
30
@@ -XXX,XX +XXX,XX @@ void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
31
static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
32
{
33
TCGv_vec t = tcg_temp_new_vec_matching(d);
34
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
35
+ TCGv_vec ones = tcg_constant_vec_matching(d, vece, 1);
36
37
tcg_gen_shri_vec(vece, t, a, shift - 1);
38
- tcg_gen_dupi_vec(vece, ones, 1);
39
tcg_gen_and_vec(vece, t, t, ones);
40
tcg_gen_shri_vec(vece, d, a, shift);
41
tcg_gen_add_vec(vece, d, d, t);
42
@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
43
static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
44
{
45
TCGv_vec t = tcg_temp_new_vec_matching(d);
46
- TCGv_vec m = tcg_temp_new_vec_matching(d);
47
+ int64_t mi = MAKE_64BIT_MASK((8 << vece) - sh, sh);
48
+ TCGv_vec m = tcg_constant_vec_matching(d, vece, mi);
49
50
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
51
tcg_gen_shri_vec(vece, t, a, sh);
52
tcg_gen_and_vec(vece, d, d, m);
53
tcg_gen_or_vec(vece, d, d, t);
54
@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
55
static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
56
{
57
TCGv_vec t = tcg_temp_new_vec_matching(d);
58
- TCGv_vec m = tcg_temp_new_vec_matching(d);
59
+ TCGv_vec m = tcg_constant_vec_matching(d, vece, MAKE_64BIT_MASK(0, sh));
60
61
tcg_gen_shli_vec(vece, t, a, sh);
62
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
63
tcg_gen_and_vec(vece, d, d, m);
64
tcg_gen_or_vec(vece, d, d, t);
65
}
66
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
67
TCGv_vec rval = tcg_temp_new_vec_matching(dst);
68
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
69
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
70
- TCGv_vec msk, max;
71
+ TCGv_vec max;
72
73
tcg_gen_neg_vec(vece, rsh, shift);
74
if (vece == MO_8) {
75
tcg_gen_mov_vec(lsh, shift);
76
} else {
77
- msk = tcg_temp_new_vec_matching(dst);
78
- tcg_gen_dupi_vec(vece, msk, 0xff);
79
+ TCGv_vec msk = tcg_constant_vec_matching(dst, vece, 0xff);
80
tcg_gen_and_vec(vece, lsh, shift, msk);
81
tcg_gen_and_vec(vece, rsh, rsh, msk);
82
}
83
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
84
tcg_gen_shlv_vec(vece, lval, src, lsh);
85
tcg_gen_shrv_vec(vece, rval, src, rsh);
86
87
- max = tcg_temp_new_vec_matching(dst);
88
- tcg_gen_dupi_vec(vece, max, 8 << vece);
89
-
90
/*
91
* The choice of LT (signed) and GEU (unsigned) are biased toward
92
* the instructions of the x86_64 host. For MO_8, the whole byte
93
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
94
* have already masked to a byte and so a signed compare works.
95
* Other tcg hosts have a full set of comparisons and do not care.
96
*/
97
+ max = tcg_constant_vec_matching(dst, vece, 8 << vece);
98
if (vece == MO_8) {
99
tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
100
tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
101
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
102
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
103
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
104
TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
105
+ TCGv_vec max, zero;
106
107
/*
108
* Rely on the TCG guarantee that out of range shifts produce
109
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
110
if (vece == MO_8) {
111
tcg_gen_mov_vec(lsh, shift);
112
} else {
113
- tcg_gen_dupi_vec(vece, tmp, 0xff);
114
- tcg_gen_and_vec(vece, lsh, shift, tmp);
115
- tcg_gen_and_vec(vece, rsh, rsh, tmp);
116
+ TCGv_vec msk = tcg_constant_vec_matching(dst, vece, 0xff);
117
+ tcg_gen_and_vec(vece, lsh, shift, msk);
118
+ tcg_gen_and_vec(vece, rsh, rsh, msk);
119
}
120
121
/* Bound rsh so out of bound right shift gets -1. */
122
- tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
123
- tcg_gen_umin_vec(vece, rsh, rsh, tmp);
124
- tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
125
+ max = tcg_constant_vec_matching(dst, vece, (8 << vece) - 1);
126
+ tcg_gen_umin_vec(vece, rsh, rsh, max);
127
+ tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, max);
128
129
tcg_gen_shlv_vec(vece, lval, src, lsh);
130
tcg_gen_sarv_vec(vece, rval, src, rsh);
131
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
132
tcg_gen_andc_vec(vece, lval, lval, tmp);
133
134
/* Select between left and right shift. */
135
+ zero = tcg_constant_vec_matching(dst, vece, 0);
136
if (vece == MO_8) {
137
- tcg_gen_dupi_vec(vece, tmp, 0);
138
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
139
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, zero, rval, lval);
140
} else {
141
- tcg_gen_dupi_vec(vece, tmp, 0x80);
142
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
143
+ TCGv_vec sgn = tcg_constant_vec_matching(dst, vece, 0x80);
144
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, sgn, lval, rval);
145
}
146
}
147
148
--
149
2.34.1
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
From: Denis Rastyogin <gerben@altlinux.org>
2
2
3
OpenBSD 7.3 we use is EoL. Both 7.4 and 7.5 releases do not work on
3
This call is redundant as it only retrieves a value that is not used further.
4
anything above Neoverse-N1 due to PAC emulation:
5
4
6
https://marc.info/?l=openbsd-arm&m=171050428327850&w=2
5
Found by Linux Verification Center (linuxtesting.org) with SVACE.
7
6
8
OpenBSD 7.6 is not yet released.
7
Signed-off-by: Denis Rastyogin <gerben@altlinux.org>
9
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
9
Message-id: 20241212120618.518369-1-gerben@altlinux.org
11
Message-id: 20240910-b4-move-to-freebsd-v5-4-0fb66d803c93@linaro.org
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
11
---
15
tests/functional/test_aarch64_sbsaref.py | 44 ------------------------
12
target/arm/vfp_helper.c | 2 --
16
1 file changed, 44 deletions(-)
13
1 file changed, 2 deletions(-)
17
14
18
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
15
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
19
index XXXXXXX..XXXXXXX 100755
16
index XXXXXXX..XXXXXXX 100644
20
--- a/tests/functional/test_aarch64_sbsaref.py
17
--- a/target/arm/vfp_helper.c
21
+++ b/tests/functional/test_aarch64_sbsaref.py
18
+++ b/target/arm/vfp_helper.c
22
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_alpine_linux_max(self):
19
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rintd)(float64 x, void *fp_status)
23
self.boot_alpine_linux("max")
20
24
21
ret = float64_round_to_int(x, fp_status);
25
22
26
- ASSET_OPENBSD_ISO = Asset(
23
- new_flags = get_float_exception_flags(fp_status);
27
- ('https://cdn.openbsd.org/pub/OpenBSD/7.3/arm64/miniroot73.img'),
28
- '7fc2c75401d6f01fbfa25f4953f72ad7d7c18650056d30755c44b9c129b707e5')
29
-
24
-
30
- # This tests the whole boot chain from EFI to Userspace
25
/* Suppress any inexact exceptions the conversion produced */
31
- # We only boot a whole OS for the current top level CPU and GIC
26
if (!(old_flags & float_flag_inexact)) {
32
- # Other test profiles should use more minimal boots
27
new_flags = get_float_exception_flags(fp_status);
33
- def boot_openbsd73(self, cpu=None):
34
- self.fetch_firmware()
35
-
36
- img_path = self.ASSET_OPENBSD_ISO.fetch()
37
-
38
- self.vm.set_console()
39
- self.vm.add_args(
40
- "-drive", f"file={img_path},format=raw,snapshot=on",
41
- )
42
- if cpu:
43
- self.vm.add_args("-cpu", cpu)
44
-
45
- self.vm.launch()
46
- wait_for_console_pattern(self,
47
- "Welcome to the OpenBSD/arm64"
48
- " 7.3 installation program.")
49
-
50
- def test_sbsaref_openbsd73_cortex_a57(self):
51
- self.boot_openbsd73("cortex-a57")
52
-
53
- def test_sbsaref_openbsd73_default_cpu(self):
54
- self.boot_openbsd73()
55
-
56
- def test_sbsaref_openbsd73_max_pauth_off(self):
57
- self.boot_openbsd73("max,pauth=off")
58
-
59
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
60
- 'Test might timeout due to PAuth emulation')
61
- def test_sbsaref_openbsd73_max_pauth_impdef(self):
62
- self.boot_openbsd73("max,pauth-impdef=on")
63
-
64
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
65
- 'Test might timeout due to PAuth emulation')
66
- def test_sbsaref_openbsd73_max(self):
67
- self.boot_openbsd73("max")
68
-
69
-
70
ASSET_FREEBSD_ISO = Asset(
71
('https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/'
72
'14.1/FreeBSD-14.1-RELEASE-arm64-aarch64-bootonly.iso'),
73
--
28
--
74
2.34.1
29
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We always pass the same value for round; compute it
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
within common code.
5
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20241206031224.78525-3-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20240912024114.1097832-21-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/tcg/translate-a64.c | 32 ++++++--------------------------
8
target/arm/helper.h | 268 ++++++++++++++++++++--------------------
13
1 file changed, 6 insertions(+), 26 deletions(-)
9
target/arm/vfp_helper.c | 120 ++++++++----------
10
2 files changed, 186 insertions(+), 202 deletions(-)
14
11
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
12
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/translate-a64.c
14
--- a/target/arm/helper.h
18
+++ b/target/arm/tcg/translate-a64.c
15
+++ b/target/arm/helper.h
19
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(probe_access, TCG_CALL_NO_WG, void, env, tl, i32, i32, i32)
20
* the vector and scalar code.
17
DEF_HELPER_1(vfp_get_fpscr, i32, env)
18
DEF_HELPER_2(vfp_set_fpscr, void, env, i32)
19
20
-DEF_HELPER_3(vfp_addh, f16, f16, f16, ptr)
21
-DEF_HELPER_3(vfp_adds, f32, f32, f32, ptr)
22
-DEF_HELPER_3(vfp_addd, f64, f64, f64, ptr)
23
-DEF_HELPER_3(vfp_subh, f16, f16, f16, ptr)
24
-DEF_HELPER_3(vfp_subs, f32, f32, f32, ptr)
25
-DEF_HELPER_3(vfp_subd, f64, f64, f64, ptr)
26
-DEF_HELPER_3(vfp_mulh, f16, f16, f16, ptr)
27
-DEF_HELPER_3(vfp_muls, f32, f32, f32, ptr)
28
-DEF_HELPER_3(vfp_muld, f64, f64, f64, ptr)
29
-DEF_HELPER_3(vfp_divh, f16, f16, f16, ptr)
30
-DEF_HELPER_3(vfp_divs, f32, f32, f32, ptr)
31
-DEF_HELPER_3(vfp_divd, f64, f64, f64, ptr)
32
-DEF_HELPER_3(vfp_maxh, f16, f16, f16, ptr)
33
-DEF_HELPER_3(vfp_maxs, f32, f32, f32, ptr)
34
-DEF_HELPER_3(vfp_maxd, f64, f64, f64, ptr)
35
-DEF_HELPER_3(vfp_minh, f16, f16, f16, ptr)
36
-DEF_HELPER_3(vfp_mins, f32, f32, f32, ptr)
37
-DEF_HELPER_3(vfp_mind, f64, f64, f64, ptr)
38
-DEF_HELPER_3(vfp_maxnumh, f16, f16, f16, ptr)
39
-DEF_HELPER_3(vfp_maxnums, f32, f32, f32, ptr)
40
-DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, ptr)
41
-DEF_HELPER_3(vfp_minnumh, f16, f16, f16, ptr)
42
-DEF_HELPER_3(vfp_minnums, f32, f32, f32, ptr)
43
-DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr)
44
-DEF_HELPER_2(vfp_sqrth, f16, f16, ptr)
45
-DEF_HELPER_2(vfp_sqrts, f32, f32, ptr)
46
-DEF_HELPER_2(vfp_sqrtd, f64, f64, ptr)
47
+DEF_HELPER_3(vfp_addh, f16, f16, f16, fpst)
48
+DEF_HELPER_3(vfp_adds, f32, f32, f32, fpst)
49
+DEF_HELPER_3(vfp_addd, f64, f64, f64, fpst)
50
+DEF_HELPER_3(vfp_subh, f16, f16, f16, fpst)
51
+DEF_HELPER_3(vfp_subs, f32, f32, f32, fpst)
52
+DEF_HELPER_3(vfp_subd, f64, f64, f64, fpst)
53
+DEF_HELPER_3(vfp_mulh, f16, f16, f16, fpst)
54
+DEF_HELPER_3(vfp_muls, f32, f32, f32, fpst)
55
+DEF_HELPER_3(vfp_muld, f64, f64, f64, fpst)
56
+DEF_HELPER_3(vfp_divh, f16, f16, f16, fpst)
57
+DEF_HELPER_3(vfp_divs, f32, f32, f32, fpst)
58
+DEF_HELPER_3(vfp_divd, f64, f64, f64, fpst)
59
+DEF_HELPER_3(vfp_maxh, f16, f16, f16, fpst)
60
+DEF_HELPER_3(vfp_maxs, f32, f32, f32, fpst)
61
+DEF_HELPER_3(vfp_maxd, f64, f64, f64, fpst)
62
+DEF_HELPER_3(vfp_minh, f16, f16, f16, fpst)
63
+DEF_HELPER_3(vfp_mins, f32, f32, f32, fpst)
64
+DEF_HELPER_3(vfp_mind, f64, f64, f64, fpst)
65
+DEF_HELPER_3(vfp_maxnumh, f16, f16, f16, fpst)
66
+DEF_HELPER_3(vfp_maxnums, f32, f32, f32, fpst)
67
+DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, fpst)
68
+DEF_HELPER_3(vfp_minnumh, f16, f16, f16, fpst)
69
+DEF_HELPER_3(vfp_minnums, f32, f32, f32, fpst)
70
+DEF_HELPER_3(vfp_minnumd, f64, f64, f64, fpst)
71
+DEF_HELPER_2(vfp_sqrth, f16, f16, fpst)
72
+DEF_HELPER_2(vfp_sqrts, f32, f32, fpst)
73
+DEF_HELPER_2(vfp_sqrtd, f64, f64, fpst)
74
DEF_HELPER_3(vfp_cmph, void, f16, f16, env)
75
DEF_HELPER_3(vfp_cmps, void, f32, f32, env)
76
DEF_HELPER_3(vfp_cmpd, void, f64, f64, env)
77
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_cmped, void, f64, f64, env)
78
79
DEF_HELPER_2(vfp_fcvtds, f64, f32, env)
80
DEF_HELPER_2(vfp_fcvtsd, f32, f64, env)
81
-DEF_HELPER_FLAGS_2(bfcvt, TCG_CALL_NO_RWG, i32, f32, ptr)
82
-DEF_HELPER_FLAGS_2(bfcvt_pair, TCG_CALL_NO_RWG, i32, i64, ptr)
83
+DEF_HELPER_FLAGS_2(bfcvt, TCG_CALL_NO_RWG, i32, f32, fpst)
84
+DEF_HELPER_FLAGS_2(bfcvt_pair, TCG_CALL_NO_RWG, i32, i64, fpst)
85
86
-DEF_HELPER_2(vfp_uitoh, f16, i32, ptr)
87
-DEF_HELPER_2(vfp_uitos, f32, i32, ptr)
88
-DEF_HELPER_2(vfp_uitod, f64, i32, ptr)
89
-DEF_HELPER_2(vfp_sitoh, f16, i32, ptr)
90
-DEF_HELPER_2(vfp_sitos, f32, i32, ptr)
91
-DEF_HELPER_2(vfp_sitod, f64, i32, ptr)
92
+DEF_HELPER_2(vfp_uitoh, f16, i32, fpst)
93
+DEF_HELPER_2(vfp_uitos, f32, i32, fpst)
94
+DEF_HELPER_2(vfp_uitod, f64, i32, fpst)
95
+DEF_HELPER_2(vfp_sitoh, f16, i32, fpst)
96
+DEF_HELPER_2(vfp_sitos, f32, i32, fpst)
97
+DEF_HELPER_2(vfp_sitod, f64, i32, fpst)
98
99
-DEF_HELPER_2(vfp_touih, i32, f16, ptr)
100
-DEF_HELPER_2(vfp_touis, i32, f32, ptr)
101
-DEF_HELPER_2(vfp_touid, i32, f64, ptr)
102
-DEF_HELPER_2(vfp_touizh, i32, f16, ptr)
103
-DEF_HELPER_2(vfp_touizs, i32, f32, ptr)
104
-DEF_HELPER_2(vfp_touizd, i32, f64, ptr)
105
-DEF_HELPER_2(vfp_tosih, s32, f16, ptr)
106
-DEF_HELPER_2(vfp_tosis, s32, f32, ptr)
107
-DEF_HELPER_2(vfp_tosid, s32, f64, ptr)
108
-DEF_HELPER_2(vfp_tosizh, s32, f16, ptr)
109
-DEF_HELPER_2(vfp_tosizs, s32, f32, ptr)
110
-DEF_HELPER_2(vfp_tosizd, s32, f64, ptr)
111
+DEF_HELPER_2(vfp_touih, i32, f16, fpst)
112
+DEF_HELPER_2(vfp_touis, i32, f32, fpst)
113
+DEF_HELPER_2(vfp_touid, i32, f64, fpst)
114
+DEF_HELPER_2(vfp_touizh, i32, f16, fpst)
115
+DEF_HELPER_2(vfp_touizs, i32, f32, fpst)
116
+DEF_HELPER_2(vfp_touizd, i32, f64, fpst)
117
+DEF_HELPER_2(vfp_tosih, s32, f16, fpst)
118
+DEF_HELPER_2(vfp_tosis, s32, f32, fpst)
119
+DEF_HELPER_2(vfp_tosid, s32, f64, fpst)
120
+DEF_HELPER_2(vfp_tosizh, s32, f16, fpst)
121
+DEF_HELPER_2(vfp_tosizs, s32, f32, fpst)
122
+DEF_HELPER_2(vfp_tosizd, s32, f64, fpst)
123
124
-DEF_HELPER_3(vfp_toshh_round_to_zero, i32, f16, i32, ptr)
125
-DEF_HELPER_3(vfp_toslh_round_to_zero, i32, f16, i32, ptr)
126
-DEF_HELPER_3(vfp_touhh_round_to_zero, i32, f16, i32, ptr)
127
-DEF_HELPER_3(vfp_toulh_round_to_zero, i32, f16, i32, ptr)
128
-DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr)
129
-DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr)
130
-DEF_HELPER_3(vfp_touhs_round_to_zero, i32, f32, i32, ptr)
131
-DEF_HELPER_3(vfp_touls_round_to_zero, i32, f32, i32, ptr)
132
-DEF_HELPER_3(vfp_toshd_round_to_zero, i64, f64, i32, ptr)
133
-DEF_HELPER_3(vfp_tosld_round_to_zero, i64, f64, i32, ptr)
134
-DEF_HELPER_3(vfp_tosqd_round_to_zero, i64, f64, i32, ptr)
135
-DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
136
-DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
137
-DEF_HELPER_3(vfp_touqd_round_to_zero, i64, f64, i32, ptr)
138
-DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
139
-DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
140
-DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
141
-DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
142
-DEF_HELPER_3(vfp_touqh, i64, f16, i32, ptr)
143
-DEF_HELPER_3(vfp_tosqh, i64, f16, i32, ptr)
144
-DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
145
-DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
146
-DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
147
-DEF_HELPER_3(vfp_touhs, i32, f32, i32, ptr)
148
-DEF_HELPER_3(vfp_touls, i32, f32, i32, ptr)
149
-DEF_HELPER_3(vfp_touqs, i64, f32, i32, ptr)
150
-DEF_HELPER_3(vfp_toshd, i64, f64, i32, ptr)
151
-DEF_HELPER_3(vfp_tosld, i64, f64, i32, ptr)
152
-DEF_HELPER_3(vfp_tosqd, i64, f64, i32, ptr)
153
-DEF_HELPER_3(vfp_touhd, i64, f64, i32, ptr)
154
-DEF_HELPER_3(vfp_tould, i64, f64, i32, ptr)
155
-DEF_HELPER_3(vfp_touqd, i64, f64, i32, ptr)
156
-DEF_HELPER_3(vfp_shtos, f32, i32, i32, ptr)
157
-DEF_HELPER_3(vfp_sltos, f32, i32, i32, ptr)
158
-DEF_HELPER_3(vfp_sqtos, f32, i64, i32, ptr)
159
-DEF_HELPER_3(vfp_uhtos, f32, i32, i32, ptr)
160
-DEF_HELPER_3(vfp_ultos, f32, i32, i32, ptr)
161
-DEF_HELPER_3(vfp_uqtos, f32, i64, i32, ptr)
162
-DEF_HELPER_3(vfp_shtod, f64, i64, i32, ptr)
163
-DEF_HELPER_3(vfp_sltod, f64, i64, i32, ptr)
164
-DEF_HELPER_3(vfp_sqtod, f64, i64, i32, ptr)
165
-DEF_HELPER_3(vfp_uhtod, f64, i64, i32, ptr)
166
-DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
167
-DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
168
-DEF_HELPER_3(vfp_shtoh, f16, i32, i32, ptr)
169
-DEF_HELPER_3(vfp_uhtoh, f16, i32, i32, ptr)
170
-DEF_HELPER_3(vfp_sltoh, f16, i32, i32, ptr)
171
-DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
172
-DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
173
-DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
174
+DEF_HELPER_3(vfp_toshh_round_to_zero, i32, f16, i32, fpst)
175
+DEF_HELPER_3(vfp_toslh_round_to_zero, i32, f16, i32, fpst)
176
+DEF_HELPER_3(vfp_touhh_round_to_zero, i32, f16, i32, fpst)
177
+DEF_HELPER_3(vfp_toulh_round_to_zero, i32, f16, i32, fpst)
178
+DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, fpst)
179
+DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, fpst)
180
+DEF_HELPER_3(vfp_touhs_round_to_zero, i32, f32, i32, fpst)
181
+DEF_HELPER_3(vfp_touls_round_to_zero, i32, f32, i32, fpst)
182
+DEF_HELPER_3(vfp_toshd_round_to_zero, i64, f64, i32, fpst)
183
+DEF_HELPER_3(vfp_tosld_round_to_zero, i64, f64, i32, fpst)
184
+DEF_HELPER_3(vfp_tosqd_round_to_zero, i64, f64, i32, fpst)
185
+DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, fpst)
186
+DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, fpst)
187
+DEF_HELPER_3(vfp_touqd_round_to_zero, i64, f64, i32, fpst)
188
+DEF_HELPER_3(vfp_touhh, i32, f16, i32, fpst)
189
+DEF_HELPER_3(vfp_toshh, i32, f16, i32, fpst)
190
+DEF_HELPER_3(vfp_toulh, i32, f16, i32, fpst)
191
+DEF_HELPER_3(vfp_toslh, i32, f16, i32, fpst)
192
+DEF_HELPER_3(vfp_touqh, i64, f16, i32, fpst)
193
+DEF_HELPER_3(vfp_tosqh, i64, f16, i32, fpst)
194
+DEF_HELPER_3(vfp_toshs, i32, f32, i32, fpst)
195
+DEF_HELPER_3(vfp_tosls, i32, f32, i32, fpst)
196
+DEF_HELPER_3(vfp_tosqs, i64, f32, i32, fpst)
197
+DEF_HELPER_3(vfp_touhs, i32, f32, i32, fpst)
198
+DEF_HELPER_3(vfp_touls, i32, f32, i32, fpst)
199
+DEF_HELPER_3(vfp_touqs, i64, f32, i32, fpst)
200
+DEF_HELPER_3(vfp_toshd, i64, f64, i32, fpst)
201
+DEF_HELPER_3(vfp_tosld, i64, f64, i32, fpst)
202
+DEF_HELPER_3(vfp_tosqd, i64, f64, i32, fpst)
203
+DEF_HELPER_3(vfp_touhd, i64, f64, i32, fpst)
204
+DEF_HELPER_3(vfp_tould, i64, f64, i32, fpst)
205
+DEF_HELPER_3(vfp_touqd, i64, f64, i32, fpst)
206
+DEF_HELPER_3(vfp_shtos, f32, i32, i32, fpst)
207
+DEF_HELPER_3(vfp_sltos, f32, i32, i32, fpst)
208
+DEF_HELPER_3(vfp_sqtos, f32, i64, i32, fpst)
209
+DEF_HELPER_3(vfp_uhtos, f32, i32, i32, fpst)
210
+DEF_HELPER_3(vfp_ultos, f32, i32, i32, fpst)
211
+DEF_HELPER_3(vfp_uqtos, f32, i64, i32, fpst)
212
+DEF_HELPER_3(vfp_shtod, f64, i64, i32, fpst)
213
+DEF_HELPER_3(vfp_sltod, f64, i64, i32, fpst)
214
+DEF_HELPER_3(vfp_sqtod, f64, i64, i32, fpst)
215
+DEF_HELPER_3(vfp_uhtod, f64, i64, i32, fpst)
216
+DEF_HELPER_3(vfp_ultod, f64, i64, i32, fpst)
217
+DEF_HELPER_3(vfp_uqtod, f64, i64, i32, fpst)
218
+DEF_HELPER_3(vfp_shtoh, f16, i32, i32, fpst)
219
+DEF_HELPER_3(vfp_uhtoh, f16, i32, i32, fpst)
220
+DEF_HELPER_3(vfp_sltoh, f16, i32, i32, fpst)
221
+DEF_HELPER_3(vfp_ultoh, f16, i32, i32, fpst)
222
+DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, fpst)
223
+DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, fpst)
224
225
-DEF_HELPER_3(vfp_shtos_round_to_nearest, f32, i32, i32, ptr)
226
-DEF_HELPER_3(vfp_sltos_round_to_nearest, f32, i32, i32, ptr)
227
-DEF_HELPER_3(vfp_uhtos_round_to_nearest, f32, i32, i32, ptr)
228
-DEF_HELPER_3(vfp_ultos_round_to_nearest, f32, i32, i32, ptr)
229
-DEF_HELPER_3(vfp_shtod_round_to_nearest, f64, i64, i32, ptr)
230
-DEF_HELPER_3(vfp_sltod_round_to_nearest, f64, i64, i32, ptr)
231
-DEF_HELPER_3(vfp_uhtod_round_to_nearest, f64, i64, i32, ptr)
232
-DEF_HELPER_3(vfp_ultod_round_to_nearest, f64, i64, i32, ptr)
233
-DEF_HELPER_3(vfp_shtoh_round_to_nearest, f16, i32, i32, ptr)
234
-DEF_HELPER_3(vfp_uhtoh_round_to_nearest, f16, i32, i32, ptr)
235
-DEF_HELPER_3(vfp_sltoh_round_to_nearest, f16, i32, i32, ptr)
236
-DEF_HELPER_3(vfp_ultoh_round_to_nearest, f16, i32, i32, ptr)
237
+DEF_HELPER_3(vfp_shtos_round_to_nearest, f32, i32, i32, fpst)
238
+DEF_HELPER_3(vfp_sltos_round_to_nearest, f32, i32, i32, fpst)
239
+DEF_HELPER_3(vfp_uhtos_round_to_nearest, f32, i32, i32, fpst)
240
+DEF_HELPER_3(vfp_ultos_round_to_nearest, f32, i32, i32, fpst)
241
+DEF_HELPER_3(vfp_shtod_round_to_nearest, f64, i64, i32, fpst)
242
+DEF_HELPER_3(vfp_sltod_round_to_nearest, f64, i64, i32, fpst)
243
+DEF_HELPER_3(vfp_uhtod_round_to_nearest, f64, i64, i32, fpst)
244
+DEF_HELPER_3(vfp_ultod_round_to_nearest, f64, i64, i32, fpst)
245
+DEF_HELPER_3(vfp_shtoh_round_to_nearest, f16, i32, i32, fpst)
246
+DEF_HELPER_3(vfp_uhtoh_round_to_nearest, f16, i32, i32, fpst)
247
+DEF_HELPER_3(vfp_sltoh_round_to_nearest, f16, i32, i32, fpst)
248
+DEF_HELPER_3(vfp_ultoh_round_to_nearest, f16, i32, i32, fpst)
249
250
-DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
251
+DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, fpst)
252
253
-DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f32, TCG_CALL_NO_RWG, f32, f16, ptr, i32)
254
-DEF_HELPER_FLAGS_3(vfp_fcvt_f32_to_f16, TCG_CALL_NO_RWG, f16, f32, ptr, i32)
255
-DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f64, TCG_CALL_NO_RWG, f64, f16, ptr, i32)
256
-DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
257
+DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f32, TCG_CALL_NO_RWG, f32, f16, fpst, i32)
258
+DEF_HELPER_FLAGS_3(vfp_fcvt_f32_to_f16, TCG_CALL_NO_RWG, f16, f32, fpst, i32)
259
+DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f64, TCG_CALL_NO_RWG, f64, f16, fpst, i32)
260
+DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, fpst, i32)
261
262
-DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
263
-DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
264
-DEF_HELPER_4(vfp_muladdh, f16, f16, f16, f16, ptr)
265
+DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, fpst)
266
+DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, fpst)
267
+DEF_HELPER_4(vfp_muladdh, f16, f16, f16, f16, fpst)
268
269
-DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
270
-DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
271
-DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
272
-DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
273
-DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
274
-DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
275
+DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, fpst)
276
+DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, fpst)
277
+DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, fpst)
278
+DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, fpst)
279
+DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, fpst)
280
+DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, fpst)
281
DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32)
282
DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32)
283
DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i64, env, i32, i64, i64)
284
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(shr_cc, i32, env, i32, i32)
285
DEF_HELPER_3(sar_cc, i32, env, i32, i32)
286
DEF_HELPER_3(ror_cc, i32, env, i32, i32)
287
288
-DEF_HELPER_FLAGS_2(rinth_exact, TCG_CALL_NO_RWG, f16, f16, ptr)
289
-DEF_HELPER_FLAGS_2(rints_exact, TCG_CALL_NO_RWG, f32, f32, ptr)
290
-DEF_HELPER_FLAGS_2(rintd_exact, TCG_CALL_NO_RWG, f64, f64, ptr)
291
-DEF_HELPER_FLAGS_2(rinth, TCG_CALL_NO_RWG, f16, f16, ptr)
292
-DEF_HELPER_FLAGS_2(rints, TCG_CALL_NO_RWG, f32, f32, ptr)
293
-DEF_HELPER_FLAGS_2(rintd, TCG_CALL_NO_RWG, f64, f64, ptr)
294
+DEF_HELPER_FLAGS_2(rinth_exact, TCG_CALL_NO_RWG, f16, f16, fpst)
295
+DEF_HELPER_FLAGS_2(rints_exact, TCG_CALL_NO_RWG, f32, f32, fpst)
296
+DEF_HELPER_FLAGS_2(rintd_exact, TCG_CALL_NO_RWG, f64, f64, fpst)
297
+DEF_HELPER_FLAGS_2(rinth, TCG_CALL_NO_RWG, f16, f16, fpst)
298
+DEF_HELPER_FLAGS_2(rints, TCG_CALL_NO_RWG, f32, f32, fpst)
299
+DEF_HELPER_FLAGS_2(rintd, TCG_CALL_NO_RWG, f64, f64, fpst)
300
301
DEF_HELPER_FLAGS_2(vjcvt, TCG_CALL_NO_RWG, i32, f64, env)
302
-DEF_HELPER_FLAGS_2(fjcvtzs, TCG_CALL_NO_RWG, i64, f64, ptr)
303
+DEF_HELPER_FLAGS_2(fjcvtzs, TCG_CALL_NO_RWG, i64, f64, fpst)
304
305
DEF_HELPER_FLAGS_3(check_hcr_el2_trap, TCG_CALL_NO_WG, void, env, i32, i32)
306
307
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a32, TCG_CALL_NO_RWG,
308
DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a64, TCG_CALL_NO_RWG,
309
void, ptr, ptr, ptr, ptr, i32)
310
311
-DEF_HELPER_FLAGS_2(frint32_s, TCG_CALL_NO_RWG, f32, f32, ptr)
312
-DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr)
313
-DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr)
314
-DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr)
315
+DEF_HELPER_FLAGS_2(frint32_s, TCG_CALL_NO_RWG, f32, f32, fpst)
316
+DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, fpst)
317
+DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, fpst)
318
+DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, fpst)
319
320
DEF_HELPER_FLAGS_3(gvec_ceq0_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
321
DEF_HELPER_FLAGS_3(gvec_ceq0_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
322
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
323
index XXXXXXX..XXXXXXX 100644
324
--- a/target/arm/vfp_helper.c
325
+++ b/target/arm/vfp_helper.c
326
@@ -XXX,XX +XXX,XX @@ void vfp_set_fpscr(CPUARMState *env, uint32_t val)
327
#define VFP_HELPER(name, p) HELPER(glue(glue(vfp_,name),p))
328
329
#define VFP_BINOP(name) \
330
-dh_ctype_f16 VFP_HELPER(name, h)(dh_ctype_f16 a, dh_ctype_f16 b, void *fpstp) \
331
+dh_ctype_f16 VFP_HELPER(name, h)(dh_ctype_f16 a, dh_ctype_f16 b, float_status *fpst) \
332
{ \
333
- float_status *fpst = fpstp; \
334
return float16_ ## name(a, b, fpst); \
335
} \
336
-float32 VFP_HELPER(name, s)(float32 a, float32 b, void *fpstp) \
337
+float32 VFP_HELPER(name, s)(float32 a, float32 b, float_status *fpst) \
338
{ \
339
- float_status *fpst = fpstp; \
340
return float32_ ## name(a, b, fpst); \
341
} \
342
-float64 VFP_HELPER(name, d)(float64 a, float64 b, void *fpstp) \
343
+float64 VFP_HELPER(name, d)(float64 a, float64 b, float_status *fpst) \
344
{ \
345
- float_status *fpst = fpstp; \
346
return float64_ ## name(a, b, fpst); \
347
}
348
VFP_BINOP(add)
349
@@ -XXX,XX +XXX,XX @@ VFP_BINOP(minnum)
350
VFP_BINOP(maxnum)
351
#undef VFP_BINOP
352
353
-dh_ctype_f16 VFP_HELPER(sqrt, h)(dh_ctype_f16 a, void *fpstp)
354
+dh_ctype_f16 VFP_HELPER(sqrt, h)(dh_ctype_f16 a, float_status *fpst)
355
{
356
- return float16_sqrt(a, fpstp);
357
+ return float16_sqrt(a, fpst);
358
}
359
360
-float32 VFP_HELPER(sqrt, s)(float32 a, void *fpstp)
361
+float32 VFP_HELPER(sqrt, s)(float32 a, float_status *fpst)
362
{
363
- return float32_sqrt(a, fpstp);
364
+ return float32_sqrt(a, fpst);
365
}
366
367
-float64 VFP_HELPER(sqrt, d)(float64 a, void *fpstp)
368
+float64 VFP_HELPER(sqrt, d)(float64 a, float_status *fpst)
369
{
370
- return float64_sqrt(a, fpstp);
371
+ return float64_sqrt(a, fpst);
372
}
373
374
static void softfloat_to_vfp_compare(CPUARMState *env, FloatRelation cmp)
375
@@ -XXX,XX +XXX,XX @@ DO_VFP_cmp(d, float64, float64, fp_status)
376
/* Integer to float and float to integer conversions */
377
378
#define CONV_ITOF(name, ftype, fsz, sign) \
379
-ftype HELPER(name)(uint32_t x, void *fpstp) \
380
+ftype HELPER(name)(uint32_t x, float_status *fpst) \
381
{ \
382
- float_status *fpst = fpstp; \
383
return sign##int32_to_##float##fsz((sign##int32_t)x, fpst); \
384
}
385
386
#define CONV_FTOI(name, ftype, fsz, sign, round) \
387
-sign##int32_t HELPER(name)(ftype x, void *fpstp) \
388
+sign##int32_t HELPER(name)(ftype x, float_status *fpst) \
389
{ \
390
- float_status *fpst = fpstp; \
391
if (float##fsz##_is_any_nan(x)) { \
392
float_raise(float_flag_invalid, fpst); \
393
return 0; \
394
@@ -XXX,XX +XXX,XX @@ float32 VFP_HELPER(fcvts, d)(float64 x, CPUARMState *env)
395
return float64_to_float32(x, &env->vfp.fp_status);
396
}
397
398
-uint32_t HELPER(bfcvt)(float32 x, void *status)
399
+uint32_t HELPER(bfcvt)(float32 x, float_status *status)
400
{
401
return float32_to_bfloat16(x, status);
402
}
403
404
-uint32_t HELPER(bfcvt_pair)(uint64_t pair, void *status)
405
+uint32_t HELPER(bfcvt_pair)(uint64_t pair, float_status *status)
406
{
407
bfloat16 lo = float32_to_bfloat16(extract64(pair, 0, 32), status);
408
bfloat16 hi = float32_to_bfloat16(extract64(pair, 32, 32), status);
409
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(bfcvt_pair)(uint64_t pair, void *status)
21
*/
410
*/
22
static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
411
#define VFP_CONV_FIX_FLOAT(name, p, fsz, ftype, isz, itype) \
23
- TCGv_i64 tcg_rnd, bool accumulate,
412
ftype HELPER(vfp_##name##to##p)(uint##isz##_t x, uint32_t shift, \
24
+ bool round, bool accumulate,
413
- void *fpstp) \
25
bool is_u, int size, int shift)
414
-{ return itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); }
26
{
415
+ float_status *fpst) \
27
bool extended_result = false;
416
+{ return itype##_to_##float##fsz##_scalbn(x, -shift, fpst); }
28
- bool round = tcg_rnd != NULL;
417
29
int ext_lshift = 0;
418
#define VFP_CONV_FIX_FLOAT_ROUND(name, p, fsz, ftype, isz, itype) \
30
TCGv_i64 tcg_src_hi;
419
ftype HELPER(vfp_##name##to##p##_round_to_nearest)(uint##isz##_t x, \
31
420
uint32_t shift, \
32
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
421
- void *fpstp) \
33
422
+ float_status *fpst) \
34
/* Deal with the rounding step */
423
{ \
35
if (round) {
424
ftype ret; \
36
+ TCGv_i64 tcg_rnd = tcg_constant_i64(1ull << (shift - 1));
425
- float_status *fpst = fpstp; \
37
if (extended_result) {
426
FloatRoundMode oldmode = fpst->float_rounding_mode; \
38
TCGv_i64 tcg_zero = tcg_constant_i64(0);
427
fpst->float_rounding_mode = float_round_nearest_even; \
39
if (!is_u) {
428
- ret = itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); \
40
@@ -XXX,XX +XXX,XX @@ static void handle_scalar_simd_shri(DisasContext *s,
429
+ ret = itype##_to_##float##fsz##_scalbn(x, -shift, fpst); \
41
bool insert = false;
430
fpst->float_rounding_mode = oldmode; \
42
TCGv_i64 tcg_rn;
431
return ret; \
43
TCGv_i64 tcg_rd;
44
- TCGv_i64 tcg_round;
45
46
if (!extract32(immh, 3, 1)) {
47
unallocated_encoding(s);
48
@@ -XXX,XX +XXX,XX @@ static void handle_scalar_simd_shri(DisasContext *s,
49
break;
50
}
432
}
51
433
52
- if (round) {
434
#define VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, ROUND, suff) \
53
- tcg_round = tcg_constant_i64(1ULL << (shift - 1));
435
uint##isz##_t HELPER(vfp_to##name##p##suff)(ftype x, uint32_t shift, \
54
- } else {
436
- void *fpst) \
55
- tcg_round = NULL;
437
+ float_status *fpst) \
56
- }
438
{ \
439
if (unlikely(float##fsz##_is_any_nan(x))) { \
440
float_raise(float_flag_invalid, fpst); \
441
@@ -XXX,XX +XXX,XX @@ VFP_CONV_FLOAT_FIX_ROUND(uq, d, 64, float64, 64, uint64,
442
/* Set the current fp rounding mode and return the old one.
443
* The argument is a softfloat float_round_ value.
444
*/
445
-uint32_t HELPER(set_rmode)(uint32_t rmode, void *fpstp)
446
+uint32_t HELPER(set_rmode)(uint32_t rmode, float_status *fp_status)
447
{
448
- float_status *fp_status = fpstp;
57
-
449
-
58
tcg_rn = read_fp_dreg(s, rn);
450
uint32_t prev_rmode = get_float_rounding_mode(fp_status);
59
tcg_rd = (accumulate || insert) ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
451
set_float_rounding_mode(rmode, fp_status);
60
452
61
@@ -XXX,XX +XXX,XX @@ static void handle_scalar_simd_shri(DisasContext *s,
453
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(set_rmode)(uint32_t rmode, void *fpstp)
62
tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, 0, esize - shift);
454
}
455
456
/* Half precision conversions. */
457
-float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, void *fpstp, uint32_t ahp_mode)
458
+float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, float_status *fpst,
459
+ uint32_t ahp_mode)
460
{
461
/* Squash FZ16 to 0 for the duration of conversion. In this case,
462
* it would affect flushing input denormals.
463
*/
464
- float_status *fpst = fpstp;
465
bool save = get_flush_inputs_to_zero(fpst);
466
set_flush_inputs_to_zero(false, fpst);
467
float32 r = float16_to_float32(a, !ahp_mode, fpst);
468
@@ -XXX,XX +XXX,XX @@ float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, void *fpstp, uint32_t ahp_mode)
469
return r;
470
}
471
472
-uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, void *fpstp, uint32_t ahp_mode)
473
+uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, float_status *fpst,
474
+ uint32_t ahp_mode)
475
{
476
/* Squash FZ16 to 0 for the duration of conversion. In this case,
477
* it would affect flushing output denormals.
478
*/
479
- float_status *fpst = fpstp;
480
bool save = get_flush_to_zero(fpst);
481
set_flush_to_zero(false, fpst);
482
float16 r = float32_to_float16(a, !ahp_mode, fpst);
483
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, void *fpstp, uint32_t ahp_mode)
484
return r;
485
}
486
487
-float64 HELPER(vfp_fcvt_f16_to_f64)(uint32_t a, void *fpstp, uint32_t ahp_mode)
488
+float64 HELPER(vfp_fcvt_f16_to_f64)(uint32_t a, float_status *fpst,
489
+ uint32_t ahp_mode)
490
{
491
/* Squash FZ16 to 0 for the duration of conversion. In this case,
492
* it would affect flushing input denormals.
493
*/
494
- float_status *fpst = fpstp;
495
bool save = get_flush_inputs_to_zero(fpst);
496
set_flush_inputs_to_zero(false, fpst);
497
float64 r = float16_to_float64(a, !ahp_mode, fpst);
498
@@ -XXX,XX +XXX,XX @@ float64 HELPER(vfp_fcvt_f16_to_f64)(uint32_t a, void *fpstp, uint32_t ahp_mode)
499
return r;
500
}
501
502
-uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
503
+uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, float_status *fpst,
504
+ uint32_t ahp_mode)
505
{
506
/* Squash FZ16 to 0 for the duration of conversion. In this case,
507
* it would affect flushing output denormals.
508
*/
509
- float_status *fpst = fpstp;
510
bool save = get_flush_to_zero(fpst);
511
set_flush_to_zero(false, fpst);
512
float16 r = float64_to_float16(a, !ahp_mode, fpst);
513
@@ -XXX,XX +XXX,XX @@ static bool round_to_inf(float_status *fpst, bool sign_bit)
514
}
515
}
516
517
-uint32_t HELPER(recpe_f16)(uint32_t input, void *fpstp)
518
+uint32_t HELPER(recpe_f16)(uint32_t input, float_status *fpst)
519
{
520
- float_status *fpst = fpstp;
521
float16 f16 = float16_squash_input_denormal(input, fpst);
522
uint32_t f16_val = float16_val(f16);
523
uint32_t f16_sign = float16_is_neg(f16);
524
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_f16)(uint32_t input, void *fpstp)
525
return make_float16(f16_val);
526
}
527
528
-float32 HELPER(recpe_f32)(float32 input, void *fpstp)
529
+float32 HELPER(recpe_f32)(float32 input, float_status *fpst)
530
{
531
- float_status *fpst = fpstp;
532
float32 f32 = float32_squash_input_denormal(input, fpst);
533
uint32_t f32_val = float32_val(f32);
534
bool f32_sign = float32_is_neg(f32);
535
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recpe_f32)(float32 input, void *fpstp)
536
return make_float32(f32_val);
537
}
538
539
-float64 HELPER(recpe_f64)(float64 input, void *fpstp)
540
+float64 HELPER(recpe_f64)(float64 input, float_status *fpst)
541
{
542
- float_status *fpst = fpstp;
543
float64 f64 = float64_squash_input_denormal(input, fpst);
544
uint64_t f64_val = float64_val(f64);
545
bool f64_sign = float64_is_neg(f64);
546
@@ -XXX,XX +XXX,XX @@ static uint64_t recip_sqrt_estimate(int *exp , int exp_off, uint64_t frac)
547
return extract64(estimate, 0, 8) << 44;
548
}
549
550
-uint32_t HELPER(rsqrte_f16)(uint32_t input, void *fpstp)
551
+uint32_t HELPER(rsqrte_f16)(uint32_t input, float_status *s)
552
{
553
- float_status *s = fpstp;
554
float16 f16 = float16_squash_input_denormal(input, s);
555
uint16_t val = float16_val(f16);
556
bool f16_sign = float16_is_neg(f16);
557
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrte_f16)(uint32_t input, void *fpstp)
558
if (float16_is_signaling_nan(f16, s)) {
559
float_raise(float_flag_invalid, s);
560
if (!s->default_nan_mode) {
561
- nan = float16_silence_nan(f16, fpstp);
562
+ nan = float16_silence_nan(f16, s);
563
}
63
}
564
}
64
} else {
565
if (s->default_nan_mode) {
65
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
566
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrte_f16)(uint32_t input, void *fpstp)
66
+ handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
567
return make_float16(val);
67
accumulate, is_u, size, shift);
568
}
68
}
569
69
570
-float32 HELPER(rsqrte_f32)(float32 input, void *fpstp)
70
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
571
+float32 HELPER(rsqrte_f32)(float32 input, float_status *s)
71
int elements = is_scalar ? 1 : (64 / esize);
572
{
72
bool round = extract32(opcode, 0, 1);
573
- float_status *s = fpstp;
73
MemOp ldop = (size + 1) | (is_u_shift ? 0 : MO_SIGN);
574
float32 f32 = float32_squash_input_denormal(input, s);
74
- TCGv_i64 tcg_rn, tcg_rd, tcg_round;
575
uint32_t val = float32_val(f32);
75
+ TCGv_i64 tcg_rn, tcg_rd;
576
uint32_t f32_sign = float32_is_neg(f32);
76
TCGv_i32 tcg_rd_narrowed;
577
@@ -XXX,XX +XXX,XX @@ float32 HELPER(rsqrte_f32)(float32 input, void *fpstp)
77
TCGv_i64 tcg_final;
578
if (float32_is_signaling_nan(f32, s)) {
78
579
float_raise(float_flag_invalid, s);
79
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
580
if (!s->default_nan_mode) {
80
tcg_rd_narrowed = tcg_temp_new_i32();
581
- nan = float32_silence_nan(f32, fpstp);
81
tcg_final = tcg_temp_new_i64();
582
+ nan = float32_silence_nan(f32, s);
82
583
}
83
- if (round) {
584
}
84
- tcg_round = tcg_constant_i64(1ULL << (shift - 1));
585
if (s->default_nan_mode) {
85
- } else {
586
@@ -XXX,XX +XXX,XX @@ float32 HELPER(rsqrte_f32)(float32 input, void *fpstp)
86
- tcg_round = NULL;
587
return make_float32(val);
87
- }
588
}
88
-
589
89
for (i = 0; i < elements; i++) {
590
-float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
90
read_vec_element(s, tcg_rn, rn, i, ldop);
591
+float64 HELPER(rsqrte_f64)(float64 input, float_status *s)
91
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
592
{
92
+ handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
593
- float_status *s = fpstp;
93
false, is_u_shift, size+1, shift);
594
float64 f64 = float64_squash_input_denormal(input, s);
94
narrowfn(tcg_rd_narrowed, tcg_env, tcg_rd);
595
uint64_t val = float64_val(f64);
95
tcg_gen_extu_i32_i64(tcg_rd, tcg_rd_narrowed);
596
bool f64_sign = float64_is_neg(f64);
96
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
597
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
97
int shift = (2 * esize) - immhb;
598
if (float64_is_signaling_nan(f64, s)) {
98
bool round = extract32(opcode, 0, 1);
599
float_raise(float_flag_invalid, s);
99
TCGv_i64 tcg_rn, tcg_rd, tcg_final;
600
if (!s->default_nan_mode) {
100
- TCGv_i64 tcg_round;
601
- nan = float64_silence_nan(f64, fpstp);
101
int i;
602
+ nan = float64_silence_nan(f64, s);
102
603
}
103
if (extract32(immh, 3, 1)) {
604
}
104
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
605
if (s->default_nan_mode) {
105
tcg_final = tcg_temp_new_i64();
606
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrte_u32)(uint32_t a)
106
read_vec_element(s, tcg_final, rd, is_q ? 1 : 0, MO_64);
607
107
608
/* VFPv4 fused multiply-accumulate */
108
- if (round) {
609
dh_ctype_f16 VFP_HELPER(muladd, h)(dh_ctype_f16 a, dh_ctype_f16 b,
109
- tcg_round = tcg_constant_i64(1ULL << (shift - 1));
610
- dh_ctype_f16 c, void *fpstp)
110
- } else {
611
+ dh_ctype_f16 c, float_status *fpst)
111
- tcg_round = NULL;
612
{
112
- }
613
- float_status *fpst = fpstp;
113
-
614
return float16_muladd(a, b, c, 0, fpst);
114
for (i = 0; i < elements; i++) {
615
}
115
read_vec_element(s, tcg_rn, rn, i, size+1);
616
116
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
617
-float32 VFP_HELPER(muladd, s)(float32 a, float32 b, float32 c, void *fpstp)
117
+ handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
618
+float32 VFP_HELPER(muladd, s)(float32 a, float32 b, float32 c,
118
false, true, size+1, shift);
619
+ float_status *fpst)
119
620
{
120
tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
621
- float_status *fpst = fpstp;
622
return float32_muladd(a, b, c, 0, fpst);
623
}
624
625
-float64 VFP_HELPER(muladd, d)(float64 a, float64 b, float64 c, void *fpstp)
626
+float64 VFP_HELPER(muladd, d)(float64 a, float64 b, float64 c,
627
+ float_status *fpst)
628
{
629
- float_status *fpst = fpstp;
630
return float64_muladd(a, b, c, 0, fpst);
631
}
632
633
/* ARMv8 round to integral */
634
-dh_ctype_f16 HELPER(rinth_exact)(dh_ctype_f16 x, void *fp_status)
635
+dh_ctype_f16 HELPER(rinth_exact)(dh_ctype_f16 x, float_status *fp_status)
636
{
637
return float16_round_to_int(x, fp_status);
638
}
639
640
-float32 HELPER(rints_exact)(float32 x, void *fp_status)
641
+float32 HELPER(rints_exact)(float32 x, float_status *fp_status)
642
{
643
return float32_round_to_int(x, fp_status);
644
}
645
646
-float64 HELPER(rintd_exact)(float64 x, void *fp_status)
647
+float64 HELPER(rintd_exact)(float64 x, float_status *fp_status)
648
{
649
return float64_round_to_int(x, fp_status);
650
}
651
652
-dh_ctype_f16 HELPER(rinth)(dh_ctype_f16 x, void *fp_status)
653
+dh_ctype_f16 HELPER(rinth)(dh_ctype_f16 x, float_status *fp_status)
654
{
655
int old_flags = get_float_exception_flags(fp_status), new_flags;
656
float16 ret;
657
@@ -XXX,XX +XXX,XX @@ dh_ctype_f16 HELPER(rinth)(dh_ctype_f16 x, void *fp_status)
658
return ret;
659
}
660
661
-float32 HELPER(rints)(float32 x, void *fp_status)
662
+float32 HELPER(rints)(float32 x, float_status *fp_status)
663
{
664
int old_flags = get_float_exception_flags(fp_status), new_flags;
665
float32 ret;
666
@@ -XXX,XX +XXX,XX @@ float32 HELPER(rints)(float32 x, void *fp_status)
667
return ret;
668
}
669
670
-float64 HELPER(rintd)(float64 x, void *fp_status)
671
+float64 HELPER(rintd)(float64 x, float_status *fp_status)
672
{
673
int old_flags = get_float_exception_flags(fp_status), new_flags;
674
float64 ret;
675
@@ -XXX,XX +XXX,XX @@ const FloatRoundMode arm_rmode_to_sf_map[] = {
676
* Implement float64 to int32_t conversion without saturation;
677
* the result is supplied modulo 2^32.
678
*/
679
-uint64_t HELPER(fjcvtzs)(float64 value, void *vstatus)
680
+uint64_t HELPER(fjcvtzs)(float64 value, float_status *status)
681
{
682
- float_status *status = vstatus;
683
uint32_t frac, e_old, e_new;
684
bool inexact;
685
686
@@ -XXX,XX +XXX,XX @@ static float32 frint_s(float32 f, float_status *fpst, int intsize)
687
return (0x100u + 126u + intsize) << 23;
688
}
689
690
-float32 HELPER(frint32_s)(float32 f, void *fpst)
691
+float32 HELPER(frint32_s)(float32 f, float_status *fpst)
692
{
693
return frint_s(f, fpst, 32);
694
}
695
696
-float32 HELPER(frint64_s)(float32 f, void *fpst)
697
+float32 HELPER(frint64_s)(float32 f, float_status *fpst)
698
{
699
return frint_s(f, fpst, 64);
700
}
701
@@ -XXX,XX +XXX,XX @@ static float64 frint_d(float64 f, float_status *fpst, int intsize)
702
return (uint64_t)(0x800 + 1022 + intsize) << 52;
703
}
704
705
-float64 HELPER(frint32_d)(float64 f, void *fpst)
706
+float64 HELPER(frint32_d)(float64 f, float_status *fpst)
707
{
708
return frint_d(f, fpst, 32);
709
}
710
711
-float64 HELPER(frint64_d)(float64 f, void *fpst)
712
+float64 HELPER(frint64_d)(float64 f, float_status *fpst)
713
{
714
return frint_d(f, fpst, 64);
715
}
121
--
716
--
122
2.34.1
717
2.34.1
123
718
124
719
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20241206031224.78525-4-richard.henderson@linaro.org
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20240912024114.1097832-8-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/tcg/a64.decode | 4 +++
8
target/arm/tcg/helper-a64.h | 94 +++++++++++++++++------------------
10
target/arm/tcg/translate-a64.c | 47 ++++++++++------------------------
9
target/arm/tcg/helper-a64.c | 98 +++++++++++++------------------------
11
2 files changed, 18 insertions(+), 33 deletions(-)
10
2 files changed, 80 insertions(+), 112 deletions(-)
12
11
13
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
12
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/tcg/a64.decode
14
--- a/target/arm/tcg/helper-a64.h
16
+++ b/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/helper-a64.h
17
@@ -XXX,XX +XXX,XX @@ FNMSUB 0001 1111 .. 1 ..... 1 ..... ..... ..... @rrrr_hsd
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(msr_i_spsel, void, env, i32)
18
17
DEF_HELPER_2(msr_i_daifset, void, env, i32)
19
EXT_d 0010 1110 00 0 rm:5 00 imm:3 0 rn:5 rd:5
18
DEF_HELPER_2(msr_i_daifclear, void, env, i32)
20
EXT_q 0110 1110 00 0 rm:5 0 imm:4 0 rn:5 rd:5
19
DEF_HELPER_1(msr_set_allint_el1, void, env)
21
+
20
-DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, ptr)
22
+# Advanced SIMD Table Lookup
21
-DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, ptr)
23
+
22
-DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
24
+TBL_TBX 0 q:1 00 1110 000 rm:5 0 len:2 tbx:1 00 rn:5 rd:5
23
-DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
25
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
24
-DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
25
-DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr)
26
+DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, fpst)
27
+DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, fpst)
28
+DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, fpst)
29
+DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, fpst)
30
+DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, fpst)
31
+DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, fpst)
32
DEF_HELPER_FLAGS_4(simd_tblx, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
-DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, ptr)
34
-DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, ptr)
35
-DEF_HELPER_FLAGS_3(neon_ceq_f64, TCG_CALL_NO_RWG, i64, i64, i64, ptr)
36
-DEF_HELPER_FLAGS_3(neon_cge_f64, TCG_CALL_NO_RWG, i64, i64, i64, ptr)
37
-DEF_HELPER_FLAGS_3(neon_cgt_f64, TCG_CALL_NO_RWG, i64, i64, i64, ptr)
38
-DEF_HELPER_FLAGS_3(recpsf_f16, TCG_CALL_NO_RWG, f16, f16, f16, ptr)
39
-DEF_HELPER_FLAGS_3(recpsf_f32, TCG_CALL_NO_RWG, f32, f32, f32, ptr)
40
-DEF_HELPER_FLAGS_3(recpsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, ptr)
41
-DEF_HELPER_FLAGS_3(rsqrtsf_f16, TCG_CALL_NO_RWG, f16, f16, f16, ptr)
42
-DEF_HELPER_FLAGS_3(rsqrtsf_f32, TCG_CALL_NO_RWG, f32, f32, f32, ptr)
43
-DEF_HELPER_FLAGS_3(rsqrtsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, ptr)
44
-DEF_HELPER_FLAGS_2(frecpx_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
45
-DEF_HELPER_FLAGS_2(frecpx_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
46
-DEF_HELPER_FLAGS_2(frecpx_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
47
+DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, fpst)
48
+DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, fpst)
49
+DEF_HELPER_FLAGS_3(neon_ceq_f64, TCG_CALL_NO_RWG, i64, i64, i64, fpst)
50
+DEF_HELPER_FLAGS_3(neon_cge_f64, TCG_CALL_NO_RWG, i64, i64, i64, fpst)
51
+DEF_HELPER_FLAGS_3(neon_cgt_f64, TCG_CALL_NO_RWG, i64, i64, i64, fpst)
52
+DEF_HELPER_FLAGS_3(recpsf_f16, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
53
+DEF_HELPER_FLAGS_3(recpsf_f32, TCG_CALL_NO_RWG, f32, f32, f32, fpst)
54
+DEF_HELPER_FLAGS_3(recpsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, fpst)
55
+DEF_HELPER_FLAGS_3(rsqrtsf_f16, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
56
+DEF_HELPER_FLAGS_3(rsqrtsf_f32, TCG_CALL_NO_RWG, f32, f32, f32, fpst)
57
+DEF_HELPER_FLAGS_3(rsqrtsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, fpst)
58
+DEF_HELPER_FLAGS_2(frecpx_f64, TCG_CALL_NO_RWG, f64, f64, fpst)
59
+DEF_HELPER_FLAGS_2(frecpx_f32, TCG_CALL_NO_RWG, f32, f32, fpst)
60
+DEF_HELPER_FLAGS_2(frecpx_f16, TCG_CALL_NO_RWG, f16, f16, fpst)
61
DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, env)
62
DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
63
DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
64
-DEF_HELPER_FLAGS_3(advsimd_maxh, TCG_CALL_NO_RWG, f16, f16, f16, ptr)
65
-DEF_HELPER_FLAGS_3(advsimd_minh, TCG_CALL_NO_RWG, f16, f16, f16, ptr)
66
-DEF_HELPER_FLAGS_3(advsimd_maxnumh, TCG_CALL_NO_RWG, f16, f16, f16, ptr)
67
-DEF_HELPER_FLAGS_3(advsimd_minnumh, TCG_CALL_NO_RWG, f16, f16, f16, ptr)
68
-DEF_HELPER_3(advsimd_addh, f16, f16, f16, ptr)
69
-DEF_HELPER_3(advsimd_subh, f16, f16, f16, ptr)
70
-DEF_HELPER_3(advsimd_mulh, f16, f16, f16, ptr)
71
-DEF_HELPER_3(advsimd_divh, f16, f16, f16, ptr)
72
-DEF_HELPER_3(advsimd_ceq_f16, i32, f16, f16, ptr)
73
-DEF_HELPER_3(advsimd_cge_f16, i32, f16, f16, ptr)
74
-DEF_HELPER_3(advsimd_cgt_f16, i32, f16, f16, ptr)
75
-DEF_HELPER_3(advsimd_acge_f16, i32, f16, f16, ptr)
76
-DEF_HELPER_3(advsimd_acgt_f16, i32, f16, f16, ptr)
77
-DEF_HELPER_3(advsimd_mulxh, f16, f16, f16, ptr)
78
-DEF_HELPER_4(advsimd_muladdh, f16, f16, f16, f16, ptr)
79
-DEF_HELPER_3(advsimd_add2h, i32, i32, i32, ptr)
80
-DEF_HELPER_3(advsimd_sub2h, i32, i32, i32, ptr)
81
-DEF_HELPER_3(advsimd_mul2h, i32, i32, i32, ptr)
82
-DEF_HELPER_3(advsimd_div2h, i32, i32, i32, ptr)
83
-DEF_HELPER_3(advsimd_max2h, i32, i32, i32, ptr)
84
-DEF_HELPER_3(advsimd_min2h, i32, i32, i32, ptr)
85
-DEF_HELPER_3(advsimd_maxnum2h, i32, i32, i32, ptr)
86
-DEF_HELPER_3(advsimd_minnum2h, i32, i32, i32, ptr)
87
-DEF_HELPER_3(advsimd_mulx2h, i32, i32, i32, ptr)
88
-DEF_HELPER_4(advsimd_muladd2h, i32, i32, i32, i32, ptr)
89
-DEF_HELPER_2(advsimd_rinth_exact, f16, f16, ptr)
90
-DEF_HELPER_2(advsimd_rinth, f16, f16, ptr)
91
+DEF_HELPER_FLAGS_3(advsimd_maxh, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
92
+DEF_HELPER_FLAGS_3(advsimd_minh, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
93
+DEF_HELPER_FLAGS_3(advsimd_maxnumh, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
94
+DEF_HELPER_FLAGS_3(advsimd_minnumh, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
95
+DEF_HELPER_3(advsimd_addh, f16, f16, f16, fpst)
96
+DEF_HELPER_3(advsimd_subh, f16, f16, f16, fpst)
97
+DEF_HELPER_3(advsimd_mulh, f16, f16, f16, fpst)
98
+DEF_HELPER_3(advsimd_divh, f16, f16, f16, fpst)
99
+DEF_HELPER_3(advsimd_ceq_f16, i32, f16, f16, fpst)
100
+DEF_HELPER_3(advsimd_cge_f16, i32, f16, f16, fpst)
101
+DEF_HELPER_3(advsimd_cgt_f16, i32, f16, f16, fpst)
102
+DEF_HELPER_3(advsimd_acge_f16, i32, f16, f16, fpst)
103
+DEF_HELPER_3(advsimd_acgt_f16, i32, f16, f16, fpst)
104
+DEF_HELPER_3(advsimd_mulxh, f16, f16, f16, fpst)
105
+DEF_HELPER_4(advsimd_muladdh, f16, f16, f16, f16, fpst)
106
+DEF_HELPER_3(advsimd_add2h, i32, i32, i32, fpst)
107
+DEF_HELPER_3(advsimd_sub2h, i32, i32, i32, fpst)
108
+DEF_HELPER_3(advsimd_mul2h, i32, i32, i32, fpst)
109
+DEF_HELPER_3(advsimd_div2h, i32, i32, i32, fpst)
110
+DEF_HELPER_3(advsimd_max2h, i32, i32, i32, fpst)
111
+DEF_HELPER_3(advsimd_min2h, i32, i32, i32, fpst)
112
+DEF_HELPER_3(advsimd_maxnum2h, i32, i32, i32, fpst)
113
+DEF_HELPER_3(advsimd_minnum2h, i32, i32, i32, fpst)
114
+DEF_HELPER_3(advsimd_mulx2h, i32, i32, i32, fpst)
115
+DEF_HELPER_4(advsimd_muladd2h, i32, i32, i32, i32, fpst)
116
+DEF_HELPER_2(advsimd_rinth_exact, f16, f16, fpst)
117
+DEF_HELPER_2(advsimd_rinth, f16, f16, fpst)
118
119
DEF_HELPER_2(exception_return, void, env, i64)
120
DEF_HELPER_FLAGS_2(dc_zva, TCG_CALL_NO_WG, void, env, i64)
121
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
26
index XXXXXXX..XXXXXXX 100644
122
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/tcg/translate-a64.c
123
--- a/target/arm/tcg/helper-a64.c
28
+++ b/target/arm/tcg/translate-a64.c
124
+++ b/target/arm/tcg/helper-a64.c
29
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
125
@@ -XXX,XX +XXX,XX @@ static inline uint32_t float_rel_to_flags(int res)
30
return true;
126
return flags;
31
}
127
}
32
128
33
+static bool trans_TBL_TBX(DisasContext *s, arg_TBL_TBX *a)
129
-uint64_t HELPER(vfp_cmph_a64)(uint32_t x, uint32_t y, void *fp_status)
34
+{
130
+uint64_t HELPER(vfp_cmph_a64)(uint32_t x, uint32_t y, float_status *fp_status)
35
+ if (fp_access_check(s)) {
131
{
36
+ int len = (a->len + 1) * 16;
132
return float_rel_to_flags(float16_compare_quiet(x, y, fp_status));
37
+
133
}
38
+ tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd),
134
39
+ vec_full_reg_offset(s, a->rm), tcg_env,
135
-uint64_t HELPER(vfp_cmpeh_a64)(uint32_t x, uint32_t y, void *fp_status)
40
+ a->q ? 16 : 8, vec_full_reg_size(s),
136
+uint64_t HELPER(vfp_cmpeh_a64)(uint32_t x, uint32_t y, float_status *fp_status)
41
+ (len << 6) | (a->tbx << 5) | a->rn,
137
{
42
+ gen_helper_simd_tblx);
138
return float_rel_to_flags(float16_compare(x, y, fp_status));
43
+ }
139
}
44
+ return true;
140
45
+}
141
-uint64_t HELPER(vfp_cmps_a64)(float32 x, float32 y, void *fp_status)
46
+
142
+uint64_t HELPER(vfp_cmps_a64)(float32 x, float32 y, float_status *fp_status)
47
/*
143
{
48
* Cryptographic AES, SHA, SHA512
144
return float_rel_to_flags(float32_compare_quiet(x, y, fp_status));
145
}
146
147
-uint64_t HELPER(vfp_cmpes_a64)(float32 x, float32 y, void *fp_status)
148
+uint64_t HELPER(vfp_cmpes_a64)(float32 x, float32 y, float_status *fp_status)
149
{
150
return float_rel_to_flags(float32_compare(x, y, fp_status));
151
}
152
153
-uint64_t HELPER(vfp_cmpd_a64)(float64 x, float64 y, void *fp_status)
154
+uint64_t HELPER(vfp_cmpd_a64)(float64 x, float64 y, float_status *fp_status)
155
{
156
return float_rel_to_flags(float64_compare_quiet(x, y, fp_status));
157
}
158
159
-uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, void *fp_status)
160
+uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, float_status *fp_status)
161
{
162
return float_rel_to_flags(float64_compare(x, y, fp_status));
163
}
164
165
-float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp)
166
+float32 HELPER(vfp_mulxs)(float32 a, float32 b, float_status *fpst)
167
{
168
- float_status *fpst = fpstp;
169
-
170
a = float32_squash_input_denormal(a, fpst);
171
b = float32_squash_input_denormal(b, fpst);
172
173
@@ -XXX,XX +XXX,XX @@ float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp)
174
return float32_mul(a, b, fpst);
175
}
176
177
-float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp)
178
+float64 HELPER(vfp_mulxd)(float64 a, float64 b, float_status *fpst)
179
{
180
- float_status *fpst = fpstp;
181
-
182
a = float64_squash_input_denormal(a, fpst);
183
b = float64_squash_input_denormal(b, fpst);
184
185
@@ -XXX,XX +XXX,XX @@ float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp)
186
}
187
188
/* 64bit/double versions of the neon float compare functions */
189
-uint64_t HELPER(neon_ceq_f64)(float64 a, float64 b, void *fpstp)
190
+uint64_t HELPER(neon_ceq_f64)(float64 a, float64 b, float_status *fpst)
191
{
192
- float_status *fpst = fpstp;
193
return -float64_eq_quiet(a, b, fpst);
194
}
195
196
-uint64_t HELPER(neon_cge_f64)(float64 a, float64 b, void *fpstp)
197
+uint64_t HELPER(neon_cge_f64)(float64 a, float64 b, float_status *fpst)
198
{
199
- float_status *fpst = fpstp;
200
return -float64_le(b, a, fpst);
201
}
202
203
-uint64_t HELPER(neon_cgt_f64)(float64 a, float64 b, void *fpstp)
204
+uint64_t HELPER(neon_cgt_f64)(float64 a, float64 b, float_status *fpst)
205
{
206
- float_status *fpst = fpstp;
207
return -float64_lt(b, a, fpst);
208
}
209
210
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_cgt_f64)(float64 a, float64 b, void *fpstp)
211
* multiply-add-and-halve.
49
*/
212
*/
50
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
213
214
-uint32_t HELPER(recpsf_f16)(uint32_t a, uint32_t b, void *fpstp)
215
+uint32_t HELPER(recpsf_f16)(uint32_t a, uint32_t b, float_status *fpst)
216
{
217
- float_status *fpst = fpstp;
218
-
219
a = float16_squash_input_denormal(a, fpst);
220
b = float16_squash_input_denormal(b, fpst);
221
222
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpsf_f16)(uint32_t a, uint32_t b, void *fpstp)
223
return float16_muladd(a, b, float16_two, 0, fpst);
224
}
225
226
-float32 HELPER(recpsf_f32)(float32 a, float32 b, void *fpstp)
227
+float32 HELPER(recpsf_f32)(float32 a, float32 b, float_status *fpst)
228
{
229
- float_status *fpst = fpstp;
230
-
231
a = float32_squash_input_denormal(a, fpst);
232
b = float32_squash_input_denormal(b, fpst);
233
234
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recpsf_f32)(float32 a, float32 b, void *fpstp)
235
return float32_muladd(a, b, float32_two, 0, fpst);
236
}
237
238
-float64 HELPER(recpsf_f64)(float64 a, float64 b, void *fpstp)
239
+float64 HELPER(recpsf_f64)(float64 a, float64 b, float_status *fpst)
240
{
241
- float_status *fpst = fpstp;
242
-
243
a = float64_squash_input_denormal(a, fpst);
244
b = float64_squash_input_denormal(b, fpst);
245
246
@@ -XXX,XX +XXX,XX @@ float64 HELPER(recpsf_f64)(float64 a, float64 b, void *fpstp)
247
return float64_muladd(a, b, float64_two, 0, fpst);
248
}
249
250
-uint32_t HELPER(rsqrtsf_f16)(uint32_t a, uint32_t b, void *fpstp)
251
+uint32_t HELPER(rsqrtsf_f16)(uint32_t a, uint32_t b, float_status *fpst)
252
{
253
- float_status *fpst = fpstp;
254
-
255
a = float16_squash_input_denormal(a, fpst);
256
b = float16_squash_input_denormal(b, fpst);
257
258
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrtsf_f16)(uint32_t a, uint32_t b, void *fpstp)
259
return float16_muladd(a, b, float16_three, float_muladd_halve_result, fpst);
260
}
261
262
-float32 HELPER(rsqrtsf_f32)(float32 a, float32 b, void *fpstp)
263
+float32 HELPER(rsqrtsf_f32)(float32 a, float32 b, float_status *fpst)
264
{
265
- float_status *fpst = fpstp;
266
-
267
a = float32_squash_input_denormal(a, fpst);
268
b = float32_squash_input_denormal(b, fpst);
269
270
@@ -XXX,XX +XXX,XX @@ float32 HELPER(rsqrtsf_f32)(float32 a, float32 b, void *fpstp)
271
return float32_muladd(a, b, float32_three, float_muladd_halve_result, fpst);
272
}
273
274
-float64 HELPER(rsqrtsf_f64)(float64 a, float64 b, void *fpstp)
275
+float64 HELPER(rsqrtsf_f64)(float64 a, float64 b, float_status *fpst)
276
{
277
- float_status *fpst = fpstp;
278
-
279
a = float64_squash_input_denormal(a, fpst);
280
b = float64_squash_input_denormal(b, fpst);
281
282
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrtsf_f64)(float64 a, float64 b, void *fpstp)
283
}
284
285
/* Floating-point reciprocal exponent - see FPRecpX in ARM ARM */
286
-uint32_t HELPER(frecpx_f16)(uint32_t a, void *fpstp)
287
+uint32_t HELPER(frecpx_f16)(uint32_t a, float_status *fpst)
288
{
289
- float_status *fpst = fpstp;
290
uint16_t val16, sbit;
291
int16_t exp;
292
293
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(frecpx_f16)(uint32_t a, void *fpstp)
51
}
294
}
52
}
295
}
53
296
54
-/* TBL/TBX
297
-float32 HELPER(frecpx_f32)(float32 a, void *fpstp)
55
- * 31 30 29 24 23 22 21 20 16 15 14 13 12 11 10 9 5 4 0
298
+float32 HELPER(frecpx_f32)(float32 a, float_status *fpst)
56
- * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
299
{
57
- * | 0 | Q | 0 0 1 1 1 0 | op2 | 0 | Rm | 0 | len | op | 0 0 | Rn | Rd |
300
- float_status *fpst = fpstp;
58
- * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
301
uint32_t val32, sbit;
59
- */
302
int32_t exp;
60
-static void disas_simd_tb(DisasContext *s, uint32_t insn)
303
61
-{
304
@@ -XXX,XX +XXX,XX @@ float32 HELPER(frecpx_f32)(float32 a, void *fpstp)
62
- int op2 = extract32(insn, 22, 2);
305
}
63
- int is_q = extract32(insn, 30, 1);
306
}
64
- int rm = extract32(insn, 16, 5);
307
65
- int rn = extract32(insn, 5, 5);
308
-float64 HELPER(frecpx_f64)(float64 a, void *fpstp)
66
- int rd = extract32(insn, 0, 5);
309
+float64 HELPER(frecpx_f64)(float64 a, float_status *fpst)
67
- int is_tbx = extract32(insn, 12, 1);
310
{
68
- int len = (extract32(insn, 13, 2) + 1) * 16;
311
- float_status *fpst = fpstp;
69
-
312
uint64_t val64, sbit;
70
- if (op2 != 0) {
313
int64_t exp;
71
- unallocated_encoding(s);
314
72
- return;
315
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(crc32c_64)(uint64_t acc, uint64_t val, uint32_t bytes)
73
- }
316
#define ADVSIMD_HELPER(name, suffix) HELPER(glue(glue(advsimd_, name), suffix))
74
-
317
75
- if (!fp_access_check(s)) {
318
#define ADVSIMD_HALFOP(name) \
76
- return;
319
-uint32_t ADVSIMD_HELPER(name, h)(uint32_t a, uint32_t b, void *fpstp) \
77
- }
320
+uint32_t ADVSIMD_HELPER(name, h)(uint32_t a, uint32_t b, float_status *fpst) \
78
-
321
{ \
79
- tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, rd),
322
- float_status *fpst = fpstp; \
80
- vec_full_reg_offset(s, rm), tcg_env,
323
return float16_ ## name(a, b, fpst); \
81
- is_q ? 16 : 8, vec_full_reg_size(s),
324
}
82
- (len << 6) | (is_tbx << 5) | rn,
325
83
- gen_helper_simd_tblx);
326
@@ -XXX,XX +XXX,XX @@ ADVSIMD_HALFOP(minnum)
84
-}
327
ADVSIMD_HALFOP(maxnum)
85
-
328
86
/* ZIP/UZP/TRN
329
#define ADVSIMD_TWOHALFOP(name) \
87
* 31 30 29 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
330
-uint32_t ADVSIMD_HELPER(name, 2h)(uint32_t two_a, uint32_t two_b, void *fpstp) \
88
* +---+---+-------------+------+---+------+---+------------------+------+
331
+uint32_t ADVSIMD_HELPER(name, 2h)(uint32_t two_a, uint32_t two_b, \
89
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
332
+ float_status *fpst) \
90
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
333
{ \
91
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
334
float16 a1, a2, b1, b2; \
92
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
335
uint32_t r1, r2; \
93
- { 0x0e000000, 0xbf208c00, disas_simd_tb },
336
- float_status *fpst = fpstp; \
94
{ 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
337
a1 = extract32(two_a, 0, 16); \
95
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
338
a2 = extract32(two_a, 16, 16); \
96
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
339
b1 = extract32(two_b, 0, 16); \
340
@@ -XXX,XX +XXX,XX @@ ADVSIMD_TWOHALFOP(minnum)
341
ADVSIMD_TWOHALFOP(maxnum)
342
343
/* Data processing - scalar floating-point and advanced SIMD */
344
-static float16 float16_mulx(float16 a, float16 b, void *fpstp)
345
+static float16 float16_mulx(float16 a, float16 b, float_status *fpst)
346
{
347
- float_status *fpst = fpstp;
348
-
349
a = float16_squash_input_denormal(a, fpst);
350
b = float16_squash_input_denormal(b, fpst);
351
352
@@ -XXX,XX +XXX,XX @@ ADVSIMD_TWOHALFOP(mulx)
353
354
/* fused multiply-accumulate */
355
uint32_t HELPER(advsimd_muladdh)(uint32_t a, uint32_t b, uint32_t c,
356
- void *fpstp)
357
+ float_status *fpst)
358
{
359
- float_status *fpst = fpstp;
360
return float16_muladd(a, b, c, 0, fpst);
361
}
362
363
uint32_t HELPER(advsimd_muladd2h)(uint32_t two_a, uint32_t two_b,
364
- uint32_t two_c, void *fpstp)
365
+ uint32_t two_c, float_status *fpst)
366
{
367
- float_status *fpst = fpstp;
368
float16 a1, a2, b1, b2, c1, c2;
369
uint32_t r1, r2;
370
a1 = extract32(two_a, 0, 16);
371
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(advsimd_muladd2h)(uint32_t two_a, uint32_t two_b,
372
373
#define ADVSIMD_CMPRES(test) (test) ? 0xffff : 0
374
375
-uint32_t HELPER(advsimd_ceq_f16)(uint32_t a, uint32_t b, void *fpstp)
376
+uint32_t HELPER(advsimd_ceq_f16)(uint32_t a, uint32_t b, float_status *fpst)
377
{
378
- float_status *fpst = fpstp;
379
int compare = float16_compare_quiet(a, b, fpst);
380
return ADVSIMD_CMPRES(compare == float_relation_equal);
381
}
382
383
-uint32_t HELPER(advsimd_cge_f16)(uint32_t a, uint32_t b, void *fpstp)
384
+uint32_t HELPER(advsimd_cge_f16)(uint32_t a, uint32_t b, float_status *fpst)
385
{
386
- float_status *fpst = fpstp;
387
int compare = float16_compare(a, b, fpst);
388
return ADVSIMD_CMPRES(compare == float_relation_greater ||
389
compare == float_relation_equal);
390
}
391
392
-uint32_t HELPER(advsimd_cgt_f16)(uint32_t a, uint32_t b, void *fpstp)
393
+uint32_t HELPER(advsimd_cgt_f16)(uint32_t a, uint32_t b, float_status *fpst)
394
{
395
- float_status *fpst = fpstp;
396
int compare = float16_compare(a, b, fpst);
397
return ADVSIMD_CMPRES(compare == float_relation_greater);
398
}
399
400
-uint32_t HELPER(advsimd_acge_f16)(uint32_t a, uint32_t b, void *fpstp)
401
+uint32_t HELPER(advsimd_acge_f16)(uint32_t a, uint32_t b, float_status *fpst)
402
{
403
- float_status *fpst = fpstp;
404
float16 f0 = float16_abs(a);
405
float16 f1 = float16_abs(b);
406
int compare = float16_compare(f0, f1, fpst);
407
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(advsimd_acge_f16)(uint32_t a, uint32_t b, void *fpstp)
408
compare == float_relation_equal);
409
}
410
411
-uint32_t HELPER(advsimd_acgt_f16)(uint32_t a, uint32_t b, void *fpstp)
412
+uint32_t HELPER(advsimd_acgt_f16)(uint32_t a, uint32_t b, float_status *fpst)
413
{
414
- float_status *fpst = fpstp;
415
float16 f0 = float16_abs(a);
416
float16 f1 = float16_abs(b);
417
int compare = float16_compare(f0, f1, fpst);
418
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(advsimd_acgt_f16)(uint32_t a, uint32_t b, void *fpstp)
419
}
420
421
/* round to integral */
422
-uint32_t HELPER(advsimd_rinth_exact)(uint32_t x, void *fp_status)
423
+uint32_t HELPER(advsimd_rinth_exact)(uint32_t x, float_status *fp_status)
424
{
425
return float16_round_to_int(x, fp_status);
426
}
427
428
-uint32_t HELPER(advsimd_rinth)(uint32_t x, void *fp_status)
429
+uint32_t HELPER(advsimd_rinth)(uint32_t x, float_status *fp_status)
430
{
431
int old_flags = get_float_exception_flags(fp_status), new_flags;
432
float16 ret;
97
--
433
--
98
2.34.1
434
2.34.1
99
435
100
436
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20241206031224.78525-5-richard.henderson@linaro.org
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20240912024114.1097832-16-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/tcg/gengvec.c | 2 +-
8
target/arm/helper.h | 284 ++++++++++++++++++------------------
10
1 file changed, 1 insertion(+), 1 deletion(-)
9
target/arm/tcg/helper-a64.h | 18 +--
10
target/arm/tcg/helper-sve.h | 12 +-
11
target/arm/tcg/vec_helper.c | 60 ++++----
12
4 files changed, 183 insertions(+), 191 deletions(-)
11
13
12
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
13
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/gengvec.c
16
--- a/target/arm/helper.h
15
+++ b/target/arm/tcg/gengvec.c
17
+++ b/target/arm/helper.h
16
@@ -XXX,XX +XXX,XX @@ void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_usdot_idx_b, TCG_CALL_NO_RWG,
17
tcg_gen_add_i32(d, d, t);
19
void, ptr, ptr, ptr, ptr, i32)
18
}
20
19
21
DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
20
- void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
22
- void, ptr, ptr, ptr, ptr, i32)
21
+void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
23
+ void, ptr, ptr, ptr, fpst, i32)
22
{
24
DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
23
TCGv_i64 t = tcg_temp_new_i64();
25
- void, ptr, ptr, ptr, ptr, i32)
24
26
+ void, ptr, ptr, ptr, fpst, i32)
27
DEF_HELPER_FLAGS_5(gvec_fcaddd, TCG_CALL_NO_RWG,
28
- void, ptr, ptr, ptr, ptr, i32)
29
+ void, ptr, ptr, ptr, fpst, i32)
30
31
DEF_HELPER_FLAGS_6(gvec_fcmlah, TCG_CALL_NO_RWG,
32
- void, ptr, ptr, ptr, ptr, ptr, i32)
33
+ void, ptr, ptr, ptr, ptr, fpst, i32)
34
DEF_HELPER_FLAGS_6(gvec_fcmlah_idx, TCG_CALL_NO_RWG,
35
- void, ptr, ptr, ptr, ptr, ptr, i32)
36
+ void, ptr, ptr, ptr, ptr, fpst, i32)
37
DEF_HELPER_FLAGS_6(gvec_fcmlas, TCG_CALL_NO_RWG,
38
- void, ptr, ptr, ptr, ptr, ptr, i32)
39
+ void, ptr, ptr, ptr, ptr, fpst, i32)
40
DEF_HELPER_FLAGS_6(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
41
- void, ptr, ptr, ptr, ptr, ptr, i32)
42
+ void, ptr, ptr, ptr, ptr, fpst, i32)
43
DEF_HELPER_FLAGS_6(gvec_fcmlad, TCG_CALL_NO_RWG,
44
- void, ptr, ptr, ptr, ptr, ptr, i32)
45
+ void, ptr, ptr, ptr, ptr, fpst, i32)
46
47
-DEF_HELPER_FLAGS_4(gvec_sstoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
48
-DEF_HELPER_FLAGS_4(gvec_sitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
49
-DEF_HELPER_FLAGS_4(gvec_ustoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
50
-DEF_HELPER_FLAGS_4(gvec_uitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
51
-DEF_HELPER_FLAGS_4(gvec_tosszh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
52
-DEF_HELPER_FLAGS_4(gvec_tosizs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
53
-DEF_HELPER_FLAGS_4(gvec_touszh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
54
-DEF_HELPER_FLAGS_4(gvec_touizs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
55
+DEF_HELPER_FLAGS_4(gvec_sstoh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
56
+DEF_HELPER_FLAGS_4(gvec_sitos, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
57
+DEF_HELPER_FLAGS_4(gvec_ustoh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
58
+DEF_HELPER_FLAGS_4(gvec_uitos, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
59
+DEF_HELPER_FLAGS_4(gvec_tosszh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
60
+DEF_HELPER_FLAGS_4(gvec_tosizs, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
61
+DEF_HELPER_FLAGS_4(gvec_touszh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
62
+DEF_HELPER_FLAGS_4(gvec_touizs, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
63
64
-DEF_HELPER_FLAGS_4(gvec_vcvt_sf, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
65
-DEF_HELPER_FLAGS_4(gvec_vcvt_uf, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
66
-DEF_HELPER_FLAGS_4(gvec_vcvt_rz_fs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
67
-DEF_HELPER_FLAGS_4(gvec_vcvt_rz_fu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
68
+DEF_HELPER_FLAGS_4(gvec_vcvt_sf, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
69
+DEF_HELPER_FLAGS_4(gvec_vcvt_uf, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
70
+DEF_HELPER_FLAGS_4(gvec_vcvt_rz_fs, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
71
+DEF_HELPER_FLAGS_4(gvec_vcvt_rz_fu, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
72
73
-DEF_HELPER_FLAGS_4(gvec_vcvt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
74
-DEF_HELPER_FLAGS_4(gvec_vcvt_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
75
-DEF_HELPER_FLAGS_4(gvec_vcvt_rz_hs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
76
-DEF_HELPER_FLAGS_4(gvec_vcvt_rz_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
77
+DEF_HELPER_FLAGS_4(gvec_vcvt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
78
+DEF_HELPER_FLAGS_4(gvec_vcvt_uh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
79
+DEF_HELPER_FLAGS_4(gvec_vcvt_rz_hs, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
80
+DEF_HELPER_FLAGS_4(gvec_vcvt_rz_hu, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
81
82
-DEF_HELPER_FLAGS_4(gvec_vcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
83
-DEF_HELPER_FLAGS_4(gvec_vcvt_ud, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
84
-DEF_HELPER_FLAGS_4(gvec_vcvt_rz_ds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
85
-DEF_HELPER_FLAGS_4(gvec_vcvt_rz_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
86
+DEF_HELPER_FLAGS_4(gvec_vcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
87
+DEF_HELPER_FLAGS_4(gvec_vcvt_ud, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
88
+DEF_HELPER_FLAGS_4(gvec_vcvt_rz_ds, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
89
+DEF_HELPER_FLAGS_4(gvec_vcvt_rz_du, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
90
91
-DEF_HELPER_FLAGS_4(gvec_vcvt_rm_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
92
-DEF_HELPER_FLAGS_4(gvec_vcvt_rm_ud, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
93
-DEF_HELPER_FLAGS_4(gvec_vcvt_rm_ss, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
94
-DEF_HELPER_FLAGS_4(gvec_vcvt_rm_us, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
95
-DEF_HELPER_FLAGS_4(gvec_vcvt_rm_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
96
-DEF_HELPER_FLAGS_4(gvec_vcvt_rm_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
97
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_sd, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
98
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_ud, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
99
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_ss, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
100
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_us, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
101
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_sh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
102
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_uh, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
103
104
-DEF_HELPER_FLAGS_4(gvec_vrint_rm_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
105
-DEF_HELPER_FLAGS_4(gvec_vrint_rm_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
106
+DEF_HELPER_FLAGS_4(gvec_vrint_rm_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
107
+DEF_HELPER_FLAGS_4(gvec_vrint_rm_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
108
109
-DEF_HELPER_FLAGS_4(gvec_vrintx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
110
-DEF_HELPER_FLAGS_4(gvec_vrintx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
111
+DEF_HELPER_FLAGS_4(gvec_vrintx_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
112
+DEF_HELPER_FLAGS_4(gvec_vrintx_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
113
114
-DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
115
-DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
116
-DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
117
+DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
118
+DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
119
+DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
120
121
-DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
122
-DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
123
-DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
124
+DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
125
+DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
126
+DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
127
128
-DEF_HELPER_FLAGS_4(gvec_fcgt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
129
-DEF_HELPER_FLAGS_4(gvec_fcgt0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
130
-DEF_HELPER_FLAGS_4(gvec_fcgt0_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
131
+DEF_HELPER_FLAGS_4(gvec_fcgt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
132
+DEF_HELPER_FLAGS_4(gvec_fcgt0_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
133
+DEF_HELPER_FLAGS_4(gvec_fcgt0_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
134
135
-DEF_HELPER_FLAGS_4(gvec_fcge0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
136
-DEF_HELPER_FLAGS_4(gvec_fcge0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
137
-DEF_HELPER_FLAGS_4(gvec_fcge0_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
138
+DEF_HELPER_FLAGS_4(gvec_fcge0_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
139
+DEF_HELPER_FLAGS_4(gvec_fcge0_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
140
+DEF_HELPER_FLAGS_4(gvec_fcge0_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
141
142
-DEF_HELPER_FLAGS_4(gvec_fceq0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
143
-DEF_HELPER_FLAGS_4(gvec_fceq0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
144
-DEF_HELPER_FLAGS_4(gvec_fceq0_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
145
+DEF_HELPER_FLAGS_4(gvec_fceq0_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
146
+DEF_HELPER_FLAGS_4(gvec_fceq0_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
147
+DEF_HELPER_FLAGS_4(gvec_fceq0_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
148
149
-DEF_HELPER_FLAGS_4(gvec_fcle0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
150
-DEF_HELPER_FLAGS_4(gvec_fcle0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
151
-DEF_HELPER_FLAGS_4(gvec_fcle0_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
152
+DEF_HELPER_FLAGS_4(gvec_fcle0_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
153
+DEF_HELPER_FLAGS_4(gvec_fcle0_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
154
+DEF_HELPER_FLAGS_4(gvec_fcle0_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
155
156
-DEF_HELPER_FLAGS_4(gvec_fclt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
157
-DEF_HELPER_FLAGS_4(gvec_fclt0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
158
-DEF_HELPER_FLAGS_4(gvec_fclt0_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
159
+DEF_HELPER_FLAGS_4(gvec_fclt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
160
+DEF_HELPER_FLAGS_4(gvec_fclt0_s, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
161
+DEF_HELPER_FLAGS_4(gvec_fclt0_d, TCG_CALL_NO_RWG, void, ptr, ptr, fpst, i32)
162
163
-DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
164
-DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
165
-DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
166
+DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
167
+DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
168
+DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
169
170
-DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
171
-DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
172
-DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
173
+DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
174
+DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
175
+DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
176
177
-DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
178
-DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
179
-DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
180
+DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
181
+DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
182
+DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
183
184
-DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
185
-DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
186
-DEF_HELPER_FLAGS_5(gvec_fabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
187
+DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
188
+DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
189
+DEF_HELPER_FLAGS_5(gvec_fabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
190
191
-DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
192
-DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
193
-DEF_HELPER_FLAGS_5(gvec_fceq_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
194
+DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
195
+DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
196
+DEF_HELPER_FLAGS_5(gvec_fceq_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
197
198
-DEF_HELPER_FLAGS_5(gvec_fcge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
199
-DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
200
-DEF_HELPER_FLAGS_5(gvec_fcge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
201
+DEF_HELPER_FLAGS_5(gvec_fcge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
202
+DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
203
+DEF_HELPER_FLAGS_5(gvec_fcge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
204
205
-DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
206
-DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
207
-DEF_HELPER_FLAGS_5(gvec_fcgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
208
+DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
209
+DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
210
+DEF_HELPER_FLAGS_5(gvec_fcgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
211
212
-DEF_HELPER_FLAGS_5(gvec_facge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
213
-DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
214
-DEF_HELPER_FLAGS_5(gvec_facge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
215
+DEF_HELPER_FLAGS_5(gvec_facge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
216
+DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
217
+DEF_HELPER_FLAGS_5(gvec_facge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
218
219
-DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
220
-DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
221
-DEF_HELPER_FLAGS_5(gvec_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
222
+DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
223
+DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
224
+DEF_HELPER_FLAGS_5(gvec_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
225
226
-DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
227
-DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
228
-DEF_HELPER_FLAGS_5(gvec_fmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
229
+DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
230
+DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
231
+DEF_HELPER_FLAGS_5(gvec_fmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
232
233
-DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
234
-DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
235
-DEF_HELPER_FLAGS_5(gvec_fmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
236
+DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
237
+DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
238
+DEF_HELPER_FLAGS_5(gvec_fmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
239
240
-DEF_HELPER_FLAGS_5(gvec_fmaxnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
241
-DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
242
-DEF_HELPER_FLAGS_5(gvec_fmaxnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
243
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
244
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
245
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
246
247
-DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
248
-DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
249
-DEF_HELPER_FLAGS_5(gvec_fminnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
250
+DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
251
+DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
252
+DEF_HELPER_FLAGS_5(gvec_fminnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
253
254
-DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
255
-DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
256
+DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
257
+DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
258
259
-DEF_HELPER_FLAGS_5(gvec_rsqrts_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
260
-DEF_HELPER_FLAGS_5(gvec_rsqrts_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
261
+DEF_HELPER_FLAGS_5(gvec_rsqrts_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
262
+DEF_HELPER_FLAGS_5(gvec_rsqrts_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
263
264
-DEF_HELPER_FLAGS_5(gvec_fmla_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
265
-DEF_HELPER_FLAGS_5(gvec_fmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
266
+DEF_HELPER_FLAGS_5(gvec_fmla_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
267
+DEF_HELPER_FLAGS_5(gvec_fmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
268
269
-DEF_HELPER_FLAGS_5(gvec_fmls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
270
-DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
271
+DEF_HELPER_FLAGS_5(gvec_fmls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
272
+DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
273
274
-DEF_HELPER_FLAGS_5(gvec_vfma_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
275
-DEF_HELPER_FLAGS_5(gvec_vfma_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
276
-DEF_HELPER_FLAGS_5(gvec_vfma_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
277
+DEF_HELPER_FLAGS_5(gvec_vfma_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
278
+DEF_HELPER_FLAGS_5(gvec_vfma_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
279
+DEF_HELPER_FLAGS_5(gvec_vfma_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
280
281
-DEF_HELPER_FLAGS_5(gvec_vfms_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
282
-DEF_HELPER_FLAGS_5(gvec_vfms_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
283
-DEF_HELPER_FLAGS_5(gvec_vfms_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
284
+DEF_HELPER_FLAGS_5(gvec_vfms_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
285
+DEF_HELPER_FLAGS_5(gvec_vfms_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
286
+DEF_HELPER_FLAGS_5(gvec_vfms_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
287
288
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
289
- void, ptr, ptr, ptr, ptr, i32)
290
+ void, ptr, ptr, ptr, fpst, i32)
291
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
292
- void, ptr, ptr, ptr, ptr, i32)
293
+ void, ptr, ptr, ptr, fpst, i32)
294
DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG,
295
- void, ptr, ptr, ptr, ptr, i32)
296
+ void, ptr, ptr, ptr, fpst, i32)
297
298
DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG,
299
- void, ptr, ptr, ptr, ptr, i32)
300
+ void, ptr, ptr, ptr, fpst, i32)
301
DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG,
302
- void, ptr, ptr, ptr, ptr, i32)
303
+ void, ptr, ptr, ptr, fpst, i32)
304
DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG,
305
- void, ptr, ptr, ptr, ptr, i32)
306
+ void, ptr, ptr, ptr, fpst, i32)
307
308
DEF_HELPER_FLAGS_5(gvec_fmla_nf_idx_h, TCG_CALL_NO_RWG,
309
- void, ptr, ptr, ptr, ptr, i32)
310
+ void, ptr, ptr, ptr, fpst, i32)
311
DEF_HELPER_FLAGS_5(gvec_fmla_nf_idx_s, TCG_CALL_NO_RWG,
312
- void, ptr, ptr, ptr, ptr, i32)
313
+ void, ptr, ptr, ptr, fpst, i32)
314
315
DEF_HELPER_FLAGS_5(gvec_fmls_nf_idx_h, TCG_CALL_NO_RWG,
316
- void, ptr, ptr, ptr, ptr, i32)
317
+ void, ptr, ptr, ptr, fpst, i32)
318
DEF_HELPER_FLAGS_5(gvec_fmls_nf_idx_s, TCG_CALL_NO_RWG,
319
- void, ptr, ptr, ptr, ptr, i32)
320
+ void, ptr, ptr, ptr, fpst, i32)
321
322
DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG,
323
- void, ptr, ptr, ptr, ptr, ptr, i32)
324
+ void, ptr, ptr, ptr, ptr, fpst, i32)
325
DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG,
326
- void, ptr, ptr, ptr, ptr, ptr, i32)
327
+ void, ptr, ptr, ptr, ptr, fpst, i32)
328
DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG,
329
- void, ptr, ptr, ptr, ptr, ptr, i32)
330
+ void, ptr, ptr, ptr, ptr, fpst, i32)
331
332
DEF_HELPER_FLAGS_5(gvec_uqadd_b, TCG_CALL_NO_RWG,
333
void, ptr, ptr, ptr, ptr, i32)
334
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_bfmmla, TCG_CALL_NO_RWG,
335
void, ptr, ptr, ptr, ptr, env, i32)
336
337
DEF_HELPER_FLAGS_6(gvec_bfmlal, TCG_CALL_NO_RWG,
338
- void, ptr, ptr, ptr, ptr, ptr, i32)
339
+ void, ptr, ptr, ptr, ptr, fpst, i32)
340
DEF_HELPER_FLAGS_6(gvec_bfmlal_idx, TCG_CALL_NO_RWG,
341
- void, ptr, ptr, ptr, ptr, ptr, i32)
342
+ void, ptr, ptr, ptr, ptr, fpst, i32)
343
344
DEF_HELPER_FLAGS_5(gvec_sclamp_b, TCG_CALL_NO_RWG,
345
void, ptr, ptr, ptr, ptr, i32)
346
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_uclamp_s, TCG_CALL_NO_RWG,
347
DEF_HELPER_FLAGS_5(gvec_uclamp_d, TCG_CALL_NO_RWG,
348
void, ptr, ptr, ptr, ptr, i32)
349
350
-DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
351
-DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
352
-DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
353
+DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
354
+DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
355
+DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
356
357
-DEF_HELPER_FLAGS_5(gvec_fmaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
358
-DEF_HELPER_FLAGS_5(gvec_fmaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
359
-DEF_HELPER_FLAGS_5(gvec_fmaxp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
360
+DEF_HELPER_FLAGS_5(gvec_fmaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
361
+DEF_HELPER_FLAGS_5(gvec_fmaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
362
+DEF_HELPER_FLAGS_5(gvec_fmaxp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
363
364
-DEF_HELPER_FLAGS_5(gvec_fminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
365
-DEF_HELPER_FLAGS_5(gvec_fminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
366
-DEF_HELPER_FLAGS_5(gvec_fminp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
367
+DEF_HELPER_FLAGS_5(gvec_fminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
368
+DEF_HELPER_FLAGS_5(gvec_fminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
369
+DEF_HELPER_FLAGS_5(gvec_fminp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
370
371
-DEF_HELPER_FLAGS_5(gvec_fmaxnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
372
-DEF_HELPER_FLAGS_5(gvec_fmaxnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
373
-DEF_HELPER_FLAGS_5(gvec_fmaxnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
374
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
375
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
376
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
377
378
-DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
379
-DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
380
-DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
381
+DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
382
+DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
383
+DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
384
385
DEF_HELPER_FLAGS_4(gvec_addp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
386
DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
387
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
388
index XXXXXXX..XXXXXXX 100644
389
--- a/target/arm/tcg/helper-a64.h
390
+++ b/target/arm/tcg/helper-a64.h
391
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
392
DEF_HELPER_FLAGS_1(guarded_page_check, TCG_CALL_NO_WG, void, env)
393
DEF_HELPER_FLAGS_2(guarded_page_br, TCG_CALL_NO_RWG, void, env, tl)
394
395
-DEF_HELPER_FLAGS_5(gvec_fdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
396
-DEF_HELPER_FLAGS_5(gvec_fdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
397
-DEF_HELPER_FLAGS_5(gvec_fdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
398
+DEF_HELPER_FLAGS_5(gvec_fdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
399
+DEF_HELPER_FLAGS_5(gvec_fdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
400
+DEF_HELPER_FLAGS_5(gvec_fdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
401
402
-DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
403
-DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
404
-DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
405
+DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
406
+DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
407
+DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
408
409
-DEF_HELPER_FLAGS_5(gvec_fmulx_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
410
-DEF_HELPER_FLAGS_5(gvec_fmulx_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
411
-DEF_HELPER_FLAGS_5(gvec_fmulx_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
412
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
413
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
414
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
415
diff --git a/target/arm/tcg/helper-sve.h b/target/arm/tcg/helper-sve.h
416
index XXXXXXX..XXXXXXX 100644
417
--- a/target/arm/tcg/helper-sve.h
418
+++ b/target/arm/tcg/helper-sve.h
419
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
420
DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
421
422
DEF_HELPER_FLAGS_5(gvec_recps_h, TCG_CALL_NO_RWG,
423
- void, ptr, ptr, ptr, ptr, i32)
424
+ void, ptr, ptr, ptr, fpst, i32)
425
DEF_HELPER_FLAGS_5(gvec_recps_s, TCG_CALL_NO_RWG,
426
- void, ptr, ptr, ptr, ptr, i32)
427
+ void, ptr, ptr, ptr, fpst, i32)
428
DEF_HELPER_FLAGS_5(gvec_recps_d, TCG_CALL_NO_RWG,
429
- void, ptr, ptr, ptr, ptr, i32)
430
+ void, ptr, ptr, ptr, fpst, i32)
431
432
DEF_HELPER_FLAGS_5(gvec_rsqrts_h, TCG_CALL_NO_RWG,
433
- void, ptr, ptr, ptr, ptr, i32)
434
+ void, ptr, ptr, ptr, fpst, i32)
435
DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
436
- void, ptr, ptr, ptr, ptr, i32)
437
+ void, ptr, ptr, ptr, fpst, i32)
438
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
439
- void, ptr, ptr, ptr, ptr, i32)
440
+ void, ptr, ptr, ptr, fpst, i32)
441
442
DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG,
443
i64, ptr, ptr, ptr, i32)
444
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
445
index XXXXXXX..XXXXXXX 100644
446
--- a/target/arm/tcg/vec_helper.c
447
+++ b/target/arm/tcg/vec_helper.c
448
@@ -XXX,XX +XXX,XX @@ DO_DOT_IDX(gvec_sdot_idx_h, int64_t, int16_t, int16_t, H8)
449
DO_DOT_IDX(gvec_udot_idx_h, uint64_t, uint16_t, uint16_t, H8)
450
451
void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
452
- void *vfpst, uint32_t desc)
453
+ float_status *fpst, uint32_t desc)
454
{
455
uintptr_t opr_sz = simd_oprsz(desc);
456
float16 *d = vd;
457
float16 *n = vn;
458
float16 *m = vm;
459
- float_status *fpst = vfpst;
460
uint32_t neg_real = extract32(desc, SIMD_DATA_SHIFT, 1);
461
uint32_t neg_imag = neg_real ^ 1;
462
uintptr_t i;
463
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
464
}
465
466
void HELPER(gvec_fcadds)(void *vd, void *vn, void *vm,
467
- void *vfpst, uint32_t desc)
468
+ float_status *fpst, uint32_t desc)
469
{
470
uintptr_t opr_sz = simd_oprsz(desc);
471
float32 *d = vd;
472
float32 *n = vn;
473
float32 *m = vm;
474
- float_status *fpst = vfpst;
475
uint32_t neg_real = extract32(desc, SIMD_DATA_SHIFT, 1);
476
uint32_t neg_imag = neg_real ^ 1;
477
uintptr_t i;
478
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcadds)(void *vd, void *vn, void *vm,
479
}
480
481
void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm,
482
- void *vfpst, uint32_t desc)
483
+ float_status *fpst, uint32_t desc)
484
{
485
uintptr_t opr_sz = simd_oprsz(desc);
486
float64 *d = vd;
487
float64 *n = vn;
488
float64 *m = vm;
489
- float_status *fpst = vfpst;
490
uint64_t neg_real = extract64(desc, SIMD_DATA_SHIFT, 1);
491
uint64_t neg_imag = neg_real ^ 1;
492
uintptr_t i;
493
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm,
494
}
495
496
void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, void *va,
497
- void *vfpst, uint32_t desc)
498
+ float_status *fpst, uint32_t desc)
499
{
500
uintptr_t opr_sz = simd_oprsz(desc);
501
float16 *d = vd, *n = vn, *m = vm, *a = va;
502
- float_status *fpst = vfpst;
503
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
504
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
505
uint32_t neg_real = flip ^ neg_imag;
506
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, void *va,
507
}
508
509
void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va,
510
- void *vfpst, uint32_t desc)
511
+ float_status *fpst, uint32_t desc)
512
{
513
uintptr_t opr_sz = simd_oprsz(desc);
514
float16 *d = vd, *n = vn, *m = vm, *a = va;
515
- float_status *fpst = vfpst;
516
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
517
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
518
intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
519
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va,
520
}
521
522
void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, void *va,
523
- void *vfpst, uint32_t desc)
524
+ float_status *fpst, uint32_t desc)
525
{
526
uintptr_t opr_sz = simd_oprsz(desc);
527
float32 *d = vd, *n = vn, *m = vm, *a = va;
528
- float_status *fpst = vfpst;
529
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
530
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
531
uint32_t neg_real = flip ^ neg_imag;
532
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, void *va,
533
}
534
535
void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va,
536
- void *vfpst, uint32_t desc)
537
+ float_status *fpst, uint32_t desc)
538
{
539
uintptr_t opr_sz = simd_oprsz(desc);
540
float32 *d = vd, *n = vn, *m = vm, *a = va;
541
- float_status *fpst = vfpst;
542
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
543
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
544
intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
545
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va,
546
}
547
548
void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, void *va,
549
- void *vfpst, uint32_t desc)
550
+ float_status *fpst, uint32_t desc)
551
{
552
uintptr_t opr_sz = simd_oprsz(desc);
553
float64 *d = vd, *n = vn, *m = vm, *a = va;
554
- float_status *fpst = vfpst;
555
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
556
uint64_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
557
uint64_t neg_real = flip ^ neg_imag;
558
@@ -XXX,XX +XXX,XX @@ static uint64_t float64_acgt(float64 op1, float64 op2, float_status *stat)
559
return -float64_lt(float64_abs(op2), float64_abs(op1), stat);
560
}
561
562
-static int16_t vfp_tosszh(float16 x, void *fpstp)
563
+static int16_t vfp_tosszh(float16 x, float_status *fpst)
564
{
565
- float_status *fpst = fpstp;
566
if (float16_is_any_nan(x)) {
567
float_raise(float_flag_invalid, fpst);
568
return 0;
569
@@ -XXX,XX +XXX,XX @@ static int16_t vfp_tosszh(float16 x, void *fpstp)
570
return float16_to_int16_round_to_zero(x, fpst);
571
}
572
573
-static uint16_t vfp_touszh(float16 x, void *fpstp)
574
+static uint16_t vfp_touszh(float16 x, float_status *fpst)
575
{
576
- float_status *fpst = fpstp;
577
if (float16_is_any_nan(x)) {
578
float_raise(float_flag_invalid, fpst);
579
return 0;
580
@@ -XXX,XX +XXX,XX @@ static uint16_t vfp_touszh(float16 x, void *fpstp)
581
}
582
583
#define DO_2OP(NAME, FUNC, TYPE) \
584
-void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
585
+void HELPER(NAME)(void *vd, void *vn, float_status *stat, uint32_t desc) \
586
{ \
587
intptr_t i, oprsz = simd_oprsz(desc); \
588
TYPE *d = vd, *n = vn; \
589
@@ -XXX,XX +XXX,XX @@ static float32 float32_rsqrts_nf(float32 op1, float32 op2, float_status *stat)
590
}
591
592
#define DO_3OP(NAME, FUNC, TYPE) \
593
-void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
594
+void HELPER(NAME)(void *vd, void *vn, void *vm, \
595
+ float_status *stat, uint32_t desc) \
596
{ \
597
intptr_t i, oprsz = simd_oprsz(desc); \
598
TYPE *d = vd, *n = vn, *m = vm; \
599
@@ -XXX,XX +XXX,XX @@ static float64 float64_mulsub_f(float64 dest, float64 op1, float64 op2,
600
return float64_muladd(float64_chs(op1), op2, dest, 0, stat);
601
}
602
603
-#define DO_MULADD(NAME, FUNC, TYPE) \
604
-void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
605
+#define DO_MULADD(NAME, FUNC, TYPE) \
606
+void HELPER(NAME)(void *vd, void *vn, void *vm, \
607
+ float_status *stat, uint32_t desc) \
608
{ \
609
intptr_t i, oprsz = simd_oprsz(desc); \
610
TYPE *d = vd, *n = vn, *m = vm; \
611
@@ -XXX,XX +XXX,XX @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, H8)
612
#undef DO_MLA_IDX
613
614
#define DO_FMUL_IDX(NAME, ADD, MUL, TYPE, H) \
615
-void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
616
+void HELPER(NAME)(void *vd, void *vn, void *vm, \
617
+ float_status *stat, uint32_t desc) \
618
{ \
619
intptr_t i, j, oprsz = simd_oprsz(desc); \
620
intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
621
@@ -XXX,XX +XXX,XX @@ DO_FMUL_IDX(gvec_fmls_nf_idx_s, float32_sub, float32_mul, float32, H4)
622
623
#define DO_FMLA_IDX(NAME, TYPE, H) \
624
void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
625
- void *stat, uint32_t desc) \
626
+ float_status *stat, uint32_t desc) \
627
{ \
628
intptr_t i, j, oprsz = simd_oprsz(desc); \
629
intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
630
@@ -XXX,XX +XXX,XX @@ DO_ABA(gvec_uaba_d, uint64_t)
631
#undef DO_ABA
632
633
#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
634
-void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
635
+void HELPER(NAME)(void *vd, void *vn, void *vm, \
636
+ float_status *stat, uint32_t desc) \
637
{ \
638
ARMVectorReg scratch; \
639
intptr_t oprsz = simd_oprsz(desc); \
640
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_uminp_s, MIN, uint32_t, H4)
641
#undef DO_3OP_PAIR
642
643
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
644
- void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
645
+ void HELPER(NAME)(void *vd, void *vn, float_status *stat, uint32_t desc) \
646
{ \
647
intptr_t i, oprsz = simd_oprsz(desc); \
648
int shift = simd_data(desc); \
649
@@ -XXX,XX +XXX,XX @@ DO_VCVT_FIXED(gvec_vcvt_rz_hu, helper_vfp_touhh_round_to_zero, uint16_t)
650
#undef DO_VCVT_FIXED
651
652
#define DO_VCVT_RMODE(NAME, FUNC, TYPE) \
653
- void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
654
+ void HELPER(NAME)(void *vd, void *vn, float_status *fpst, uint32_t desc) \
655
{ \
656
- float_status *fpst = stat; \
657
intptr_t i, oprsz = simd_oprsz(desc); \
658
uint32_t rmode = simd_data(desc); \
659
uint32_t prev_rmode = get_float_rounding_mode(fpst); \
660
@@ -XXX,XX +XXX,XX @@ DO_VCVT_RMODE(gvec_vcvt_rm_uh, helper_vfp_touhh, uint16_t)
661
#undef DO_VCVT_RMODE
662
663
#define DO_VRINT_RMODE(NAME, FUNC, TYPE) \
664
- void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
665
+ void HELPER(NAME)(void *vd, void *vn, float_status *fpst, uint32_t desc) \
666
{ \
667
- float_status *fpst = stat; \
668
intptr_t i, oprsz = simd_oprsz(desc); \
669
uint32_t rmode = simd_data(desc); \
670
uint32_t prev_rmode = get_float_rounding_mode(fpst); \
671
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va,
672
}
673
674
void HELPER(gvec_bfmlal)(void *vd, void *vn, void *vm, void *va,
675
- void *stat, uint32_t desc)
676
+ float_status *stat, uint32_t desc)
677
{
678
intptr_t i, opr_sz = simd_oprsz(desc);
679
intptr_t sel = simd_data(desc);
680
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_bfmlal)(void *vd, void *vn, void *vm, void *va,
681
}
682
683
void HELPER(gvec_bfmlal_idx)(void *vd, void *vn, void *vm,
684
- void *va, void *stat, uint32_t desc)
685
+ void *va, float_status *stat, uint32_t desc)
686
{
687
intptr_t i, j, opr_sz = simd_oprsz(desc);
688
intptr_t sel = extract32(desc, SIMD_DATA_SHIFT, 1);
25
--
689
--
26
2.34.1
690
2.34.1
27
691
28
692
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Instead of copying a constant into a temporary with dupi,
4
use a vector constant directly.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-3-richard.henderson@linaro.org
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Message-id: 20241206031224.78525-6-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/tcg/translate-sve.c | 128 +++++++++++++--------------------
8
target/arm/helper.h | 14 +++++++-------
12
1 file changed, 49 insertions(+), 79 deletions(-)
9
target/arm/tcg/neon_helper.c | 21 +++++++--------------
10
2 files changed, 14 insertions(+), 21 deletions(-)
13
11
14
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
12
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/translate-sve.c
14
--- a/target/arm/helper.h
17
+++ b/target/arm/tcg/translate-sve.c
15
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ static void gen_sshll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm)
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qneg_s16, TCG_CALL_NO_RWG, i32, env, i32)
19
17
DEF_HELPER_FLAGS_2(neon_qneg_s32, TCG_CALL_NO_RWG, i32, env, i32)
20
if (top) {
18
DEF_HELPER_FLAGS_2(neon_qneg_s64, TCG_CALL_NO_RWG, i64, env, i64)
21
if (shl == halfbits) {
19
22
- TCGv_vec t = tcg_temp_new_vec_matching(d);
20
-DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, ptr)
23
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits));
21
-DEF_HELPER_3(neon_cge_f32, i32, i32, i32, ptr)
24
- tcg_gen_and_vec(vece, d, n, t);
22
-DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr)
25
+ tcg_gen_and_vec(vece, d, n,
23
-DEF_HELPER_3(neon_acge_f32, i32, i32, i32, ptr)
26
+ tcg_constant_vec_matching(d, vece,
24
-DEF_HELPER_3(neon_acgt_f32, i32, i32, i32, ptr)
27
+ MAKE_64BIT_MASK(halfbits, halfbits)));
25
-DEF_HELPER_3(neon_acge_f64, i64, i64, i64, ptr)
28
} else {
26
-DEF_HELPER_3(neon_acgt_f64, i64, i64, i64, ptr)
29
tcg_gen_sari_vec(vece, d, n, halfbits);
27
+DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, fpst)
30
tcg_gen_shli_vec(vece, d, d, shl);
28
+DEF_HELPER_3(neon_cge_f32, i32, i32, i32, fpst)
31
@@ -XXX,XX +XXX,XX @@ static void gen_ushll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm)
29
+DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, fpst)
32
30
+DEF_HELPER_3(neon_acge_f32, i32, i32, i32, fpst)
33
if (top) {
31
+DEF_HELPER_3(neon_acgt_f32, i32, i32, i32, fpst)
34
if (shl == halfbits) {
32
+DEF_HELPER_3(neon_acge_f64, i64, i64, i64, fpst)
35
- TCGv_vec t = tcg_temp_new_vec_matching(d);
33
+DEF_HELPER_3(neon_acgt_f64, i64, i64, i64, fpst)
36
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits));
34
37
- tcg_gen_and_vec(vece, d, n, t);
35
/* iwmmxt_helper.c */
38
+ tcg_gen_and_vec(vece, d, n,
36
DEF_HELPER_2(iwmmxt_maddsq, i64, i64, i64)
39
+ tcg_constant_vec_matching(d, vece,
37
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
40
+ MAKE_64BIT_MASK(halfbits, halfbits)));
38
index XXXXXXX..XXXXXXX 100644
41
} else {
39
--- a/target/arm/tcg/neon_helper.c
42
tcg_gen_shri_vec(vece, d, n, halfbits);
40
+++ b/target/arm/tcg/neon_helper.c
43
tcg_gen_shli_vec(vece, d, d, shl);
41
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qneg_s64)(CPUARMState *env, uint64_t x)
44
}
42
* Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
45
} else {
43
* Softfloat routines return 0/1, which we convert to the 0/-1 Neon requires.
46
if (shl == 0) {
44
*/
47
- TCGv_vec t = tcg_temp_new_vec_matching(d);
45
-uint32_t HELPER(neon_ceq_f32)(uint32_t a, uint32_t b, void *fpstp)
48
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
46
+uint32_t HELPER(neon_ceq_f32)(uint32_t a, uint32_t b, float_status *fpst)
49
- tcg_gen_and_vec(vece, d, n, t);
50
+ tcg_gen_and_vec(vece, d, n,
51
+ tcg_constant_vec_matching(d, vece,
52
+ MAKE_64BIT_MASK(0, halfbits)));
53
} else {
54
tcg_gen_shli_vec(vece, d, n, halfbits);
55
tcg_gen_shri_vec(vece, d, d, halfbits - shl);
56
@@ -XXX,XX +XXX,XX @@ static const TCGOpcode sqxtn_list[] = {
57
58
static void gen_sqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
59
{
47
{
60
- TCGv_vec t = tcg_temp_new_vec_matching(d);
48
- float_status *fpst = fpstp;
61
int halfbits = 4 << vece;
49
return -float32_eq_quiet(make_float32(a), make_float32(b), fpst);
62
int64_t mask = (1ull << halfbits) - 1;
63
int64_t min = -1ull << (halfbits - 1);
64
int64_t max = -min - 1;
65
66
- tcg_gen_dupi_vec(vece, t, min);
67
- tcg_gen_smax_vec(vece, d, n, t);
68
- tcg_gen_dupi_vec(vece, t, max);
69
- tcg_gen_smin_vec(vece, d, d, t);
70
- tcg_gen_dupi_vec(vece, t, mask);
71
- tcg_gen_and_vec(vece, d, d, t);
72
+ tcg_gen_smax_vec(vece, d, n, tcg_constant_vec_matching(d, vece, min));
73
+ tcg_gen_smin_vec(vece, d, d, tcg_constant_vec_matching(d, vece, max));
74
+ tcg_gen_and_vec(vece, d, d, tcg_constant_vec_matching(d, vece, mask));
75
}
50
}
76
51
77
static const GVecGen2 sqxtnb_ops[3] = {
52
-uint32_t HELPER(neon_cge_f32)(uint32_t a, uint32_t b, void *fpstp)
78
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQXTNB, aa64_sve2, do_narrow_extract, a, sqxtnb_ops)
53
+uint32_t HELPER(neon_cge_f32)(uint32_t a, uint32_t b, float_status *fpst)
79
80
static void gen_sqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
81
{
54
{
82
- TCGv_vec t = tcg_temp_new_vec_matching(d);
55
- float_status *fpst = fpstp;
83
int halfbits = 4 << vece;
56
return -float32_le(make_float32(b), make_float32(a), fpst);
84
int64_t mask = (1ull << halfbits) - 1;
85
int64_t min = -1ull << (halfbits - 1);
86
int64_t max = -min - 1;
87
88
- tcg_gen_dupi_vec(vece, t, min);
89
- tcg_gen_smax_vec(vece, n, n, t);
90
- tcg_gen_dupi_vec(vece, t, max);
91
- tcg_gen_smin_vec(vece, n, n, t);
92
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, min));
93
+ tcg_gen_smin_vec(vece, n, n, tcg_constant_vec_matching(d, vece, max));
94
tcg_gen_shli_vec(vece, n, n, halfbits);
95
- tcg_gen_dupi_vec(vece, t, mask);
96
- tcg_gen_bitsel_vec(vece, d, t, d, n);
97
+ tcg_gen_bitsel_vec(vece, d, tcg_constant_vec_matching(d, vece, mask), d, n);
98
}
57
}
99
58
100
static const GVecGen2 sqxtnt_ops[3] = {
59
-uint32_t HELPER(neon_cgt_f32)(uint32_t a, uint32_t b, void *fpstp)
101
@@ -XXX,XX +XXX,XX @@ static const TCGOpcode uqxtn_list[] = {
60
+uint32_t HELPER(neon_cgt_f32)(uint32_t a, uint32_t b, float_status *fpst)
102
103
static void gen_uqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
104
{
61
{
105
- TCGv_vec t = tcg_temp_new_vec_matching(d);
62
- float_status *fpst = fpstp;
106
int halfbits = 4 << vece;
63
return -float32_lt(make_float32(b), make_float32(a), fpst);
107
int64_t max = (1ull << halfbits) - 1;
108
109
- tcg_gen_dupi_vec(vece, t, max);
110
- tcg_gen_umin_vec(vece, d, n, t);
111
+ tcg_gen_umin_vec(vece, d, n, tcg_constant_vec_matching(d, vece, max));
112
}
64
}
113
65
114
static const GVecGen2 uqxtnb_ops[3] = {
66
-uint32_t HELPER(neon_acge_f32)(uint32_t a, uint32_t b, void *fpstp)
115
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(UQXTNB, aa64_sve2, do_narrow_extract, a, uqxtnb_ops)
67
+uint32_t HELPER(neon_acge_f32)(uint32_t a, uint32_t b, float_status *fpst)
116
117
static void gen_uqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
118
{
68
{
119
- TCGv_vec t = tcg_temp_new_vec_matching(d);
69
- float_status *fpst = fpstp;
120
int halfbits = 4 << vece;
70
float32 f0 = float32_abs(make_float32(a));
121
int64_t max = (1ull << halfbits) - 1;
71
float32 f1 = float32_abs(make_float32(b));
122
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
72
return -float32_le(f1, f0, fpst);
123
124
- tcg_gen_dupi_vec(vece, t, max);
125
- tcg_gen_umin_vec(vece, n, n, t);
126
+ tcg_gen_umin_vec(vece, n, n, maxv);
127
tcg_gen_shli_vec(vece, n, n, halfbits);
128
- tcg_gen_bitsel_vec(vece, d, t, d, n);
129
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
130
}
73
}
131
74
132
static const GVecGen2 uqxtnt_ops[3] = {
75
-uint32_t HELPER(neon_acgt_f32)(uint32_t a, uint32_t b, void *fpstp)
133
@@ -XXX,XX +XXX,XX @@ static const TCGOpcode sqxtun_list[] = {
76
+uint32_t HELPER(neon_acgt_f32)(uint32_t a, uint32_t b, float_status *fpst)
134
135
static void gen_sqxtunb_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
136
{
77
{
137
- TCGv_vec t = tcg_temp_new_vec_matching(d);
78
- float_status *fpst = fpstp;
138
int halfbits = 4 << vece;
79
float32 f0 = float32_abs(make_float32(a));
139
int64_t max = (1ull << halfbits) - 1;
80
float32 f1 = float32_abs(make_float32(b));
140
81
return -float32_lt(f1, f0, fpst);
141
- tcg_gen_dupi_vec(vece, t, 0);
142
- tcg_gen_smax_vec(vece, d, n, t);
143
- tcg_gen_dupi_vec(vece, t, max);
144
- tcg_gen_umin_vec(vece, d, d, t);
145
+ tcg_gen_smax_vec(vece, d, n, tcg_constant_vec_matching(d, vece, 0));
146
+ tcg_gen_umin_vec(vece, d, d, tcg_constant_vec_matching(d, vece, max));
147
}
82
}
148
83
149
static const GVecGen2 sqxtunb_ops[3] = {
84
-uint64_t HELPER(neon_acge_f64)(uint64_t a, uint64_t b, void *fpstp)
150
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQXTUNB, aa64_sve2, do_narrow_extract, a, sqxtunb_ops)
85
+uint64_t HELPER(neon_acge_f64)(uint64_t a, uint64_t b, float_status *fpst)
151
152
static void gen_sqxtunt_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
153
{
86
{
154
- TCGv_vec t = tcg_temp_new_vec_matching(d);
87
- float_status *fpst = fpstp;
155
int halfbits = 4 << vece;
88
float64 f0 = float64_abs(make_float64(a));
156
int64_t max = (1ull << halfbits) - 1;
89
float64 f1 = float64_abs(make_float64(b));
157
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
90
return -float64_le(f1, f0, fpst);
158
159
- tcg_gen_dupi_vec(vece, t, 0);
160
- tcg_gen_smax_vec(vece, n, n, t);
161
- tcg_gen_dupi_vec(vece, t, max);
162
- tcg_gen_umin_vec(vece, n, n, t);
163
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, 0));
164
+ tcg_gen_umin_vec(vece, n, n, maxv);
165
tcg_gen_shli_vec(vece, n, n, halfbits);
166
- tcg_gen_bitsel_vec(vece, d, t, d, n);
167
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
168
}
91
}
169
92
170
static const GVecGen2 sqxtunt_ops[3] = {
93
-uint64_t HELPER(neon_acgt_f64)(uint64_t a, uint64_t b, void *fpstp)
171
@@ -XXX,XX +XXX,XX @@ static void gen_shrnb64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr)
94
+uint64_t HELPER(neon_acgt_f64)(uint64_t a, uint64_t b, float_status *fpst)
172
173
static void gen_shrnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr)
174
{
95
{
175
- TCGv_vec t = tcg_temp_new_vec_matching(d);
96
- float_status *fpst = fpstp;
176
int halfbits = 4 << vece;
97
float64 f0 = float64_abs(make_float64(a));
177
uint64_t mask = MAKE_64BIT_MASK(0, halfbits);
98
float64 f1 = float64_abs(make_float64(b));
178
99
return -float64_lt(f1, f0, fpst);
179
tcg_gen_shri_vec(vece, n, n, shr);
180
- tcg_gen_dupi_vec(vece, t, mask);
181
- tcg_gen_and_vec(vece, d, n, t);
182
+ tcg_gen_and_vec(vece, d, n, tcg_constant_vec_matching(d, vece, mask));
183
}
184
185
static const TCGOpcode shrnb_vec_list[] = { INDEX_op_shri_vec, 0 };
186
@@ -XXX,XX +XXX,XX @@ static void gen_shrnt64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr)
187
188
static void gen_shrnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr)
189
{
190
- TCGv_vec t = tcg_temp_new_vec_matching(d);
191
int halfbits = 4 << vece;
192
uint64_t mask = MAKE_64BIT_MASK(0, halfbits);
193
194
tcg_gen_shli_vec(vece, n, n, halfbits - shr);
195
- tcg_gen_dupi_vec(vece, t, mask);
196
- tcg_gen_bitsel_vec(vece, d, t, d, n);
197
+ tcg_gen_bitsel_vec(vece, d, tcg_constant_vec_matching(d, vece, mask), d, n);
198
}
199
200
static const TCGOpcode shrnt_vec_list[] = { INDEX_op_shli_vec, 0 };
201
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(RSHRNT, aa64_sve2, do_shr_narrow, a, rshrnt_ops)
202
static void gen_sqshrunb_vec(unsigned vece, TCGv_vec d,
203
TCGv_vec n, int64_t shr)
204
{
205
- TCGv_vec t = tcg_temp_new_vec_matching(d);
206
int halfbits = 4 << vece;
207
+ uint64_t max = MAKE_64BIT_MASK(0, halfbits);
208
209
tcg_gen_sari_vec(vece, n, n, shr);
210
- tcg_gen_dupi_vec(vece, t, 0);
211
- tcg_gen_smax_vec(vece, n, n, t);
212
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
213
- tcg_gen_umin_vec(vece, d, n, t);
214
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, 0));
215
+ tcg_gen_umin_vec(vece, d, n, tcg_constant_vec_matching(d, vece, max));
216
}
217
218
static const TCGOpcode sqshrunb_vec_list[] = {
219
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQSHRUNB, aa64_sve2, do_shr_narrow, a, sqshrunb_ops)
220
static void gen_sqshrunt_vec(unsigned vece, TCGv_vec d,
221
TCGv_vec n, int64_t shr)
222
{
223
- TCGv_vec t = tcg_temp_new_vec_matching(d);
224
int halfbits = 4 << vece;
225
+ uint64_t max = MAKE_64BIT_MASK(0, halfbits);
226
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
227
228
tcg_gen_sari_vec(vece, n, n, shr);
229
- tcg_gen_dupi_vec(vece, t, 0);
230
- tcg_gen_smax_vec(vece, n, n, t);
231
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
232
- tcg_gen_umin_vec(vece, n, n, t);
233
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, 0));
234
+ tcg_gen_umin_vec(vece, n, n, maxv);
235
tcg_gen_shli_vec(vece, n, n, halfbits);
236
- tcg_gen_bitsel_vec(vece, d, t, d, n);
237
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
238
}
239
240
static const TCGOpcode sqshrunt_vec_list[] = {
241
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQRSHRUNT, aa64_sve2, do_shr_narrow, a, sqrshrunt_ops)
242
static void gen_sqshrnb_vec(unsigned vece, TCGv_vec d,
243
TCGv_vec n, int64_t shr)
244
{
245
- TCGv_vec t = tcg_temp_new_vec_matching(d);
246
int halfbits = 4 << vece;
247
int64_t max = MAKE_64BIT_MASK(0, halfbits - 1);
248
int64_t min = -max - 1;
249
+ int64_t mask = MAKE_64BIT_MASK(0, halfbits);
250
251
tcg_gen_sari_vec(vece, n, n, shr);
252
- tcg_gen_dupi_vec(vece, t, min);
253
- tcg_gen_smax_vec(vece, n, n, t);
254
- tcg_gen_dupi_vec(vece, t, max);
255
- tcg_gen_smin_vec(vece, n, n, t);
256
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
257
- tcg_gen_and_vec(vece, d, n, t);
258
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, min));
259
+ tcg_gen_smin_vec(vece, n, n, tcg_constant_vec_matching(d, vece, max));
260
+ tcg_gen_and_vec(vece, d, n, tcg_constant_vec_matching(d, vece, mask));
261
}
262
263
static const TCGOpcode sqshrnb_vec_list[] = {
264
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQSHRNB, aa64_sve2, do_shr_narrow, a, sqshrnb_ops)
265
static void gen_sqshrnt_vec(unsigned vece, TCGv_vec d,
266
TCGv_vec n, int64_t shr)
267
{
268
- TCGv_vec t = tcg_temp_new_vec_matching(d);
269
int halfbits = 4 << vece;
270
int64_t max = MAKE_64BIT_MASK(0, halfbits - 1);
271
int64_t min = -max - 1;
272
+ int64_t mask = MAKE_64BIT_MASK(0, halfbits);
273
274
tcg_gen_sari_vec(vece, n, n, shr);
275
- tcg_gen_dupi_vec(vece, t, min);
276
- tcg_gen_smax_vec(vece, n, n, t);
277
- tcg_gen_dupi_vec(vece, t, max);
278
- tcg_gen_smin_vec(vece, n, n, t);
279
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, min));
280
+ tcg_gen_smin_vec(vece, n, n, tcg_constant_vec_matching(d, vece, max));
281
tcg_gen_shli_vec(vece, n, n, halfbits);
282
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
283
- tcg_gen_bitsel_vec(vece, d, t, d, n);
284
+ tcg_gen_bitsel_vec(vece, d, tcg_constant_vec_matching(d, vece, mask), d, n);
285
}
286
287
static const TCGOpcode sqshrnt_vec_list[] = {
288
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQRSHRNT, aa64_sve2, do_shr_narrow, a, sqrshrnt_ops)
289
static void gen_uqshrnb_vec(unsigned vece, TCGv_vec d,
290
TCGv_vec n, int64_t shr)
291
{
292
- TCGv_vec t = tcg_temp_new_vec_matching(d);
293
int halfbits = 4 << vece;
294
+ int64_t max = MAKE_64BIT_MASK(0, halfbits);
295
296
tcg_gen_shri_vec(vece, n, n, shr);
297
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
298
- tcg_gen_umin_vec(vece, d, n, t);
299
+ tcg_gen_umin_vec(vece, d, n, tcg_constant_vec_matching(d, vece, max));
300
}
301
302
static const TCGOpcode uqshrnb_vec_list[] = {
303
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(UQSHRNB, aa64_sve2, do_shr_narrow, a, uqshrnb_ops)
304
static void gen_uqshrnt_vec(unsigned vece, TCGv_vec d,
305
TCGv_vec n, int64_t shr)
306
{
307
- TCGv_vec t = tcg_temp_new_vec_matching(d);
308
int halfbits = 4 << vece;
309
+ int64_t max = MAKE_64BIT_MASK(0, halfbits);
310
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
311
312
tcg_gen_shri_vec(vece, n, n, shr);
313
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
314
- tcg_gen_umin_vec(vece, n, n, t);
315
+ tcg_gen_umin_vec(vece, n, n, maxv);
316
tcg_gen_shli_vec(vece, n, n, halfbits);
317
- tcg_gen_bitsel_vec(vece, d, t, d, n);
318
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
319
}
320
321
static const TCGOpcode uqshrnt_vec_list[] = {
322
--
100
--
323
2.34.1
101
2.34.1
102
103
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Instead of cmp+and or cmp+andc, use cmpsel. This will
4
be better for hosts that use predicate registers for cmp.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-4-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/gengvec.c | 19 ++++++++-----------
12
1 file changed, 8 insertions(+), 11 deletions(-)
13
14
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/gengvec.c
17
+++ b/target/arm/tcg/gengvec.c
18
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
19
TCGv_vec rval = tcg_temp_new_vec_matching(dst);
20
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
21
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
22
- TCGv_vec max;
23
+ TCGv_vec max, zero;
24
25
tcg_gen_neg_vec(vece, rsh, shift);
26
if (vece == MO_8) {
27
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
28
tcg_gen_shrv_vec(vece, rval, src, rsh);
29
30
/*
31
- * The choice of LT (signed) and GEU (unsigned) are biased toward
32
+ * The choice of GE (signed) and GEU (unsigned) are biased toward
33
* the instructions of the x86_64 host. For MO_8, the whole byte
34
* is significant so we must use an unsigned compare; otherwise we
35
* have already masked to a byte and so a signed compare works.
36
* Other tcg hosts have a full set of comparisons and do not care.
37
*/
38
+ zero = tcg_constant_vec_matching(dst, vece, 0);
39
max = tcg_constant_vec_matching(dst, vece, 8 << vece);
40
if (vece == MO_8) {
41
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
42
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
43
- tcg_gen_andc_vec(vece, lval, lval, lsh);
44
- tcg_gen_andc_vec(vece, rval, rval, rsh);
45
+ tcg_gen_cmpsel_vec(TCG_COND_GEU, vece, lval, lsh, max, zero, lval);
46
+ tcg_gen_cmpsel_vec(TCG_COND_GEU, vece, rval, rsh, max, zero, rval);
47
} else {
48
- tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
49
- tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
50
- tcg_gen_and_vec(vece, lval, lval, lsh);
51
- tcg_gen_and_vec(vece, rval, rval, rsh);
52
+ tcg_gen_cmpsel_vec(TCG_COND_GE, vece, lval, lsh, max, zero, lval);
53
+ tcg_gen_cmpsel_vec(TCG_COND_GE, vece, rval, rsh, max, zero, rval);
54
}
55
tcg_gen_or_vec(vece, dst, lval, rval);
56
}
57
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
58
{
59
static const TCGOpcode vecop_list[] = {
60
INDEX_op_neg_vec, INDEX_op_shlv_vec,
61
- INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
62
+ INDEX_op_shrv_vec, INDEX_op_cmpsel_vec, 0
63
};
64
static const GVecGen3 ops[4] = {
65
{ .fniv = gen_ushl_vec,
66
--
67
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Instead of cmp+and or cmp+andc, use cmpsel. This will
4
be better for hosts that use predicate registers for cmp.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-5-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/gengvec.c | 8 +++-----
12
1 file changed, 3 insertions(+), 5 deletions(-)
13
14
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/gengvec.c
17
+++ b/target/arm/tcg/gengvec.c
18
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
19
TCGv_vec rval = tcg_temp_new_vec_matching(dst);
20
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
21
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
22
- TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
23
TCGv_vec max, zero;
24
25
/*
26
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
27
/* Bound rsh so out of bound right shift gets -1. */
28
max = tcg_constant_vec_matching(dst, vece, (8 << vece) - 1);
29
tcg_gen_umin_vec(vece, rsh, rsh, max);
30
- tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, max);
31
32
tcg_gen_shlv_vec(vece, lval, src, lsh);
33
tcg_gen_sarv_vec(vece, rval, src, rsh);
34
35
/* Select in-bound left shift. */
36
- tcg_gen_andc_vec(vece, lval, lval, tmp);
37
+ zero = tcg_constant_vec_matching(dst, vece, 0);
38
+ tcg_gen_cmpsel_vec(TCG_COND_GT, vece, lval, lsh, max, zero, lval);
39
40
/* Select between left and right shift. */
41
- zero = tcg_constant_vec_matching(dst, vece, 0);
42
if (vece == MO_8) {
43
tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, zero, rval, lval);
44
} else {
45
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
46
{
47
static const TCGOpcode vecop_list[] = {
48
INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
49
- INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
50
+ INDEX_op_sarv_vec, INDEX_op_cmpsel_vec, 0
51
};
52
static const GVecGen3 ops[4] = {
53
{ .fniv = gen_sshl_vec,
54
--
55
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This includes SSHR, USHR, SSRA, USRA, SRSHR, URSHR,
4
SRSRA, URSRA, SRI.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-24-richard.henderson@linaro.org
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Message-id: 20241206031224.78525-7-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/tcg/a64.decode | 16 ++++
8
target/arm/tcg/helper-sve.h | 414 ++++++++++++++++++------------------
12
target/arm/tcg/translate-a64.c | 140 ++++++++++++++++-----------------
9
target/arm/tcg/sve_helper.c | 96 +++++----
13
2 files changed, 86 insertions(+), 70 deletions(-)
10
2 files changed, 258 insertions(+), 252 deletions(-)
14
11
15
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
12
diff --git a/target/arm/tcg/helper-sve.h b/target/arm/tcg/helper-sve.h
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/a64.decode
14
--- a/target/arm/tcg/helper-sve.h
18
+++ b/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/helper-sve.h
19
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
20
&rri_sf rd rn imm sf
17
void, ptr, ptr, ptr, fpst, i32)
21
&i imm
18
22
&rr_e rd rn esz
19
DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG,
23
+&rri_e rd rn imm esz
20
- i64, ptr, ptr, ptr, i32)
24
&rrr_e rd rn rm esz
21
+ i64, ptr, ptr, fpst, i32)
25
&rrx_e rd rn rm idx esz
22
DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG,
26
&rrrr_e rd rn rm ra esz
23
- i64, ptr, ptr, ptr, i32)
27
@@ -XXX,XX +XXX,XX @@ SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_s
24
+ i64, ptr, ptr, fpst, i32)
28
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_b
25
DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG,
29
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_h
26
- i64, ptr, ptr, ptr, i32)
30
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
27
+ i64, ptr, ptr, fpst, i32)
31
+
28
32
+# Advanced SIMD scalar shift by immediate
29
DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG,
33
+
30
- i64, ptr, ptr, ptr, i32)
34
+@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
31
+ i64, ptr, ptr, fpst, i32)
35
+ &rri_e esz=3 imm=%neon_rshift_i6
32
DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG,
36
+
33
- i64, ptr, ptr, ptr, i32)
37
+SSHR_s 0101 11110 .... ... 00000 1 ..... ..... @shri_d
34
+ i64, ptr, ptr, fpst, i32)
38
+USHR_s 0111 11110 .... ... 00000 1 ..... ..... @shri_d
35
DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG,
39
+SSRA_s 0101 11110 .... ... 00010 1 ..... ..... @shri_d
36
- i64, ptr, ptr, ptr, i32)
40
+USRA_s 0111 11110 .... ... 00010 1 ..... ..... @shri_d
37
+ i64, ptr, ptr, fpst, i32)
41
+SRSHR_s 0101 11110 .... ... 00100 1 ..... ..... @shri_d
38
42
+URSHR_s 0111 11110 .... ... 00100 1 ..... ..... @shri_d
39
DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG,
43
+SRSRA_s 0101 11110 .... ... 00110 1 ..... ..... @shri_d
40
- i64, ptr, ptr, ptr, i32)
44
+URSRA_s 0111 11110 .... ... 00110 1 ..... ..... @shri_d
41
+ i64, ptr, ptr, fpst, i32)
45
+SRI_s 0111 11110 .... ... 01000 1 ..... ..... @shri_d
42
DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG,
46
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
43
- i64, ptr, ptr, ptr, i32)
44
+ i64, ptr, ptr, fpst, i32)
45
DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG,
46
- i64, ptr, ptr, ptr, i32)
47
+ i64, ptr, ptr, fpst, i32)
48
49
DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG,
50
- i64, ptr, ptr, ptr, i32)
51
+ i64, ptr, ptr, fpst, i32)
52
DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG,
53
- i64, ptr, ptr, ptr, i32)
54
+ i64, ptr, ptr, fpst, i32)
55
DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG,
56
- i64, ptr, ptr, ptr, i32)
57
+ i64, ptr, ptr, fpst, i32)
58
59
DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG,
60
- i64, ptr, ptr, ptr, i32)
61
+ i64, ptr, ptr, fpst, i32)
62
DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG,
63
- i64, ptr, ptr, ptr, i32)
64
+ i64, ptr, ptr, fpst, i32)
65
DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG,
66
- i64, ptr, ptr, ptr, i32)
67
+ i64, ptr, ptr, fpst, i32)
68
69
DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
70
- i64, i64, ptr, ptr, ptr, i32)
71
+ i64, i64, ptr, ptr, fpst, i32)
72
DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
73
- i64, i64, ptr, ptr, ptr, i32)
74
+ i64, i64, ptr, ptr, fpst, i32)
75
DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
76
- i64, i64, ptr, ptr, ptr, i32)
77
+ i64, i64, ptr, ptr, fpst, i32)
78
79
DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG,
80
- void, ptr, ptr, ptr, ptr, i32)
81
+ void, ptr, ptr, ptr, fpst, i32)
82
DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG,
83
- void, ptr, ptr, ptr, ptr, i32)
84
+ void, ptr, ptr, ptr, fpst, i32)
85
DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG,
86
- void, ptr, ptr, ptr, ptr, i32)
87
+ void, ptr, ptr, ptr, fpst, i32)
88
89
DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG,
90
- void, ptr, ptr, ptr, ptr, i32)
91
+ void, ptr, ptr, ptr, fpst, i32)
92
DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG,
93
- void, ptr, ptr, ptr, ptr, i32)
94
+ void, ptr, ptr, ptr, fpst, i32)
95
DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG,
96
- void, ptr, ptr, ptr, ptr, i32)
97
+ void, ptr, ptr, ptr, fpst, i32)
98
99
DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG,
100
- void, ptr, ptr, ptr, ptr, i32)
101
+ void, ptr, ptr, ptr, fpst, i32)
102
DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG,
103
- void, ptr, ptr, ptr, ptr, i32)
104
+ void, ptr, ptr, ptr, fpst, i32)
105
DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG,
106
- void, ptr, ptr, ptr, ptr, i32)
107
+ void, ptr, ptr, ptr, fpst, i32)
108
109
DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG,
110
- void, ptr, ptr, ptr, ptr, i32)
111
+ void, ptr, ptr, ptr, fpst, i32)
112
DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG,
113
- void, ptr, ptr, ptr, ptr, i32)
114
+ void, ptr, ptr, ptr, fpst, i32)
115
DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG,
116
- void, ptr, ptr, ptr, ptr, i32)
117
+ void, ptr, ptr, ptr, fpst, i32)
118
119
DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG,
120
- void, ptr, ptr, ptr, ptr, i32)
121
+ void, ptr, ptr, ptr, fpst, i32)
122
DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG,
123
- void, ptr, ptr, ptr, ptr, i32)
124
+ void, ptr, ptr, ptr, fpst, i32)
125
DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG,
126
- void, ptr, ptr, ptr, ptr, i32)
127
+ void, ptr, ptr, ptr, fpst, i32)
128
129
DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG,
130
- void, ptr, ptr, ptr, ptr, i32)
131
+ void, ptr, ptr, ptr, fpst, i32)
132
DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG,
133
- void, ptr, ptr, ptr, ptr, i32)
134
+ void, ptr, ptr, ptr, fpst, i32)
135
DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG,
136
- void, ptr, ptr, ptr, ptr, i32)
137
+ void, ptr, ptr, ptr, fpst, i32)
138
139
DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
140
- void, ptr, ptr, ptr, ptr, ptr, i32)
141
+ void, ptr, ptr, ptr, ptr, fpst, i32)
142
DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
143
- void, ptr, ptr, ptr, ptr, ptr, i32)
144
+ void, ptr, ptr, ptr, ptr, fpst, i32)
145
DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG,
146
- void, ptr, ptr, ptr, ptr, ptr, i32)
147
+ void, ptr, ptr, ptr, ptr, fpst, i32)
148
149
DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG,
150
- void, ptr, ptr, ptr, ptr, ptr, i32)
151
+ void, ptr, ptr, ptr, ptr, fpst, i32)
152
DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG,
153
- void, ptr, ptr, ptr, ptr, ptr, i32)
154
+ void, ptr, ptr, ptr, ptr, fpst, i32)
155
DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG,
156
- void, ptr, ptr, ptr, ptr, ptr, i32)
157
+ void, ptr, ptr, ptr, ptr, fpst, i32)
158
159
DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG,
160
- void, ptr, ptr, ptr, ptr, ptr, i32)
161
+ void, ptr, ptr, ptr, ptr, fpst, i32)
162
DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG,
163
- void, ptr, ptr, ptr, ptr, ptr, i32)
164
+ void, ptr, ptr, ptr, ptr, fpst, i32)
165
DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG,
166
- void, ptr, ptr, ptr, ptr, ptr, i32)
167
+ void, ptr, ptr, ptr, ptr, fpst, i32)
168
169
DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG,
170
- void, ptr, ptr, ptr, ptr, ptr, i32)
171
+ void, ptr, ptr, ptr, ptr, fpst, i32)
172
DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG,
173
- void, ptr, ptr, ptr, ptr, ptr, i32)
174
+ void, ptr, ptr, ptr, ptr, fpst, i32)
175
DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG,
176
- void, ptr, ptr, ptr, ptr, ptr, i32)
177
+ void, ptr, ptr, ptr, ptr, fpst, i32)
178
179
DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG,
180
- void, ptr, ptr, ptr, ptr, ptr, i32)
181
+ void, ptr, ptr, ptr, ptr, fpst, i32)
182
DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG,
183
- void, ptr, ptr, ptr, ptr, ptr, i32)
184
+ void, ptr, ptr, ptr, ptr, fpst, i32)
185
DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG,
186
- void, ptr, ptr, ptr, ptr, ptr, i32)
187
+ void, ptr, ptr, ptr, ptr, fpst, i32)
188
189
DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG,
190
- void, ptr, ptr, ptr, ptr, ptr, i32)
191
+ void, ptr, ptr, ptr, ptr, fpst, i32)
192
DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG,
193
- void, ptr, ptr, ptr, ptr, ptr, i32)
194
+ void, ptr, ptr, ptr, ptr, fpst, i32)
195
DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG,
196
- void, ptr, ptr, ptr, ptr, ptr, i32)
197
+ void, ptr, ptr, ptr, ptr, fpst, i32)
198
199
DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG,
200
- void, ptr, ptr, ptr, ptr, ptr, i32)
201
+ void, ptr, ptr, ptr, ptr, fpst, i32)
202
DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG,
203
- void, ptr, ptr, ptr, ptr, ptr, i32)
204
+ void, ptr, ptr, ptr, ptr, fpst, i32)
205
DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG,
206
- void, ptr, ptr, ptr, ptr, ptr, i32)
207
+ void, ptr, ptr, ptr, ptr, fpst, i32)
208
209
DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG,
210
- void, ptr, ptr, ptr, ptr, ptr, i32)
211
+ void, ptr, ptr, ptr, ptr, fpst, i32)
212
DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG,
213
- void, ptr, ptr, ptr, ptr, ptr, i32)
214
+ void, ptr, ptr, ptr, ptr, fpst, i32)
215
DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG,
216
- void, ptr, ptr, ptr, ptr, ptr, i32)
217
+ void, ptr, ptr, ptr, ptr, fpst, i32)
218
219
DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG,
220
- void, ptr, ptr, ptr, ptr, ptr, i32)
221
+ void, ptr, ptr, ptr, ptr, fpst, i32)
222
DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG,
223
- void, ptr, ptr, ptr, ptr, ptr, i32)
224
+ void, ptr, ptr, ptr, ptr, fpst, i32)
225
DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG,
226
- void, ptr, ptr, ptr, ptr, ptr, i32)
227
+ void, ptr, ptr, ptr, ptr, fpst, i32)
228
229
DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG,
230
- void, ptr, ptr, ptr, ptr, ptr, i32)
231
+ void, ptr, ptr, ptr, ptr, fpst, i32)
232
DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG,
233
- void, ptr, ptr, ptr, ptr, ptr, i32)
234
+ void, ptr, ptr, ptr, ptr, fpst, i32)
235
DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG,
236
- void, ptr, ptr, ptr, ptr, ptr, i32)
237
+ void, ptr, ptr, ptr, ptr, fpst, i32)
238
239
DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG,
240
- void, ptr, ptr, ptr, ptr, ptr, i32)
241
+ void, ptr, ptr, ptr, ptr, fpst, i32)
242
DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
243
- void, ptr, ptr, ptr, ptr, ptr, i32)
244
+ void, ptr, ptr, ptr, ptr, fpst, i32)
245
DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
246
- void, ptr, ptr, ptr, ptr, ptr, i32)
247
+ void, ptr, ptr, ptr, ptr, fpst, i32)
248
249
DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG,
250
- void, ptr, ptr, ptr, i64, ptr, i32)
251
+ void, ptr, ptr, ptr, i64, fpst, i32)
252
DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG,
253
- void, ptr, ptr, ptr, i64, ptr, i32)
254
+ void, ptr, ptr, ptr, i64, fpst, i32)
255
DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG,
256
- void, ptr, ptr, ptr, i64, ptr, i32)
257
+ void, ptr, ptr, ptr, i64, fpst, i32)
258
259
DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG,
260
- void, ptr, ptr, ptr, i64, ptr, i32)
261
+ void, ptr, ptr, ptr, i64, fpst, i32)
262
DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG,
263
- void, ptr, ptr, ptr, i64, ptr, i32)
264
+ void, ptr, ptr, ptr, i64, fpst, i32)
265
DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG,
266
- void, ptr, ptr, ptr, i64, ptr, i32)
267
+ void, ptr, ptr, ptr, i64, fpst, i32)
268
269
DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG,
270
- void, ptr, ptr, ptr, i64, ptr, i32)
271
+ void, ptr, ptr, ptr, i64, fpst, i32)
272
DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG,
273
- void, ptr, ptr, ptr, i64, ptr, i32)
274
+ void, ptr, ptr, ptr, i64, fpst, i32)
275
DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG,
276
- void, ptr, ptr, ptr, i64, ptr, i32)
277
+ void, ptr, ptr, ptr, i64, fpst, i32)
278
279
DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG,
280
- void, ptr, ptr, ptr, i64, ptr, i32)
281
+ void, ptr, ptr, ptr, i64, fpst, i32)
282
DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG,
283
- void, ptr, ptr, ptr, i64, ptr, i32)
284
+ void, ptr, ptr, ptr, i64, fpst, i32)
285
DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG,
286
- void, ptr, ptr, ptr, i64, ptr, i32)
287
+ void, ptr, ptr, ptr, i64, fpst, i32)
288
289
DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG,
290
- void, ptr, ptr, ptr, i64, ptr, i32)
291
+ void, ptr, ptr, ptr, i64, fpst, i32)
292
DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG,
293
- void, ptr, ptr, ptr, i64, ptr, i32)
294
+ void, ptr, ptr, ptr, i64, fpst, i32)
295
DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG,
296
- void, ptr, ptr, ptr, i64, ptr, i32)
297
+ void, ptr, ptr, ptr, i64, fpst, i32)
298
299
DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG,
300
- void, ptr, ptr, ptr, i64, ptr, i32)
301
+ void, ptr, ptr, ptr, i64, fpst, i32)
302
DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG,
303
- void, ptr, ptr, ptr, i64, ptr, i32)
304
+ void, ptr, ptr, ptr, i64, fpst, i32)
305
DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG,
306
- void, ptr, ptr, ptr, i64, ptr, i32)
307
+ void, ptr, ptr, ptr, i64, fpst, i32)
308
309
DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG,
310
- void, ptr, ptr, ptr, i64, ptr, i32)
311
+ void, ptr, ptr, ptr, i64, fpst, i32)
312
DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG,
313
- void, ptr, ptr, ptr, i64, ptr, i32)
314
+ void, ptr, ptr, ptr, i64, fpst, i32)
315
DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG,
316
- void, ptr, ptr, ptr, i64, ptr, i32)
317
+ void, ptr, ptr, ptr, i64, fpst, i32)
318
319
DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG,
320
- void, ptr, ptr, ptr, i64, ptr, i32)
321
+ void, ptr, ptr, ptr, i64, fpst, i32)
322
DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
323
- void, ptr, ptr, ptr, i64, ptr, i32)
324
+ void, ptr, ptr, ptr, i64, fpst, i32)
325
DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
326
- void, ptr, ptr, ptr, i64, ptr, i32)
327
+ void, ptr, ptr, ptr, i64, fpst, i32)
328
329
DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG,
330
- void, ptr, ptr, ptr, ptr, i32)
331
+ void, ptr, ptr, ptr, fpst, i32)
332
DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG,
333
- void, ptr, ptr, ptr, ptr, i32)
334
+ void, ptr, ptr, ptr, fpst, i32)
335
DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG,
336
- void, ptr, ptr, ptr, ptr, i32)
337
+ void, ptr, ptr, ptr, fpst, i32)
338
DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG,
339
- void, ptr, ptr, ptr, ptr, i32)
340
+ void, ptr, ptr, ptr, fpst, i32)
341
DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
342
- void, ptr, ptr, ptr, ptr, i32)
343
+ void, ptr, ptr, ptr, fpst, i32)
344
DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
345
- void, ptr, ptr, ptr, ptr, i32)
346
+ void, ptr, ptr, ptr, fpst, i32)
347
DEF_HELPER_FLAGS_5(sve_bfcvt, TCG_CALL_NO_RWG,
348
- void, ptr, ptr, ptr, ptr, i32)
349
+ void, ptr, ptr, ptr, fpst, i32)
350
351
DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG,
352
- void, ptr, ptr, ptr, ptr, i32)
353
+ void, ptr, ptr, ptr, fpst, i32)
354
DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG,
355
- void, ptr, ptr, ptr, ptr, i32)
356
+ void, ptr, ptr, ptr, fpst, i32)
357
DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG,
358
- void, ptr, ptr, ptr, ptr, i32)
359
+ void, ptr, ptr, ptr, fpst, i32)
360
DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG,
361
- void, ptr, ptr, ptr, ptr, i32)
362
+ void, ptr, ptr, ptr, fpst, i32)
363
DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG,
364
- void, ptr, ptr, ptr, ptr, i32)
365
+ void, ptr, ptr, ptr, fpst, i32)
366
DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG,
367
- void, ptr, ptr, ptr, ptr, i32)
368
+ void, ptr, ptr, ptr, fpst, i32)
369
DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG,
370
- void, ptr, ptr, ptr, ptr, i32)
371
+ void, ptr, ptr, ptr, fpst, i32)
372
373
DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG,
374
- void, ptr, ptr, ptr, ptr, i32)
375
+ void, ptr, ptr, ptr, fpst, i32)
376
DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG,
377
- void, ptr, ptr, ptr, ptr, i32)
378
+ void, ptr, ptr, ptr, fpst, i32)
379
DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG,
380
- void, ptr, ptr, ptr, ptr, i32)
381
+ void, ptr, ptr, ptr, fpst, i32)
382
DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG,
383
- void, ptr, ptr, ptr, ptr, i32)
384
+ void, ptr, ptr, ptr, fpst, i32)
385
DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG,
386
- void, ptr, ptr, ptr, ptr, i32)
387
+ void, ptr, ptr, ptr, fpst, i32)
388
DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
389
- void, ptr, ptr, ptr, ptr, i32)
390
+ void, ptr, ptr, ptr, fpst, i32)
391
DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
392
- void, ptr, ptr, ptr, ptr, i32)
393
+ void, ptr, ptr, ptr, fpst, i32)
394
395
DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG,
396
- void, ptr, ptr, ptr, ptr, i32)
397
+ void, ptr, ptr, ptr, fpst, i32)
398
DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG,
399
- void, ptr, ptr, ptr, ptr, i32)
400
+ void, ptr, ptr, ptr, fpst, i32)
401
DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG,
402
- void, ptr, ptr, ptr, ptr, i32)
403
+ void, ptr, ptr, ptr, fpst, i32)
404
405
DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG,
406
- void, ptr, ptr, ptr, ptr, i32)
407
+ void, ptr, ptr, ptr, fpst, i32)
408
DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
409
- void, ptr, ptr, ptr, ptr, i32)
410
+ void, ptr, ptr, ptr, fpst, i32)
411
DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
412
- void, ptr, ptr, ptr, ptr, i32)
413
+ void, ptr, ptr, ptr, fpst, i32)
414
415
DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG,
416
- void, ptr, ptr, ptr, ptr, i32)
417
+ void, ptr, ptr, ptr, fpst, i32)
418
DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG,
419
- void, ptr, ptr, ptr, ptr, i32)
420
+ void, ptr, ptr, ptr, fpst, i32)
421
DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG,
422
- void, ptr, ptr, ptr, ptr, i32)
423
+ void, ptr, ptr, ptr, fpst, i32)
424
425
DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG,
426
- void, ptr, ptr, ptr, ptr, i32)
427
+ void, ptr, ptr, ptr, fpst, i32)
428
DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG,
429
- void, ptr, ptr, ptr, ptr, i32)
430
+ void, ptr, ptr, ptr, fpst, i32)
431
DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG,
432
- void, ptr, ptr, ptr, ptr, i32)
433
+ void, ptr, ptr, ptr, fpst, i32)
434
435
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
436
- void, ptr, ptr, ptr, ptr, i32)
437
+ void, ptr, ptr, ptr, fpst, i32)
438
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
439
- void, ptr, ptr, ptr, ptr, i32)
440
+ void, ptr, ptr, ptr, fpst, i32)
441
DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG,
442
- void, ptr, ptr, ptr, ptr, i32)
443
+ void, ptr, ptr, ptr, fpst, i32)
444
DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG,
445
- void, ptr, ptr, ptr, ptr, i32)
446
+ void, ptr, ptr, ptr, fpst, i32)
447
DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG,
448
- void, ptr, ptr, ptr, ptr, i32)
449
+ void, ptr, ptr, ptr, fpst, i32)
450
DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG,
451
- void, ptr, ptr, ptr, ptr, i32)
452
+ void, ptr, ptr, ptr, fpst, i32)
453
DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG,
454
- void, ptr, ptr, ptr, ptr, i32)
455
+ void, ptr, ptr, ptr, fpst, i32)
456
457
DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG,
458
- void, ptr, ptr, ptr, ptr, i32)
459
+ void, ptr, ptr, ptr, fpst, i32)
460
DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG,
461
- void, ptr, ptr, ptr, ptr, i32)
462
+ void, ptr, ptr, ptr, fpst, i32)
463
DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG,
464
- void, ptr, ptr, ptr, ptr, i32)
465
+ void, ptr, ptr, ptr, fpst, i32)
466
DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG,
467
- void, ptr, ptr, ptr, ptr, i32)
468
+ void, ptr, ptr, ptr, fpst, i32)
469
DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG,
470
- void, ptr, ptr, ptr, ptr, i32)
471
+ void, ptr, ptr, ptr, fpst, i32)
472
DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
473
- void, ptr, ptr, ptr, ptr, i32)
474
+ void, ptr, ptr, ptr, fpst, i32)
475
DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
476
- void, ptr, ptr, ptr, ptr, i32)
477
+ void, ptr, ptr, ptr, fpst, i32)
478
479
DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG,
480
- void, ptr, ptr, ptr, ptr, ptr, i32)
481
+ void, ptr, ptr, ptr, ptr, fpst, i32)
482
DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG,
483
- void, ptr, ptr, ptr, ptr, ptr, i32)
484
+ void, ptr, ptr, ptr, ptr, fpst, i32)
485
DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG,
486
- void, ptr, ptr, ptr, ptr, ptr, i32)
487
+ void, ptr, ptr, ptr, ptr, fpst, i32)
488
489
DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG,
490
- void, ptr, ptr, ptr, ptr, ptr, i32)
491
+ void, ptr, ptr, ptr, ptr, fpst, i32)
492
DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG,
493
- void, ptr, ptr, ptr, ptr, ptr, i32)
494
+ void, ptr, ptr, ptr, ptr, fpst, i32)
495
DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG,
496
- void, ptr, ptr, ptr, ptr, ptr, i32)
497
+ void, ptr, ptr, ptr, ptr, fpst, i32)
498
499
DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG,
500
- void, ptr, ptr, ptr, ptr, ptr, i32)
501
+ void, ptr, ptr, ptr, ptr, fpst, i32)
502
DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG,
503
- void, ptr, ptr, ptr, ptr, ptr, i32)
504
+ void, ptr, ptr, ptr, ptr, fpst, i32)
505
DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG,
506
- void, ptr, ptr, ptr, ptr, ptr, i32)
507
+ void, ptr, ptr, ptr, ptr, fpst, i32)
508
509
DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG,
510
- void, ptr, ptr, ptr, ptr, ptr, i32)
511
+ void, ptr, ptr, ptr, ptr, fpst, i32)
512
DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG,
513
- void, ptr, ptr, ptr, ptr, ptr, i32)
514
+ void, ptr, ptr, ptr, ptr, fpst, i32)
515
DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG,
516
- void, ptr, ptr, ptr, ptr, ptr, i32)
517
+ void, ptr, ptr, ptr, ptr, fpst, i32)
518
519
DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG,
520
- void, ptr, ptr, ptr, ptr, ptr, i32)
521
+ void, ptr, ptr, ptr, ptr, fpst, i32)
522
DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG,
523
- void, ptr, ptr, ptr, ptr, ptr, i32)
524
+ void, ptr, ptr, ptr, ptr, fpst, i32)
525
DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG,
526
- void, ptr, ptr, ptr, ptr, ptr, i32)
527
+ void, ptr, ptr, ptr, ptr, fpst, i32)
528
529
DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG,
530
- void, ptr, ptr, ptr, ptr, ptr, i32)
531
+ void, ptr, ptr, ptr, ptr, fpst, i32)
532
DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG,
533
- void, ptr, ptr, ptr, ptr, ptr, i32)
534
+ void, ptr, ptr, ptr, ptr, fpst, i32)
535
DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG,
536
- void, ptr, ptr, ptr, ptr, ptr, i32)
537
+ void, ptr, ptr, ptr, ptr, fpst, i32)
538
539
DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG,
540
- void, ptr, ptr, ptr, ptr, ptr, i32)
541
+ void, ptr, ptr, ptr, ptr, fpst, i32)
542
DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
543
- void, ptr, ptr, ptr, ptr, ptr, i32)
544
+ void, ptr, ptr, ptr, ptr, fpst, i32)
545
DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
546
- void, ptr, ptr, ptr, ptr, ptr, i32)
547
+ void, ptr, ptr, ptr, ptr, fpst, i32)
548
549
DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG,
550
- void, ptr, ptr, ptr, ptr, ptr, i32)
551
+ void, ptr, ptr, ptr, ptr, fpst, i32)
552
DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG,
553
- void, ptr, ptr, ptr, ptr, ptr, i32)
554
+ void, ptr, ptr, ptr, ptr, fpst, i32)
555
DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG,
556
- void, ptr, ptr, ptr, ptr, ptr, i32)
557
+ void, ptr, ptr, ptr, ptr, fpst, i32)
558
559
DEF_HELPER_FLAGS_7(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG,
560
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
561
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
562
DEF_HELPER_FLAGS_7(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG,
563
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
564
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
565
DEF_HELPER_FLAGS_7(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG,
566
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
567
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
568
569
DEF_HELPER_FLAGS_7(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG,
570
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
571
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
572
DEF_HELPER_FLAGS_7(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG,
573
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
574
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
575
DEF_HELPER_FLAGS_7(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG,
576
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
577
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
578
579
DEF_HELPER_FLAGS_7(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG,
580
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
581
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
582
DEF_HELPER_FLAGS_7(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG,
583
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
584
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
585
DEF_HELPER_FLAGS_7(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG,
586
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
587
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
588
589
DEF_HELPER_FLAGS_7(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG,
590
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
591
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
592
DEF_HELPER_FLAGS_7(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG,
593
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
594
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
595
DEF_HELPER_FLAGS_7(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG,
596
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
597
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
598
599
DEF_HELPER_FLAGS_7(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG,
600
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
601
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
602
DEF_HELPER_FLAGS_7(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG,
603
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
604
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
605
DEF_HELPER_FLAGS_7(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG,
606
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
607
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
608
609
-DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
610
-DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
611
-DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
612
+DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
613
+DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
614
+DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
615
616
DEF_HELPER_FLAGS_4(sve2_saddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
617
DEF_HELPER_FLAGS_4(sve2_saddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
618
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve2_xar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
619
DEF_HELPER_FLAGS_4(sve2_xar_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
620
621
DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG,
622
- void, ptr, ptr, ptr, ptr, ptr, i32)
623
+ void, ptr, ptr, ptr, ptr, fpst, i32)
624
DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG,
625
- void, ptr, ptr, ptr, ptr, ptr, i32)
626
+ void, ptr, ptr, ptr, ptr, fpst, i32)
627
DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_d, TCG_CALL_NO_RWG,
628
- void, ptr, ptr, ptr, ptr, ptr, i32)
629
+ void, ptr, ptr, ptr, ptr, fpst, i32)
630
631
DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_h, TCG_CALL_NO_RWG,
632
- void, ptr, ptr, ptr, ptr, ptr, i32)
633
+ void, ptr, ptr, ptr, ptr, fpst, i32)
634
DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_s, TCG_CALL_NO_RWG,
635
- void, ptr, ptr, ptr, ptr, ptr, i32)
636
+ void, ptr, ptr, ptr, ptr, fpst, i32)
637
DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_d, TCG_CALL_NO_RWG,
638
- void, ptr, ptr, ptr, ptr, ptr, i32)
639
+ void, ptr, ptr, ptr, ptr, fpst, i32)
640
641
DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_h, TCG_CALL_NO_RWG,
642
- void, ptr, ptr, ptr, ptr, ptr, i32)
643
+ void, ptr, ptr, ptr, ptr, fpst, i32)
644
DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_s, TCG_CALL_NO_RWG,
645
- void, ptr, ptr, ptr, ptr, ptr, i32)
646
+ void, ptr, ptr, ptr, ptr, fpst, i32)
647
DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_d, TCG_CALL_NO_RWG,
648
- void, ptr, ptr, ptr, ptr, ptr, i32)
649
+ void, ptr, ptr, ptr, ptr, fpst, i32)
650
651
DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_h, TCG_CALL_NO_RWG,
652
- void, ptr, ptr, ptr, ptr, ptr, i32)
653
+ void, ptr, ptr, ptr, ptr, fpst, i32)
654
DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_s, TCG_CALL_NO_RWG,
655
- void, ptr, ptr, ptr, ptr, ptr, i32)
656
+ void, ptr, ptr, ptr, ptr, fpst, i32)
657
DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_d, TCG_CALL_NO_RWG,
658
- void, ptr, ptr, ptr, ptr, ptr, i32)
659
+ void, ptr, ptr, ptr, ptr, fpst, i32)
660
661
DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_h, TCG_CALL_NO_RWG,
662
- void, ptr, ptr, ptr, ptr, ptr, i32)
663
+ void, ptr, ptr, ptr, ptr, fpst, i32)
664
DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_s, TCG_CALL_NO_RWG,
665
- void, ptr, ptr, ptr, ptr, ptr, i32)
666
+ void, ptr, ptr, ptr, ptr, fpst, i32)
667
DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_d, TCG_CALL_NO_RWG,
668
- void, ptr, ptr, ptr, ptr, ptr, i32)
669
+ void, ptr, ptr, ptr, ptr, fpst, i32)
670
671
DEF_HELPER_FLAGS_5(sve2_eor3, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
672
DEF_HELPER_FLAGS_5(sve2_bcax, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
673
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_s, TCG_CALL_NO_RWG,
674
DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG,
675
void, ptr, ptr, ptr, ptr, i32)
676
677
-DEF_HELPER_FLAGS_6(fmmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32)
678
-DEF_HELPER_FLAGS_6(fmmla_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32)
679
+DEF_HELPER_FLAGS_6(fmmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, fpst, i32)
680
+DEF_HELPER_FLAGS_6(fmmla_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, fpst, i32)
681
682
DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_h, TCG_CALL_NO_RWG,
683
void, ptr, ptr, ptr, ptr, i32)
684
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve2_cdot_idx_d, TCG_CALL_NO_RWG,
685
void, ptr, ptr, ptr, ptr, i32)
686
687
DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG,
688
- void, ptr, ptr, ptr, ptr, i32)
689
+ void, ptr, ptr, ptr, fpst, i32)
690
DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG,
691
- void, ptr, ptr, ptr, ptr, i32)
692
+ void, ptr, ptr, ptr, fpst, i32)
693
DEF_HELPER_FLAGS_5(sve_bfcvtnt, TCG_CALL_NO_RWG,
694
- void, ptr, ptr, ptr, ptr, i32)
695
+ void, ptr, ptr, ptr, fpst, i32)
696
697
DEF_HELPER_FLAGS_5(sve2_fcvtlt_hs, TCG_CALL_NO_RWG,
698
- void, ptr, ptr, ptr, ptr, i32)
699
+ void, ptr, ptr, ptr, fpst, i32)
700
DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG,
701
- void, ptr, ptr, ptr, ptr, i32)
702
+ void, ptr, ptr, ptr, fpst, i32)
703
704
-DEF_HELPER_FLAGS_5(flogb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
705
-DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
706
-DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
707
+DEF_HELPER_FLAGS_5(flogb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
708
+DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
709
+DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, fpst, i32)
710
711
DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_b, TCG_CALL_NO_RWG,
712
void, ptr, ptr, ptr, i32)
713
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
47
index XXXXXXX..XXXXXXX 100644
714
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/tcg/translate-a64.c
715
--- a/target/arm/tcg/sve_helper.c
49
+++ b/target/arm/tcg/translate-a64.c
716
+++ b/target/arm/tcg/sve_helper.c
50
@@ -XXX,XX +XXX,XX @@ static void gen_ushr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
717
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN)
718
719
#define DO_ZPZZ_PAIR_FP(NAME, TYPE, H, OP) \
720
void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
721
- void *status, uint32_t desc) \
722
+ float_status *status, uint32_t desc) \
723
{ \
724
intptr_t i, opr_sz = simd_oprsz(desc); \
725
for (i = 0; i < opr_sz; ) { \
726
@@ -XXX,XX +XXX,XX @@ static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \
727
return TYPE##_##FUNC(lo, hi, status); \
728
} \
729
} \
730
-uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \
731
+uint64_t HELPER(NAME)(void *vn, void *vg, float_status *s, uint32_t desc) \
732
{ \
733
uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_data(desc); \
734
TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \
735
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \
736
for (; i < maxsz; i += sizeof(TYPE)) { \
737
*(TYPE *)((void *)data + i) = IDENT; \
738
} \
739
- return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \
740
+ return NAME##_reduce(data, s, maxsz / sizeof(TYPE)); \
741
}
742
743
DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero)
744
@@ -XXX,XX +XXX,XX @@ DO_REDUCE(sve_fmaxv_d, float64, H1_8, max, float64_chs(float64_infinity))
745
#undef DO_REDUCE
746
747
uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
748
- void *status, uint32_t desc)
749
+ float_status *status, uint32_t desc)
750
{
751
intptr_t i = 0, opr_sz = simd_oprsz(desc);
752
float16 result = nn;
753
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
754
}
755
756
uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg,
757
- void *status, uint32_t desc)
758
+ float_status *status, uint32_t desc)
759
{
760
intptr_t i = 0, opr_sz = simd_oprsz(desc);
761
float32 result = nn;
762
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg,
763
}
764
765
uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg,
766
- void *status, uint32_t desc)
767
+ float_status *status, uint32_t desc)
768
{
769
intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8;
770
uint64_t *m = vm;
771
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg,
772
*/
773
#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \
774
void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
775
- void *status, uint32_t desc) \
776
+ float_status *status, uint32_t desc) \
777
{ \
778
intptr_t i = simd_oprsz(desc); \
779
uint64_t *g = vg; \
780
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, H1_8, helper_vfp_mulxd)
781
*/
782
#define DO_ZPZS_FP(NAME, TYPE, H, OP) \
783
void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \
784
- void *status, uint32_t desc) \
785
+ float_status *status, uint32_t desc) \
786
{ \
787
intptr_t i = simd_oprsz(desc); \
788
uint64_t *g = vg; \
789
@@ -XXX,XX +XXX,XX @@ DO_ZPZS_FP(sve_fmins_d, float64, H1_8, float64_min)
790
* With the extra float_status parameter.
791
*/
792
#define DO_ZPZ_FP(NAME, TYPE, H, OP) \
793
-void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
794
+void HELPER(NAME)(void *vd, void *vn, void *vg, \
795
+ float_status *status, uint32_t desc) \
796
{ \
797
intptr_t i = simd_oprsz(desc); \
798
uint64_t *g = vg; \
799
@@ -XXX,XX +XXX,XX @@ static void do_fmla_zpzzz_h(void *vd, void *vn, void *vm, void *va, void *vg,
800
}
801
802
void HELPER(sve_fmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va,
803
- void *vg, void *status, uint32_t desc)
804
+ void *vg, float_status *status, uint32_t desc)
805
{
806
do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0);
807
}
808
809
void HELPER(sve_fmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va,
810
- void *vg, void *status, uint32_t desc)
811
+ void *vg, float_status *status, uint32_t desc)
812
{
813
do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0);
814
}
815
816
void HELPER(sve_fnmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va,
817
- void *vg, void *status, uint32_t desc)
818
+ void *vg, float_status *status, uint32_t desc)
819
{
820
do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0x8000);
821
}
822
823
void HELPER(sve_fnmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va,
824
- void *vg, void *status, uint32_t desc)
825
+ void *vg, float_status *status, uint32_t desc)
826
{
827
do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0x8000);
828
}
829
@@ -XXX,XX +XXX,XX @@ static void do_fmla_zpzzz_s(void *vd, void *vn, void *vm, void *va, void *vg,
830
}
831
832
void HELPER(sve_fmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va,
833
- void *vg, void *status, uint32_t desc)
834
+ void *vg, float_status *status, uint32_t desc)
835
{
836
do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0);
837
}
838
839
void HELPER(sve_fmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va,
840
- void *vg, void *status, uint32_t desc)
841
+ void *vg, float_status *status, uint32_t desc)
842
{
843
do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0);
844
}
845
846
void HELPER(sve_fnmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va,
847
- void *vg, void *status, uint32_t desc)
848
+ void *vg, float_status *status, uint32_t desc)
849
{
850
do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0x80000000);
851
}
852
853
void HELPER(sve_fnmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va,
854
- void *vg, void *status, uint32_t desc)
855
+ void *vg, float_status *status, uint32_t desc)
856
{
857
do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0x80000000);
858
}
859
@@ -XXX,XX +XXX,XX @@ static void do_fmla_zpzzz_d(void *vd, void *vn, void *vm, void *va, void *vg,
860
}
861
862
void HELPER(sve_fmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va,
863
- void *vg, void *status, uint32_t desc)
864
+ void *vg, float_status *status, uint32_t desc)
865
{
866
do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0);
867
}
868
869
void HELPER(sve_fmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va,
870
- void *vg, void *status, uint32_t desc)
871
+ void *vg, float_status *status, uint32_t desc)
872
{
873
do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, 0);
874
}
875
876
void HELPER(sve_fnmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va,
877
- void *vg, void *status, uint32_t desc)
878
+ void *vg, float_status *status, uint32_t desc)
879
{
880
do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, INT64_MIN);
881
}
882
883
void HELPER(sve_fnmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va,
884
- void *vg, void *status, uint32_t desc)
885
+ void *vg, float_status *status, uint32_t desc)
886
{
887
do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, INT64_MIN);
888
}
889
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fnmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va,
890
*/
891
#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \
892
void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
893
- void *status, uint32_t desc) \
894
+ float_status *status, uint32_t desc) \
895
{ \
896
intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \
897
uint64_t *d = vd, *g = vg; \
898
@@ -XXX,XX +XXX,XX @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT)
899
*/
900
#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \
901
void HELPER(NAME)(void *vd, void *vn, void *vg, \
902
- void *status, uint32_t desc) \
903
+ float_status *status, uint32_t desc) \
904
{ \
905
intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \
906
uint64_t *d = vd, *g = vg; \
907
@@ -XXX,XX +XXX,XX @@ DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE)
908
909
/* FP Trig Multiply-Add. */
910
911
-void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
912
+void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm,
913
+ float_status *s, uint32_t desc)
914
{
915
static const float16 coeff[16] = {
916
0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
917
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
918
mm = float16_abs(mm);
919
xx += 8;
920
}
921
- d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs);
922
+ d[i] = float16_muladd(n[i], mm, coeff[xx], 0, s);
51
}
923
}
52
}
924
}
53
925
54
+static void gen_ssra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
926
-void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
55
+{
927
+void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm,
56
+ gen_sshr_d(src, src, shift);
928
+ float_status *s, uint32_t desc)
57
+ tcg_gen_add_i64(dst, dst, src);
929
{
58
+}
930
static const float32 coeff[16] = {
59
+
931
0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9,
60
+static void gen_usra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
932
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
61
+{
933
mm = float32_abs(mm);
62
+ gen_ushr_d(src, src, shift);
934
xx += 8;
63
+ tcg_gen_add_i64(dst, dst, src);
935
}
64
+}
936
- d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs);
65
+
937
+ d[i] = float32_muladd(n[i], mm, coeff[xx], 0, s);
66
static void gen_srshr_bhs(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
67
{
68
assert(shift >= 0 && shift <= 32);
69
@@ -XXX,XX +XXX,XX @@ static void gen_urshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
70
}
938
}
71
}
939
}
72
940
73
+static void gen_srsra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
941
-void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
74
+{
942
+void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm,
75
+ gen_srshr_d(src, src, shift);
943
+ float_status *s, uint32_t desc)
76
+ tcg_gen_add_i64(dst, dst, src);
944
{
77
+}
945
static const float64 coeff[16] = {
78
+
946
0x3ff0000000000000ull, 0xbfc5555555555543ull,
79
+static void gen_ursra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
947
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
80
+{
948
mm = float64_abs(mm);
81
+ gen_urshr_d(src, src, shift);
949
xx += 8;
82
+ tcg_gen_add_i64(dst, dst, src);
950
}
83
+}
951
- d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs);
84
+
952
+ d[i] = float64_muladd(n[i], mm, coeff[xx], 0, s);
85
+static void gen_sri_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
86
+{
87
+ /* If shift is 64, dst is unchanged. */
88
+ if (shift != 64) {
89
+ tcg_gen_shri_i64(src, src, shift);
90
+ tcg_gen_deposit_i64(dst, dst, src, 0, 64 - shift);
91
+ }
92
+}
93
+
94
static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
95
WideShiftImmFn * const fns[3], MemOp sign)
96
{
97
@@ -XXX,XX +XXX,XX @@ static WideShiftImmFn * const rshrn_fns[] = {
98
};
99
TRANS(RSHRN_v, do_vec_shift_imm_narrow, a, rshrn_fns, 0)
100
101
+/*
102
+ * Advanced SIMD Scalar Shift by Immediate
103
+ */
104
+
105
+static bool do_scalar_shift_imm(DisasContext *s, arg_rri_e *a,
106
+ WideShiftImmFn *fn, bool accumulate,
107
+ MemOp sign)
108
+{
109
+ if (fp_access_check(s)) {
110
+ TCGv_i64 rd = tcg_temp_new_i64();
111
+ TCGv_i64 rn = tcg_temp_new_i64();
112
+
113
+ read_vec_element(s, rn, a->rn, 0, a->esz | sign);
114
+ if (accumulate) {
115
+ read_vec_element(s, rd, a->rd, 0, a->esz | sign);
116
+ }
117
+ fn(rd, rn, a->imm);
118
+ write_fp_dreg(s, a->rd, rd);
119
+ }
120
+ return true;
121
+}
122
+
123
+TRANS(SSHR_s, do_scalar_shift_imm, a, gen_sshr_d, false, 0)
124
+TRANS(USHR_s, do_scalar_shift_imm, a, gen_ushr_d, false, 0)
125
+TRANS(SSRA_s, do_scalar_shift_imm, a, gen_ssra_d, true, 0)
126
+TRANS(USRA_s, do_scalar_shift_imm, a, gen_usra_d, true, 0)
127
+TRANS(SRSHR_s, do_scalar_shift_imm, a, gen_srshr_d, false, 0)
128
+TRANS(URSHR_s, do_scalar_shift_imm, a, gen_urshr_d, false, 0)
129
+TRANS(SRSRA_s, do_scalar_shift_imm, a, gen_srsra_d, true, 0)
130
+TRANS(URSRA_s, do_scalar_shift_imm, a, gen_ursra_d, true, 0)
131
+TRANS(SRI_s, do_scalar_shift_imm, a, gen_sri_d, true, 0)
132
+
133
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
134
* Note that it is the caller's responsibility to ensure that the
135
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
136
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
137
}
953
}
138
}
954
}
139
955
140
-/* SSHR[RA]/USHR[RA] - Scalar shift right (optional rounding/accumulate) */
956
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
141
-static void handle_scalar_simd_shri(DisasContext *s,
957
*/
142
- bool is_u, int immh, int immb,
958
143
- int opcode, int rn, int rd)
959
void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg,
144
-{
960
- void *vs, uint32_t desc)
145
- const int size = 3;
961
+ float_status *s, uint32_t desc)
146
- int immhb = immh << 3 | immb;
962
{
147
- int shift = 2 * (8 << size) - immhb;
963
intptr_t j, i = simd_oprsz(desc);
148
- bool accumulate = false;
964
uint64_t *g = vg;
149
- bool round = false;
965
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg,
150
- bool insert = false;
966
e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag;
151
- TCGv_i64 tcg_rn;
967
152
- TCGv_i64 tcg_rd;
968
if (likely((pg >> (i & 63)) & 1)) {
153
-
969
- *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs);
154
- if (!extract32(immh, 3, 1)) {
970
+ *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, s);
155
- unallocated_encoding(s);
971
}
156
- return;
972
if (likely((pg >> (j & 63)) & 1)) {
157
- }
973
- *(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs);
158
-
974
+ *(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, s);
159
- if (!fp_access_check(s)) {
975
}
160
- return;
976
} while (i & 63);
161
- }
977
} while (i != 0);
162
-
978
}
163
- switch (opcode) {
979
164
- case 0x02: /* SSRA / USRA (accumulate) */
980
void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg,
165
- accumulate = true;
981
- void *vs, uint32_t desc)
166
- break;
982
+ float_status *s, uint32_t desc)
167
- case 0x04: /* SRSHR / URSHR (rounding) */
983
{
168
- round = true;
984
intptr_t j, i = simd_oprsz(desc);
169
- break;
985
uint64_t *g = vg;
170
- case 0x06: /* SRSRA / URSRA (accum + rounding) */
986
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg,
171
- accumulate = round = true;
987
e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag;
172
- break;
988
173
- case 0x08: /* SRI */
989
if (likely((pg >> (i & 63)) & 1)) {
174
- insert = true;
990
- *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs);
175
- break;
991
+ *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, s);
176
- }
992
}
177
-
993
if (likely((pg >> (j & 63)) & 1)) {
178
- tcg_rn = read_fp_dreg(s, rn);
994
- *(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs);
179
- tcg_rd = (accumulate || insert) ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
995
+ *(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, s);
180
-
996
}
181
- if (insert) {
997
} while (i & 63);
182
- /* shift count same as element size is valid but does nothing;
998
} while (i != 0);
183
- * special case to avoid potential shift by 64.
999
}
184
- */
1000
185
- int esize = 8 << size;
1001
void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
186
- if (shift != esize) {
1002
- void *vs, uint32_t desc)
187
- tcg_gen_shri_i64(tcg_rn, tcg_rn, shift);
1003
+ float_status *s, uint32_t desc)
188
- tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, 0, esize - shift);
1004
{
189
- }
1005
intptr_t j, i = simd_oprsz(desc);
190
- } else {
1006
uint64_t *g = vg;
191
- handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
1007
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
192
- accumulate, is_u, size, shift);
1008
e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag;
193
- }
1009
194
-
1010
if (likely((pg >> (i & 63)) & 1)) {
195
- write_fp_dreg(s, rd, tcg_rd);
1011
- *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs);
196
-}
1012
+ *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, s);
197
-
1013
}
198
/* SHL/SLI - Scalar shift left */
1014
if (likely((pg >> (j & 63)) & 1)) {
199
static void handle_scalar_simd_shli(DisasContext *s, bool insert,
1015
- *(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs);
200
int immh, int immb, int opcode,
1016
+ *(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, s);
201
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
1017
}
202
}
1018
} while (i & 63);
203
1019
} while (i != 0);
204
switch (opcode) {
1020
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
205
- case 0x08: /* SRI */
1021
*/
206
- if (!is_u) {
1022
207
- unallocated_encoding(s);
1023
void HELPER(sve_fcmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va,
208
- return;
1024
- void *vg, void *status, uint32_t desc)
209
- }
1025
+ void *vg, float_status *status, uint32_t desc)
210
- /* fall through */
1026
{
211
- case 0x00: /* SSHR / USHR */
1027
intptr_t j, i = simd_oprsz(desc);
212
- case 0x02: /* SSRA / USRA */
1028
unsigned rot = simd_data(desc);
213
- case 0x04: /* SRSHR / URSHR */
1029
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va,
214
- case 0x06: /* SRSRA / URSRA */
1030
}
215
- handle_scalar_simd_shri(s, is_u, immh, immb, opcode, rn, rd);
1031
216
- break;
1032
void HELPER(sve_fcmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va,
217
case 0x0a: /* SHL / SLI */
1033
- void *vg, void *status, uint32_t desc)
218
handle_scalar_simd_shli(s, is_u, immh, immb, opcode, rn, rd);
1034
+ void *vg, float_status *status, uint32_t desc)
219
break;
1035
{
220
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
1036
intptr_t j, i = simd_oprsz(desc);
221
handle_simd_shift_fpint_conv(s, true, false, is_u, immh, immb, rn, rd);
1037
unsigned rot = simd_data(desc);
222
break;
1038
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va,
223
default:
1039
}
224
+ case 0x00: /* SSHR / USHR */
1040
225
+ case 0x02: /* SSRA / USRA */
1041
void HELPER(sve_fcmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va,
226
+ case 0x04: /* SRSHR / URSHR */
1042
- void *vg, void *status, uint32_t desc)
227
+ case 0x06: /* SRSRA / URSRA */
1043
+ void *vg, float_status *status, uint32_t desc)
228
+ case 0x08: /* SRI */
1044
{
229
unallocated_encoding(s);
1045
intptr_t j, i = simd_oprsz(desc);
230
break;
1046
unsigned rot = simd_data(desc);
231
}
1047
@@ -XXX,XX +XXX,XX @@ void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc)
1048
}
1049
1050
void HELPER(fmmla_s)(void *vd, void *vn, void *vm, void *va,
1051
- void *status, uint32_t desc)
1052
+ float_status *status, uint32_t desc)
1053
{
1054
intptr_t s, opr_sz = simd_oprsz(desc) / (sizeof(float32) * 4);
1055
1056
@@ -XXX,XX +XXX,XX @@ void HELPER(fmmla_s)(void *vd, void *vn, void *vm, void *va,
1057
}
1058
1059
void HELPER(fmmla_d)(void *vd, void *vn, void *vm, void *va,
1060
- void *status, uint32_t desc)
1061
+ float_status *status, uint32_t desc)
1062
{
1063
intptr_t s, opr_sz = simd_oprsz(desc) / (sizeof(float64) * 4);
1064
1065
@@ -XXX,XX +XXX,XX @@ void HELPER(fmmla_d)(void *vd, void *vn, void *vm, void *va,
1066
}
1067
1068
#define DO_FCVTNT(NAME, TYPEW, TYPEN, HW, HN, OP) \
1069
-void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
1070
+void HELPER(NAME)(void *vd, void *vn, void *vg, \
1071
+ float_status *status, uint32_t desc) \
1072
{ \
1073
intptr_t i = simd_oprsz(desc); \
1074
uint64_t *g = vg; \
1075
@@ -XXX,XX +XXX,XX @@ DO_FCVTNT(sve2_fcvtnt_sh, uint32_t, uint16_t, H1_4, H1_2, sve_f32_to_f16)
1076
DO_FCVTNT(sve2_fcvtnt_ds, uint64_t, uint32_t, H1_8, H1_4, float64_to_float32)
1077
1078
#define DO_FCVTLT(NAME, TYPEW, TYPEN, HW, HN, OP) \
1079
-void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
1080
+void HELPER(NAME)(void *vd, void *vn, void *vg, \
1081
+ float_status *status, uint32_t desc) \
1082
{ \
1083
intptr_t i = simd_oprsz(desc); \
1084
uint64_t *g = vg; \
232
--
1085
--
233
2.34.1
1086
2.34.1
1087
1088
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This includes SHL and SLI.
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241206031224.78525-8-richard.henderson@linaro.org
7
Message-id: 20240912024114.1097832-25-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
7
---
10
target/arm/tcg/a64.decode | 4 ++++
8
target/arm/tcg/helper-sme.h | 4 ++--
11
target/arm/tcg/translate-a64.c | 44 +++++++---------------------------
9
target/arm/tcg/sme_helper.c | 8 ++++----
12
2 files changed, 13 insertions(+), 35 deletions(-)
10
2 files changed, 6 insertions(+), 6 deletions(-)
13
11
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
12
diff --git a/target/arm/tcg/helper-sme.h b/target/arm/tcg/helper-sme.h
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/a64.decode
14
--- a/target/arm/tcg/helper-sme.h
17
+++ b/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/helper-sme.h
18
@@ -XXX,XX +XXX,XX @@ RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sme_addva_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
17
DEF_HELPER_FLAGS_7(sme_fmopa_h, TCG_CALL_NO_RWG,
20
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
18
void, ptr, ptr, ptr, ptr, ptr, env, i32)
21
&rri_e esz=3 imm=%neon_rshift_i6
19
DEF_HELPER_FLAGS_7(sme_fmopa_s, TCG_CALL_NO_RWG,
22
+@shli_d .... ..... 1 imm:6 ..... . rn:5 rd:5 &rri_e esz=3
20
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
23
21
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
24
SSHR_s 0101 11110 .... ... 00000 1 ..... ..... @shri_d
22
DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG,
25
USHR_s 0111 11110 .... ... 00000 1 ..... ..... @shri_d
23
- void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
26
@@ -XXX,XX +XXX,XX @@ URSHR_s 0111 11110 .... ... 00100 1 ..... ..... @shri_d
24
+ void, ptr, ptr, ptr, ptr, ptr, fpst, i32)
27
SRSRA_s 0101 11110 .... ... 00110 1 ..... ..... @shri_d
25
DEF_HELPER_FLAGS_7(sme_bfmopa, TCG_CALL_NO_RWG,
28
URSRA_s 0111 11110 .... ... 00110 1 ..... ..... @shri_d
26
void, ptr, ptr, ptr, ptr, ptr, env, i32)
29
SRI_s 0111 11110 .... ... 01000 1 ..... ..... @shri_d
27
DEF_HELPER_FLAGS_6(sme_smopa_s, TCG_CALL_NO_RWG,
30
+
28
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
31
+SHL_s 0101 11110 .... ... 01010 1 ..... ..... @shli_d
32
+SLI_s 0111 11110 .... ... 01010 1 ..... ..... @shli_d
33
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
34
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/translate-a64.c
30
--- a/target/arm/tcg/sme_helper.c
36
+++ b/target/arm/tcg/translate-a64.c
31
+++ b/target/arm/tcg/sme_helper.c
37
@@ -XXX,XX +XXX,XX @@ static void gen_sri_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
32
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_addva_d)(void *vzda, void *vzn, void *vpn,
38
}
39
}
33
}
40
34
41
+static void gen_sli_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
35
void HELPER(sme_fmopa_s)(void *vza, void *vzn, void *vzm, void *vpn,
42
+{
36
- void *vpm, void *vst, uint32_t desc)
43
+ tcg_gen_deposit_i64(dst, dst, src, shift, 64 - shift);
37
+ void *vpm, float_status *fpst_in, uint32_t desc)
44
+}
45
+
46
static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
47
WideShiftImmFn * const fns[3], MemOp sign)
48
{
38
{
49
@@ -XXX,XX +XXX,XX @@ TRANS(SRSRA_s, do_scalar_shift_imm, a, gen_srsra_d, true, 0)
39
intptr_t row, col, oprsz = simd_maxsz(desc);
50
TRANS(URSRA_s, do_scalar_shift_imm, a, gen_ursra_d, true, 0)
40
uint32_t neg = simd_data(desc) << 31;
51
TRANS(SRI_s, do_scalar_shift_imm, a, gen_sri_d, true, 0)
41
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_fmopa_s)(void *vza, void *vzn, void *vzm, void *vpn,
52
42
* update the cumulative fp exception status. It also produces
53
+TRANS(SHL_s, do_scalar_shift_imm, a, tcg_gen_shli_i64, false, 0)
43
* default nans.
54
+TRANS(SLI_s, do_scalar_shift_imm, a, gen_sli_d, true, 0)
44
*/
55
+
45
- fpst = *(float_status *)vst;
56
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
46
+ fpst = *fpst_in;
57
* Note that it is the caller's responsibility to ensure that the
47
set_default_nan_mode(true, &fpst);
58
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
48
59
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
49
for (row = 0; row < oprsz; ) {
60
}
50
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_fmopa_s)(void *vza, void *vzn, void *vzm, void *vpn,
61
}
51
}
62
52
63
-/* SHL/SLI - Scalar shift left */
53
void HELPER(sme_fmopa_d)(void *vza, void *vzn, void *vzm, void *vpn,
64
-static void handle_scalar_simd_shli(DisasContext *s, bool insert,
54
- void *vpm, void *vst, uint32_t desc)
65
- int immh, int immb, int opcode,
55
+ void *vpm, float_status *fpst_in, uint32_t desc)
66
- int rn, int rd)
56
{
67
-{
57
intptr_t row, col, oprsz = simd_oprsz(desc) / 8;
68
- int size = 32 - clz32(immh) - 1;
58
uint64_t neg = (uint64_t)simd_data(desc) << 63;
69
- int immhb = immh << 3 | immb;
59
uint64_t *za = vza, *zn = vzn, *zm = vzm;
70
- int shift = immhb - (8 << size);
60
uint8_t *pn = vpn, *pm = vpm;
71
- TCGv_i64 tcg_rn;
61
- float_status fpst = *(float_status *)vst;
72
- TCGv_i64 tcg_rd;
62
+ float_status fpst = *fpst_in;
73
-
63
74
- if (!extract32(immh, 3, 1)) {
64
set_default_nan_mode(true, &fpst);
75
- unallocated_encoding(s);
65
76
- return;
77
- }
78
-
79
- if (!fp_access_check(s)) {
80
- return;
81
- }
82
-
83
- tcg_rn = read_fp_dreg(s, rn);
84
- tcg_rd = insert ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
85
-
86
- if (insert) {
87
- tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, shift, 64 - shift);
88
- } else {
89
- tcg_gen_shli_i64(tcg_rd, tcg_rn, shift);
90
- }
91
-
92
- write_fp_dreg(s, rd, tcg_rd);
93
-}
94
-
95
/* SQSHRN/SQSHRUN - Saturating (signed/unsigned) shift right with
96
* (signed/unsigned) narrowing */
97
static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
98
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
99
}
100
101
switch (opcode) {
102
- case 0x0a: /* SHL / SLI */
103
- handle_scalar_simd_shli(s, is_u, immh, immb, opcode, rn, rd);
104
- break;
105
case 0x1c: /* SCVTF, UCVTF */
106
handle_simd_shift_intfp_conv(s, true, false, is_u, immh, immb,
107
opcode, rn, rd);
108
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
109
case 0x04: /* SRSHR / URSHR */
110
case 0x06: /* SRSRA / URSRA */
111
case 0x08: /* SRI */
112
+ case 0x0a: /* SHL / SLI */
113
unallocated_encoding(s);
114
break;
115
}
116
--
66
--
117
2.34.1
67
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Allow the helpers to receive CPUARMState* directly
4
instead of via void*.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241206031224.78525-9-richard.henderson@linaro.org
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20240912024114.1097832-7-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
10
---
9
target/arm/tcg/a64.decode | 5 ++
11
target/arm/helper.h | 12 ++++++------
10
target/arm/tcg/translate-a64.c | 121 +++++++++++++--------------------
12
target/arm/tcg/helper-a64.h | 2 +-
11
2 files changed, 53 insertions(+), 73 deletions(-)
13
target/arm/tcg/vec_helper.c | 21 +++++++--------------
14
3 files changed, 14 insertions(+), 21 deletions(-)
12
15
13
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/tcg/a64.decode
18
--- a/target/arm/helper.h
16
+++ b/target/arm/tcg/a64.decode
19
+++ b/target/arm/helper.h
17
@@ -XXX,XX +XXX,XX @@ FMADD 0001 1111 .. 0 ..... 0 ..... ..... ..... @rrrr_hsd
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_suqadd_d, TCG_CALL_NO_RWG,
18
FMSUB 0001 1111 .. 0 ..... 1 ..... ..... ..... @rrrr_hsd
21
void, ptr, ptr, ptr, ptr, i32)
19
FNMADD 0001 1111 .. 1 ..... 0 ..... ..... ..... @rrrr_hsd
22
20
FNMSUB 0001 1111 .. 1 ..... 1 ..... ..... ..... @rrrr_hsd
23
DEF_HELPER_FLAGS_5(gvec_fmlal_a32, TCG_CALL_NO_RWG,
21
+
24
- void, ptr, ptr, ptr, ptr, i32)
22
+# Advanced SIMD Extract
25
+ void, ptr, ptr, ptr, env, i32)
23
+
26
DEF_HELPER_FLAGS_5(gvec_fmlal_a64, TCG_CALL_NO_RWG,
24
+EXT_d 0010 1110 00 0 rm:5 00 imm:3 0 rn:5 rd:5
27
- void, ptr, ptr, ptr, ptr, i32)
25
+EXT_q 0110 1110 00 0 rm:5 0 imm:4 0 rn:5 rd:5
28
+ void, ptr, ptr, ptr, env, i32)
26
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a32, TCG_CALL_NO_RWG,
30
- void, ptr, ptr, ptr, ptr, i32)
31
+ void, ptr, ptr, ptr, env, i32)
32
DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a64, TCG_CALL_NO_RWG,
33
- void, ptr, ptr, ptr, ptr, i32)
34
+ void, ptr, ptr, ptr, env, i32)
35
36
DEF_HELPER_FLAGS_2(frint32_s, TCG_CALL_NO_RWG, f32, f32, fpst)
37
DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, fpst)
38
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_d, TCG_CALL_NO_RWG,
39
void, ptr, ptr, ptr, i32)
40
41
DEF_HELPER_FLAGS_6(sve2_fmlal_zzzw_s, TCG_CALL_NO_RWG,
42
- void, ptr, ptr, ptr, ptr, ptr, i32)
43
+ void, ptr, ptr, ptr, ptr, env, i32)
44
DEF_HELPER_FLAGS_6(sve2_fmlal_zzxw_s, TCG_CALL_NO_RWG,
45
- void, ptr, ptr, ptr, ptr, ptr, i32)
46
+ void, ptr, ptr, ptr, ptr, env, i32)
47
48
DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
49
50
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
27
index XXXXXXX..XXXXXXX 100644
51
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/tcg/translate-a64.c
52
--- a/target/arm/tcg/helper-a64.h
29
+++ b/target/arm/tcg/translate-a64.c
53
+++ b/target/arm/tcg/helper-a64.h
30
@@ -XXX,XX +XXX,XX @@ static bool trans_FCSEL(DisasContext *s, arg_FCSEL *a)
54
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, fpst)
31
return true;
55
DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, fpst)
56
DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, fpst)
57
DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, fpst)
58
-DEF_HELPER_FLAGS_4(simd_tblx, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_4(simd_tblx, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
60
DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, fpst)
61
DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, fpst)
62
DEF_HELPER_FLAGS_3(neon_ceq_f64, TCG_CALL_NO_RWG, i64, i64, i64, fpst)
63
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/tcg/vec_helper.c
66
+++ b/target/arm/tcg/vec_helper.c
67
@@ -XXX,XX +XXX,XX @@ static void do_fmlal(float32 *d, void *vn, void *vm, float_status *fpst,
32
}
68
}
33
69
34
+/*
70
void HELPER(gvec_fmlal_a32)(void *vd, void *vn, void *vm,
35
+ * Advanced SIMD Extract
71
- void *venv, uint32_t desc)
36
+ */
72
+ CPUARMState *env, uint32_t desc)
37
+
73
{
38
+static bool trans_EXT_d(DisasContext *s, arg_EXT_d *a)
74
- CPUARMState *env = venv;
39
+{
75
do_fmlal(vd, vn, vm, &env->vfp.standard_fp_status, desc,
40
+ if (fp_access_check(s)) {
76
get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
41
+ TCGv_i64 lo = read_fp_dreg(s, a->rn);
42
+ if (a->imm != 0) {
43
+ TCGv_i64 hi = read_fp_dreg(s, a->rm);
44
+ tcg_gen_extract2_i64(lo, lo, hi, a->imm * 8);
45
+ }
46
+ write_fp_dreg(s, a->rd, lo);
47
+ }
48
+ return true;
49
+}
50
+
51
+static bool trans_EXT_q(DisasContext *s, arg_EXT_q *a)
52
+{
53
+ TCGv_i64 lo, hi;
54
+ int pos = (a->imm & 7) * 8;
55
+ int elt = a->imm >> 3;
56
+
57
+ if (!fp_access_check(s)) {
58
+ return true;
59
+ }
60
+
61
+ lo = tcg_temp_new_i64();
62
+ hi = tcg_temp_new_i64();
63
+
64
+ read_vec_element(s, lo, a->rn, elt, MO_64);
65
+ elt++;
66
+ read_vec_element(s, hi, elt & 2 ? a->rm : a->rn, elt & 1, MO_64);
67
+ elt++;
68
+
69
+ if (pos != 0) {
70
+ TCGv_i64 hh = tcg_temp_new_i64();
71
+ tcg_gen_extract2_i64(lo, lo, hi, pos);
72
+ read_vec_element(s, hh, a->rm, elt & 1, MO_64);
73
+ tcg_gen_extract2_i64(hi, hi, hh, pos);
74
+ }
75
+
76
+ write_vec_element(s, lo, a->rd, 0, MO_64);
77
+ write_vec_element(s, hi, a->rd, 1, MO_64);
78
+ clear_vec_high(s, true, a->rd);
79
+ return true;
80
+}
81
+
82
/*
83
* Floating-point data-processing (3 source)
84
*/
85
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
86
}
87
}
77
}
88
78
89
-/* EXT
79
void HELPER(gvec_fmlal_a64)(void *vd, void *vn, void *vm,
90
- * 31 30 29 24 23 22 21 20 16 15 14 11 10 9 5 4 0
80
- void *venv, uint32_t desc)
91
- * +---+---+-------------+-----+---+------+---+------+---+------+------+
81
+ CPUARMState *env, uint32_t desc)
92
- * | 0 | Q | 1 0 1 1 1 0 | op2 | 0 | Rm | 0 | imm4 | 0 | Rn | Rd |
82
{
93
- * +---+---+-------------+-----+---+------+---+------+---+------+------+
83
- CPUARMState *env = venv;
94
- */
84
do_fmlal(vd, vn, vm, &env->vfp.fp_status, desc,
95
-static void disas_simd_ext(DisasContext *s, uint32_t insn)
85
get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
96
-{
86
}
97
- int is_q = extract32(insn, 30, 1);
87
98
- int op2 = extract32(insn, 22, 2);
88
void HELPER(sve2_fmlal_zzzw_s)(void *vd, void *vn, void *vm, void *va,
99
- int imm4 = extract32(insn, 11, 4);
89
- void *venv, uint32_t desc)
100
- int rm = extract32(insn, 16, 5);
90
+ CPUARMState *env, uint32_t desc)
101
- int rn = extract32(insn, 5, 5);
91
{
102
- int rd = extract32(insn, 0, 5);
92
intptr_t i, oprsz = simd_oprsz(desc);
103
- int pos = imm4 << 3;
93
uint16_t negn = extract32(desc, SIMD_DATA_SHIFT, 1) << 15;
104
- TCGv_i64 tcg_resl, tcg_resh;
94
intptr_t sel = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(float16);
105
-
95
- CPUARMState *env = venv;
106
- if (op2 != 0 || (!is_q && extract32(imm4, 3, 1))) {
96
float_status *status = &env->vfp.fp_status;
107
- unallocated_encoding(s);
97
bool fz16 = get_flush_inputs_to_zero(&env->vfp.fp_status_f16);
108
- return;
98
109
- }
99
@@ -XXX,XX +XXX,XX @@ static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst,
110
-
100
}
111
- if (!fp_access_check(s)) {
101
112
- return;
102
void HELPER(gvec_fmlal_idx_a32)(void *vd, void *vn, void *vm,
113
- }
103
- void *venv, uint32_t desc)
114
-
104
+ CPUARMState *env, uint32_t desc)
115
- tcg_resh = tcg_temp_new_i64();
105
{
116
- tcg_resl = tcg_temp_new_i64();
106
- CPUARMState *env = venv;
117
-
107
do_fmlal_idx(vd, vn, vm, &env->vfp.standard_fp_status, desc,
118
- /* Vd gets bits starting at pos bits into Vm:Vn. This is
108
get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
119
- * either extracting 128 bits from a 128:128 concatenation, or
109
}
120
- * extracting 64 bits from a 64:64 concatenation.
110
121
- */
111
void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm,
122
- if (!is_q) {
112
- void *venv, uint32_t desc)
123
- read_vec_element(s, tcg_resl, rn, 0, MO_64);
113
+ CPUARMState *env, uint32_t desc)
124
- if (pos != 0) {
114
{
125
- read_vec_element(s, tcg_resh, rm, 0, MO_64);
115
- CPUARMState *env = venv;
126
- tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
116
do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc,
127
- }
117
get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
128
- } else {
118
}
129
- TCGv_i64 tcg_hh;
119
130
- typedef struct {
120
void HELPER(sve2_fmlal_zzxw_s)(void *vd, void *vn, void *vm, void *va,
131
- int reg;
121
- void *venv, uint32_t desc)
132
- int elt;
122
+ CPUARMState *env, uint32_t desc)
133
- } EltPosns;
123
{
134
- EltPosns eltposns[] = { {rn, 0}, {rn, 1}, {rm, 0}, {rm, 1} };
124
intptr_t i, j, oprsz = simd_oprsz(desc);
135
- EltPosns *elt = eltposns;
125
uint16_t negn = extract32(desc, SIMD_DATA_SHIFT, 1) << 15;
136
-
126
intptr_t sel = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(float16);
137
- if (pos >= 64) {
127
intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 2, 3) * sizeof(float16);
138
- elt++;
128
- CPUARMState *env = venv;
139
- pos -= 64;
129
float_status *status = &env->vfp.fp_status;
140
- }
130
bool fz16 = get_flush_inputs_to_zero(&env->vfp.fp_status_f16);
141
-
131
142
- read_vec_element(s, tcg_resl, elt->reg, elt->elt, MO_64);
132
@@ -XXX,XX +XXX,XX @@ DO_VRINT_RMODE(gvec_vrint_rm_s, helper_rints, uint32_t)
143
- elt++;
133
#undef DO_VRINT_RMODE
144
- read_vec_element(s, tcg_resh, elt->reg, elt->elt, MO_64);
134
145
- elt++;
135
#ifdef TARGET_AARCH64
146
- if (pos != 0) {
136
-void HELPER(simd_tblx)(void *vd, void *vm, void *venv, uint32_t desc)
147
- tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
137
+void HELPER(simd_tblx)(void *vd, void *vm, CPUARMState *env, uint32_t desc)
148
- tcg_hh = tcg_temp_new_i64();
138
{
149
- read_vec_element(s, tcg_hh, elt->reg, elt->elt, MO_64);
139
const uint8_t *indices = vm;
150
- tcg_gen_extract2_i64(tcg_resh, tcg_resh, tcg_hh, pos);
140
- CPUARMState *env = venv;
151
- }
141
size_t oprsz = simd_oprsz(desc);
152
- }
142
uint32_t rn = extract32(desc, SIMD_DATA_SHIFT, 5);
153
-
143
bool is_tbx = extract32(desc, SIMD_DATA_SHIFT + 5, 1);
154
- write_vec_element(s, tcg_resl, rd, 0, MO_64);
155
- if (is_q) {
156
- write_vec_element(s, tcg_resh, rd, 1, MO_64);
157
- }
158
- clear_vec_high(s, is_q, rd);
159
-}
160
-
161
/* TBL/TBX
162
* 31 30 29 24 23 22 21 20 16 15 14 13 12 11 10 9 5 4 0
163
* +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
164
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
165
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
166
{ 0x0e000000, 0xbf208c00, disas_simd_tb },
167
{ 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
168
- { 0x2e000000, 0xbf208400, disas_simd_ext },
169
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
170
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
171
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
172
--
144
--
173
2.34.1
145
2.34.1
174
146
175
147
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-26-richard.henderson@linaro.org
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Message-id: 20241206031224.78525-10-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/helper.h | 12 ++++
8
target/arm/helper.h | 56 ++++++++++++++++++------------------
9
target/arm/tcg/translate.h | 7 ++
9
target/arm/tcg/neon_helper.c | 6 ++--
10
target/arm/tcg/neon-dp.decode | 6 +-
10
2 files changed, 30 insertions(+), 32 deletions(-)
11
target/arm/tcg/gengvec.c | 36 +++++++++++
12
target/arm/tcg/neon_helper.c | 33 ++++++++++
13
target/arm/tcg/translate-neon.c | 110 +-------------------------------
14
6 files changed, 94 insertions(+), 110 deletions(-)
15
11
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.h
14
--- a/target/arm/helper.h
19
+++ b/target/arm/helper.h
15
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(neon_uqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(neon_qrshl_u32, i32, env, i32, i32)
21
DEF_HELPER_FLAGS_5(neon_uqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
17
DEF_HELPER_3(neon_qrshl_s32, i32, env, i32, i32)
22
DEF_HELPER_FLAGS_5(neon_uqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
DEF_HELPER_3(neon_qrshl_u64, i64, env, i64, i64)
23
DEF_HELPER_FLAGS_5(neon_uqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
DEF_HELPER_3(neon_qrshl_s64, i64, env, i64, i64)
24
+DEF_HELPER_FLAGS_4(neon_sqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
-DEF_HELPER_FLAGS_5(neon_sqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_4(neon_sqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
-DEF_HELPER_FLAGS_5(neon_sqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(neon_sqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
-DEF_HELPER_FLAGS_5(neon_sqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(neon_sqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
-DEF_HELPER_FLAGS_5(neon_sqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(neon_uqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
-DEF_HELPER_FLAGS_5(neon_uqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(neon_uqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
-DEF_HELPER_FLAGS_5(neon_uqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(neon_uqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
-DEF_HELPER_FLAGS_5(neon_uqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(neon_uqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
-DEF_HELPER_FLAGS_5(neon_uqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(neon_sqshlui_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
-DEF_HELPER_FLAGS_5(neon_sqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(neon_sqshlui_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
-DEF_HELPER_FLAGS_5(neon_sqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(neon_sqshlui_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
-DEF_HELPER_FLAGS_5(neon_sqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(neon_sqshlui_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
-DEF_HELPER_FLAGS_5(neon_sqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
-DEF_HELPER_FLAGS_5(neon_uqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
-DEF_HELPER_FLAGS_5(neon_uqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
-DEF_HELPER_FLAGS_5(neon_uqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
-DEF_HELPER_FLAGS_5(neon_uqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
-DEF_HELPER_FLAGS_4(neon_sqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
-DEF_HELPER_FLAGS_4(neon_sqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
-DEF_HELPER_FLAGS_4(neon_sqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
-DEF_HELPER_FLAGS_4(neon_sqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
40
-DEF_HELPER_FLAGS_4(neon_uqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
-DEF_HELPER_FLAGS_4(neon_uqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
-DEF_HELPER_FLAGS_4(neon_uqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
43
-DEF_HELPER_FLAGS_4(neon_uqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
-DEF_HELPER_FLAGS_4(neon_sqshlui_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
45
-DEF_HELPER_FLAGS_4(neon_sqshlui_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
46
-DEF_HELPER_FLAGS_4(neon_sqshlui_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
-DEF_HELPER_FLAGS_4(neon_sqshlui_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_5(neon_sqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
49
+DEF_HELPER_FLAGS_5(neon_sqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
50
+DEF_HELPER_FLAGS_5(neon_sqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
51
+DEF_HELPER_FLAGS_5(neon_sqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
52
+DEF_HELPER_FLAGS_5(neon_uqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
53
+DEF_HELPER_FLAGS_5(neon_uqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
54
+DEF_HELPER_FLAGS_5(neon_uqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
55
+DEF_HELPER_FLAGS_5(neon_uqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
56
+DEF_HELPER_FLAGS_5(neon_sqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
57
+DEF_HELPER_FLAGS_5(neon_sqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
58
+DEF_HELPER_FLAGS_5(neon_sqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
59
+DEF_HELPER_FLAGS_5(neon_sqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
60
+DEF_HELPER_FLAGS_5(neon_uqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
61
+DEF_HELPER_FLAGS_5(neon_uqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
62
+DEF_HELPER_FLAGS_5(neon_uqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
63
+DEF_HELPER_FLAGS_5(neon_uqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
64
+DEF_HELPER_FLAGS_4(neon_sqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
65
+DEF_HELPER_FLAGS_4(neon_sqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
66
+DEF_HELPER_FLAGS_4(neon_sqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
67
+DEF_HELPER_FLAGS_4(neon_sqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
68
+DEF_HELPER_FLAGS_4(neon_uqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
69
+DEF_HELPER_FLAGS_4(neon_uqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
70
+DEF_HELPER_FLAGS_4(neon_uqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
71
+DEF_HELPER_FLAGS_4(neon_uqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
72
+DEF_HELPER_FLAGS_4(neon_sqshlui_b, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
73
+DEF_HELPER_FLAGS_4(neon_sqshlui_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
74
+DEF_HELPER_FLAGS_4(neon_sqshlui_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
75
+DEF_HELPER_FLAGS_4(neon_sqshlui_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
36
76
37
DEF_HELPER_FLAGS_4(gvec_srshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
77
DEF_HELPER_FLAGS_4(gvec_srshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_4(gvec_srshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
78
DEF_HELPER_FLAGS_4(gvec_srshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/tcg/translate.h
42
+++ b/target/arm/tcg/translate.h
43
@@ -XXX,XX +XXX,XX @@ void gen_neon_sqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
44
void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
45
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
46
47
+void gen_neon_sqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
48
+ int64_t c, uint32_t opr_sz, uint32_t max_sz);
49
+void gen_neon_uqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
50
+ int64_t c, uint32_t opr_sz, uint32_t max_sz);
51
+void gen_neon_sqshlui(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
52
+ int64_t c, uint32_t opr_sz, uint32_t max_sz);
53
+
54
void gen_gvec_shadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
55
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
56
void gen_gvec_uhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
57
diff --git a/target/arm/tcg/neon-dp.decode b/target/arm/tcg/neon-dp.decode
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/tcg/neon-dp.decode
60
+++ b/target/arm/tcg/neon-dp.decode
61
@@ -XXX,XX +XXX,XX @@ VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
62
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
63
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
64
65
-VQSHLU_64_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_d
66
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_d
67
VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_s
68
VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_h
69
VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_b
70
71
-VQSHL_S_64_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
72
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
73
VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
74
VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
75
VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
76
77
-VQSHL_U_64_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
78
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
79
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
80
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
81
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
82
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
83
index XXXXXXX..XXXXXXX 100644
84
--- a/target/arm/tcg/gengvec.c
85
+++ b/target/arm/tcg/gengvec.c
86
@@ -XXX,XX +XXX,XX @@ void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
87
opr_sz, max_sz, 0, fns[vece]);
88
}
89
90
+void gen_neon_sqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
91
+ int64_t c, uint32_t opr_sz, uint32_t max_sz)
92
+{
93
+ static gen_helper_gvec_2_ptr * const fns[] = {
94
+ gen_helper_neon_sqshli_b, gen_helper_neon_sqshli_h,
95
+ gen_helper_neon_sqshli_s, gen_helper_neon_sqshli_d,
96
+ };
97
+ tcg_debug_assert(vece <= MO_64);
98
+ tcg_debug_assert(c >= 0 && c <= (8 << vece));
99
+ tcg_gen_gvec_2_ptr(rd_ofs, rn_ofs, tcg_env, opr_sz, max_sz, c, fns[vece]);
100
+}
101
+
102
+void gen_neon_uqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
103
+ int64_t c, uint32_t opr_sz, uint32_t max_sz)
104
+{
105
+ static gen_helper_gvec_2_ptr * const fns[] = {
106
+ gen_helper_neon_uqshli_b, gen_helper_neon_uqshli_h,
107
+ gen_helper_neon_uqshli_s, gen_helper_neon_uqshli_d,
108
+ };
109
+ tcg_debug_assert(vece <= MO_64);
110
+ tcg_debug_assert(c >= 0 && c <= (8 << vece));
111
+ tcg_gen_gvec_2_ptr(rd_ofs, rn_ofs, tcg_env, opr_sz, max_sz, c, fns[vece]);
112
+}
113
+
114
+void gen_neon_sqshlui(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
115
+ int64_t c, uint32_t opr_sz, uint32_t max_sz)
116
+{
117
+ static gen_helper_gvec_2_ptr * const fns[] = {
118
+ gen_helper_neon_sqshlui_b, gen_helper_neon_sqshlui_h,
119
+ gen_helper_neon_sqshlui_s, gen_helper_neon_sqshlui_d,
120
+ };
121
+ tcg_debug_assert(vece <= MO_64);
122
+ tcg_debug_assert(c >= 0 && c <= (8 << vece));
123
+ tcg_gen_gvec_2_ptr(rd_ofs, rn_ofs, tcg_env, opr_sz, max_sz, c, fns[vece]);
124
+}
125
+
126
void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz)
127
{
128
uint64_t max = MAKE_64BIT_MASK(0, 8 << esz);
129
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
79
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
130
index XXXXXXX..XXXXXXX 100644
80
index XXXXXXX..XXXXXXX 100644
131
--- a/target/arm/tcg/neon_helper.c
81
--- a/target/arm/tcg/neon_helper.c
132
+++ b/target/arm/tcg/neon_helper.c
82
+++ b/target/arm/tcg/neon_helper.c
83
@@ -XXX,XX +XXX,XX @@ void HELPER(name)(void *vd, void *vn, void *vm, uint32_t desc) \
84
}
85
86
#define NEON_GVEC_VOP2_ENV(name, vtype) \
87
-void HELPER(name)(void *vd, void *vn, void *vm, void *venv, uint32_t desc) \
88
+void HELPER(name)(void *vd, void *vn, void *vm, CPUARMState *env, uint32_t desc) \
89
{ \
90
intptr_t i, opr_sz = simd_oprsz(desc); \
91
vtype *d = vd, *n = vn, *m = vm; \
92
- CPUARMState *env = venv; \
93
for (i = 0; i < opr_sz / sizeof(vtype); i++) { \
94
NEON_FN(d[i], n[i], m[i]); \
95
} \
133
@@ -XXX,XX +XXX,XX @@ void HELPER(name)(void *vd, void *vn, void *vm, void *venv, uint32_t desc) \
96
@@ -XXX,XX +XXX,XX @@ void HELPER(name)(void *vd, void *vn, void *vm, void *venv, uint32_t desc) \
134
clear_tail(d, opr_sz, simd_maxsz(desc)); \
135
}
97
}
136
98
137
+#define NEON_GVEC_VOP2i_ENV(name, vtype) \
99
#define NEON_GVEC_VOP2i_ENV(name, vtype) \
138
+void HELPER(name)(void *vd, void *vn, void *venv, uint32_t desc) \
100
-void HELPER(name)(void *vd, void *vn, void *venv, uint32_t desc) \
139
+{ \
101
+void HELPER(name)(void *vd, void *vn, CPUARMState *env, uint32_t desc) \
140
+ intptr_t i, opr_sz = simd_oprsz(desc); \
102
{ \
141
+ int imm = simd_data(desc); \
103
intptr_t i, opr_sz = simd_oprsz(desc); \
142
+ vtype *d = vd, *n = vn; \
104
int imm = simd_data(desc); \
143
+ CPUARMState *env = venv; \
105
vtype *d = vd, *n = vn; \
144
+ for (i = 0; i < opr_sz / sizeof(vtype); i++) { \
106
- CPUARMState *env = venv; \
145
+ NEON_FN(d[i], n[i], imm); \
107
for (i = 0; i < opr_sz / sizeof(vtype); i++) { \
146
+ } \
108
NEON_FN(d[i], n[i], imm); \
147
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
109
} \
148
+}
149
+
150
/* Pairwise operations. */
151
/* For 32-bit elements each segment only contains a single element, so
152
the elementwise and pairwise operations are the same. */
153
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift)
154
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc))
155
NEON_VOP_ENV(qshl_u8, neon_u8, 4)
156
NEON_GVEC_VOP2_ENV(neon_uqshl_b, uint8_t)
157
+NEON_GVEC_VOP2i_ENV(neon_uqshli_b, uint8_t)
158
#undef NEON_FN
159
160
#define NEON_FN(dest, src1, src2) \
161
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc))
162
NEON_VOP_ENV(qshl_u16, neon_u16, 2)
163
NEON_GVEC_VOP2_ENV(neon_uqshl_h, uint16_t)
164
+NEON_GVEC_VOP2i_ENV(neon_uqshli_h, uint16_t)
165
#undef NEON_FN
166
167
#define NEON_FN(dest, src1, src2) \
168
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc))
169
NEON_GVEC_VOP2_ENV(neon_uqshl_s, uint32_t)
170
+NEON_GVEC_VOP2i_ENV(neon_uqshli_s, uint32_t)
171
#undef NEON_FN
172
173
#define NEON_FN(dest, src1, src2) \
174
(dest = do_uqrshl_d(src1, (int8_t)src2, false, env->vfp.qc))
175
NEON_GVEC_VOP2_ENV(neon_uqshl_d, uint64_t)
176
+NEON_GVEC_VOP2i_ENV(neon_uqshli_d, uint64_t)
177
#undef NEON_FN
178
179
uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift)
180
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift)
181
(dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc))
182
NEON_VOP_ENV(qshl_s8, neon_s8, 4)
183
NEON_GVEC_VOP2_ENV(neon_sqshl_b, int8_t)
184
+NEON_GVEC_VOP2i_ENV(neon_sqshli_b, int8_t)
185
#undef NEON_FN
186
187
#define NEON_FN(dest, src1, src2) \
188
(dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc))
189
NEON_VOP_ENV(qshl_s16, neon_s16, 2)
190
NEON_GVEC_VOP2_ENV(neon_sqshl_h, int16_t)
191
+NEON_GVEC_VOP2i_ENV(neon_sqshli_h, int16_t)
192
#undef NEON_FN
193
194
#define NEON_FN(dest, src1, src2) \
195
(dest = do_sqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc))
196
NEON_GVEC_VOP2_ENV(neon_sqshl_s, int32_t)
197
+NEON_GVEC_VOP2i_ENV(neon_sqshli_s, int32_t)
198
#undef NEON_FN
199
200
#define NEON_FN(dest, src1, src2) \
201
(dest = do_sqrshl_d(src1, (int8_t)src2, false, env->vfp.qc))
202
NEON_GVEC_VOP2_ENV(neon_sqshl_d, int64_t)
203
+NEON_GVEC_VOP2i_ENV(neon_sqshli_d, int64_t)
204
#undef NEON_FN
205
206
uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift)
207
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift)
208
#define NEON_FN(dest, src1, src2) \
209
(dest = do_suqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc))
210
NEON_VOP_ENV(qshlu_s8, neon_s8, 4)
211
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_b, int8_t)
212
#undef NEON_FN
213
214
#define NEON_FN(dest, src1, src2) \
215
(dest = do_suqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc))
216
NEON_VOP_ENV(qshlu_s16, neon_s16, 2)
217
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_h, int16_t)
218
#undef NEON_FN
219
220
uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t val, uint32_t shift)
221
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift)
222
return do_suqrshl_d(val, (int8_t)shift, false, env->vfp.qc);
223
}
224
225
+#define NEON_FN(dest, src1, src2) \
226
+ (dest = do_suqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc))
227
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_s, int32_t)
228
+#undef NEON_FN
229
+
230
+#define NEON_FN(dest, src1, src2) \
231
+ (dest = do_suqrshl_d(src1, (int8_t)src2, false, env->vfp.qc))
232
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_d, int64_t)
233
+#undef NEON_FN
234
+
235
#define NEON_FN(dest, src1, src2) \
236
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc))
237
NEON_VOP_ENV(qrshl_u8, neon_u8, 4)
238
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
239
index XXXXXXX..XXXXXXX 100644
240
--- a/target/arm/tcg/translate-neon.c
241
+++ b/target/arm/tcg/translate-neon.c
242
@@ -XXX,XX +XXX,XX @@ DO_2SH(VRSRA_S, gen_gvec_srsra)
243
DO_2SH(VRSRA_U, gen_gvec_ursra)
244
DO_2SH(VSHR_S, gen_gvec_sshr)
245
DO_2SH(VSHR_U, gen_gvec_ushr)
246
-
247
-static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
248
- NeonGenTwo64OpEnvFn *fn)
249
-{
250
- /*
251
- * 2-reg-and-shift operations, size == 3 case, where the
252
- * function needs to be passed tcg_env.
253
- */
254
- TCGv_i64 constimm;
255
- int pass;
256
-
257
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
258
- return false;
259
- }
260
-
261
- /* UNDEF accesses to D16-D31 if they don't exist. */
262
- if (!dc_isar_feature(aa32_simd_r32, s) &&
263
- ((a->vd | a->vm) & 0x10)) {
264
- return false;
265
- }
266
-
267
- if ((a->vm | a->vd) & a->q) {
268
- return false;
269
- }
270
-
271
- if (!vfp_access_check(s)) {
272
- return true;
273
- }
274
-
275
- /*
276
- * To avoid excessive duplication of ops we implement shift
277
- * by immediate using the variable shift operations.
278
- */
279
- constimm = tcg_constant_i64(dup_const(a->size, a->shift));
280
-
281
- for (pass = 0; pass < a->q + 1; pass++) {
282
- TCGv_i64 tmp = tcg_temp_new_i64();
283
-
284
- read_neon_element64(tmp, a->vm, pass, MO_64);
285
- fn(tmp, tcg_env, tmp, constimm);
286
- write_neon_element64(tmp, a->vd, pass, MO_64);
287
- }
288
- return true;
289
-}
290
-
291
-static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
292
- NeonGenTwoOpEnvFn *fn)
293
-{
294
- /*
295
- * 2-reg-and-shift operations, size < 3 case, where the
296
- * helper needs to be passed tcg_env.
297
- */
298
- TCGv_i32 constimm, tmp;
299
- int pass;
300
-
301
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
302
- return false;
303
- }
304
-
305
- /* UNDEF accesses to D16-D31 if they don't exist. */
306
- if (!dc_isar_feature(aa32_simd_r32, s) &&
307
- ((a->vd | a->vm) & 0x10)) {
308
- return false;
309
- }
310
-
311
- if ((a->vm | a->vd) & a->q) {
312
- return false;
313
- }
314
-
315
- if (!vfp_access_check(s)) {
316
- return true;
317
- }
318
-
319
- /*
320
- * To avoid excessive duplication of ops we implement shift
321
- * by immediate using the variable shift operations.
322
- */
323
- constimm = tcg_constant_i32(dup_const(a->size, a->shift));
324
- tmp = tcg_temp_new_i32();
325
-
326
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
327
- read_neon_element32(tmp, a->vm, pass, MO_32);
328
- fn(tmp, tcg_env, tmp, constimm);
329
- write_neon_element32(tmp, a->vd, pass, MO_32);
330
- }
331
- return true;
332
-}
333
-
334
-#define DO_2SHIFT_ENV(INSN, FUNC) \
335
- static bool trans_##INSN##_64_2sh(DisasContext *s, arg_2reg_shift *a) \
336
- { \
337
- return do_2shift_env_64(s, a, gen_helper_neon_##FUNC##64); \
338
- } \
339
- static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
340
- { \
341
- static NeonGenTwoOpEnvFn * const fns[] = { \
342
- gen_helper_neon_##FUNC##8, \
343
- gen_helper_neon_##FUNC##16, \
344
- gen_helper_neon_##FUNC##32, \
345
- }; \
346
- assert(a->size < ARRAY_SIZE(fns)); \
347
- return do_2shift_env_32(s, a, fns[a->size]); \
348
- }
349
-
350
-DO_2SHIFT_ENV(VQSHLU, qshlu_s)
351
-DO_2SHIFT_ENV(VQSHL_U, qshl_u)
352
-DO_2SHIFT_ENV(VQSHL_S, qshl_s)
353
+DO_2SH(VQSHLU, gen_neon_sqshlui)
354
+DO_2SH(VQSHL_U, gen_neon_uqshli)
355
+DO_2SH(VQSHL_S, gen_neon_sqshli)
356
357
static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
358
NeonGenTwo64OpFn *shiftfn,
359
--
110
--
360
2.34.1
111
2.34.1
112
113
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The extract2 tcg op performs the same operation
3
Pass float_status not env to match other functions.
4
as the do_ext64 function.
5
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20241206031952.78776-2-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20240912024114.1097832-6-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
9
---
12
target/arm/tcg/translate-a64.c | 23 +++--------------------
10
target/arm/tcg/helper-a64.h | 2 +-
13
1 file changed, 3 insertions(+), 20 deletions(-)
11
target/arm/tcg/helper-a64.c | 3 +--
12
target/arm/tcg/translate-a64.c | 2 +-
13
3 files changed, 3 insertions(+), 4 deletions(-)
14
14
15
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/helper-a64.h
18
+++ b/target/arm/tcg/helper-a64.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(rsqrtsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, fpst)
20
DEF_HELPER_FLAGS_2(frecpx_f64, TCG_CALL_NO_RWG, f64, f64, fpst)
21
DEF_HELPER_FLAGS_2(frecpx_f32, TCG_CALL_NO_RWG, f32, f32, fpst)
22
DEF_HELPER_FLAGS_2(frecpx_f16, TCG_CALL_NO_RWG, f16, f16, fpst)
23
-DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, env)
24
+DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, fpst)
25
DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
26
DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
27
DEF_HELPER_FLAGS_3(advsimd_maxh, TCG_CALL_NO_RWG, f16, f16, f16, fpst)
28
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/tcg/helper-a64.c
31
+++ b/target/arm/tcg/helper-a64.c
32
@@ -XXX,XX +XXX,XX @@ float64 HELPER(frecpx_f64)(float64 a, float_status *fpst)
33
}
34
}
35
36
-float32 HELPER(fcvtx_f64_to_f32)(float64 a, CPUARMState *env)
37
+float32 HELPER(fcvtx_f64_to_f32)(float64 a, float_status *fpst)
38
{
39
float32 r;
40
- float_status *fpst = &env->vfp.fp_status;
41
int old = get_float_rounding_mode(fpst);
42
43
set_float_rounding_mode(float_round_to_odd, fpst);
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
44
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
16
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/translate-a64.c
46
--- a/target/arm/tcg/translate-a64.c
18
+++ b/target/arm/tcg/translate-a64.c
47
+++ b/target/arm/tcg/translate-a64.c
19
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
48
@@ -XXX,XX +XXX,XX @@ static void gen_fcvtxn_sd(TCGv_i64 d, TCGv_i64 n)
20
}
49
* with von Neumann rounding (round to odd)
50
*/
51
TCGv_i32 tmp = tcg_temp_new_i32();
52
- gen_helper_fcvtx_f64_to_f32(tmp, n, tcg_env);
53
+ gen_helper_fcvtx_f64_to_f32(tmp, n, fpstatus_ptr(FPST_FPCR));
54
tcg_gen_extu_i32_i64(d, tmp);
21
}
55
}
22
23
-static void do_ext64(DisasContext *s, TCGv_i64 tcg_left, TCGv_i64 tcg_right,
24
- int pos)
25
-{
26
- /* Extract 64 bits from the middle of two concatenated 64 bit
27
- * vector register slices left:right. The extracted bits start
28
- * at 'pos' bits into the right (least significant) side.
29
- * We return the result in tcg_right, and guarantee not to
30
- * trash tcg_left.
31
- */
32
- TCGv_i64 tcg_tmp = tcg_temp_new_i64();
33
- assert(pos > 0 && pos < 64);
34
-
35
- tcg_gen_shri_i64(tcg_right, tcg_right, pos);
36
- tcg_gen_shli_i64(tcg_tmp, tcg_left, 64 - pos);
37
- tcg_gen_or_i64(tcg_right, tcg_right, tcg_tmp);
38
-}
39
-
40
/* EXT
41
* 31 30 29 24 23 22 21 20 16 15 14 11 10 9 5 4 0
42
* +---+---+-------------+-----+---+------+---+------+---+------+------+
43
@@ -XXX,XX +XXX,XX @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
44
read_vec_element(s, tcg_resl, rn, 0, MO_64);
45
if (pos != 0) {
46
read_vec_element(s, tcg_resh, rm, 0, MO_64);
47
- do_ext64(s, tcg_resh, tcg_resl, pos);
48
+ tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
49
}
50
} else {
51
TCGv_i64 tcg_hh;
52
@@ -XXX,XX +XXX,XX @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
53
read_vec_element(s, tcg_resh, elt->reg, elt->elt, MO_64);
54
elt++;
55
if (pos != 0) {
56
- do_ext64(s, tcg_resh, tcg_resl, pos);
57
+ tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
58
tcg_hh = tcg_temp_new_i64();
59
read_vec_element(s, tcg_hh, elt->reg, elt->elt, MO_64);
60
- do_ext64(s, tcg_hh, tcg_resh, pos);
61
+ tcg_gen_extract2_i64(tcg_resh, tcg_resh, tcg_hh, pos);
62
}
63
}
64
56
65
--
57
--
66
2.34.1
58
2.34.1
67
59
68
60
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
While these functions really do return a 32-bit value,
3
Pass float_status not env to match other functions.
4
widening the return type means that we need do less
5
marshalling between TCG types.
6
7
Remove NeonGenNarrowEnvFn typedef; add NeonGenOne64OpEnvFn.
8
4
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20240912024114.1097832-27-richard.henderson@linaro.org
7
Message-id: 20241206031952.78776-3-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
9
---
14
target/arm/helper.h | 22 ++++++------
10
target/arm/helper.h | 4 ++--
15
target/arm/tcg/translate.h | 2 +-
11
target/arm/tcg/translate-a64.c | 15 ++++++++++-----
16
target/arm/tcg/neon_helper.c | 43 ++++++++++++++---------
12
target/arm/tcg/translate-vfp.c | 4 ++--
17
target/arm/tcg/translate-a64.c | 60 ++++++++++++++++++---------------
13
target/arm/vfp_helper.c | 8 ++++----
18
target/arm/tcg/translate-neon.c | 44 ++++++++++++------------
14
4 files changed, 18 insertions(+), 13 deletions(-)
19
5 files changed, 93 insertions(+), 78 deletions(-)
20
15
21
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
22
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/helper.h
18
--- a/target/arm/helper.h
24
+++ b/target/arm/helper.h
19
+++ b/target/arm/helper.h
25
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(neon_qrdmulh_s32, i32, env, i32, i32)
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_cmpeh, void, f16, f16, env)
26
DEF_HELPER_4(neon_qrdmlah_s32, i32, env, s32, s32, s32)
21
DEF_HELPER_3(vfp_cmpes, void, f32, f32, env)
27
DEF_HELPER_4(neon_qrdmlsh_s32, i32, env, s32, s32, s32)
22
DEF_HELPER_3(vfp_cmped, void, f64, f64, env)
28
23
29
-DEF_HELPER_1(neon_narrow_u8, i32, i64)
24
-DEF_HELPER_2(vfp_fcvtds, f64, f32, env)
30
-DEF_HELPER_1(neon_narrow_u16, i32, i64)
25
-DEF_HELPER_2(vfp_fcvtsd, f32, f64, env)
31
-DEF_HELPER_2(neon_unarrow_sat8, i32, env, i64)
26
+DEF_HELPER_2(vfp_fcvtds, f64, f32, fpst)
32
-DEF_HELPER_2(neon_narrow_sat_u8, i32, env, i64)
27
+DEF_HELPER_2(vfp_fcvtsd, f32, f64, fpst)
33
-DEF_HELPER_2(neon_narrow_sat_s8, i32, env, i64)
28
DEF_HELPER_FLAGS_2(bfcvt, TCG_CALL_NO_RWG, i32, f32, fpst)
34
-DEF_HELPER_2(neon_unarrow_sat16, i32, env, i64)
29
DEF_HELPER_FLAGS_2(bfcvt_pair, TCG_CALL_NO_RWG, i32, i64, fpst)
35
-DEF_HELPER_2(neon_narrow_sat_u16, i32, env, i64)
30
36
-DEF_HELPER_2(neon_narrow_sat_s16, i32, env, i64)
37
-DEF_HELPER_2(neon_unarrow_sat32, i32, env, i64)
38
-DEF_HELPER_2(neon_narrow_sat_u32, i32, env, i64)
39
-DEF_HELPER_2(neon_narrow_sat_s32, i32, env, i64)
40
+DEF_HELPER_1(neon_narrow_u8, i64, i64)
41
+DEF_HELPER_1(neon_narrow_u16, i64, i64)
42
+DEF_HELPER_2(neon_unarrow_sat8, i64, env, i64)
43
+DEF_HELPER_2(neon_narrow_sat_u8, i64, env, i64)
44
+DEF_HELPER_2(neon_narrow_sat_s8, i64, env, i64)
45
+DEF_HELPER_2(neon_unarrow_sat16, i64, env, i64)
46
+DEF_HELPER_2(neon_narrow_sat_u16, i64, env, i64)
47
+DEF_HELPER_2(neon_narrow_sat_s16, i64, env, i64)
48
+DEF_HELPER_2(neon_unarrow_sat32, i64, env, i64)
49
+DEF_HELPER_2(neon_narrow_sat_u32, i64, env, i64)
50
+DEF_HELPER_2(neon_narrow_sat_s32, i64, env, i64)
51
DEF_HELPER_1(neon_narrow_high_u8, i32, i64)
52
DEF_HELPER_1(neon_narrow_high_u16, i32, i64)
53
DEF_HELPER_1(neon_narrow_round_high_u8, i32, i64)
54
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/tcg/translate.h
57
+++ b/target/arm/tcg/translate.h
58
@@ -XXX,XX +XXX,XX @@ typedef void NeonGenThreeOpEnvFn(TCGv_i32, TCGv_env, TCGv_i32,
59
typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
60
typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
61
typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
62
-typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
63
typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
64
typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
65
typedef void NeonGenOneSingleOpFn(TCGv_i32, TCGv_i32, TCGv_ptr);
66
typedef void NeonGenTwoSingleOpFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
67
typedef void NeonGenTwoDoubleOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
68
typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
69
+typedef void NeonGenOne64OpEnvFn(TCGv_i64, TCGv_env, TCGv_i64);
70
typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
71
typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
72
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
73
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/neon_helper.c
76
+++ b/target/arm/tcg/neon_helper.c
77
@@ -XXX,XX +XXX,XX @@ NEON_VOP_ENV(qrdmulh_s32, neon_s32, 1)
78
#undef NEON_FN
79
#undef NEON_QDMULH32
80
81
-uint32_t HELPER(neon_narrow_u8)(uint64_t x)
82
+/* Only the low 32-bits of output are significant. */
83
+uint64_t HELPER(neon_narrow_u8)(uint64_t x)
84
{
85
return (x & 0xffu) | ((x >> 8) & 0xff00u) | ((x >> 16) & 0xff0000u)
86
| ((x >> 24) & 0xff000000u);
87
}
88
89
-uint32_t HELPER(neon_narrow_u16)(uint64_t x)
90
+/* Only the low 32-bits of output are significant. */
91
+uint64_t HELPER(neon_narrow_u16)(uint64_t x)
92
{
93
return (x & 0xffffu) | ((x >> 16) & 0xffff0000u);
94
}
95
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_round_high_u16)(uint64_t x)
96
return ((x >> 16) & 0xffff) | ((x >> 32) & 0xffff0000);
97
}
98
99
-uint32_t HELPER(neon_unarrow_sat8)(CPUARMState *env, uint64_t x)
100
+/* Only the low 32-bits of output are significant. */
101
+uint64_t HELPER(neon_unarrow_sat8)(CPUARMState *env, uint64_t x)
102
{
103
uint16_t s;
104
uint8_t d;
105
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_unarrow_sat8)(CPUARMState *env, uint64_t x)
106
return res;
107
}
108
109
-uint32_t HELPER(neon_narrow_sat_u8)(CPUARMState *env, uint64_t x)
110
+/* Only the low 32-bits of output are significant. */
111
+uint64_t HELPER(neon_narrow_sat_u8)(CPUARMState *env, uint64_t x)
112
{
113
uint16_t s;
114
uint8_t d;
115
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_u8)(CPUARMState *env, uint64_t x)
116
return res;
117
}
118
119
-uint32_t HELPER(neon_narrow_sat_s8)(CPUARMState *env, uint64_t x)
120
+/* Only the low 32-bits of output are significant. */
121
+uint64_t HELPER(neon_narrow_sat_s8)(CPUARMState *env, uint64_t x)
122
{
123
int16_t s;
124
uint8_t d;
125
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_s8)(CPUARMState *env, uint64_t x)
126
return res;
127
}
128
129
-uint32_t HELPER(neon_unarrow_sat16)(CPUARMState *env, uint64_t x)
130
+/* Only the low 32-bits of output are significant. */
131
+uint64_t HELPER(neon_unarrow_sat16)(CPUARMState *env, uint64_t x)
132
{
133
uint32_t high;
134
uint32_t low;
135
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_unarrow_sat16)(CPUARMState *env, uint64_t x)
136
high = 0xffff;
137
SET_QC();
138
}
139
- return low | (high << 16);
140
+ return deposit32(low, 16, 16, high);
141
}
142
143
-uint32_t HELPER(neon_narrow_sat_u16)(CPUARMState *env, uint64_t x)
144
+/* Only the low 32-bits of output are significant. */
145
+uint64_t HELPER(neon_narrow_sat_u16)(CPUARMState *env, uint64_t x)
146
{
147
uint32_t high;
148
uint32_t low;
149
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_u16)(CPUARMState *env, uint64_t x)
150
high = 0xffff;
151
SET_QC();
152
}
153
- return low | (high << 16);
154
+ return deposit32(low, 16, 16, high);
155
}
156
157
-uint32_t HELPER(neon_narrow_sat_s16)(CPUARMState *env, uint64_t x)
158
+/* Only the low 32-bits of output are significant. */
159
+uint64_t HELPER(neon_narrow_sat_s16)(CPUARMState *env, uint64_t x)
160
{
161
int32_t low;
162
int32_t high;
163
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_s16)(CPUARMState *env, uint64_t x)
164
high = (high >> 31) ^ 0x7fff;
165
SET_QC();
166
}
167
- return (uint16_t)low | (high << 16);
168
+ return deposit32(low, 16, 16, high);
169
}
170
171
-uint32_t HELPER(neon_unarrow_sat32)(CPUARMState *env, uint64_t x)
172
+/* Only the low 32-bits of output are significant. */
173
+uint64_t HELPER(neon_unarrow_sat32)(CPUARMState *env, uint64_t x)
174
{
175
if (x & 0x8000000000000000ull) {
176
SET_QC();
177
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_unarrow_sat32)(CPUARMState *env, uint64_t x)
178
return x;
179
}
180
181
-uint32_t HELPER(neon_narrow_sat_u32)(CPUARMState *env, uint64_t x)
182
+/* Only the low 32-bits of output are significant. */
183
+uint64_t HELPER(neon_narrow_sat_u32)(CPUARMState *env, uint64_t x)
184
{
185
if (x > 0xffffffffu) {
186
SET_QC();
187
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_u32)(CPUARMState *env, uint64_t x)
188
return x;
189
}
190
191
-uint32_t HELPER(neon_narrow_sat_s32)(CPUARMState *env, uint64_t x)
192
+/* Only the low 32-bits of output are significant. */
193
+uint64_t HELPER(neon_narrow_sat_s32)(CPUARMState *env, uint64_t x)
194
{
195
if ((int64_t)x != (int32_t)x) {
196
SET_QC();
197
- return ((int64_t)x >> 63) ^ 0x7fffffff;
198
+ return (uint32_t)((int64_t)x >> 63) ^ 0x7fffffff;
199
}
200
- return x;
201
+ return (uint32_t)x;
202
}
203
204
uint64_t HELPER(neon_widen_u8)(uint32_t x)
205
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
31
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
206
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
207
--- a/target/arm/tcg/translate-a64.c
33
--- a/target/arm/tcg/translate-a64.c
208
+++ b/target/arm/tcg/translate-a64.c
34
+++ b/target/arm/tcg/translate-a64.c
209
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
35
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVT_s_ds(DisasContext *s, arg_rr *a)
210
int elements = is_scalar ? 1 : (64 / esize);
36
if (fp_access_check(s)) {
211
bool round = extract32(opcode, 0, 1);
37
TCGv_i32 tcg_rn = read_fp_sreg(s, a->rn);
212
MemOp ldop = (size + 1) | (is_u_shift ? 0 : MO_SIGN);
38
TCGv_i64 tcg_rd = tcg_temp_new_i64();
213
- TCGv_i64 tcg_rn, tcg_rd;
39
+ TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
214
- TCGv_i32 tcg_rd_narrowed;
40
215
- TCGv_i64 tcg_final;
41
- gen_helper_vfp_fcvtds(tcg_rd, tcg_rn, tcg_env);
216
+ TCGv_i64 tcg_rn, tcg_rd, tcg_final;
42
+ gen_helper_vfp_fcvtds(tcg_rd, tcg_rn, fpst);
217
43
write_fp_dreg(s, a->rd, tcg_rd);
218
- static NeonGenNarrowEnvFn * const signed_narrow_fns[4][2] = {
44
}
219
+ static NeonGenOne64OpEnvFn * const signed_narrow_fns[4][2] = {
45
return true;
220
{ gen_helper_neon_narrow_sat_s8,
46
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVT_s_sd(DisasContext *s, arg_rr *a)
221
gen_helper_neon_unarrow_sat8 },
47
if (fp_access_check(s)) {
222
{ gen_helper_neon_narrow_sat_s16,
48
TCGv_i64 tcg_rn = read_fp_dreg(s, a->rn);
223
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
49
TCGv_i32 tcg_rd = tcg_temp_new_i32();
224
gen_helper_neon_unarrow_sat32 },
50
+ TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
225
{ NULL, NULL },
51
226
};
52
- gen_helper_vfp_fcvtsd(tcg_rd, tcg_rn, tcg_env);
227
- static NeonGenNarrowEnvFn * const unsigned_narrow_fns[4] = {
53
+ gen_helper_vfp_fcvtsd(tcg_rd, tcg_rn, fpst);
228
+ static NeonGenOne64OpEnvFn * const unsigned_narrow_fns[4] = {
54
write_fp_sreg(s, a->rd, tcg_rd);
229
gen_helper_neon_narrow_sat_u8,
55
}
230
gen_helper_neon_narrow_sat_u16,
56
return true;
231
gen_helper_neon_narrow_sat_u32,
57
@@ -XXX,XX +XXX,XX @@ static void gen_fcvtn_hs(TCGv_i64 d, TCGv_i64 n)
232
NULL
58
static void gen_fcvtn_sd(TCGv_i64 d, TCGv_i64 n)
233
};
59
{
234
- NeonGenNarrowEnvFn *narrowfn;
60
TCGv_i32 tmp = tcg_temp_new_i32();
235
+ NeonGenOne64OpEnvFn *narrowfn;
61
- gen_helper_vfp_fcvtsd(tmp, n, tcg_env);
236
62
+ TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
237
int i;
63
+
238
64
+ gen_helper_vfp_fcvtsd(tmp, n, fpst);
239
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
65
tcg_gen_extu_i32_i64(d, tmp);
240
66
}
241
tcg_rn = tcg_temp_new_i64();
67
242
tcg_rd = tcg_temp_new_i64();
68
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVTL_v(DisasContext *s, arg_qrr_e *a)
243
- tcg_rd_narrowed = tcg_temp_new_i32();
69
* The only instruction like this is FCVTL.
244
tcg_final = tcg_temp_new_i64();
245
246
for (i = 0; i < elements; i++) {
247
read_vec_element(s, tcg_rn, rn, i, ldop);
248
handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
249
false, is_u_shift, size+1, shift);
250
- narrowfn(tcg_rd_narrowed, tcg_env, tcg_rd);
251
- tcg_gen_extu_i32_i64(tcg_rd, tcg_rd_narrowed);
252
+ narrowfn(tcg_rd, tcg_env, tcg_rd);
253
if (i == 0) {
254
tcg_gen_extract_i64(tcg_final, tcg_rd, 0, esize);
255
} else {
256
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
257
* in the source becomes a size element in the destination).
258
*/
70
*/
259
int pass;
71
int pass;
260
- TCGv_i32 tcg_res[2];
72
+ TCGv_ptr fpst;
261
+ TCGv_i64 tcg_res[2];
73
262
int destelt = is_q ? 2 : 0;
74
if (!fp_access_check(s)) {
263
int passes = scalar ? 1 : 2;
75
return true;
264
265
if (scalar) {
266
- tcg_res[1] = tcg_constant_i32(0);
267
+ tcg_res[1] = tcg_constant_i64(0);
268
}
76
}
269
77
270
for (pass = 0; pass < passes; pass++) {
78
+ fpst = fpstatus_ptr(FPST_FPCR);
271
TCGv_i64 tcg_op = tcg_temp_new_i64();
79
if (a->esz == MO_64) {
272
- NeonGenNarrowFn *genfn = NULL;
80
/* 32 -> 64 bit fp conversion */
273
- NeonGenNarrowEnvFn *genenvfn = NULL;
81
TCGv_i64 tcg_res[2];
274
+ NeonGenOne64OpFn *genfn = NULL;
82
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVTL_v(DisasContext *s, arg_qrr_e *a)
275
+ NeonGenOne64OpEnvFn *genenvfn = NULL;
83
for (pass = 0; pass < 2; pass++) {
276
84
tcg_res[pass] = tcg_temp_new_i64();
277
if (scalar) {
85
read_vec_element_i32(s, tcg_op, a->rn, srcelt + pass, MO_32);
278
read_vec_element(s, tcg_op, rn, pass, size + 1);
86
- gen_helper_vfp_fcvtds(tcg_res[pass], tcg_op, tcg_env);
279
} else {
87
+ gen_helper_vfp_fcvtds(tcg_res[pass], tcg_op, fpst);
280
read_vec_element(s, tcg_op, rn, pass, MO_64);
281
}
88
}
282
- tcg_res[pass] = tcg_temp_new_i32();
89
for (pass = 0; pass < 2; pass++) {
283
+ tcg_res[pass] = tcg_temp_new_i64();
90
write_vec_element(s, tcg_res[pass], a->rd, pass, MO_64);
284
91
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVTL_v(DisasContext *s, arg_qrr_e *a)
285
switch (opcode) {
92
/* 16 -> 32 bit fp conversion */
286
case 0x12: /* XTN, SQXTUN */
93
int srcelt = a->q ? 4 : 0;
287
{
94
TCGv_i32 tcg_res[4];
288
- static NeonGenNarrowFn * const xtnfns[3] = {
95
- TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
289
+ static NeonGenOne64OpFn * const xtnfns[3] = {
96
TCGv_i32 ahp = get_ahp_flag();
290
gen_helper_neon_narrow_u8,
97
291
gen_helper_neon_narrow_u16,
98
for (pass = 0; pass < 4; pass++) {
292
- tcg_gen_extrl_i64_i32,
99
diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c
293
+ tcg_gen_ext32u_i64,
294
};
295
- static NeonGenNarrowEnvFn * const sqxtunfns[3] = {
296
+ static NeonGenOne64OpEnvFn * const sqxtunfns[3] = {
297
gen_helper_neon_unarrow_sat8,
298
gen_helper_neon_unarrow_sat16,
299
gen_helper_neon_unarrow_sat32,
300
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
301
}
302
case 0x14: /* SQXTN, UQXTN */
303
{
304
- static NeonGenNarrowEnvFn * const fns[3][2] = {
305
+ static NeonGenOne64OpEnvFn * const fns[3][2] = {
306
{ gen_helper_neon_narrow_sat_s8,
307
gen_helper_neon_narrow_sat_u8 },
308
{ gen_helper_neon_narrow_sat_s16,
309
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
310
case 0x16: /* FCVTN, FCVTN2 */
311
/* 32 bit to 16 bit or 64 bit to 32 bit float conversion */
312
if (size == 2) {
313
- gen_helper_vfp_fcvtsd(tcg_res[pass], tcg_op, tcg_env);
314
+ TCGv_i32 tmp = tcg_temp_new_i32();
315
+ gen_helper_vfp_fcvtsd(tmp, tcg_op, tcg_env);
316
+ tcg_gen_extu_i32_i64(tcg_res[pass], tmp);
317
} else {
318
TCGv_i32 tcg_lo = tcg_temp_new_i32();
319
TCGv_i32 tcg_hi = tcg_temp_new_i32();
320
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
321
tcg_gen_extr_i64_i32(tcg_lo, tcg_hi, tcg_op);
322
gen_helper_vfp_fcvt_f32_to_f16(tcg_lo, tcg_lo, fpst, ahp);
323
gen_helper_vfp_fcvt_f32_to_f16(tcg_hi, tcg_hi, fpst, ahp);
324
- tcg_gen_deposit_i32(tcg_res[pass], tcg_lo, tcg_hi, 16, 16);
325
+ tcg_gen_deposit_i32(tcg_lo, tcg_lo, tcg_hi, 16, 16);
326
+ tcg_gen_extu_i32_i64(tcg_res[pass], tcg_lo);
327
}
328
break;
329
case 0x36: /* BFCVTN, BFCVTN2 */
330
{
331
TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
332
- gen_helper_bfcvt_pair(tcg_res[pass], tcg_op, fpst);
333
+ TCGv_i32 tmp = tcg_temp_new_i32();
334
+ gen_helper_bfcvt_pair(tmp, tcg_op, fpst);
335
+ tcg_gen_extu_i32_i64(tcg_res[pass], tmp);
336
}
337
break;
338
case 0x56: /* FCVTXN, FCVTXN2 */
339
- /* 64 bit to 32 bit float conversion
340
- * with von Neumann rounding (round to odd)
341
- */
342
- assert(size == 2);
343
- gen_helper_fcvtx_f64_to_f32(tcg_res[pass], tcg_op, tcg_env);
344
+ {
345
+ /*
346
+ * 64 bit to 32 bit float conversion
347
+ * with von Neumann rounding (round to odd)
348
+ */
349
+ TCGv_i32 tmp = tcg_temp_new_i32();
350
+ assert(size == 2);
351
+ gen_helper_fcvtx_f64_to_f32(tmp, tcg_op, tcg_env);
352
+ tcg_gen_extu_i32_i64(tcg_res[pass], tmp);
353
+ }
354
break;
355
default:
356
g_assert_not_reached();
357
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
358
}
359
360
for (pass = 0; pass < 2; pass++) {
361
- write_vec_element_i32(s, tcg_res[pass], rd, destelt + pass, MO_32);
362
+ write_vec_element(s, tcg_res[pass], rd, destelt + pass, MO_32);
363
}
364
clear_vec_high(s, is_q, rd);
365
}
366
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
367
index XXXXXXX..XXXXXXX 100644
100
index XXXXXXX..XXXXXXX 100644
368
--- a/target/arm/tcg/translate-neon.c
101
--- a/target/arm/tcg/translate-vfp.c
369
+++ b/target/arm/tcg/translate-neon.c
102
+++ b/target/arm/tcg/translate-vfp.c
370
@@ -XXX,XX +XXX,XX @@ DO_2SH(VQSHL_S, gen_neon_sqshli)
103
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
371
104
vm = tcg_temp_new_i32();
372
static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
105
vd = tcg_temp_new_i64();
373
NeonGenTwo64OpFn *shiftfn,
106
vfp_load_reg32(vm, a->vm);
374
- NeonGenNarrowEnvFn *narrowfn)
107
- gen_helper_vfp_fcvtds(vd, vm, tcg_env);
375
+ NeonGenOne64OpEnvFn *narrowfn)
108
+ gen_helper_vfp_fcvtds(vd, vm, fpstatus_ptr(FPST_FPCR));
376
{
109
vfp_store_reg64(vd, a->vd);
377
/* 2-reg-and-shift narrowing-shift operations, size == 3 case */
378
- TCGv_i64 constimm, rm1, rm2;
379
- TCGv_i32 rd;
380
+ TCGv_i64 constimm, rm1, rm2, rd;
381
382
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
383
return false;
384
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
385
constimm = tcg_constant_i64(-a->shift);
386
rm1 = tcg_temp_new_i64();
387
rm2 = tcg_temp_new_i64();
388
- rd = tcg_temp_new_i32();
389
+ rd = tcg_temp_new_i64();
390
391
/* Load both inputs first to avoid potential overwrite if rm == rd */
392
read_neon_element64(rm1, a->vm, 0, MO_64);
393
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
394
395
shiftfn(rm1, rm1, constimm);
396
narrowfn(rd, tcg_env, rm1);
397
- write_neon_element32(rd, a->vd, 0, MO_32);
398
+ write_neon_element64(rd, a->vd, 0, MO_32);
399
400
shiftfn(rm2, rm2, constimm);
401
narrowfn(rd, tcg_env, rm2);
402
- write_neon_element32(rd, a->vd, 1, MO_32);
403
+ write_neon_element64(rd, a->vd, 1, MO_32);
404
405
return true;
110
return true;
406
}
111
}
407
112
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
408
static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
113
vd = tcg_temp_new_i32();
409
NeonGenTwoOpFn *shiftfn,
114
vm = tcg_temp_new_i64();
410
- NeonGenNarrowEnvFn *narrowfn)
115
vfp_load_reg64(vm, a->vm);
411
+ NeonGenOne64OpEnvFn *narrowfn)
116
- gen_helper_vfp_fcvtsd(vd, vm, tcg_env);
412
{
117
+ gen_helper_vfp_fcvtsd(vd, vm, fpstatus_ptr(FPST_FPCR));
413
/* 2-reg-and-shift narrowing-shift operations, size < 3 case */
118
vfp_store_reg32(vd, a->vd);
414
TCGv_i32 constimm, rm1, rm2, rm3, rm4;
415
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
416
417
tcg_gen_concat_i32_i64(rtmp, rm1, rm2);
418
419
- narrowfn(rm1, tcg_env, rtmp);
420
- write_neon_element32(rm1, a->vd, 0, MO_32);
421
+ narrowfn(rtmp, tcg_env, rtmp);
422
+ write_neon_element64(rtmp, a->vd, 0, MO_32);
423
424
shiftfn(rm3, rm3, constimm);
425
shiftfn(rm4, rm4, constimm);
426
427
tcg_gen_concat_i32_i64(rtmp, rm3, rm4);
428
429
- narrowfn(rm3, tcg_env, rtmp);
430
- write_neon_element32(rm3, a->vd, 1, MO_32);
431
+ narrowfn(rtmp, tcg_env, rtmp);
432
+ write_neon_element64(rtmp, a->vd, 1, MO_32);
433
return true;
119
return true;
434
}
120
}
435
121
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
436
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
122
index XXXXXXX..XXXXXXX 100644
437
return do_2shift_narrow_32(s, a, FUNC, NARROWFUNC); \
123
--- a/target/arm/vfp_helper.c
438
}
124
+++ b/target/arm/vfp_helper.c
439
125
@@ -XXX,XX +XXX,XX @@ FLOAT_CONVS(ui, d, float64, 64, u)
440
-static void gen_neon_narrow_u32(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
126
#undef FLOAT_CONVS
441
+static void gen_neon_narrow_u32(TCGv_i64 dest, TCGv_ptr env, TCGv_i64 src)
127
128
/* floating point conversion */
129
-float64 VFP_HELPER(fcvtd, s)(float32 x, CPUARMState *env)
130
+float64 VFP_HELPER(fcvtd, s)(float32 x, float_status *status)
442
{
131
{
443
- tcg_gen_extrl_i64_i32(dest, src);
132
- return float32_to_float64(x, &env->vfp.fp_status);
444
+ tcg_gen_ext32u_i64(dest, src);
133
+ return float32_to_float64(x, status);
445
}
134
}
446
135
447
-static void gen_neon_narrow_u16(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
136
-float32 VFP_HELPER(fcvts, d)(float64 x, CPUARMState *env)
448
+static void gen_neon_narrow_u16(TCGv_i64 dest, TCGv_ptr env, TCGv_i64 src)
137
+float32 VFP_HELPER(fcvts, d)(float64 x, float_status *status)
449
{
138
{
450
gen_helper_neon_narrow_u16(dest, src);
139
- return float64_to_float32(x, &env->vfp.fp_status);
140
+ return float64_to_float32(x, status);
451
}
141
}
452
142
453
-static void gen_neon_narrow_u8(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
143
uint32_t HELPER(bfcvt)(float32 x, float_status *status)
454
+static void gen_neon_narrow_u8(TCGv_i64 dest, TCGv_ptr env, TCGv_i64 src)
455
{
456
gen_helper_neon_narrow_u8(dest, src);
457
}
458
@@ -XXX,XX +XXX,XX @@ static bool trans_VZIP(DisasContext *s, arg_2misc *a)
459
}
460
461
static bool do_vmovn(DisasContext *s, arg_2misc *a,
462
- NeonGenNarrowEnvFn *narrowfn)
463
+ NeonGenOne64OpEnvFn *narrowfn)
464
{
465
- TCGv_i64 rm;
466
- TCGv_i32 rd0, rd1;
467
+ TCGv_i64 rm, rd0, rd1;
468
469
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
470
return false;
471
@@ -XXX,XX +XXX,XX @@ static bool do_vmovn(DisasContext *s, arg_2misc *a,
472
}
473
474
rm = tcg_temp_new_i64();
475
- rd0 = tcg_temp_new_i32();
476
- rd1 = tcg_temp_new_i32();
477
+ rd0 = tcg_temp_new_i64();
478
+ rd1 = tcg_temp_new_i64();
479
480
read_neon_element64(rm, a->vm, 0, MO_64);
481
narrowfn(rd0, tcg_env, rm);
482
read_neon_element64(rm, a->vm, 1, MO_64);
483
narrowfn(rd1, tcg_env, rm);
484
- write_neon_element32(rd0, a->vd, 0, MO_32);
485
- write_neon_element32(rd1, a->vd, 1, MO_32);
486
+ write_neon_element64(rd0, a->vd, 0, MO_32);
487
+ write_neon_element64(rd1, a->vd, 1, MO_32);
488
return true;
489
}
490
491
#define DO_VMOVN(INSN, FUNC) \
492
static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
493
{ \
494
- static NeonGenNarrowEnvFn * const narrowfn[] = { \
495
+ static NeonGenOne64OpEnvFn * const narrowfn[] = { \
496
FUNC##8, \
497
FUNC##16, \
498
FUNC##32, \
499
--
144
--
500
2.34.1
145
2.34.1
146
147
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
FEAT_XS introduces a set of new TLBI maintenance instructions with an
2
"nXS" qualifier. These behave like the stardard ones except that
3
they do not wait for memory accesses with the XS attribute to
4
complete. They have an interaction with the fine-grained-trap
5
handling: the FGT bits that a hypervisor can use to trap TLBI
6
maintenance instructions normally trap also the nXS variants, but the
7
hypervisor can elect to not trap the nXS variants by setting
8
HCRX_EL2.FGTnXS to 1.
2
9
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Add support to our FGT mechanism for these TLBI bits. For each
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
TLBI-trapping FGT bit we define, for example:
5
Message-id: 20240912024114.1097832-29-richard.henderson@linaro.org
12
* FGT_TLBIVAE1 -- the same value we do at present for the
13
normal variant of the insn
14
* FGT_TLBIVAE1NXS -- for the nXS qualified insn; the value of
15
this enum has an NXS bit ORed into it
16
17
In access_check_cp_reg() we can then ignore the trap bit for an
18
access where ri->fgt has the NXS bit set and HCRX_EL2.FGTnXS is 1.
19
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
22
Message-id: 20241211144440.2700268-2-peter.maydell@linaro.org
7
---
23
---
8
target/arm/tcg/a64.decode | 24 +++++
24
target/arm/cpregs.h | 72 ++++++++++++++++++++++----------------
9
target/arm/tcg/translate-a64.c | 176 ++++++++++++++++++++++++++++++---
25
target/arm/cpu-features.h | 5 +++
10
2 files changed, 186 insertions(+), 14 deletions(-)
26
target/arm/helper.c | 5 ++-
27
target/arm/tcg/op_helper.c | 11 +++++-
28
4 files changed, 61 insertions(+), 32 deletions(-)
11
29
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
30
diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
13
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
32
--- a/target/arm/cpregs.h
15
+++ b/target/arm/tcg/a64.decode
33
+++ b/target/arm/cpregs.h
16
@@ -XXX,XX +XXX,XX @@ SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_h
34
@@ -XXX,XX +XXX,XX @@ FIELD(HDFGWTR_EL2, NBRBCTL, 60, 1)
17
SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_s
35
FIELD(HDFGWTR_EL2, NBRBDATA, 61, 1)
18
SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_d
36
FIELD(HDFGWTR_EL2, NPMSNEVFR_EL1, 62, 1)
19
37
20
+SQSHRN_v 0.00 11110 .... ... 10010 1 ..... ..... @q_shri_b
38
+FIELD(FGT, NXS, 13, 1) /* Honour HCR_EL2.FGTnXS to suppress FGT */
21
+SQSHRN_v 0.00 11110 .... ... 10010 1 ..... ..... @q_shri_h
39
/* Which fine-grained trap bit register to check, if any */
22
+SQSHRN_v 0.00 11110 .... ... 10010 1 ..... ..... @q_shri_s
40
FIELD(FGT, TYPE, 10, 3)
41
FIELD(FGT, REV, 9, 1) /* Is bit sense reversed? */
42
@@ -XXX,XX +XXX,XX @@ FIELD(FGT, BITPOS, 0, 6) /* Bit position within the uint64_t */
43
#define DO_REV_BIT(REG, BITNAME) \
44
FGT_##BITNAME = FGT_##REG | FGT_REV | R_##REG##_EL2_##BITNAME##_SHIFT
45
46
+/*
47
+ * The FGT bits for TLBI maintenance instructions accessible at EL1 always
48
+ * affect the "normal" TLBI insns; they affect the corresponding TLBI insns
49
+ * with the nXS qualifier only if HCRX_EL2.FGTnXS is 0. We define e.g.
50
+ * FGT_TLBIVAE1 to use for the normal insn, and FGT_TLBIVAE1NXS to use
51
+ * for the nXS qualified insn.
52
+ */
53
+#define DO_TLBINXS_BIT(REG, BITNAME) \
54
+ FGT_##BITNAME = FGT_##REG | R_##REG##_EL2_##BITNAME##_SHIFT, \
55
+ FGT_##BITNAME##NXS = FGT_##BITNAME | R_FGT_NXS_MASK
23
+
56
+
24
+UQSHRN_v 0.10 11110 .... ... 10010 1 ..... ..... @q_shri_b
57
typedef enum FGTBit {
25
+UQSHRN_v 0.10 11110 .... ... 10010 1 ..... ..... @q_shri_h
58
/*
26
+UQSHRN_v 0.10 11110 .... ... 10010 1 ..... ..... @q_shri_s
59
* These bits tell us which register arrays to use:
27
+
60
@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
28
+SQSHRUN_v 0.10 11110 .... ... 10000 1 ..... ..... @q_shri_b
61
DO_BIT(HFGITR, ATS1E0W),
29
+SQSHRUN_v 0.10 11110 .... ... 10000 1 ..... ..... @q_shri_h
62
DO_BIT(HFGITR, ATS1E1RP),
30
+SQSHRUN_v 0.10 11110 .... ... 10000 1 ..... ..... @q_shri_s
63
DO_BIT(HFGITR, ATS1E1WP),
31
+
64
- DO_BIT(HFGITR, TLBIVMALLE1OS),
32
+SQRSHRN_v 0.00 11110 .... ... 10011 1 ..... ..... @q_shri_b
65
- DO_BIT(HFGITR, TLBIVAE1OS),
33
+SQRSHRN_v 0.00 11110 .... ... 10011 1 ..... ..... @q_shri_h
66
- DO_BIT(HFGITR, TLBIASIDE1OS),
34
+SQRSHRN_v 0.00 11110 .... ... 10011 1 ..... ..... @q_shri_s
67
- DO_BIT(HFGITR, TLBIVAAE1OS),
35
+
68
- DO_BIT(HFGITR, TLBIVALE1OS),
36
+UQRSHRN_v 0.10 11110 .... ... 10011 1 ..... ..... @q_shri_b
69
- DO_BIT(HFGITR, TLBIVAALE1OS),
37
+UQRSHRN_v 0.10 11110 .... ... 10011 1 ..... ..... @q_shri_h
70
- DO_BIT(HFGITR, TLBIRVAE1OS),
38
+UQRSHRN_v 0.10 11110 .... ... 10011 1 ..... ..... @q_shri_s
71
- DO_BIT(HFGITR, TLBIRVAAE1OS),
39
+
72
- DO_BIT(HFGITR, TLBIRVALE1OS),
40
+SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_b
73
- DO_BIT(HFGITR, TLBIRVAALE1OS),
41
+SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_h
74
- DO_BIT(HFGITR, TLBIVMALLE1IS),
42
+SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_s
75
- DO_BIT(HFGITR, TLBIVAE1IS),
43
+
76
- DO_BIT(HFGITR, TLBIASIDE1IS),
44
# Advanced SIMD scalar shift by immediate
77
- DO_BIT(HFGITR, TLBIVAAE1IS),
45
78
- DO_BIT(HFGITR, TLBIVALE1IS),
46
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
79
- DO_BIT(HFGITR, TLBIVAALE1IS),
47
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
80
- DO_BIT(HFGITR, TLBIRVAE1IS),
81
- DO_BIT(HFGITR, TLBIRVAAE1IS),
82
- DO_BIT(HFGITR, TLBIRVALE1IS),
83
- DO_BIT(HFGITR, TLBIRVAALE1IS),
84
- DO_BIT(HFGITR, TLBIRVAE1),
85
- DO_BIT(HFGITR, TLBIRVAAE1),
86
- DO_BIT(HFGITR, TLBIRVALE1),
87
- DO_BIT(HFGITR, TLBIRVAALE1),
88
- DO_BIT(HFGITR, TLBIVMALLE1),
89
- DO_BIT(HFGITR, TLBIVAE1),
90
- DO_BIT(HFGITR, TLBIASIDE1),
91
- DO_BIT(HFGITR, TLBIVAAE1),
92
- DO_BIT(HFGITR, TLBIVALE1),
93
- DO_BIT(HFGITR, TLBIVAALE1),
94
+ DO_TLBINXS_BIT(HFGITR, TLBIVMALLE1OS),
95
+ DO_TLBINXS_BIT(HFGITR, TLBIVAE1OS),
96
+ DO_TLBINXS_BIT(HFGITR, TLBIASIDE1OS),
97
+ DO_TLBINXS_BIT(HFGITR, TLBIVAAE1OS),
98
+ DO_TLBINXS_BIT(HFGITR, TLBIVALE1OS),
99
+ DO_TLBINXS_BIT(HFGITR, TLBIVAALE1OS),
100
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAE1OS),
101
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAAE1OS),
102
+ DO_TLBINXS_BIT(HFGITR, TLBIRVALE1OS),
103
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAALE1OS),
104
+ DO_TLBINXS_BIT(HFGITR, TLBIVMALLE1IS),
105
+ DO_TLBINXS_BIT(HFGITR, TLBIVAE1IS),
106
+ DO_TLBINXS_BIT(HFGITR, TLBIASIDE1IS),
107
+ DO_TLBINXS_BIT(HFGITR, TLBIVAAE1IS),
108
+ DO_TLBINXS_BIT(HFGITR, TLBIVALE1IS),
109
+ DO_TLBINXS_BIT(HFGITR, TLBIVAALE1IS),
110
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAE1IS),
111
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAAE1IS),
112
+ DO_TLBINXS_BIT(HFGITR, TLBIRVALE1IS),
113
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAALE1IS),
114
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAE1),
115
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAAE1),
116
+ DO_TLBINXS_BIT(HFGITR, TLBIRVALE1),
117
+ DO_TLBINXS_BIT(HFGITR, TLBIRVAALE1),
118
+ DO_TLBINXS_BIT(HFGITR, TLBIVMALLE1),
119
+ DO_TLBINXS_BIT(HFGITR, TLBIVAE1),
120
+ DO_TLBINXS_BIT(HFGITR, TLBIASIDE1),
121
+ DO_TLBINXS_BIT(HFGITR, TLBIVAAE1),
122
+ DO_TLBINXS_BIT(HFGITR, TLBIVALE1),
123
+ DO_TLBINXS_BIT(HFGITR, TLBIVAALE1),
124
DO_BIT(HFGITR, CFPRCTX),
125
DO_BIT(HFGITR, DVPRCTX),
126
DO_BIT(HFGITR, CPPRCTX),
127
diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
48
index XXXXXXX..XXXXXXX 100644
128
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/tcg/translate-a64.c
129
--- a/target/arm/cpu-features.h
50
+++ b/target/arm/tcg/translate-a64.c
130
+++ b/target/arm/cpu-features.h
51
@@ -XXX,XX +XXX,XX @@ static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
131
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
52
return true;
132
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
53
}
133
}
54
134
55
+static void gen_sqshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
135
+static inline bool isar_feature_aa64_xs(const ARMISARegisters *id)
56
+{
136
+{
57
+ tcg_gen_sari_i64(d, s, i);
137
+ return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, XS) != 0;
58
+ tcg_gen_ext16u_i64(d, d);
59
+ gen_helper_neon_narrow_sat_s8(d, tcg_env, d);
60
+}
138
+}
61
+
139
+
62
+static void gen_sqshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
63
+{
64
+ tcg_gen_sari_i64(d, s, i);
65
+ tcg_gen_ext32u_i64(d, d);
66
+ gen_helper_neon_narrow_sat_s16(d, tcg_env, d);
67
+}
68
+
69
+static void gen_sqshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
70
+{
71
+ gen_sshr_d(d, s, i);
72
+ gen_helper_neon_narrow_sat_s32(d, tcg_env, d);
73
+}
74
+
75
+static void gen_uqshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
76
+{
77
+ tcg_gen_shri_i64(d, s, i);
78
+ gen_helper_neon_narrow_sat_u8(d, tcg_env, d);
79
+}
80
+
81
+static void gen_uqshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
82
+{
83
+ tcg_gen_shri_i64(d, s, i);
84
+ gen_helper_neon_narrow_sat_u16(d, tcg_env, d);
85
+}
86
+
87
+static void gen_uqshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
88
+{
89
+ gen_ushr_d(d, s, i);
90
+ gen_helper_neon_narrow_sat_u32(d, tcg_env, d);
91
+}
92
+
93
+static void gen_sqshrun_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
94
+{
95
+ tcg_gen_sari_i64(d, s, i);
96
+ tcg_gen_ext16u_i64(d, d);
97
+ gen_helper_neon_unarrow_sat8(d, tcg_env, d);
98
+}
99
+
100
+static void gen_sqshrun_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
101
+{
102
+ tcg_gen_sari_i64(d, s, i);
103
+ tcg_gen_ext32u_i64(d, d);
104
+ gen_helper_neon_unarrow_sat16(d, tcg_env, d);
105
+}
106
+
107
+static void gen_sqshrun_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
108
+{
109
+ gen_sshr_d(d, s, i);
110
+ gen_helper_neon_unarrow_sat32(d, tcg_env, d);
111
+}
112
+
113
+static void gen_sqrshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
114
+{
115
+ gen_srshr_bhs(d, s, i);
116
+ tcg_gen_ext16u_i64(d, d);
117
+ gen_helper_neon_narrow_sat_s8(d, tcg_env, d);
118
+}
119
+
120
+static void gen_sqrshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
121
+{
122
+ gen_srshr_bhs(d, s, i);
123
+ tcg_gen_ext32u_i64(d, d);
124
+ gen_helper_neon_narrow_sat_s16(d, tcg_env, d);
125
+}
126
+
127
+static void gen_sqrshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
128
+{
129
+ gen_srshr_d(d, s, i);
130
+ gen_helper_neon_narrow_sat_s32(d, tcg_env, d);
131
+}
132
+
133
+static void gen_uqrshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
134
+{
135
+ gen_urshr_bhs(d, s, i);
136
+ gen_helper_neon_narrow_sat_u8(d, tcg_env, d);
137
+}
138
+
139
+static void gen_uqrshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
140
+{
141
+ gen_urshr_bhs(d, s, i);
142
+ gen_helper_neon_narrow_sat_u16(d, tcg_env, d);
143
+}
144
+
145
+static void gen_uqrshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
146
+{
147
+ gen_urshr_d(d, s, i);
148
+ gen_helper_neon_narrow_sat_u32(d, tcg_env, d);
149
+}
150
+
151
+static void gen_sqrshrun_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
152
+{
153
+ gen_srshr_bhs(d, s, i);
154
+ tcg_gen_ext16u_i64(d, d);
155
+ gen_helper_neon_unarrow_sat8(d, tcg_env, d);
156
+}
157
+
158
+static void gen_sqrshrun_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
159
+{
160
+ gen_srshr_bhs(d, s, i);
161
+ tcg_gen_ext32u_i64(d, d);
162
+ gen_helper_neon_unarrow_sat16(d, tcg_env, d);
163
+}
164
+
165
+static void gen_sqrshrun_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
166
+{
167
+ gen_srshr_d(d, s, i);
168
+ gen_helper_neon_unarrow_sat32(d, tcg_env, d);
169
+}
170
+
171
static WideShiftImmFn * const shrn_fns[] = {
172
tcg_gen_shri_i64,
173
tcg_gen_shri_i64,
174
@@ -XXX,XX +XXX,XX @@ static WideShiftImmFn * const rshrn_fns[] = {
175
};
176
TRANS(RSHRN_v, do_vec_shift_imm_narrow, a, rshrn_fns, 0)
177
178
+static WideShiftImmFn * const sqshrn_fns[] = {
179
+ gen_sqshrn_b,
180
+ gen_sqshrn_h,
181
+ gen_sqshrn_s,
182
+};
183
+TRANS(SQSHRN_v, do_vec_shift_imm_narrow, a, sqshrn_fns, MO_SIGN)
184
+
185
+static WideShiftImmFn * const uqshrn_fns[] = {
186
+ gen_uqshrn_b,
187
+ gen_uqshrn_h,
188
+ gen_uqshrn_s,
189
+};
190
+TRANS(UQSHRN_v, do_vec_shift_imm_narrow, a, uqshrn_fns, 0)
191
+
192
+static WideShiftImmFn * const sqshrun_fns[] = {
193
+ gen_sqshrun_b,
194
+ gen_sqshrun_h,
195
+ gen_sqshrun_s,
196
+};
197
+TRANS(SQSHRUN_v, do_vec_shift_imm_narrow, a, sqshrun_fns, MO_SIGN)
198
+
199
+static WideShiftImmFn * const sqrshrn_fns[] = {
200
+ gen_sqrshrn_b,
201
+ gen_sqrshrn_h,
202
+ gen_sqrshrn_s,
203
+};
204
+TRANS(SQRSHRN_v, do_vec_shift_imm_narrow, a, sqrshrn_fns, MO_SIGN)
205
+
206
+static WideShiftImmFn * const uqrshrn_fns[] = {
207
+ gen_uqrshrn_b,
208
+ gen_uqrshrn_h,
209
+ gen_uqrshrn_s,
210
+};
211
+TRANS(UQRSHRN_v, do_vec_shift_imm_narrow, a, uqrshrn_fns, 0)
212
+
213
+static WideShiftImmFn * const sqrshrun_fns[] = {
214
+ gen_sqrshrun_b,
215
+ gen_sqrshrun_h,
216
+ gen_sqrshrun_s,
217
+};
218
+TRANS(SQRSHRUN_v, do_vec_shift_imm_narrow, a, sqrshrun_fns, MO_SIGN)
219
+
220
/*
140
/*
221
* Advanced SIMD Scalar Shift by Immediate
141
* These are the values from APA/API/APA3.
222
*/
142
* In general these must be compared '>=', per the normal Arm ARM
223
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
143
diff --git a/target/arm/helper.c b/target/arm/helper.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/arm/helper.c
146
+++ b/target/arm/helper.c
147
@@ -XXX,XX +XXX,XX @@ static void hcrx_write(CPUARMState *env, const ARMCPRegInfo *ri,
148
valid_mask |= HCRX_TALLINT | HCRX_VINMI | HCRX_VFNMI;
224
}
149
}
225
150
/* FEAT_CMOW adds CMOW */
226
switch (opcode) {
151
-
227
- case 0x10: /* SHRN / SQSHRUN */
152
if (cpu_isar_feature(aa64_cmow, cpu)) {
228
- case 0x11: /* RSHRN / SQRSHRUN */
153
valid_mask |= HCRX_CMOW;
229
- if (is_u) {
154
}
230
- handle_vec_simd_sqshrn(s, false, is_q, false, true, immh, immb,
155
+ /* FEAT_XS adds FGTnXS, FnXS */
231
- opcode, rn, rd);
156
+ if (cpu_isar_feature(aa64_xs, cpu)) {
232
- } else {
157
+ valid_mask |= HCRX_FGTNXS | HCRX_FNXS;
233
- unallocated_encoding(s);
158
+ }
234
- }
159
235
- break;
160
/* Clear RES0 bits. */
236
- case 0x12: /* SQSHRN / UQSHRN */
161
env->cp15.hcrx_el2 = value & valid_mask;
237
- case 0x13: /* SQRSHRN / UQRSHRN */
162
diff --git a/target/arm/tcg/op_helper.c b/target/arm/tcg/op_helper.c
238
- handle_vec_simd_sqshrn(s, false, is_q, is_u, is_u, immh, immb,
163
index XXXXXXX..XXXXXXX 100644
239
- opcode, rn, rd);
164
--- a/target/arm/tcg/op_helper.c
240
- break;
165
+++ b/target/arm/tcg/op_helper.c
241
case 0x1c: /* SCVTF / UCVTF */
166
@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
242
handle_simd_shift_intfp_conv(s, false, is_q, is_u, immh, immb,
167
unsigned int idx = FIELD_EX32(ri->fgt, FGT, IDX);
243
opcode, rn, rd);
168
unsigned int bitpos = FIELD_EX32(ri->fgt, FGT, BITPOS);
244
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
169
bool rev = FIELD_EX32(ri->fgt, FGT, REV);
245
case 0x0a: /* SHL / SLI */
170
+ bool nxs = FIELD_EX32(ri->fgt, FGT, NXS);
246
case 0x0c: /* SQSHLU */
171
bool trapbit;
247
case 0x0e: /* SQSHL, UQSHL */
172
248
+ case 0x10: /* SHRN / SQSHRUN */
173
if (ri->fgt & FGT_EXEC) {
249
+ case 0x11: /* RSHRN / SQRSHRUN */
174
@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
250
+ case 0x12: /* SQSHRN / UQSHRN */
175
trapword = env->cp15.fgt_write[idx];
251
+ case 0x13: /* SQRSHRN / UQRSHRN */
176
}
252
case 0x14: /* SSHLL / USHLL */
177
253
unallocated_encoding(s);
178
- trapbit = extract64(trapword, bitpos, 1);
254
return;
179
+ if (nxs && (arm_hcrx_el2_eff(env) & HCRX_FGTNXS)) {
180
+ /*
181
+ * If HCRX_EL2.FGTnXS is 1 then the fine-grained trap for
182
+ * TLBI maintenance insns does *not* apply to the nXS variant.
183
+ */
184
+ trapbit = 0;
185
+ } else {
186
+ trapbit = extract64(trapword, bitpos, 1);
187
+ }
188
if (trapbit != rev) {
189
res = CP_ACCESS_TRAP_EL2;
190
goto fail;
255
--
191
--
256
2.34.1
192
2.34.1
diff view generated by jsdifflib
1
docs/devel/nested-papr.txt is entirely (apart from the initial
1
All of the TLBI insns with an NXS variant put that variant at the
2
paragraph) a partial copy of the kernel documentation
2
same encoding but with a CRn field that is one greater than for the
3
https://docs.kernel.org/arch/powerpc/kvm-nested.html
3
original TLBI insn. To avoid having to define every TLBI insn
4
effectively twice, once in the normal way and once in a set of cpreg
5
arrays that are only registered when FEAT_XS is present, we define a
6
new ARM_CP_ADD_TLB_NXS type flag for cpregs. When this flag is set
7
in a cpreg struct and FEAT_XS is present,
8
define_one_arm_cp_reg_with_opaque() will automatically add a second
9
cpreg to the hash table for the TLBI NXS insn with:
10
* the crn+1 encoding
11
* an FGT field that indicates that it should honour HCR_EL2.FGTnXS
12
* a name with the "NXS" suffix
4
13
5
There's no benefit to the QEMU docs to converting this to rST,
14
(If there are future TLBI NXS insns that don't use this same
6
so instead delete it. Anybody needing to know the API and
15
encoding convention, it is also possible to define them manually.)
7
protocol for the guest to communicate with the hypervisor
8
to created nested VMs should refer to the authoratitative
9
documentation in the kernel docs.
10
16
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20240816133318.3603114-1-peter.maydell@linaro.org
19
Message-id: 20241211144440.2700268-3-peter.maydell@linaro.org
14
---
20
---
15
docs/devel/nested-papr.txt | 119 -------------------------------------
21
target/arm/cpregs.h | 8 ++++++++
16
1 file changed, 119 deletions(-)
22
target/arm/helper.c | 25 +++++++++++++++++++++++++
17
delete mode 100644 docs/devel/nested-papr.txt
23
2 files changed, 33 insertions(+)
18
24
19
diff --git a/docs/devel/nested-papr.txt b/docs/devel/nested-papr.txt
25
diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
20
deleted file mode 100644
26
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX
27
--- a/target/arm/cpregs.h
22
--- a/docs/devel/nested-papr.txt
28
+++ b/target/arm/cpregs.h
23
+++ /dev/null
29
@@ -XXX,XX +XXX,XX @@ enum {
24
@@ -XXX,XX +XXX,XX @@
30
* equivalent EL1 register when FEAT_NV2 is enabled.
25
-Nested PAPR API (aka KVM on PowerVM)
31
*/
26
-====================================
32
ARM_CP_NV2_REDIRECT = 1 << 20,
27
-
33
+ /*
28
-This API aims at providing support to enable nested virtualization with
34
+ * Flag: this is a TLBI insn which (when FEAT_XS is present) also has
29
-KVM on PowerVM. While the existing support for nested KVM on PowerNV was
35
+ * an NXS variant at the same encoding except that crn is 1 greater,
30
-introduced with cap-nested-hv option, however, with a slight design change,
36
+ * so when registering this cpreg automatically also register one
31
-to enable this on papr/pseries, a new cap-nested-papr option is added. eg:
37
+ * for the TLBI NXS variant. (For QEMU the NXS variant behaves
32
-
38
+ * identically to the normal one, other than FGT trapping handling.)
33
- qemu-system-ppc64 -cpu POWER10 -machine pseries,cap-nested-papr=true ...
39
+ */
34
-
40
+ ARM_CP_ADD_TLBI_NXS = 1 << 21,
35
-Work by:
41
};
36
- Michael Neuling <mikey@neuling.org>
42
37
- Vaibhav Jain <vaibhav@linux.ibm.com>
43
/*
38
- Jordan Niethe <jniethe5@gmail.com>
44
diff --git a/target/arm/helper.c b/target/arm/helper.c
39
- Harsh Prateek Bora <harshpb@linux.ibm.com>
45
index XXXXXXX..XXXXXXX 100644
40
- Shivaprasad G Bhat <sbhat@linux.ibm.com>
46
--- a/target/arm/helper.c
41
- Kautuk Consul <kconsul@linux.vnet.ibm.com>
47
+++ b/target/arm/helper.c
42
-
48
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
43
-Below taken from the kernel documentation:
49
if (r->state != state && r->state != ARM_CP_STATE_BOTH) {
44
-
50
continue;
45
-Introduction
51
}
46
-============
52
+ if ((r->type & ARM_CP_ADD_TLBI_NXS) &&
47
-
53
+ cpu_isar_feature(aa64_xs, cpu)) {
48
-This document explains how a guest operating system can act as a
54
+ /*
49
-hypervisor and run nested guests through the use of hypercalls, if the
55
+ * This is a TLBI insn which has an NXS variant. The
50
-hypervisor has implemented them. The terms L0, L1, and L2 are used to
56
+ * NXS variant is at the same encoding except that
51
-refer to different software entities. L0 is the hypervisor mode entity
57
+ * crn is +1, and has the same behaviour except for
52
-that would normally be called the "host" or "hypervisor". L1 is a
58
+ * fine-grained trapping. Add the NXS insn here and
53
-guest virtual machine that is directly run under L0 and is initiated
59
+ * then fall through to add the normal register.
54
-and controlled by L0. L2 is a guest virtual machine that is initiated
60
+ * add_cpreg_to_hashtable() copies the cpreg struct
55
-and controlled by L1 acting as a hypervisor. A significant design change
61
+ * and name that it is passed, so it's OK to use
56
-wrt existing API is that now the entire L2 state is maintained within L0.
62
+ * a local struct here.
57
-
63
+ */
58
-Existing Nested-HV API
64
+ ARMCPRegInfo nxs_ri = *r;
59
-======================
65
+ g_autofree char *name = g_strdup_printf("%sNXS", r->name);
60
-
66
+
61
-Linux/KVM has had support for Nesting as an L0 or L1 since 2018
67
+ assert(state == ARM_CP_STATE_AA64);
62
-
68
+ assert(nxs_ri.crn < 0xf);
63
-The L0 code was added::
69
+ nxs_ri.crn++;
64
-
70
+ if (nxs_ri.fgt) {
65
- commit 8e3f5fc1045dc49fd175b978c5457f5f51e7a2ce
71
+ nxs_ri.fgt |= R_FGT_NXS_MASK;
66
- Author: Paul Mackerras <paulus@ozlabs.org>
72
+ }
67
- Date: Mon Oct 8 16:31:03 2018 +1100
73
+ add_cpreg_to_hashtable(cpu, &nxs_ri, opaque, state,
68
- KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization
74
+ ARM_CP_SECSTATE_NS,
69
-
75
+ crm, opc1, opc2, name);
70
-The L1 code was added::
76
+ }
71
-
77
if (state == ARM_CP_STATE_AA32) {
72
- commit 360cae313702cdd0b90f82c261a8302fecef030a
78
/*
73
- Author: Paul Mackerras <paulus@ozlabs.org>
79
* Under AArch32 CP registers can be common
74
- Date: Mon Oct 8 16:31:04 2018 +1100
75
- KVM: PPC: Book3S HV: Nested guest entry via hypercall
76
-
77
-This API works primarily using a signal hcall h_enter_nested(). This
78
-call made by the L1 to tell the L0 to start an L2 vCPU with the given
79
-state. The L0 then starts this L2 and runs until an L2 exit condition
80
-is reached. Once the L2 exits, the state of the L2 is given back to
81
-the L1 by the L0. The full L2 vCPU state is always transferred from
82
-and to L1 when the L2 is run. The L0 doesn't keep any state on the L2
83
-vCPU (except in the short sequence in the L0 on L1 -> L2 entry and L2
84
--> L1 exit).
85
-
86
-The only state kept by the L0 is the partition table. The L1 registers
87
-it's partition table using the h_set_partition_table() hcall. All
88
-other state held by the L0 about the L2s is cached state (such as
89
-shadow page tables).
90
-
91
-The L1 may run any L2 or vCPU without first informing the L0. It
92
-simply starts the vCPU using h_enter_nested(). The creation of L2s and
93
-vCPUs is done implicitly whenever h_enter_nested() is called.
94
-
95
-In this document, we call this existing API the v1 API.
96
-
97
-New PAPR API
98
-===============
99
-
100
-The new PAPR API changes from the v1 API such that the creating L2 and
101
-associated vCPUs is explicit. In this document, we call this the v2
102
-API.
103
-
104
-h_enter_nested() is replaced with H_GUEST_VCPU_RUN(). Before this can
105
-be called the L1 must explicitly create the L2 using h_guest_create()
106
-and any associated vCPUs() created with h_guest_create_vCPU(). Getting
107
-and setting vCPU state can also be performed using h_guest_{g|s}et
108
-hcall.
109
-
110
-The basic execution flow is for an L1 to create an L2, run it, and
111
-delete it is:
112
-
113
-- L1 and L0 negotiate capabilities with H_GUEST_{G,S}ET_CAPABILITIES()
114
- (normally at L1 boot time).
115
-
116
-- L1 requests the L0 to create an L2 with H_GUEST_CREATE() and receives a token
117
-
118
-- L1 requests the L0 to create an L2 vCPU with H_GUEST_CREATE_VCPU()
119
-
120
-- L1 and L0 communicate the vCPU state using the H_GUEST_{G,S}ET() hcall
121
-
122
-- L1 requests the L0 to run the vCPU using H_GUEST_RUN_VCPU() hcall
123
-
124
-- L1 deletes L2 with H_GUEST_DELETE()
125
-
126
-For more details, please refer:
127
-
128
-[1] Linux Kernel documentation (upstream documentation commit):
129
-
130
-commit 476652297f94a2e5e5ef29e734b0da37ade94110
131
-Author: Michael Neuling <mikey@neuling.org>
132
-Date: Thu Sep 14 13:06:00 2023 +1000
133
-
134
- docs: powerpc: Document nested KVM on POWER
135
-
136
- Document support for nested KVM on POWER using the existing API as well
137
- as the new PAPR API. This includes the new HCALL interface and how it
138
- used by KVM.
139
-
140
- Signed-off-by: Michael Neuling <mikey@neuling.org>
141
- Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
142
- Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
143
- Link: https://msgid.link/20230914030600.16993-12-jniethe5@gmail.com
144
--
80
--
145
2.34.1
81
2.34.1
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
Add the ARM_CP_ADD_TLBI_NXS to the TLBI insns with an NXS variant.
2
This is every AArch64 TLBI encoding except for the four FEAT_RME TLBI
3
insns.
2
4
3
'Test might timeout' means nothing. Replace it with useful information
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
that it is emulation of pointer authentication what makes this test run
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
too long.
7
Message-id: 20241211144440.2700268-4-peter.maydell@linaro.org
8
---
9
target/arm/tcg/tlb-insns.c | 202 +++++++++++++++++++++++--------------
10
1 file changed, 124 insertions(+), 78 deletions(-)
6
11
7
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
12
diff --git a/target/arm/tcg/tlb-insns.c b/target/arm/tcg/tlb-insns.c
8
Message-id: 20240910-b4-move-to-freebsd-v5-3-0fb66d803c93@linaro.org
13
index XXXXXXX..XXXXXXX 100644
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
--- a/target/arm/tcg/tlb-insns.c
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
+++ b/target/arm/tcg/tlb-insns.c
11
---
16
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbi_v8_cp_reginfo[] = {
12
tests/functional/test_aarch64_sbsaref.py | 15 ++++++++++-----
17
/* AArch64 TLBI operations */
13
1 file changed, 10 insertions(+), 5 deletions(-)
18
{ .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
14
19
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
15
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
20
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
16
index XXXXXXX..XXXXXXX 100755
21
+ .access = PL1_W, .accessfn = access_ttlbis,
17
--- a/tests/functional/test_aarch64_sbsaref.py
22
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
18
+++ b/tests/functional/test_aarch64_sbsaref.py
23
.fgt = FGT_TLBIVMALLE1IS,
19
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_alpine_linux_max_pauth_off(self):
24
.writefn = tlbi_aa64_vmalle1is_write },
20
def test_sbsaref_alpine_linux_max_pauth_impdef(self):
25
{ .name = "TLBI_VAE1IS", .state = ARM_CP_STATE_AA64,
21
self.boot_alpine_linux("max,pauth-impdef=on")
26
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
22
27
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
23
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
28
+ .access = PL1_W, .accessfn = access_ttlbis,
24
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
29
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
25
+ 'Test might timeout due to PAuth emulation')
30
.fgt = FGT_TLBIVAE1IS,
26
def test_sbsaref_alpine_linux_max(self):
31
.writefn = tlbi_aa64_vae1is_write },
27
self.boot_alpine_linux("max")
32
{ .name = "TLBI_ASIDE1IS", .state = ARM_CP_STATE_AA64,
28
33
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
29
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_openbsd73_default_cpu(self):
34
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
30
def test_sbsaref_openbsd73_max_pauth_off(self):
35
+ .access = PL1_W, .accessfn = access_ttlbis,
31
self.boot_openbsd73("max,pauth=off")
36
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
32
37
.fgt = FGT_TLBIASIDE1IS,
33
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
38
.writefn = tlbi_aa64_vmalle1is_write },
34
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
39
{ .name = "TLBI_VAAE1IS", .state = ARM_CP_STATE_AA64,
35
+ 'Test might timeout due to PAuth emulation')
40
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
36
def test_sbsaref_openbsd73_max_pauth_impdef(self):
41
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
37
self.boot_openbsd73("max,pauth-impdef=on")
42
+ .access = PL1_W, .accessfn = access_ttlbis,
38
43
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
39
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
44
.fgt = FGT_TLBIVAAE1IS,
40
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
45
.writefn = tlbi_aa64_vae1is_write },
41
+ 'Test might timeout due to PAuth emulation')
46
{ .name = "TLBI_VALE1IS", .state = ARM_CP_STATE_AA64,
42
def test_sbsaref_openbsd73_max(self):
47
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
43
self.boot_openbsd73("max")
48
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
44
49
+ .access = PL1_W, .accessfn = access_ttlbis,
45
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_freebsd14_default_cpu(self):
50
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
46
def test_sbsaref_freebsd14_max_pauth_off(self):
51
.fgt = FGT_TLBIVALE1IS,
47
self.boot_freebsd14("max,pauth=off")
52
.writefn = tlbi_aa64_vae1is_write },
48
53
{ .name = "TLBI_VAALE1IS", .state = ARM_CP_STATE_AA64,
49
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
54
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
50
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
55
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
51
+ 'Test might timeout due to PAuth emulation')
56
+ .access = PL1_W, .accessfn = access_ttlbis,
52
def test_sbsaref_freebsd14_max_pauth_impdef(self):
57
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
53
self.boot_freebsd14("max,pauth-impdef=on")
58
.fgt = FGT_TLBIVAALE1IS,
54
59
.writefn = tlbi_aa64_vae1is_write },
55
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
60
{ .name = "TLBI_VMALLE1", .state = ARM_CP_STATE_AA64,
56
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
61
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
57
+ 'Test might timeout due to PAuth emulation')
62
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
58
def test_sbsaref_freebsd14_max(self):
63
+ .access = PL1_W, .accessfn = access_ttlb,
59
self.boot_freebsd14("max")
64
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
65
.fgt = FGT_TLBIVMALLE1,
66
.writefn = tlbi_aa64_vmalle1_write },
67
{ .name = "TLBI_VAE1", .state = ARM_CP_STATE_AA64,
68
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 1,
69
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
70
+ .access = PL1_W, .accessfn = access_ttlb,
71
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
72
.fgt = FGT_TLBIVAE1,
73
.writefn = tlbi_aa64_vae1_write },
74
{ .name = "TLBI_ASIDE1", .state = ARM_CP_STATE_AA64,
75
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 2,
76
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
77
+ .access = PL1_W, .accessfn = access_ttlb,
78
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
79
.fgt = FGT_TLBIASIDE1,
80
.writefn = tlbi_aa64_vmalle1_write },
81
{ .name = "TLBI_VAAE1", .state = ARM_CP_STATE_AA64,
82
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
83
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
84
+ .access = PL1_W, .accessfn = access_ttlb,
85
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
86
.fgt = FGT_TLBIVAAE1,
87
.writefn = tlbi_aa64_vae1_write },
88
{ .name = "TLBI_VALE1", .state = ARM_CP_STATE_AA64,
89
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
90
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
91
+ .access = PL1_W, .accessfn = access_ttlb,
92
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
93
.fgt = FGT_TLBIVALE1,
94
.writefn = tlbi_aa64_vae1_write },
95
{ .name = "TLBI_VAALE1", .state = ARM_CP_STATE_AA64,
96
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 7,
97
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
98
+ .access = PL1_W, .accessfn = access_ttlb,
99
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
100
.fgt = FGT_TLBIVAALE1,
101
.writefn = tlbi_aa64_vae1_write },
102
{ .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
103
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
104
- .access = PL2_W, .type = ARM_CP_NO_RAW,
105
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
106
.writefn = tlbi_aa64_ipas2e1is_write },
107
{ .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
108
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
109
- .access = PL2_W, .type = ARM_CP_NO_RAW,
110
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
111
.writefn = tlbi_aa64_ipas2e1is_write },
112
{ .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
113
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
114
- .access = PL2_W, .type = ARM_CP_NO_RAW,
115
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
116
.writefn = tlbi_aa64_alle1is_write },
117
{ .name = "TLBI_VMALLS12E1IS", .state = ARM_CP_STATE_AA64,
118
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 6,
119
- .access = PL2_W, .type = ARM_CP_NO_RAW,
120
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
121
.writefn = tlbi_aa64_alle1is_write },
122
{ .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
123
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
124
- .access = PL2_W, .type = ARM_CP_NO_RAW,
125
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
126
.writefn = tlbi_aa64_ipas2e1_write },
127
{ .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
128
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
129
- .access = PL2_W, .type = ARM_CP_NO_RAW,
130
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
131
.writefn = tlbi_aa64_ipas2e1_write },
132
{ .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
133
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
134
- .access = PL2_W, .type = ARM_CP_NO_RAW,
135
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
136
.writefn = tlbi_aa64_alle1_write },
137
{ .name = "TLBI_VMALLS12E1", .state = ARM_CP_STATE_AA64,
138
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 6,
139
- .access = PL2_W, .type = ARM_CP_NO_RAW,
140
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
141
.writefn = tlbi_aa64_alle1is_write },
142
};
143
144
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbi_el2_cp_reginfo[] = {
145
.writefn = tlbimva_hyp_is_write },
146
{ .name = "TLBI_ALLE2", .state = ARM_CP_STATE_AA64,
147
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 0,
148
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
149
+ .access = PL2_W,
150
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
151
.writefn = tlbi_aa64_alle2_write },
152
{ .name = "TLBI_VAE2", .state = ARM_CP_STATE_AA64,
153
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 1,
154
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
155
+ .access = PL2_W,
156
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
157
.writefn = tlbi_aa64_vae2_write },
158
{ .name = "TLBI_VALE2", .state = ARM_CP_STATE_AA64,
159
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 5,
160
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
161
+ .access = PL2_W,
162
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
163
.writefn = tlbi_aa64_vae2_write },
164
{ .name = "TLBI_ALLE2IS", .state = ARM_CP_STATE_AA64,
165
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 0,
166
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
167
+ .access = PL2_W,
168
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
169
.writefn = tlbi_aa64_alle2is_write },
170
{ .name = "TLBI_VAE2IS", .state = ARM_CP_STATE_AA64,
171
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 1,
172
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
173
+ .access = PL2_W,
174
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
175
.writefn = tlbi_aa64_vae2is_write },
176
{ .name = "TLBI_VALE2IS", .state = ARM_CP_STATE_AA64,
177
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 5,
178
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
179
+ .access = PL2_W,
180
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
181
.writefn = tlbi_aa64_vae2is_write },
182
};
183
184
static const ARMCPRegInfo tlbi_el3_cp_reginfo[] = {
185
{ .name = "TLBI_ALLE3IS", .state = ARM_CP_STATE_AA64,
186
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 3, .opc2 = 0,
187
- .access = PL3_W, .type = ARM_CP_NO_RAW,
188
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
189
.writefn = tlbi_aa64_alle3is_write },
190
{ .name = "TLBI_VAE3IS", .state = ARM_CP_STATE_AA64,
191
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 3, .opc2 = 1,
192
- .access = PL3_W, .type = ARM_CP_NO_RAW,
193
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
194
.writefn = tlbi_aa64_vae3is_write },
195
{ .name = "TLBI_VALE3IS", .state = ARM_CP_STATE_AA64,
196
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 3, .opc2 = 5,
197
- .access = PL3_W, .type = ARM_CP_NO_RAW,
198
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
199
.writefn = tlbi_aa64_vae3is_write },
200
{ .name = "TLBI_ALLE3", .state = ARM_CP_STATE_AA64,
201
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 0,
202
- .access = PL3_W, .type = ARM_CP_NO_RAW,
203
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
204
.writefn = tlbi_aa64_alle3_write },
205
{ .name = "TLBI_VAE3", .state = ARM_CP_STATE_AA64,
206
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 1,
207
- .access = PL3_W, .type = ARM_CP_NO_RAW,
208
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
209
.writefn = tlbi_aa64_vae3_write },
210
{ .name = "TLBI_VALE3", .state = ARM_CP_STATE_AA64,
211
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 5,
212
- .access = PL3_W, .type = ARM_CP_NO_RAW,
213
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
214
.writefn = tlbi_aa64_vae3_write },
215
};
216
217
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_ripas2e1is_write(CPUARMState *env,
218
static const ARMCPRegInfo tlbirange_reginfo[] = {
219
{ .name = "TLBI_RVAE1IS", .state = ARM_CP_STATE_AA64,
220
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 1,
221
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
222
+ .access = PL1_W, .accessfn = access_ttlbis,
223
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
224
.fgt = FGT_TLBIRVAE1IS,
225
.writefn = tlbi_aa64_rvae1is_write },
226
{ .name = "TLBI_RVAAE1IS", .state = ARM_CP_STATE_AA64,
227
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 3,
228
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
229
+ .access = PL1_W, .accessfn = access_ttlbis,
230
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
231
.fgt = FGT_TLBIRVAAE1IS,
232
.writefn = tlbi_aa64_rvae1is_write },
233
{ .name = "TLBI_RVALE1IS", .state = ARM_CP_STATE_AA64,
234
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 5,
235
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
236
+ .access = PL1_W, .accessfn = access_ttlbis,
237
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
238
.fgt = FGT_TLBIRVALE1IS,
239
.writefn = tlbi_aa64_rvae1is_write },
240
{ .name = "TLBI_RVAALE1IS", .state = ARM_CP_STATE_AA64,
241
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 7,
242
- .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
243
+ .access = PL1_W, .accessfn = access_ttlbis,
244
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
245
.fgt = FGT_TLBIRVAALE1IS,
246
.writefn = tlbi_aa64_rvae1is_write },
247
{ .name = "TLBI_RVAE1OS", .state = ARM_CP_STATE_AA64,
248
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 1,
249
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
250
+ .access = PL1_W, .accessfn = access_ttlbos,
251
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
252
.fgt = FGT_TLBIRVAE1OS,
253
.writefn = tlbi_aa64_rvae1is_write },
254
{ .name = "TLBI_RVAAE1OS", .state = ARM_CP_STATE_AA64,
255
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 3,
256
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
257
+ .access = PL1_W, .accessfn = access_ttlbos,
258
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
259
.fgt = FGT_TLBIRVAAE1OS,
260
.writefn = tlbi_aa64_rvae1is_write },
261
{ .name = "TLBI_RVALE1OS", .state = ARM_CP_STATE_AA64,
262
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 5,
263
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
264
+ .access = PL1_W, .accessfn = access_ttlbos,
265
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
266
.fgt = FGT_TLBIRVALE1OS,
267
.writefn = tlbi_aa64_rvae1is_write },
268
{ .name = "TLBI_RVAALE1OS", .state = ARM_CP_STATE_AA64,
269
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 7,
270
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
271
+ .access = PL1_W, .accessfn = access_ttlbos,
272
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
273
.fgt = FGT_TLBIRVAALE1OS,
274
.writefn = tlbi_aa64_rvae1is_write },
275
{ .name = "TLBI_RVAE1", .state = ARM_CP_STATE_AA64,
276
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 1,
277
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
278
+ .access = PL1_W, .accessfn = access_ttlb,
279
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
280
.fgt = FGT_TLBIRVAE1,
281
.writefn = tlbi_aa64_rvae1_write },
282
{ .name = "TLBI_RVAAE1", .state = ARM_CP_STATE_AA64,
283
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 3,
284
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
285
+ .access = PL1_W, .accessfn = access_ttlb,
286
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
287
.fgt = FGT_TLBIRVAAE1,
288
.writefn = tlbi_aa64_rvae1_write },
289
{ .name = "TLBI_RVALE1", .state = ARM_CP_STATE_AA64,
290
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 5,
291
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
292
+ .access = PL1_W, .accessfn = access_ttlb,
293
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
294
.fgt = FGT_TLBIRVALE1,
295
.writefn = tlbi_aa64_rvae1_write },
296
{ .name = "TLBI_RVAALE1", .state = ARM_CP_STATE_AA64,
297
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 7,
298
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
299
+ .access = PL1_W, .accessfn = access_ttlb,
300
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
301
.fgt = FGT_TLBIRVAALE1,
302
.writefn = tlbi_aa64_rvae1_write },
303
{ .name = "TLBI_RIPAS2E1IS", .state = ARM_CP_STATE_AA64,
304
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 2,
305
- .access = PL2_W, .type = ARM_CP_NO_RAW,
306
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
307
.writefn = tlbi_aa64_ripas2e1is_write },
308
{ .name = "TLBI_RIPAS2LE1IS", .state = ARM_CP_STATE_AA64,
309
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 6,
310
- .access = PL2_W, .type = ARM_CP_NO_RAW,
311
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
312
.writefn = tlbi_aa64_ripas2e1is_write },
313
{ .name = "TLBI_RVAE2IS", .state = ARM_CP_STATE_AA64,
314
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 2, .opc2 = 1,
315
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
316
+ .access = PL2_W,
317
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
318
.writefn = tlbi_aa64_rvae2is_write },
319
{ .name = "TLBI_RVALE2IS", .state = ARM_CP_STATE_AA64,
320
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 2, .opc2 = 5,
321
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
322
+ .access = PL2_W,
323
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
324
.writefn = tlbi_aa64_rvae2is_write },
325
{ .name = "TLBI_RIPAS2E1", .state = ARM_CP_STATE_AA64,
326
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 2,
327
- .access = PL2_W, .type = ARM_CP_NO_RAW,
328
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
329
.writefn = tlbi_aa64_ripas2e1_write },
330
{ .name = "TLBI_RIPAS2LE1", .state = ARM_CP_STATE_AA64,
331
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 6,
332
- .access = PL2_W, .type = ARM_CP_NO_RAW,
333
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
334
.writefn = tlbi_aa64_ripas2e1_write },
335
{ .name = "TLBI_RVAE2OS", .state = ARM_CP_STATE_AA64,
336
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 5, .opc2 = 1,
337
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
338
+ .access = PL2_W,
339
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
340
.writefn = tlbi_aa64_rvae2is_write },
341
{ .name = "TLBI_RVALE2OS", .state = ARM_CP_STATE_AA64,
342
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 5, .opc2 = 5,
343
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
344
+ .access = PL2_W,
345
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
346
.writefn = tlbi_aa64_rvae2is_write },
347
{ .name = "TLBI_RVAE2", .state = ARM_CP_STATE_AA64,
348
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 6, .opc2 = 1,
349
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
350
+ .access = PL2_W,
351
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
352
.writefn = tlbi_aa64_rvae2_write },
353
{ .name = "TLBI_RVALE2", .state = ARM_CP_STATE_AA64,
354
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 6, .opc2 = 5,
355
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
356
+ .access = PL2_W,
357
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
358
.writefn = tlbi_aa64_rvae2_write },
359
{ .name = "TLBI_RVAE3IS", .state = ARM_CP_STATE_AA64,
360
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 2, .opc2 = 1,
361
- .access = PL3_W, .type = ARM_CP_NO_RAW,
362
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
363
.writefn = tlbi_aa64_rvae3is_write },
364
{ .name = "TLBI_RVALE3IS", .state = ARM_CP_STATE_AA64,
365
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 2, .opc2 = 5,
366
- .access = PL3_W, .type = ARM_CP_NO_RAW,
367
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
368
.writefn = tlbi_aa64_rvae3is_write },
369
{ .name = "TLBI_RVAE3OS", .state = ARM_CP_STATE_AA64,
370
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 5, .opc2 = 1,
371
- .access = PL3_W, .type = ARM_CP_NO_RAW,
372
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
373
.writefn = tlbi_aa64_rvae3is_write },
374
{ .name = "TLBI_RVALE3OS", .state = ARM_CP_STATE_AA64,
375
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 5, .opc2 = 5,
376
- .access = PL3_W, .type = ARM_CP_NO_RAW,
377
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
378
.writefn = tlbi_aa64_rvae3is_write },
379
{ .name = "TLBI_RVAE3", .state = ARM_CP_STATE_AA64,
380
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 6, .opc2 = 1,
381
- .access = PL3_W, .type = ARM_CP_NO_RAW,
382
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
383
.writefn = tlbi_aa64_rvae3_write },
384
{ .name = "TLBI_RVALE3", .state = ARM_CP_STATE_AA64,
385
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 6, .opc2 = 5,
386
- .access = PL3_W, .type = ARM_CP_NO_RAW,
387
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
388
.writefn = tlbi_aa64_rvae3_write },
389
};
390
391
static const ARMCPRegInfo tlbios_reginfo[] = {
392
{ .name = "TLBI_VMALLE1OS", .state = ARM_CP_STATE_AA64,
393
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 0,
394
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
395
+ .access = PL1_W, .accessfn = access_ttlbos,
396
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
397
.fgt = FGT_TLBIVMALLE1OS,
398
.writefn = tlbi_aa64_vmalle1is_write },
399
{ .name = "TLBI_VAE1OS", .state = ARM_CP_STATE_AA64,
400
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 1,
401
.fgt = FGT_TLBIVAE1OS,
402
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
403
+ .access = PL1_W, .accessfn = access_ttlbos,
404
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
405
.writefn = tlbi_aa64_vae1is_write },
406
{ .name = "TLBI_ASIDE1OS", .state = ARM_CP_STATE_AA64,
407
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 2,
408
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
409
+ .access = PL1_W, .accessfn = access_ttlbos,
410
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
411
.fgt = FGT_TLBIASIDE1OS,
412
.writefn = tlbi_aa64_vmalle1is_write },
413
{ .name = "TLBI_VAAE1OS", .state = ARM_CP_STATE_AA64,
414
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 3,
415
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
416
+ .access = PL1_W, .accessfn = access_ttlbos,
417
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
418
.fgt = FGT_TLBIVAAE1OS,
419
.writefn = tlbi_aa64_vae1is_write },
420
{ .name = "TLBI_VALE1OS", .state = ARM_CP_STATE_AA64,
421
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 5,
422
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
423
+ .access = PL1_W, .accessfn = access_ttlbos,
424
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
425
.fgt = FGT_TLBIVALE1OS,
426
.writefn = tlbi_aa64_vae1is_write },
427
{ .name = "TLBI_VAALE1OS", .state = ARM_CP_STATE_AA64,
428
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 7,
429
- .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
430
+ .access = PL1_W, .accessfn = access_ttlbos,
431
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
432
.fgt = FGT_TLBIVAALE1OS,
433
.writefn = tlbi_aa64_vae1is_write },
434
{ .name = "TLBI_ALLE2OS", .state = ARM_CP_STATE_AA64,
435
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 0,
436
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
437
+ .access = PL2_W,
438
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
439
.writefn = tlbi_aa64_alle2is_write },
440
{ .name = "TLBI_VAE2OS", .state = ARM_CP_STATE_AA64,
441
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 1,
442
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
443
+ .access = PL2_W,
444
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
445
.writefn = tlbi_aa64_vae2is_write },
446
{ .name = "TLBI_ALLE1OS", .state = ARM_CP_STATE_AA64,
447
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 4,
448
- .access = PL2_W, .type = ARM_CP_NO_RAW,
449
+ .access = PL2_W,
450
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
451
.writefn = tlbi_aa64_alle1is_write },
452
{ .name = "TLBI_VALE2OS", .state = ARM_CP_STATE_AA64,
453
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 5,
454
- .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
455
+ .access = PL2_W,
456
+ .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS | ARM_CP_EL3_NO_EL2_UNDEF,
457
.writefn = tlbi_aa64_vae2is_write },
458
{ .name = "TLBI_VMALLS12E1OS", .state = ARM_CP_STATE_AA64,
459
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 6,
460
- .access = PL2_W, .type = ARM_CP_NO_RAW,
461
+ .access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
462
.writefn = tlbi_aa64_alle1is_write },
463
{ .name = "TLBI_IPAS2E1OS", .state = ARM_CP_STATE_AA64,
464
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 0,
465
- .access = PL2_W, .type = ARM_CP_NOP },
466
+ .access = PL2_W, .type = ARM_CP_NOP | ARM_CP_ADD_TLBI_NXS },
467
{ .name = "TLBI_RIPAS2E1OS", .state = ARM_CP_STATE_AA64,
468
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 3,
469
- .access = PL2_W, .type = ARM_CP_NOP },
470
+ .access = PL2_W, .type = ARM_CP_NOP | ARM_CP_ADD_TLBI_NXS },
471
{ .name = "TLBI_IPAS2LE1OS", .state = ARM_CP_STATE_AA64,
472
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 4,
473
- .access = PL2_W, .type = ARM_CP_NOP },
474
+ .access = PL2_W, .type = ARM_CP_NOP | ARM_CP_ADD_TLBI_NXS },
475
{ .name = "TLBI_RIPAS2LE1OS", .state = ARM_CP_STATE_AA64,
476
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 7,
477
- .access = PL2_W, .type = ARM_CP_NOP },
478
+ .access = PL2_W, .type = ARM_CP_NOP | ARM_CP_ADD_TLBI_NXS },
479
{ .name = "TLBI_ALLE3OS", .state = ARM_CP_STATE_AA64,
480
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 0,
481
- .access = PL3_W, .type = ARM_CP_NO_RAW,
482
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
483
.writefn = tlbi_aa64_alle3is_write },
484
{ .name = "TLBI_VAE3OS", .state = ARM_CP_STATE_AA64,
485
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 1,
486
- .access = PL3_W, .type = ARM_CP_NO_RAW,
487
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
488
.writefn = tlbi_aa64_vae3is_write },
489
{ .name = "TLBI_VALE3OS", .state = ARM_CP_STATE_AA64,
490
.opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 5,
491
- .access = PL3_W, .type = ARM_CP_NO_RAW,
492
+ .access = PL3_W, .type = ARM_CP_NO_RAW | ARM_CP_ADD_TLBI_NXS,
493
.writefn = tlbi_aa64_vae3is_write },
494
};
60
495
61
--
496
--
62
2.34.1
497
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
The DSB nXS variant is always both a reads and writes request type.
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Ignore the domain field like we do in plain DSB and perform a full
5
Message-id: 20240912024114.1097832-9-richard.henderson@linaro.org
5
system barrier operation.
6
7
The DSB nXS variant is part of FEAT_XS made mandatory from Armv8.7.
8
9
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241211144440.2700268-5-peter.maydell@linaro.org
13
[PMM: added missing "UNDEF unless feature present" check]
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
15
---
8
target/arm/tcg/a64.decode | 9 ++
16
target/arm/tcg/a64.decode | 3 +++
9
target/arm/tcg/translate-a64.c | 158 ++++++++++++++-------------------
17
target/arm/tcg/translate-a64.c | 9 +++++++++
10
2 files changed, 77 insertions(+), 90 deletions(-)
18
2 files changed, 12 insertions(+)
11
19
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
20
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
22
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
23
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ EXT_q 0110 1110 00 0 rm:5 0 imm:4 0 rn:5 rd:5
24
@@ -XXX,XX +XXX,XX @@ WFIT 1101 0101 0000 0011 0001 0000 001 rd:5
17
# Advanced SIMD Table Lookup
25
18
26
CLREX 1101 0101 0000 0011 0011 ---- 010 11111
19
TBL_TBX 0 q:1 00 1110 000 rm:5 0 len:2 tbx:1 00 rn:5 rd:5
27
DSB_DMB 1101 0101 0000 0011 0011 domain:2 types:2 10- 11111
20
+
28
+# For the DSB nXS variant, types always equals MBReqTypes_All and we ignore the
21
+# Advanced SIMD Permute
29
+# domain bits.
22
+
30
+DSB_nXS 1101 0101 0000 0011 0011 -- 10 001 11111
23
+UZP1 0.00 1110 .. 0 ..... 0 001 10 ..... ..... @qrrr_e
31
ISB 1101 0101 0000 0011 0011 ---- 110 11111
24
+UZP2 0.00 1110 .. 0 ..... 0 101 10 ..... ..... @qrrr_e
32
SB 1101 0101 0000 0011 0011 0000 111 11111
25
+TRN1 0.00 1110 .. 0 ..... 0 010 10 ..... ..... @qrrr_e
33
26
+TRN2 0.00 1110 .. 0 ..... 0 110 10 ..... ..... @qrrr_e
27
+ZIP1 0.00 1110 .. 0 ..... 0 011 10 ..... ..... @qrrr_e
28
+ZIP2 0.00 1110 .. 0 ..... 0 111 10 ..... ..... @qrrr_e
29
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
34
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/tcg/translate-a64.c
36
--- a/target/arm/tcg/translate-a64.c
32
+++ b/target/arm/tcg/translate-a64.c
37
+++ b/target/arm/tcg/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ static bool trans_TBL_TBX(DisasContext *s, arg_TBL_TBX *a)
38
@@ -XXX,XX +XXX,XX @@ static bool trans_DSB_DMB(DisasContext *s, arg_DSB_DMB *a)
34
return true;
39
return true;
35
}
40
}
36
41
37
+typedef int simd_permute_idx_fn(int i, int part, int elements);
42
+static bool trans_DSB_nXS(DisasContext *s, arg_DSB_nXS *a)
38
+
39
+static bool do_simd_permute(DisasContext *s, arg_qrrr_e *a,
40
+ simd_permute_idx_fn *fn, int part)
41
+{
43
+{
42
+ MemOp esz = a->esz;
44
+ if (!dc_isar_feature(aa64_xs, s)) {
43
+ int datasize = a->q ? 16 : 8;
44
+ int elements = datasize >> esz;
45
+ TCGv_i64 tcg_res[2], tcg_ele;
46
+
47
+ if (esz == MO_64 && !a->q) {
48
+ return false;
45
+ return false;
49
+ }
46
+ }
50
+ if (!fp_access_check(s)) {
47
+ tcg_gen_mb(TCG_BAR_SC | TCG_MO_ALL);
51
+ return true;
52
+ }
53
+
54
+ tcg_res[0] = tcg_temp_new_i64();
55
+ tcg_res[1] = a->q ? tcg_temp_new_i64() : NULL;
56
+ tcg_ele = tcg_temp_new_i64();
57
+
58
+ for (int i = 0; i < elements; i++) {
59
+ int o, w, idx;
60
+
61
+ idx = fn(i, part, elements);
62
+ read_vec_element(s, tcg_ele, (idx & elements ? a->rm : a->rn),
63
+ idx & (elements - 1), esz);
64
+
65
+ w = (i << (esz + 3)) / 64;
66
+ o = (i << (esz + 3)) % 64;
67
+ if (o == 0) {
68
+ tcg_gen_mov_i64(tcg_res[w], tcg_ele);
69
+ } else {
70
+ tcg_gen_deposit_i64(tcg_res[w], tcg_res[w], tcg_ele, o, 8 << esz);
71
+ }
72
+ }
73
+
74
+ for (int i = a->q; i >= 0; --i) {
75
+ write_vec_element(s, tcg_res[i], a->rd, i, MO_64);
76
+ }
77
+ clear_vec_high(s, a->q, a->rd);
78
+ return true;
48
+ return true;
79
+}
49
+}
80
+
50
+
81
+static int permute_load_uzp(int i, int part, int elements)
51
static bool trans_ISB(DisasContext *s, arg_ISB *a)
82
+{
52
{
83
+ return 2 * i + part;
53
/*
84
+}
85
+
86
+TRANS(UZP1, do_simd_permute, a, permute_load_uzp, 0)
87
+TRANS(UZP2, do_simd_permute, a, permute_load_uzp, 1)
88
+
89
+static int permute_load_trn(int i, int part, int elements)
90
+{
91
+ return (i & 1) * elements + (i & ~1) + part;
92
+}
93
+
94
+TRANS(TRN1, do_simd_permute, a, permute_load_trn, 0)
95
+TRANS(TRN2, do_simd_permute, a, permute_load_trn, 1)
96
+
97
+static int permute_load_zip(int i, int part, int elements)
98
+{
99
+ return (i & 1) * elements + ((part * elements + i) >> 1);
100
+}
101
+
102
+TRANS(ZIP1, do_simd_permute, a, permute_load_zip, 0)
103
+TRANS(ZIP2, do_simd_permute, a, permute_load_zip, 1)
104
+
105
/*
106
* Cryptographic AES, SHA, SHA512
107
*/
108
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
109
}
110
}
111
112
-/* ZIP/UZP/TRN
113
- * 31 30 29 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
114
- * +---+---+-------------+------+---+------+---+------------------+------+
115
- * | 0 | Q | 0 0 1 1 1 0 | size | 0 | Rm | 0 | opc | 1 0 | Rn | Rd |
116
- * +---+---+-------------+------+---+------+---+------------------+------+
117
- */
118
-static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
119
-{
120
- int rd = extract32(insn, 0, 5);
121
- int rn = extract32(insn, 5, 5);
122
- int rm = extract32(insn, 16, 5);
123
- int size = extract32(insn, 22, 2);
124
- /* opc field bits [1:0] indicate ZIP/UZP/TRN;
125
- * bit 2 indicates 1 vs 2 variant of the insn.
126
- */
127
- int opcode = extract32(insn, 12, 2);
128
- bool part = extract32(insn, 14, 1);
129
- bool is_q = extract32(insn, 30, 1);
130
- int esize = 8 << size;
131
- int i;
132
- int datasize = is_q ? 128 : 64;
133
- int elements = datasize / esize;
134
- TCGv_i64 tcg_res[2], tcg_ele;
135
-
136
- if (opcode == 0 || (size == 3 && !is_q)) {
137
- unallocated_encoding(s);
138
- return;
139
- }
140
-
141
- if (!fp_access_check(s)) {
142
- return;
143
- }
144
-
145
- tcg_res[0] = tcg_temp_new_i64();
146
- tcg_res[1] = is_q ? tcg_temp_new_i64() : NULL;
147
- tcg_ele = tcg_temp_new_i64();
148
-
149
- for (i = 0; i < elements; i++) {
150
- int o, w;
151
-
152
- switch (opcode) {
153
- case 1: /* UZP1/2 */
154
- {
155
- int midpoint = elements / 2;
156
- if (i < midpoint) {
157
- read_vec_element(s, tcg_ele, rn, 2 * i + part, size);
158
- } else {
159
- read_vec_element(s, tcg_ele, rm,
160
- 2 * (i - midpoint) + part, size);
161
- }
162
- break;
163
- }
164
- case 2: /* TRN1/2 */
165
- if (i & 1) {
166
- read_vec_element(s, tcg_ele, rm, (i & ~1) + part, size);
167
- } else {
168
- read_vec_element(s, tcg_ele, rn, (i & ~1) + part, size);
169
- }
170
- break;
171
- case 3: /* ZIP1/2 */
172
- {
173
- int base = part * elements / 2;
174
- if (i & 1) {
175
- read_vec_element(s, tcg_ele, rm, base + (i >> 1), size);
176
- } else {
177
- read_vec_element(s, tcg_ele, rn, base + (i >> 1), size);
178
- }
179
- break;
180
- }
181
- default:
182
- g_assert_not_reached();
183
- }
184
-
185
- w = (i * esize) / 64;
186
- o = (i * esize) % 64;
187
- if (o == 0) {
188
- tcg_gen_mov_i64(tcg_res[w], tcg_ele);
189
- } else {
190
- tcg_gen_shli_i64(tcg_ele, tcg_ele, o);
191
- tcg_gen_or_i64(tcg_res[w], tcg_res[w], tcg_ele);
192
- }
193
- }
194
-
195
- for (i = 0; i <= is_q; ++i) {
196
- write_vec_element(s, tcg_res[i], rd, i, MO_64);
197
- }
198
- clear_vec_high(s, is_q, rd);
199
-}
200
-
201
/*
202
* do_reduction_op helper
203
*
204
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
205
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
206
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
207
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
208
- { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
209
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
210
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
211
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
212
--
54
--
213
2.34.1
55
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Use simple shift and add instead of ctpop, ctz, shift and mask.
4
Unlike SVE, there is no predicate to disable elements.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-10-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/translate-a64.c | 40 +++++++++++-----------------------
12
1 file changed, 13 insertions(+), 27 deletions(-)
13
14
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/translate-a64.c
17
+++ b/target/arm/tcg/translate-a64.c
18
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
19
* important for correct NaN propagation that we do these
20
* operations in exactly the order specified by the pseudocode.
21
*
22
- * This is a recursive function, TCG temps should be freed by the
23
- * calling function once it is done with the values.
24
+ * This is a recursive function.
25
*/
26
static TCGv_i32 do_reduction_op(DisasContext *s, int fpopcode, int rn,
27
- int esize, int size, int vmap, TCGv_ptr fpst)
28
+ MemOp esz, int ebase, int ecount, TCGv_ptr fpst)
29
{
30
- if (esize == size) {
31
- int element;
32
- MemOp msize = esize == 16 ? MO_16 : MO_32;
33
- TCGv_i32 tcg_elem;
34
-
35
- /* We should have one register left here */
36
- assert(ctpop8(vmap) == 1);
37
- element = ctz32(vmap);
38
- assert(element < 8);
39
-
40
- tcg_elem = tcg_temp_new_i32();
41
- read_vec_element_i32(s, tcg_elem, rn, element, msize);
42
+ if (ecount == 1) {
43
+ TCGv_i32 tcg_elem = tcg_temp_new_i32();
44
+ read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
45
return tcg_elem;
46
} else {
47
- int bits = size / 2;
48
- int shift = ctpop8(vmap) / 2;
49
- int vmap_lo = (vmap >> shift) & vmap;
50
- int vmap_hi = (vmap & ~vmap_lo);
51
+ int half = ecount >> 1;
52
TCGv_i32 tcg_hi, tcg_lo, tcg_res;
53
54
- tcg_hi = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_hi, fpst);
55
- tcg_lo = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_lo, fpst);
56
+ tcg_hi = do_reduction_op(s, fpopcode, rn, esz,
57
+ ebase + half, half, fpst);
58
+ tcg_lo = do_reduction_op(s, fpopcode, rn, esz,
59
+ ebase, half, fpst);
60
tcg_res = tcg_temp_new_i32();
61
62
switch (fpopcode) {
63
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
64
bool is_u = extract32(insn, 29, 1);
65
bool is_fp = false;
66
bool is_min = false;
67
- int esize;
68
int elements;
69
int i;
70
TCGv_i64 tcg_res, tcg_elt;
71
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
72
return;
73
}
74
75
- esize = 8 << size;
76
- elements = (is_q ? 128 : 64) / esize;
77
+ elements = (is_q ? 16 : 8) >> size;
78
79
tcg_res = tcg_temp_new_i64();
80
tcg_elt = tcg_temp_new_i64();
81
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
82
*/
83
TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
84
int fpopcode = opcode | is_min << 4 | is_u << 5;
85
- int vmap = (1 << elements) - 1;
86
- TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, esize,
87
- (is_q ? 128 : 64), vmap, fpst);
88
+ TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, size,
89
+ 0, elements, fpst);
90
tcg_gen_extu_i32_i64(tcg_res, tcg_res32);
91
}
92
93
--
94
2.34.1
diff view generated by jsdifflib
1
The Neoverse-V1 TRM is a bit confused about the layout of the
1
From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
2
ID_AA64ISAR1_EL1 register, and so its table 3-6 has the wrong value
3
for this ID register. Trust instead section 3.2.74's list of which
4
fields are set.
5
2
6
This means that we stop incorrectly reporting FEAT_XS as present, and
3
Add FEAT_XS feature report value in max cpu's ID_AA64ISAR1 sys register.
7
now report the presence of FEAT_BF16.
8
4
9
Cc: qemu-stable@nongnu.org
5
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
10
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20240917161337.3012188-1-peter.maydell@linaro.org
8
Message-id: 20241211144440.2700268-6-peter.maydell@linaro.org
9
[PMM: Add entry for FEAT_XS to documentation]
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
---
11
---
15
target/arm/tcg/cpu64.c | 2 +-
12
docs/system/arm/emulation.rst | 1 +
16
1 file changed, 1 insertion(+), 1 deletion(-)
13
target/arm/tcg/cpu64.c | 1 +
14
2 files changed, 2 insertions(+)
17
15
16
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
17
index XXXXXXX..XXXXXXX 100644
18
--- a/docs/system/arm/emulation.rst
19
+++ b/docs/system/arm/emulation.rst
20
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
21
- FEAT_VMID16 (16-bit VMID)
22
- FEAT_WFxT (WFE and WFI instructions with timeout)
23
- FEAT_XNX (Translation table stage 2 Unprivileged Execute-never)
24
+- FEAT_XS (XS attribute)
25
26
For information on the specifics of these extensions, please refer
27
to the `Arm Architecture Reference Manual for A-profile architecture
18
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
28
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
19
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/tcg/cpu64.c
30
--- a/target/arm/tcg/cpu64.c
21
+++ b/target/arm/tcg/cpu64.c
31
+++ b/target/arm/tcg/cpu64.c
22
@@ -XXX,XX +XXX,XX @@ static void aarch64_neoverse_v1_initfn(Object *obj)
32
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
23
cpu->isar.id_aa64dfr0 = 0x000001f210305519ull;
33
t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 2); /* FEAT_BF16, FEAT_EBF16 */
24
cpu->isar.id_aa64dfr1 = 0x00000000;
34
t = FIELD_DP64(t, ID_AA64ISAR1, DGH, 1); /* FEAT_DGH */
25
cpu->isar.id_aa64isar0 = 0x1011111110212120ull; /* with FEAT_RNG */
35
t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1); /* FEAT_I8MM */
26
- cpu->isar.id_aa64isar1 = 0x0111000001211032ull;
36
+ t = FIELD_DP64(t, ID_AA64ISAR1, XS, 1); /* FEAT_XS */
27
+ cpu->isar.id_aa64isar1 = 0x0011100001211032ull;
37
cpu->isar.id_aa64isar1 = t;
28
cpu->isar.id_aa64mmfr0 = 0x0000000000101125ull;
38
29
cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
39
t = cpu->isar.id_aa64isar2;
30
cpu->isar.id_aa64mmfr2 = 0x0220011102101011ull;
31
--
40
--
32
2.34.1
41
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Add system test to make sure FEAT_XS is enabled for max cpu emulation
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
and that QEMU doesn't crash when encountering an NXS instruction
5
Message-id: 20240912024114.1097832-11-richard.henderson@linaro.org
5
variant.
6
7
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20241211144440.2700268-7-peter.maydell@linaro.org
10
[PMM: In ISAR field test, mask with 0xf, not 0xff; use < rather
11
than an equality test to follow the standard ID register field
12
check guidelines]
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
14
---
8
target/arm/tcg/a64.decode | 12 +++
15
tests/tcg/aarch64/system/feat-xs.c | 27 +++++++++++++++++++++++++++
9
target/arm/tcg/translate-a64.c | 140 ++++++++++++---------------------
16
1 file changed, 27 insertions(+)
10
2 files changed, 61 insertions(+), 91 deletions(-)
17
create mode 100644 tests/tcg/aarch64/system/feat-xs.c
11
18
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
19
diff --git a/tests/tcg/aarch64/system/feat-xs.c b/tests/tcg/aarch64/system/feat-xs.c
13
index XXXXXXX..XXXXXXX 100644
20
new file mode 100644
14
--- a/target/arm/tcg/a64.decode
21
index XXXXXXX..XXXXXXX
15
+++ b/target/arm/tcg/a64.decode
22
--- /dev/null
23
+++ b/tests/tcg/aarch64/system/feat-xs.c
16
@@ -XXX,XX +XXX,XX @@
24
@@ -XXX,XX +XXX,XX @@
17
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
18
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
19
20
+@qrr_e . q:1 ...... esz:2 ...... ...... rn:5 rd:5 &qrr_e
21
+
22
@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0
23
@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
24
@qrrr_s . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=2
25
@@ -XXX,XX +XXX,XX @@ TRN1 0.00 1110 .. 0 ..... 0 010 10 ..... ..... @qrrr_e
26
TRN2 0.00 1110 .. 0 ..... 0 110 10 ..... ..... @qrrr_e
27
ZIP1 0.00 1110 .. 0 ..... 0 011 10 ..... ..... @qrrr_e
28
ZIP2 0.00 1110 .. 0 ..... 0 111 10 ..... ..... @qrrr_e
29
+
30
+# Advanced SIMD Across Lanes
31
+
32
+ADDV 0.00 1110 .. 11000 11011 10 ..... ..... @qrr_e
33
+SADDLV 0.00 1110 .. 11000 00011 10 ..... ..... @qrr_e
34
+UADDLV 0.10 1110 .. 11000 00011 10 ..... ..... @qrr_e
35
+SMAXV 0.00 1110 .. 11000 01010 10 ..... ..... @qrr_e
36
+UMAXV 0.10 1110 .. 11000 01010 10 ..... ..... @qrr_e
37
+SMINV 0.00 1110 .. 11000 11010 10 ..... ..... @qrr_e
38
+UMINV 0.10 1110 .. 11000 11010 10 ..... ..... @qrr_e
39
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/tcg/translate-a64.c
42
+++ b/target/arm/tcg/translate-a64.c
43
@@ -XXX,XX +XXX,XX @@ TRANS(FNMADD, do_fmadd, a, true, true)
44
TRANS(FMSUB, do_fmadd, a, false, true)
45
TRANS(FNMSUB, do_fmadd, a, true, false)
46
47
+/*
25
+/*
48
+ * Advanced SIMD Across Lanes
26
+ * FEAT_XS Test
27
+ *
28
+ * Copyright (c) 2024 Linaro Ltd
29
+ *
30
+ * SPDX-License-Identifier: GPL-2.0-or-later
49
+ */
31
+ */
50
+
32
+
51
+static bool do_int_reduction(DisasContext *s, arg_qrr_e *a, bool widen,
33
+#include <minilib.h>
52
+ MemOp src_sign, NeonGenTwo64OpFn *fn)
34
+#include <stdint.h>
35
+
36
+int main(void)
53
+{
37
+{
54
+ TCGv_i64 tcg_res, tcg_elt;
38
+ uint64_t isar1;
55
+ MemOp src_mop = a->esz | src_sign;
56
+ int elements = (a->q ? 16 : 8) >> a->esz;
57
+
39
+
58
+ /* Reject MO_64, and MO_32 without Q: a minimum of 4 elements. */
40
+ asm volatile ("mrs %0, id_aa64isar1_el1" : "=r"(isar1));
59
+ if (elements < 4) {
41
+ if (((isar1 >> 56) & 0xf) < 1) {
60
+ return false;
42
+ ml_printf("FEAT_XS not supported by CPU");
43
+ return 1;
61
+ }
44
+ }
62
+ if (!fp_access_check(s)) {
45
+ /* VMALLE1NXS */
63
+ return true;
46
+ asm volatile (".inst 0xd508971f");
64
+ }
47
+ /* VMALLE1OSNXS */
48
+ asm volatile (".inst 0xd508911f");
65
+
49
+
66
+ tcg_res = tcg_temp_new_i64();
50
+ return 0;
67
+ tcg_elt = tcg_temp_new_i64();
68
+
69
+ read_vec_element(s, tcg_res, a->rn, 0, src_mop);
70
+ for (int i = 1; i < elements; i++) {
71
+ read_vec_element(s, tcg_elt, a->rn, i, src_mop);
72
+ fn(tcg_res, tcg_res, tcg_elt);
73
+ }
74
+
75
+ tcg_gen_ext_i64(tcg_res, tcg_res, a->esz + widen);
76
+ write_fp_dreg(s, a->rd, tcg_res);
77
+ return true;
78
+}
51
+}
79
+
80
+TRANS(ADDV, do_int_reduction, a, false, 0, tcg_gen_add_i64)
81
+TRANS(SADDLV, do_int_reduction, a, true, MO_SIGN, tcg_gen_add_i64)
82
+TRANS(UADDLV, do_int_reduction, a, true, 0, tcg_gen_add_i64)
83
+TRANS(SMAXV, do_int_reduction, a, false, MO_SIGN, tcg_gen_smax_i64)
84
+TRANS(UMAXV, do_int_reduction, a, false, 0, tcg_gen_umax_i64)
85
+TRANS(SMINV, do_int_reduction, a, false, MO_SIGN, tcg_gen_smin_i64)
86
+TRANS(UMINV, do_int_reduction, a, false, 0, tcg_gen_umin_i64)
87
+
88
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
89
* Note that it is the caller's responsibility to ensure that the
90
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
91
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
92
int opcode = extract32(insn, 12, 5);
93
bool is_q = extract32(insn, 30, 1);
94
bool is_u = extract32(insn, 29, 1);
95
- bool is_fp = false;
96
bool is_min = false;
97
int elements;
98
- int i;
99
- TCGv_i64 tcg_res, tcg_elt;
100
101
switch (opcode) {
102
- case 0x1b: /* ADDV */
103
- if (is_u) {
104
- unallocated_encoding(s);
105
- return;
106
- }
107
- /* fall through */
108
- case 0x3: /* SADDLV, UADDLV */
109
- case 0xa: /* SMAXV, UMAXV */
110
- case 0x1a: /* SMINV, UMINV */
111
- if (size == 3 || (size == 2 && !is_q)) {
112
- unallocated_encoding(s);
113
- return;
114
- }
115
- break;
116
case 0xc: /* FMAXNMV, FMINNMV */
117
case 0xf: /* FMAXV, FMINV */
118
/* Bit 1 of size field encodes min vs max and the actual size
119
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
120
* precision.
121
*/
122
is_min = extract32(size, 1, 1);
123
- is_fp = true;
124
if (!is_u && dc_isar_feature(aa64_fp16, s)) {
125
size = 1;
126
} else if (!is_u || !is_q || extract32(size, 0, 1)) {
127
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
128
}
129
break;
130
default:
131
+ case 0x3: /* SADDLV, UADDLV */
132
+ case 0xa: /* SMAXV, UMAXV */
133
+ case 0x1a: /* SMINV, UMINV */
134
+ case 0x1b: /* ADDV */
135
unallocated_encoding(s);
136
return;
137
}
138
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
139
140
elements = (is_q ? 16 : 8) >> size;
141
142
- tcg_res = tcg_temp_new_i64();
143
- tcg_elt = tcg_temp_new_i64();
144
-
145
- /* These instructions operate across all lanes of a vector
146
- * to produce a single result. We can guarantee that a 64
147
- * bit intermediate is sufficient:
148
- * + for [US]ADDLV the maximum element size is 32 bits, and
149
- * the result type is 64 bits
150
- * + for FMAX*V, FMIN*V, ADDV the intermediate type is the
151
- * same as the element size, which is 32 bits at most
152
- * For the integer operations we can choose to work at 64
153
- * or 32 bits and truncate at the end; for simplicity
154
- * we use 64 bits always. The floating point
155
- * ops do require 32 bit intermediates, though.
156
- */
157
- if (!is_fp) {
158
- read_vec_element(s, tcg_res, rn, 0, size | (is_u ? 0 : MO_SIGN));
159
-
160
- for (i = 1; i < elements; i++) {
161
- read_vec_element(s, tcg_elt, rn, i, size | (is_u ? 0 : MO_SIGN));
162
-
163
- switch (opcode) {
164
- case 0x03: /* SADDLV / UADDLV */
165
- case 0x1b: /* ADDV */
166
- tcg_gen_add_i64(tcg_res, tcg_res, tcg_elt);
167
- break;
168
- case 0x0a: /* SMAXV / UMAXV */
169
- if (is_u) {
170
- tcg_gen_umax_i64(tcg_res, tcg_res, tcg_elt);
171
- } else {
172
- tcg_gen_smax_i64(tcg_res, tcg_res, tcg_elt);
173
- }
174
- break;
175
- case 0x1a: /* SMINV / UMINV */
176
- if (is_u) {
177
- tcg_gen_umin_i64(tcg_res, tcg_res, tcg_elt);
178
- } else {
179
- tcg_gen_smin_i64(tcg_res, tcg_res, tcg_elt);
180
- }
181
- break;
182
- default:
183
- g_assert_not_reached();
184
- }
185
-
186
- }
187
- } else {
188
+ {
189
/* Floating point vector reduction ops which work across 32
190
* bit (single) or 16 bit (half-precision) intermediates.
191
* Note that correct NaN propagation requires that we do these
192
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
193
*/
194
TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
195
int fpopcode = opcode | is_min << 4 | is_u << 5;
196
- TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, size,
197
- 0, elements, fpst);
198
- tcg_gen_extu_i32_i64(tcg_res, tcg_res32);
199
+ TCGv_i32 tcg_res = do_reduction_op(s, fpopcode, rn, size,
200
+ 0, elements, fpst);
201
+ write_fp_sreg(s, rd, tcg_res);
202
}
203
-
204
- /* Now truncate the result to the width required for the final output */
205
- if (opcode == 0x03) {
206
- /* SADDLV, UADDLV: result is 2*esize */
207
- size++;
208
- }
209
-
210
- switch (size) {
211
- case 0:
212
- tcg_gen_ext8u_i64(tcg_res, tcg_res);
213
- break;
214
- case 1:
215
- tcg_gen_ext16u_i64(tcg_res, tcg_res);
216
- break;
217
- case 2:
218
- tcg_gen_ext32u_i64(tcg_res, tcg_res);
219
- break;
220
- case 3:
221
- break;
222
- default:
223
- g_assert_not_reached();
224
- }
225
-
226
- write_fp_dreg(s, rd, tcg_res);
227
}
228
229
/* AdvSIMD modified immediate
230
--
52
--
231
2.34.1
53
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-12-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 14 +++
9
target/arm/tcg/translate-a64.c | 176 ++++++++++-----------------------
10
2 files changed, 67 insertions(+), 123 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@
17
@rrx_d ........ .. . rm:5 .... idx:1 . rn:5 rd:5 &rrx_e esz=3
18
19
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
20
+@rr_q1e2 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=2
21
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
22
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
23
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
24
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
25
26
+@qrr_h . q:1 ...... .. ...... ...... rn:5 rd:5 &qrr_e esz=1
27
@qrr_e . q:1 ...... esz:2 ...... ...... rn:5 rd:5 &qrr_e
28
29
@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0
30
@@ -XXX,XX +XXX,XX @@ SMAXV 0.00 1110 .. 11000 01010 10 ..... ..... @qrr_e
31
UMAXV 0.10 1110 .. 11000 01010 10 ..... ..... @qrr_e
32
SMINV 0.00 1110 .. 11000 11010 10 ..... ..... @qrr_e
33
UMINV 0.10 1110 .. 11000 11010 10 ..... ..... @qrr_e
34
+
35
+FMAXNMV_h 0.00 1110 00 11000 01100 10 ..... ..... @qrr_h
36
+FMAXNMV_s 0110 1110 00 11000 01100 10 ..... ..... @rr_q1e2
37
+
38
+FMINNMV_h 0.00 1110 10 11000 01100 10 ..... ..... @qrr_h
39
+FMINNMV_s 0110 1110 10 11000 01100 10 ..... ..... @rr_q1e2
40
+
41
+FMAXV_h 0.00 1110 00 11000 01111 10 ..... ..... @qrr_h
42
+FMAXV_s 0110 1110 00 11000 01111 10 ..... ..... @rr_q1e2
43
+
44
+FMINV_h 0.00 1110 10 11000 01111 10 ..... ..... @qrr_h
45
+FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
46
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/tcg/translate-a64.c
49
+++ b/target/arm/tcg/translate-a64.c
50
@@ -XXX,XX +XXX,XX @@ TRANS(UMAXV, do_int_reduction, a, false, 0, tcg_gen_umax_i64)
51
TRANS(SMINV, do_int_reduction, a, false, MO_SIGN, tcg_gen_smin_i64)
52
TRANS(UMINV, do_int_reduction, a, false, 0, tcg_gen_umin_i64)
53
54
+/*
55
+ * do_fp_reduction helper
56
+ *
57
+ * This mirrors the Reduce() pseudocode in the ARM ARM. It is
58
+ * important for correct NaN propagation that we do these
59
+ * operations in exactly the order specified by the pseudocode.
60
+ *
61
+ * This is a recursive function.
62
+ */
63
+static TCGv_i32 do_reduction_op(DisasContext *s, int rn, MemOp esz,
64
+ int ebase, int ecount, TCGv_ptr fpst,
65
+ NeonGenTwoSingleOpFn *fn)
66
+{
67
+ if (ecount == 1) {
68
+ TCGv_i32 tcg_elem = tcg_temp_new_i32();
69
+ read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
70
+ return tcg_elem;
71
+ } else {
72
+ int half = ecount >> 1;
73
+ TCGv_i32 tcg_hi, tcg_lo, tcg_res;
74
+
75
+ tcg_hi = do_reduction_op(s, rn, esz, ebase + half, half, fpst, fn);
76
+ tcg_lo = do_reduction_op(s, rn, esz, ebase, half, fpst, fn);
77
+ tcg_res = tcg_temp_new_i32();
78
+
79
+ fn(tcg_res, tcg_lo, tcg_hi, fpst);
80
+ return tcg_res;
81
+ }
82
+}
83
+
84
+static bool do_fp_reduction(DisasContext *s, arg_qrr_e *a,
85
+ NeonGenTwoSingleOpFn *fn)
86
+{
87
+ if (fp_access_check(s)) {
88
+ MemOp esz = a->esz;
89
+ int elts = (a->q ? 16 : 8) >> esz;
90
+ TCGv_ptr fpst = fpstatus_ptr(esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
91
+ TCGv_i32 res = do_reduction_op(s, a->rn, esz, 0, elts, fpst, fn);
92
+ write_fp_sreg(s, a->rd, res);
93
+ }
94
+ return true;
95
+}
96
+
97
+TRANS_FEAT(FMAXNMV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_maxnumh)
98
+TRANS_FEAT(FMINNMV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_minnumh)
99
+TRANS_FEAT(FMAXV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_maxh)
100
+TRANS_FEAT(FMINV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_minh)
101
+
102
+TRANS(FMAXNMV_s, do_fp_reduction, a, gen_helper_vfp_maxnums)
103
+TRANS(FMINNMV_s, do_fp_reduction, a, gen_helper_vfp_minnums)
104
+TRANS(FMAXV_s, do_fp_reduction, a, gen_helper_vfp_maxs)
105
+TRANS(FMINV_s, do_fp_reduction, a, gen_helper_vfp_mins)
106
+
107
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
108
* Note that it is the caller's responsibility to ensure that the
109
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
110
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
111
}
112
}
113
114
-/*
115
- * do_reduction_op helper
116
- *
117
- * This mirrors the Reduce() pseudocode in the ARM ARM. It is
118
- * important for correct NaN propagation that we do these
119
- * operations in exactly the order specified by the pseudocode.
120
- *
121
- * This is a recursive function.
122
- */
123
-static TCGv_i32 do_reduction_op(DisasContext *s, int fpopcode, int rn,
124
- MemOp esz, int ebase, int ecount, TCGv_ptr fpst)
125
-{
126
- if (ecount == 1) {
127
- TCGv_i32 tcg_elem = tcg_temp_new_i32();
128
- read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
129
- return tcg_elem;
130
- } else {
131
- int half = ecount >> 1;
132
- TCGv_i32 tcg_hi, tcg_lo, tcg_res;
133
-
134
- tcg_hi = do_reduction_op(s, fpopcode, rn, esz,
135
- ebase + half, half, fpst);
136
- tcg_lo = do_reduction_op(s, fpopcode, rn, esz,
137
- ebase, half, fpst);
138
- tcg_res = tcg_temp_new_i32();
139
-
140
- switch (fpopcode) {
141
- case 0x0c: /* fmaxnmv half-precision */
142
- gen_helper_advsimd_maxnumh(tcg_res, tcg_lo, tcg_hi, fpst);
143
- break;
144
- case 0x0f: /* fmaxv half-precision */
145
- gen_helper_advsimd_maxh(tcg_res, tcg_lo, tcg_hi, fpst);
146
- break;
147
- case 0x1c: /* fminnmv half-precision */
148
- gen_helper_advsimd_minnumh(tcg_res, tcg_lo, tcg_hi, fpst);
149
- break;
150
- case 0x1f: /* fminv half-precision */
151
- gen_helper_advsimd_minh(tcg_res, tcg_lo, tcg_hi, fpst);
152
- break;
153
- case 0x2c: /* fmaxnmv */
154
- gen_helper_vfp_maxnums(tcg_res, tcg_lo, tcg_hi, fpst);
155
- break;
156
- case 0x2f: /* fmaxv */
157
- gen_helper_vfp_maxs(tcg_res, tcg_lo, tcg_hi, fpst);
158
- break;
159
- case 0x3c: /* fminnmv */
160
- gen_helper_vfp_minnums(tcg_res, tcg_lo, tcg_hi, fpst);
161
- break;
162
- case 0x3f: /* fminv */
163
- gen_helper_vfp_mins(tcg_res, tcg_lo, tcg_hi, fpst);
164
- break;
165
- default:
166
- g_assert_not_reached();
167
- }
168
- return tcg_res;
169
- }
170
-}
171
-
172
-/* AdvSIMD across lanes
173
- * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0
174
- * +---+---+---+-----------+------+-----------+--------+-----+------+------+
175
- * | 0 | Q | U | 0 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 | Rn | Rd |
176
- * +---+---+---+-----------+------+-----------+--------+-----+------+------+
177
- */
178
-static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
179
-{
180
- int rd = extract32(insn, 0, 5);
181
- int rn = extract32(insn, 5, 5);
182
- int size = extract32(insn, 22, 2);
183
- int opcode = extract32(insn, 12, 5);
184
- bool is_q = extract32(insn, 30, 1);
185
- bool is_u = extract32(insn, 29, 1);
186
- bool is_min = false;
187
- int elements;
188
-
189
- switch (opcode) {
190
- case 0xc: /* FMAXNMV, FMINNMV */
191
- case 0xf: /* FMAXV, FMINV */
192
- /* Bit 1 of size field encodes min vs max and the actual size
193
- * depends on the encoding of the U bit. If not set (and FP16
194
- * enabled) then we do half-precision float instead of single
195
- * precision.
196
- */
197
- is_min = extract32(size, 1, 1);
198
- if (!is_u && dc_isar_feature(aa64_fp16, s)) {
199
- size = 1;
200
- } else if (!is_u || !is_q || extract32(size, 0, 1)) {
201
- unallocated_encoding(s);
202
- return;
203
- } else {
204
- size = 2;
205
- }
206
- break;
207
- default:
208
- case 0x3: /* SADDLV, UADDLV */
209
- case 0xa: /* SMAXV, UMAXV */
210
- case 0x1a: /* SMINV, UMINV */
211
- case 0x1b: /* ADDV */
212
- unallocated_encoding(s);
213
- return;
214
- }
215
-
216
- if (!fp_access_check(s)) {
217
- return;
218
- }
219
-
220
- elements = (is_q ? 16 : 8) >> size;
221
-
222
- {
223
- /* Floating point vector reduction ops which work across 32
224
- * bit (single) or 16 bit (half-precision) intermediates.
225
- * Note that correct NaN propagation requires that we do these
226
- * operations in exactly the order specified by the pseudocode.
227
- */
228
- TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
229
- int fpopcode = opcode | is_min << 4 | is_u << 5;
230
- TCGv_i32 tcg_res = do_reduction_op(s, fpopcode, rn, size,
231
- 0, elements, fpst);
232
- write_fp_sreg(s, rd, tcg_res);
233
- }
234
-}
235
-
236
/* AdvSIMD modified immediate
237
* 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0
238
* +---+---+----+---------------------+-----+-------+----+---+-------+------+
239
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
240
static const AArch64DecodeTable data_proc_simd[] = {
241
/* pattern , mask , fn */
242
{ 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
243
- { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
244
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
245
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
246
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
247
--
248
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 4 ++
9
target/arm/tcg/translate-a64.c | 74 ++++++++++++----------------------
10
2 files changed, 30 insertions(+), 48 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ FMAXV_s 0110 1110 00 11000 01111 10 ..... ..... @rr_q1e2
17
18
FMINV_h 0.00 1110 10 11000 01111 10 ..... ..... @qrr_h
19
FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
20
+
21
+# Floating-point Immediate
22
+
23
+FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
24
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/tcg/translate-a64.c
27
+++ b/target/arm/tcg/translate-a64.c
28
@@ -XXX,XX +XXX,XX @@ TRANS(FMINNMV_s, do_fp_reduction, a, gen_helper_vfp_minnums)
29
TRANS(FMAXV_s, do_fp_reduction, a, gen_helper_vfp_maxs)
30
TRANS(FMINV_s, do_fp_reduction, a, gen_helper_vfp_mins)
31
32
+/*
33
+ * Floating-point Immediate
34
+ */
35
+
36
+static bool trans_FMOVI_s(DisasContext *s, arg_FMOVI_s *a)
37
+{
38
+ switch (a->esz) {
39
+ case MO_32:
40
+ case MO_64:
41
+ break;
42
+ case MO_16:
43
+ if (!dc_isar_feature(aa64_fp16, s)) {
44
+ return false;
45
+ }
46
+ break;
47
+ default:
48
+ return false;
49
+ }
50
+ if (fp_access_check(s)) {
51
+ uint64_t imm = vfp_expand_imm(a->esz, a->imm);
52
+ write_fp_dreg(s, a->rd, tcg_constant_i64(imm));
53
+ }
54
+ return true;
55
+}
56
+
57
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
58
* Note that it is the caller's responsibility to ensure that the
59
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
60
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
61
}
62
}
63
64
-/* Floating point immediate
65
- * 31 30 29 28 24 23 22 21 20 13 12 10 9 5 4 0
66
- * +---+---+---+-----------+------+---+------------+-------+------+------+
67
- * | M | 0 | S | 1 1 1 1 0 | type | 1 | imm8 | 1 0 0 | imm5 | Rd |
68
- * +---+---+---+-----------+------+---+------------+-------+------+------+
69
- */
70
-static void disas_fp_imm(DisasContext *s, uint32_t insn)
71
-{
72
- int rd = extract32(insn, 0, 5);
73
- int imm5 = extract32(insn, 5, 5);
74
- int imm8 = extract32(insn, 13, 8);
75
- int type = extract32(insn, 22, 2);
76
- int mos = extract32(insn, 29, 3);
77
- uint64_t imm;
78
- MemOp sz;
79
-
80
- if (mos || imm5) {
81
- unallocated_encoding(s);
82
- return;
83
- }
84
-
85
- switch (type) {
86
- case 0:
87
- sz = MO_32;
88
- break;
89
- case 1:
90
- sz = MO_64;
91
- break;
92
- case 3:
93
- sz = MO_16;
94
- if (dc_isar_feature(aa64_fp16, s)) {
95
- break;
96
- }
97
- /* fallthru */
98
- default:
99
- unallocated_encoding(s);
100
- return;
101
- }
102
-
103
- if (!fp_access_check(s)) {
104
- return;
105
- }
106
-
107
- imm = vfp_expand_imm(sz, imm8);
108
- write_fp_dreg(s, rd, tcg_constant_i64(imm));
109
-}
110
-
111
/* Handle floating point <=> fixed point conversions. Note that we can
112
* also deal with fp <=> integer conversions as a special case (scale == 64)
113
* OPTME: consider handling that special case specially or at least skipping
114
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
115
switch (ctz32(extract32(insn, 12, 4))) {
116
case 0: /* [15:12] == xxx1 */
117
/* Floating point immediate */
118
- disas_fp_imm(s, insn);
119
+ unallocated_encoding(s); /* in decodetree */
120
break;
121
case 1: /* [15:12] == xx10 */
122
/* Floating point compare */
123
--
124
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-14-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 9 +++
9
target/arm/tcg/translate-a64.c | 117 ++++++++++++++-------------------
10
2 files changed, 59 insertions(+), 67 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
17
# Floating-point Immediate
18
19
FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
20
+
21
+# Advanced SIMD Modified Immediate
22
+
23
+%abcdefgh 16:3 5:5
24
+
25
+FMOVI_v_h 0 q:1 00 1111 00000 ... 1111 11 ..... rd:5 %abcdefgh
26
+
27
+# MOVI, MVNI, ORR, BIC, FMOV are all intermixed via cmode.
28
+Vimm 0 q:1 op:1 0 1111 00000 ... cmode:4 01 ..... rd:5 %abcdefgh
29
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/tcg/translate-a64.c
32
+++ b/target/arm/tcg/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ static bool trans_FMOVI_s(DisasContext *s, arg_FMOVI_s *a)
34
return true;
35
}
36
37
+/*
38
+ * Advanced SIMD Modified Immediate
39
+ */
40
+
41
+static bool trans_FMOVI_v_h(DisasContext *s, arg_FMOVI_v_h *a)
42
+{
43
+ if (!dc_isar_feature(aa64_fp16, s)) {
44
+ return false;
45
+ }
46
+ if (fp_access_check(s)) {
47
+ tcg_gen_gvec_dup_imm(MO_16, vec_full_reg_offset(s, a->rd),
48
+ a->q ? 16 : 8, vec_full_reg_size(s),
49
+ vfp_expand_imm(MO_16, a->abcdefgh));
50
+ }
51
+ return true;
52
+}
53
+
54
+static void gen_movi(unsigned vece, uint32_t dofs, uint32_t aofs,
55
+ int64_t c, uint32_t oprsz, uint32_t maxsz)
56
+{
57
+ tcg_gen_gvec_dup_imm(MO_64, dofs, oprsz, maxsz, c);
58
+}
59
+
60
+static bool trans_Vimm(DisasContext *s, arg_Vimm *a)
61
+{
62
+ GVecGen2iFn *fn;
63
+
64
+ /* Handle decode of cmode/op here between ORR/BIC/MOVI */
65
+ if ((a->cmode & 1) && a->cmode < 12) {
66
+ /* For op=1, the imm will be inverted, so BIC becomes AND. */
67
+ fn = a->op ? tcg_gen_gvec_andi : tcg_gen_gvec_ori;
68
+ } else {
69
+ /* There is one unallocated cmode/op combination in this space */
70
+ if (a->cmode == 15 && a->op == 1 && a->q == 0) {
71
+ return false;
72
+ }
73
+ fn = gen_movi;
74
+ }
75
+
76
+ if (fp_access_check(s)) {
77
+ uint64_t imm = asimd_imm_const(a->abcdefgh, a->cmode, a->op);
78
+ gen_gvec_fn2i(s, a->q, a->rd, a->rd, imm, fn, MO_64);
79
+ }
80
+ return true;
81
+}
82
+
83
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
84
* Note that it is the caller's responsibility to ensure that the
85
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
86
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
87
}
88
}
89
90
-/* AdvSIMD modified immediate
91
- * 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0
92
- * +---+---+----+---------------------+-----+-------+----+---+-------+------+
93
- * | 0 | Q | op | 0 1 1 1 1 0 0 0 0 0 | abc | cmode | o2 | 1 | defgh | Rd |
94
- * +---+---+----+---------------------+-----+-------+----+---+-------+------+
95
- *
96
- * There are a number of operations that can be carried out here:
97
- * MOVI - move (shifted) imm into register
98
- * MVNI - move inverted (shifted) imm into register
99
- * ORR - bitwise OR of (shifted) imm with register
100
- * BIC - bitwise clear of (shifted) imm with register
101
- * With ARMv8.2 we also have:
102
- * FMOV half-precision
103
- */
104
-static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
105
-{
106
- int rd = extract32(insn, 0, 5);
107
- int cmode = extract32(insn, 12, 4);
108
- int o2 = extract32(insn, 11, 1);
109
- uint64_t abcdefgh = extract32(insn, 5, 5) | (extract32(insn, 16, 3) << 5);
110
- bool is_neg = extract32(insn, 29, 1);
111
- bool is_q = extract32(insn, 30, 1);
112
- uint64_t imm = 0;
113
-
114
- if (o2) {
115
- if (cmode != 0xf || is_neg) {
116
- unallocated_encoding(s);
117
- return;
118
- }
119
- /* FMOV (vector, immediate) - half-precision */
120
- if (!dc_isar_feature(aa64_fp16, s)) {
121
- unallocated_encoding(s);
122
- return;
123
- }
124
- imm = vfp_expand_imm(MO_16, abcdefgh);
125
- /* now duplicate across the lanes */
126
- imm = dup_const(MO_16, imm);
127
- } else {
128
- if (cmode == 0xf && is_neg && !is_q) {
129
- unallocated_encoding(s);
130
- return;
131
- }
132
- imm = asimd_imm_const(abcdefgh, cmode, is_neg);
133
- }
134
-
135
- if (!fp_access_check(s)) {
136
- return;
137
- }
138
-
139
- if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) {
140
- /* MOVI or MVNI, with MVNI negation handled above. */
141
- tcg_gen_gvec_dup_imm(MO_64, vec_full_reg_offset(s, rd), is_q ? 16 : 8,
142
- vec_full_reg_size(s), imm);
143
- } else {
144
- /* ORR or BIC, with BIC negation to AND handled above. */
145
- if (is_neg) {
146
- gen_gvec_fn2i(s, is_q, rd, rd, imm, tcg_gen_gvec_andi, MO_64);
147
- } else {
148
- gen_gvec_fn2i(s, is_q, rd, rd, imm, tcg_gen_gvec_ori, MO_64);
149
- }
150
- }
151
-}
152
-
153
/*
154
* Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
155
*
156
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
157
bool is_u = extract32(insn, 29, 1);
158
bool is_q = extract32(insn, 30, 1);
159
160
- /* data_proc_simd[] has sent immh == 0 to disas_simd_mod_imm. */
161
- assert(immh != 0);
162
+ if (immh == 0) {
163
+ unallocated_encoding(s);
164
+ return;
165
+ }
166
167
switch (opcode) {
168
case 0x08: /* SRI */
169
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
170
static const AArch64DecodeTable data_proc_simd[] = {
171
/* pattern , mask , fn */
172
{ 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
173
- /* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
174
- { 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
175
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
176
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
177
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
178
--
179
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Handle the two special cases within these new
4
functions instead of higher in the call stack.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-15-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/translate.h | 5 +++++
12
target/arm/tcg/gengvec.c | 19 +++++++++++++++++++
13
target/arm/tcg/translate-a64.c | 16 +---------------
14
target/arm/tcg/translate-neon.c | 25 ++-----------------------
15
4 files changed, 27 insertions(+), 38 deletions(-)
16
17
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/tcg/translate.h
20
+++ b/target/arm/tcg/translate.h
21
@@ -XXX,XX +XXX,XX @@ void gen_sqsub_d(TCGv_i64 d, TCGv_i64 q, TCGv_i64 a, TCGv_i64 b);
22
void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
23
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
24
25
+void gen_gvec_sshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
26
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
27
+void gen_gvec_ushr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
28
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
29
+
30
void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
31
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
32
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
33
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/gengvec.c
36
+++ b/target/arm/tcg/gengvec.c
37
@@ -XXX,XX +XXX,XX @@ GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT)
38
39
#undef GEN_CMP0
40
41
+void gen_gvec_sshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
42
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
43
+{
44
+ /* Signed shift out of range results in all-sign-bits */
45
+ shift = MIN(shift, (8 << vece) - 1);
46
+ tcg_gen_gvec_sari(vece, rd_ofs, rm_ofs, shift, opr_sz, max_sz);
47
+}
48
+
49
+void gen_gvec_ushr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
50
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
51
+{
52
+ /* Unsigned shift out of range results in all-zero-bits */
53
+ if (shift >= (8 << vece)) {
54
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
55
+ } else {
56
+ tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift, opr_sz, max_sz);
57
+ }
58
+}
59
+
60
static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
61
{
62
tcg_gen_vec_sar8i_i64(a, a, shift);
63
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/tcg/translate-a64.c
66
+++ b/target/arm/tcg/translate-a64.c
67
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
68
break;
69
70
case 0x00: /* SSHR / USHR */
71
- if (is_u) {
72
- if (shift == 8 << size) {
73
- /* Shift count the same size as element size produces zero. */
74
- tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
75
- is_q ? 16 : 8, vec_full_reg_size(s), 0);
76
- return;
77
- }
78
- gvec_fn = tcg_gen_gvec_shri;
79
- } else {
80
- /* Shift count the same size as element size produces all sign. */
81
- if (shift == 8 << size) {
82
- shift -= 1;
83
- }
84
- gvec_fn = tcg_gen_gvec_sari;
85
- }
86
+ gvec_fn = is_u ? gen_gvec_ushr : gen_gvec_sshr;
87
break;
88
89
case 0x04: /* SRSHR / URSHR (rounding) */
90
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/tcg/translate-neon.c
93
+++ b/target/arm/tcg/translate-neon.c
94
@@ -XXX,XX +XXX,XX @@ DO_2SH(VRSHR_S, gen_gvec_srshr)
95
DO_2SH(VRSHR_U, gen_gvec_urshr)
96
DO_2SH(VRSRA_S, gen_gvec_srsra)
97
DO_2SH(VRSRA_U, gen_gvec_ursra)
98
-
99
-static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
100
-{
101
- /* Signed shift out of range results in all-sign-bits */
102
- a->shift = MIN(a->shift, (8 << a->size) - 1);
103
- return do_vector_2sh(s, a, tcg_gen_gvec_sari);
104
-}
105
-
106
-static void gen_zero_rd_2sh(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
107
- int64_t shift, uint32_t oprsz, uint32_t maxsz)
108
-{
109
- tcg_gen_gvec_dup_imm(vece, rd_ofs, oprsz, maxsz, 0);
110
-}
111
-
112
-static bool trans_VSHR_U_2sh(DisasContext *s, arg_2reg_shift *a)
113
-{
114
- /* Shift out of range is architecturally valid and results in zero. */
115
- if (a->shift >= (8 << a->size)) {
116
- return do_vector_2sh(s, a, gen_zero_rd_2sh);
117
- } else {
118
- return do_vector_2sh(s, a, tcg_gen_gvec_shri);
119
- }
120
-}
121
+DO_2SH(VSHR_S, gen_gvec_sshr)
122
+DO_2SH(VSHR_U, gen_gvec_ushr)
123
124
static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
125
NeonGenTwo64OpEnvFn *fn)
126
--
127
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
This includes SSHR, USHR, SSRA, USRA, SRSHR, URSHR, SRSRA, URSRA, SRI.
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20240912024114.1097832-17-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/tcg/a64.decode | 63 ++++++++++++++++++++++++-
11
target/arm/tcg/translate-a64.c | 86 +++++++++++-----------------------
12
2 files changed, 89 insertions(+), 60 deletions(-)
13
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/a64.decode
17
+++ b/target/arm/tcg/a64.decode
18
@@ -XXX,XX +XXX,XX @@
19
&rrx_e rd rn rm idx esz
20
&rrrr_e rd rn rm ra esz
21
&qrr_e q rd rn esz
22
+&qrri_e q rd rn imm esz
23
&qrrr_e q rd rn rm esz
24
&qrrx_e q rd rn rm idx esz
25
&qrrrr_e q rd rn rm ra esz
26
@@ -XXX,XX +XXX,XX @@ FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
27
28
FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
29
30
-# Advanced SIMD Modified Immediate
31
+# Advanced SIMD Modified Immediate / Shift by Immediate
32
33
%abcdefgh 16:3 5:5
34
35
+# Right shifts are encoded as N - shift, where N is the element size in bits.
36
+%neon_rshift_i6 16:6 !function=rsub_64
37
+%neon_rshift_i5 16:5 !function=rsub_32
38
+%neon_rshift_i4 16:4 !function=rsub_16
39
+%neon_rshift_i3 16:3 !function=rsub_8
40
+
41
+@q_shri_b . q:1 .. ..... 0001 ... ..... . rn:5 rd:5 \
42
+ &qrri_e esz=0 imm=%neon_rshift_i3
43
+@q_shri_h . q:1 .. ..... 001 .... ..... . rn:5 rd:5 \
44
+ &qrri_e esz=1 imm=%neon_rshift_i4
45
+@q_shri_s . q:1 .. ..... 01 ..... ..... . rn:5 rd:5 \
46
+ &qrri_e esz=2 imm=%neon_rshift_i5
47
+@q_shri_d . 1 .. ..... 1 ...... ..... . rn:5 rd:5 \
48
+ &qrri_e esz=3 imm=%neon_rshift_i6 q=1
49
+
50
FMOVI_v_h 0 q:1 00 1111 00000 ... 1111 11 ..... rd:5 %abcdefgh
51
52
# MOVI, MVNI, ORR, BIC, FMOV are all intermixed via cmode.
53
Vimm 0 q:1 op:1 0 1111 00000 ... cmode:4 01 ..... rd:5 %abcdefgh
54
+
55
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_b
56
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_h
57
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_s
58
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_d
59
+
60
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_b
61
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_h
62
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_s
63
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_d
64
+
65
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_b
66
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_h
67
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_s
68
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_d
69
+
70
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_b
71
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_h
72
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_s
73
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_d
74
+
75
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_b
76
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_h
77
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_s
78
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_d
79
+
80
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_b
81
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_h
82
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_s
83
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_d
84
+
85
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_b
86
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_h
87
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_s
88
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_d
89
+
90
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_b
91
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_h
92
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_s
93
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_d
94
+
95
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_b
96
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_h
97
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_s
98
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_d
99
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
100
index XXXXXXX..XXXXXXX 100644
101
--- a/target/arm/tcg/translate-a64.c
102
+++ b/target/arm/tcg/translate-a64.c
103
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm(DisasContext *s, arg_Vimm *a)
104
return true;
105
}
106
107
+/*
108
+ * Advanced SIMD Shift by Immediate
109
+ */
110
+
111
+static bool do_vec_shift_imm(DisasContext *s, arg_qrri_e *a, GVecGen2iFn *fn)
112
+{
113
+ if (fp_access_check(s)) {
114
+ gen_gvec_fn2i(s, a->q, a->rd, a->rn, a->imm, fn, a->esz);
115
+ }
116
+ return true;
117
+}
118
+
119
+TRANS(SSHR_v, do_vec_shift_imm, a, gen_gvec_sshr)
120
+TRANS(USHR_v, do_vec_shift_imm, a, gen_gvec_ushr)
121
+TRANS(SSRA_v, do_vec_shift_imm, a, gen_gvec_ssra)
122
+TRANS(USRA_v, do_vec_shift_imm, a, gen_gvec_usra)
123
+TRANS(SRSHR_v, do_vec_shift_imm, a, gen_gvec_srshr)
124
+TRANS(URSHR_v, do_vec_shift_imm, a, gen_gvec_urshr)
125
+TRANS(SRSRA_v, do_vec_shift_imm, a, gen_gvec_srsra)
126
+TRANS(URSRA_v, do_vec_shift_imm, a, gen_gvec_ursra)
127
+TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
128
+
129
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
130
* Note that it is the caller's responsibility to ensure that the
131
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
132
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
133
}
134
}
135
136
-/* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */
137
-static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
138
- int immh, int immb, int opcode, int rn, int rd)
139
-{
140
- int size = 32 - clz32(immh) - 1;
141
- int immhb = immh << 3 | immb;
142
- int shift = 2 * (8 << size) - immhb;
143
- GVecGen2iFn *gvec_fn;
144
-
145
- if (extract32(immh, 3, 1) && !is_q) {
146
- unallocated_encoding(s);
147
- return;
148
- }
149
- tcg_debug_assert(size <= 3);
150
-
151
- if (!fp_access_check(s)) {
152
- return;
153
- }
154
-
155
- switch (opcode) {
156
- case 0x02: /* SSRA / USRA (accumulate) */
157
- gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
158
- break;
159
-
160
- case 0x08: /* SRI */
161
- gvec_fn = gen_gvec_sri;
162
- break;
163
-
164
- case 0x00: /* SSHR / USHR */
165
- gvec_fn = is_u ? gen_gvec_ushr : gen_gvec_sshr;
166
- break;
167
-
168
- case 0x04: /* SRSHR / URSHR (rounding) */
169
- gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
170
- break;
171
-
172
- case 0x06: /* SRSRA / URSRA (accum + rounding) */
173
- gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
174
- break;
175
-
176
- default:
177
- g_assert_not_reached();
178
- }
179
-
180
- gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
181
-}
182
-
183
/* SHL/SLI - Vector shift left */
184
static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
185
int immh, int immb, int opcode, int rn, int rd)
186
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
187
}
188
189
switch (opcode) {
190
- case 0x08: /* SRI */
191
- if (!is_u) {
192
- unallocated_encoding(s);
193
- return;
194
- }
195
- /* fall through */
196
- case 0x00: /* SSHR / USHR */
197
- case 0x02: /* SSRA / USRA (accumulate) */
198
- case 0x04: /* SRSHR / URSHR (rounding) */
199
- case 0x06: /* SRSRA / URSRA (accum + rounding) */
200
- handle_vec_simd_shri(s, is_q, is_u, immh, immb, opcode, rn, rd);
201
- break;
202
case 0x0a: /* SHL / SLI */
203
handle_vec_simd_shli(s, is_q, is_u, immh, immb, opcode, rn, rd);
204
break;
205
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
206
handle_simd_shift_fpint_conv(s, false, is_q, is_u, immh, immb, rn, rd);
207
return;
208
default:
209
+ case 0x00: /* SSHR / USHR */
210
+ case 0x02: /* SSRA / USRA (accumulate) */
211
+ case 0x04: /* SRSHR / URSHR (rounding) */
212
+ case 0x06: /* SRSRA / URSRA (accum + rounding) */
213
+ case 0x08: /* SRI */
214
unallocated_encoding(s);
215
return;
216
}
217
--
218
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
This includes SHL and SLI.
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20240912024114.1097832-18-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/tcg/a64.decode | 15 +++++++++++++++
11
target/arm/tcg/translate-a64.c | 33 +++------------------------------
12
2 files changed, 18 insertions(+), 30 deletions(-)
13
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/a64.decode
17
+++ b/target/arm/tcg/a64.decode
18
@@ -XXX,XX +XXX,XX @@ FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
19
@q_shri_d . 1 .. ..... 1 ...... ..... . rn:5 rd:5 \
20
&qrri_e esz=3 imm=%neon_rshift_i6 q=1
21
22
+@q_shli_b . q:1 .. ..... 0001 imm:3 ..... . rn:5 rd:5 &qrri_e esz=0
23
+@q_shli_h . q:1 .. ..... 001 imm:4 ..... . rn:5 rd:5 &qrri_e esz=1
24
+@q_shli_s . q:1 .. ..... 01 imm:5 ..... . rn:5 rd:5 &qrri_e esz=2
25
+@q_shli_d . 1 .. ..... 1 imm:6 ..... . rn:5 rd:5 &qrri_e esz=3 q=1
26
+
27
FMOVI_v_h 0 q:1 00 1111 00000 ... 1111 11 ..... rd:5 %abcdefgh
28
29
# MOVI, MVNI, ORR, BIC, FMOV are all intermixed via cmode.
30
@@ -XXX,XX +XXX,XX @@ SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_b
31
SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_h
32
SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_s
33
SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_d
34
+
35
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_b
36
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_h
37
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_s
38
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_d
39
+
40
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_b
41
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_h
42
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_s
43
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_d
44
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/tcg/translate-a64.c
47
+++ b/target/arm/tcg/translate-a64.c
48
@@ -XXX,XX +XXX,XX @@ TRANS(URSHR_v, do_vec_shift_imm, a, gen_gvec_urshr)
49
TRANS(SRSRA_v, do_vec_shift_imm, a, gen_gvec_srsra)
50
TRANS(URSRA_v, do_vec_shift_imm, a, gen_gvec_ursra)
51
TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
52
+TRANS(SHL_v, do_vec_shift_imm, a, tcg_gen_gvec_shli)
53
+TRANS(SLI_v, do_vec_shift_imm, a, gen_gvec_sli);
54
55
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
56
* Note that it is the caller's responsibility to ensure that the
57
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
58
}
59
}
60
61
-/* SHL/SLI - Vector shift left */
62
-static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
63
- int immh, int immb, int opcode, int rn, int rd)
64
-{
65
- int size = 32 - clz32(immh) - 1;
66
- int immhb = immh << 3 | immb;
67
- int shift = immhb - (8 << size);
68
-
69
- /* Range of size is limited by decode: immh is a non-zero 4 bit field */
70
- assert(size >= 0 && size <= 3);
71
-
72
- if (extract32(immh, 3, 1) && !is_q) {
73
- unallocated_encoding(s);
74
- return;
75
- }
76
-
77
- if (!fp_access_check(s)) {
78
- return;
79
- }
80
-
81
- if (insert) {
82
- gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
83
- } else {
84
- gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
85
- }
86
-}
87
-
88
/* USHLL/SHLL - Vector shift left with widening */
89
static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
90
int immh, int immb, int opcode, int rn, int rd)
91
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
92
}
93
94
switch (opcode) {
95
- case 0x0a: /* SHL / SLI */
96
- handle_vec_simd_shli(s, is_q, is_u, immh, immb, opcode, rn, rd);
97
- break;
98
case 0x10: /* SHRN */
99
case 0x11: /* RSHRN / SQRSHRUN */
100
if (is_u) {
101
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
102
case 0x04: /* SRSHR / URSHR (rounding) */
103
case 0x06: /* SRSRA / URSRA (accum + rounding) */
104
case 0x08: /* SRI */
105
+ case 0x0a: /* SHL / SLI */
106
unallocated_encoding(s);
107
return;
108
}
109
--
110
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Combine the right shift with the extension via
4
the tcg extract operations.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-19-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/translate-a64.c | 7 +++++--
12
1 file changed, 5 insertions(+), 2 deletions(-)
13
14
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/translate-a64.c
17
+++ b/target/arm/tcg/translate-a64.c
18
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
19
read_vec_element(s, tcg_rn, rn, is_q ? 1 : 0, MO_64);
20
21
for (i = 0; i < elements; i++) {
22
- tcg_gen_shri_i64(tcg_rd, tcg_rn, i * esize);
23
- ext_and_shift_reg(tcg_rd, tcg_rd, size | (!is_u << 2), 0);
24
+ if (is_u) {
25
+ tcg_gen_extract_i64(tcg_rd, tcg_rn, i * esize, esize);
26
+ } else {
27
+ tcg_gen_sextract_i64(tcg_rd, tcg_rn, i * esize, esize);
28
+ }
29
tcg_gen_shli_i64(tcg_rd, tcg_rd, shift);
30
write_vec_element(s, tcg_rd, rd, i, size + 1);
31
}
32
--
33
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-20-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 8 ++++
9
target/arm/tcg/translate-a64.c | 81 ++++++++++++++++------------------
10
2 files changed, 45 insertions(+), 44 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_b
17
SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_h
18
SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_s
19
SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_d
20
+
21
+SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_b
22
+SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_h
23
+SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_s
24
+
25
+USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_b
26
+USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_h
27
+USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_s
28
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/tcg/translate-a64.c
31
+++ b/target/arm/tcg/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
33
TRANS(SHL_v, do_vec_shift_imm, a, tcg_gen_gvec_shli)
34
TRANS(SLI_v, do_vec_shift_imm, a, gen_gvec_sli);
35
36
+static bool do_vec_shift_imm_wide(DisasContext *s, arg_qrri_e *a, bool is_u)
37
+{
38
+ TCGv_i64 tcg_rn, tcg_rd;
39
+ int esz = a->esz;
40
+ int esize;
41
+
42
+ if (!fp_access_check(s)) {
43
+ return true;
44
+ }
45
+
46
+ /*
47
+ * For the LL variants the store is larger than the load,
48
+ * so if rd == rn we would overwrite parts of our input.
49
+ * So load everything right now and use shifts in the main loop.
50
+ */
51
+ tcg_rd = tcg_temp_new_i64();
52
+ tcg_rn = tcg_temp_new_i64();
53
+ read_vec_element(s, tcg_rn, a->rn, a->q, MO_64);
54
+
55
+ esize = 8 << esz;
56
+ for (int i = 0, elements = 8 >> esz; i < elements; i++) {
57
+ if (is_u) {
58
+ tcg_gen_extract_i64(tcg_rd, tcg_rn, i * esize, esize);
59
+ } else {
60
+ tcg_gen_sextract_i64(tcg_rd, tcg_rn, i * esize, esize);
61
+ }
62
+ tcg_gen_shli_i64(tcg_rd, tcg_rd, a->imm);
63
+ write_vec_element(s, tcg_rd, a->rd, i, esz + 1);
64
+ }
65
+ clear_vec_high(s, true, a->rd);
66
+ return true;
67
+}
68
+
69
+TRANS(SSHLL_v, do_vec_shift_imm_wide, a, false)
70
+TRANS(USHLL_v, do_vec_shift_imm_wide, a, true)
71
+
72
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
73
* Note that it is the caller's responsibility to ensure that the
74
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
75
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
76
}
77
}
78
79
-/* USHLL/SHLL - Vector shift left with widening */
80
-static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
81
- int immh, int immb, int opcode, int rn, int rd)
82
-{
83
- int size = 32 - clz32(immh) - 1;
84
- int immhb = immh << 3 | immb;
85
- int shift = immhb - (8 << size);
86
- int dsize = 64;
87
- int esize = 8 << size;
88
- int elements = dsize/esize;
89
- TCGv_i64 tcg_rn = tcg_temp_new_i64();
90
- TCGv_i64 tcg_rd = tcg_temp_new_i64();
91
- int i;
92
-
93
- if (size >= 3) {
94
- unallocated_encoding(s);
95
- return;
96
- }
97
-
98
- if (!fp_access_check(s)) {
99
- return;
100
- }
101
-
102
- /* For the LL variants the store is larger than the load,
103
- * so if rd == rn we would overwrite parts of our input.
104
- * So load everything right now and use shifts in the main loop.
105
- */
106
- read_vec_element(s, tcg_rn, rn, is_q ? 1 : 0, MO_64);
107
-
108
- for (i = 0; i < elements; i++) {
109
- if (is_u) {
110
- tcg_gen_extract_i64(tcg_rd, tcg_rn, i * esize, esize);
111
- } else {
112
- tcg_gen_sextract_i64(tcg_rd, tcg_rn, i * esize, esize);
113
- }
114
- tcg_gen_shli_i64(tcg_rd, tcg_rd, shift);
115
- write_vec_element(s, tcg_rd, rd, i, size + 1);
116
- }
117
- clear_vec_high(s, true, rd);
118
-}
119
-
120
/* SHRN/RSHRN - Shift right with narrowing (and potential rounding) */
121
static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
122
int immh, int immb, int opcode, int rn, int rd)
123
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
124
handle_vec_simd_sqshrn(s, false, is_q, is_u, is_u, immh, immb,
125
opcode, rn, rd);
126
break;
127
- case 0x14: /* SSHLL / USHLL */
128
- handle_vec_simd_wshli(s, is_q, is_u, immh, immb, opcode, rn, rd);
129
- break;
130
case 0x1c: /* SCVTF / UCVTF */
131
handle_simd_shift_intfp_conv(s, false, is_q, is_u, immh, immb,
132
opcode, rn, rd);
133
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
134
case 0x06: /* SRSRA / URSRA (accum + rounding) */
135
case 0x08: /* SRI */
136
case 0x0a: /* SHL / SLI */
137
+ case 0x14: /* SSHLL / USHLL */
138
unallocated_encoding(s);
139
return;
140
}
141
--
142
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
There isn't a lot of commonality along the different paths of
4
handle_shri_with_rndacc. Split them out to separate functions,
5
which will be usable during the decodetree conversion.
6
7
Simplify 64-bit rounding operations to not require double-word arithmetic.
8
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20240912024114.1097832-22-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
target/arm/tcg/translate-a64.c | 138 ++++++++++++++++++++-------------
15
1 file changed, 82 insertions(+), 56 deletions(-)
16
17
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/tcg/translate-a64.c
20
+++ b/target/arm/tcg/translate-a64.c
21
@@ -XXX,XX +XXX,XX @@ static bool do_vec_shift_imm_wide(DisasContext *s, arg_qrri_e *a, bool is_u)
22
TRANS(SSHLL_v, do_vec_shift_imm_wide, a, false)
23
TRANS(USHLL_v, do_vec_shift_imm_wide, a, true)
24
25
+static void gen_sshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
26
+{
27
+ assert(shift >= 0 && shift <= 64);
28
+ tcg_gen_sari_i64(dst, src, MIN(shift, 63));
29
+}
30
+
31
+static void gen_ushr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
32
+{
33
+ assert(shift >= 0 && shift <= 64);
34
+ if (shift == 64) {
35
+ tcg_gen_movi_i64(dst, 0);
36
+ } else {
37
+ tcg_gen_shri_i64(dst, src, shift);
38
+ }
39
+}
40
+
41
+static void gen_srshr_bhs(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
42
+{
43
+ assert(shift >= 0 && shift <= 32);
44
+ if (shift) {
45
+ TCGv_i64 rnd = tcg_constant_i64(1ull << (shift - 1));
46
+ tcg_gen_add_i64(dst, src, rnd);
47
+ tcg_gen_sari_i64(dst, dst, shift);
48
+ } else {
49
+ tcg_gen_mov_i64(dst, src);
50
+ }
51
+}
52
+
53
+static void gen_urshr_bhs(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
54
+{
55
+ assert(shift >= 0 && shift <= 32);
56
+ if (shift) {
57
+ TCGv_i64 rnd = tcg_constant_i64(1ull << (shift - 1));
58
+ tcg_gen_add_i64(dst, src, rnd);
59
+ tcg_gen_shri_i64(dst, dst, shift);
60
+ } else {
61
+ tcg_gen_mov_i64(dst, src);
62
+ }
63
+}
64
+
65
+static void gen_srshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
66
+{
67
+ assert(shift >= 0 && shift <= 64);
68
+ if (shift == 0) {
69
+ tcg_gen_mov_i64(dst, src);
70
+ } else if (shift == 64) {
71
+ /* Extension of sign bit (0,-1) plus sign bit (0,1) is zero. */
72
+ tcg_gen_movi_i64(dst, 0);
73
+ } else {
74
+ TCGv_i64 rnd = tcg_temp_new_i64();
75
+ tcg_gen_extract_i64(rnd, src, shift - 1, 1);
76
+ tcg_gen_sari_i64(dst, src, shift);
77
+ tcg_gen_add_i64(dst, dst, rnd);
78
+ }
79
+}
80
+
81
+static void gen_urshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
82
+{
83
+ assert(shift >= 0 && shift <= 64);
84
+ if (shift == 0) {
85
+ tcg_gen_mov_i64(dst, src);
86
+ } else if (shift == 64) {
87
+ /* Rounding will propagate bit 63 into bit 64. */
88
+ tcg_gen_shri_i64(dst, src, 63);
89
+ } else {
90
+ TCGv_i64 rnd = tcg_temp_new_i64();
91
+ tcg_gen_extract_i64(rnd, src, shift - 1, 1);
92
+ tcg_gen_shri_i64(dst, src, shift);
93
+ tcg_gen_add_i64(dst, dst, rnd);
94
+ }
95
+}
96
+
97
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
98
* Note that it is the caller's responsibility to ensure that the
99
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
100
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
101
bool round, bool accumulate,
102
bool is_u, int size, int shift)
103
{
104
- bool extended_result = false;
105
- int ext_lshift = 0;
106
- TCGv_i64 tcg_src_hi;
107
-
108
- if (round && size == 3) {
109
- extended_result = true;
110
- ext_lshift = 64 - shift;
111
- tcg_src_hi = tcg_temp_new_i64();
112
- } else if (shift == 64) {
113
- if (!accumulate && is_u) {
114
- /* result is zero */
115
- tcg_gen_movi_i64(tcg_res, 0);
116
- return;
117
- }
118
- }
119
-
120
- /* Deal with the rounding step */
121
- if (round) {
122
- TCGv_i64 tcg_rnd = tcg_constant_i64(1ull << (shift - 1));
123
- if (extended_result) {
124
- TCGv_i64 tcg_zero = tcg_constant_i64(0);
125
- if (!is_u) {
126
- /* take care of sign extending tcg_res */
127
- tcg_gen_sari_i64(tcg_src_hi, tcg_src, 63);
128
- tcg_gen_add2_i64(tcg_src, tcg_src_hi,
129
- tcg_src, tcg_src_hi,
130
- tcg_rnd, tcg_zero);
131
- } else {
132
- tcg_gen_add2_i64(tcg_src, tcg_src_hi,
133
- tcg_src, tcg_zero,
134
- tcg_rnd, tcg_zero);
135
- }
136
+ if (!round) {
137
+ if (is_u) {
138
+ gen_ushr_d(tcg_src, tcg_src, shift);
139
} else {
140
- tcg_gen_add_i64(tcg_src, tcg_src, tcg_rnd);
141
+ gen_sshr_d(tcg_src, tcg_src, shift);
142
}
143
- }
144
-
145
- /* Now do the shift right */
146
- if (round && extended_result) {
147
- /* extended case, >64 bit precision required */
148
- if (ext_lshift == 0) {
149
- /* special case, only high bits matter */
150
- tcg_gen_mov_i64(tcg_src, tcg_src_hi);
151
+ } else if (size == MO_64) {
152
+ if (is_u) {
153
+ gen_urshr_d(tcg_src, tcg_src, shift);
154
} else {
155
- tcg_gen_shri_i64(tcg_src, tcg_src, shift);
156
- tcg_gen_shli_i64(tcg_src_hi, tcg_src_hi, ext_lshift);
157
- tcg_gen_or_i64(tcg_src, tcg_src, tcg_src_hi);
158
+ gen_srshr_d(tcg_src, tcg_src, shift);
159
}
160
} else {
161
if (is_u) {
162
- if (shift == 64) {
163
- /* essentially shifting in 64 zeros */
164
- tcg_gen_movi_i64(tcg_src, 0);
165
- } else {
166
- tcg_gen_shri_i64(tcg_src, tcg_src, shift);
167
- }
168
+ gen_urshr_bhs(tcg_src, tcg_src, shift);
169
} else {
170
- if (shift == 64) {
171
- /* effectively extending the sign-bit */
172
- tcg_gen_sari_i64(tcg_src, tcg_src, 63);
173
- } else {
174
- tcg_gen_sari_i64(tcg_src, tcg_src, shift);
175
- }
176
+ gen_srshr_bhs(tcg_src, tcg_src, shift);
177
}
178
}
179
180
--
181
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-23-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 8 +++
9
target/arm/tcg/translate-a64.c | 95 +++++++++++++++++-----------------
10
2 files changed, 55 insertions(+), 48 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_s
17
USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_b
18
USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_h
19
USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_s
20
+
21
+SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_b
22
+SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_h
23
+SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_s
24
+
25
+RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_b
26
+RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_h
27
+RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
28
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/tcg/translate-a64.c
31
+++ b/target/arm/tcg/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ static void gen_urshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
33
}
34
}
35
36
+static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
37
+ WideShiftImmFn * const fns[3], MemOp sign)
38
+{
39
+ TCGv_i64 tcg_rn, tcg_rd;
40
+ int esz = a->esz;
41
+ int esize;
42
+ WideShiftImmFn *fn;
43
+
44
+ tcg_debug_assert(esz >= MO_8 && esz <= MO_32);
45
+
46
+ if (!fp_access_check(s)) {
47
+ return true;
48
+ }
49
+
50
+ tcg_rn = tcg_temp_new_i64();
51
+ tcg_rd = tcg_temp_new_i64();
52
+ tcg_gen_movi_i64(tcg_rd, 0);
53
+
54
+ fn = fns[esz];
55
+ esize = 8 << esz;
56
+ for (int i = 0, elements = 8 >> esz; i < elements; i++) {
57
+ read_vec_element(s, tcg_rn, a->rn, i, (esz + 1) | sign);
58
+ fn(tcg_rn, tcg_rn, a->imm);
59
+ tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, esize * i, esize);
60
+ }
61
+
62
+ write_vec_element(s, tcg_rd, a->rd, a->q, MO_64);
63
+ clear_vec_high(s, a->q, a->rd);
64
+ return true;
65
+}
66
+
67
+static WideShiftImmFn * const shrn_fns[] = {
68
+ tcg_gen_shri_i64,
69
+ tcg_gen_shri_i64,
70
+ gen_ushr_d,
71
+};
72
+TRANS(SHRN_v, do_vec_shift_imm_narrow, a, shrn_fns, 0)
73
+
74
+static WideShiftImmFn * const rshrn_fns[] = {
75
+ gen_urshr_bhs,
76
+ gen_urshr_bhs,
77
+ gen_urshr_d,
78
+};
79
+TRANS(RSHRN_v, do_vec_shift_imm_narrow, a, rshrn_fns, 0)
80
+
81
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
82
* Note that it is the caller's responsibility to ensure that the
83
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
84
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
85
}
86
}
87
88
-/* SHRN/RSHRN - Shift right with narrowing (and potential rounding) */
89
-static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
90
- int immh, int immb, int opcode, int rn, int rd)
91
-{
92
- int immhb = immh << 3 | immb;
93
- int size = 32 - clz32(immh) - 1;
94
- int dsize = 64;
95
- int esize = 8 << size;
96
- int elements = dsize/esize;
97
- int shift = (2 * esize) - immhb;
98
- bool round = extract32(opcode, 0, 1);
99
- TCGv_i64 tcg_rn, tcg_rd, tcg_final;
100
- int i;
101
-
102
- if (extract32(immh, 3, 1)) {
103
- unallocated_encoding(s);
104
- return;
105
- }
106
-
107
- if (!fp_access_check(s)) {
108
- return;
109
- }
110
-
111
- tcg_rn = tcg_temp_new_i64();
112
- tcg_rd = tcg_temp_new_i64();
113
- tcg_final = tcg_temp_new_i64();
114
- read_vec_element(s, tcg_final, rd, is_q ? 1 : 0, MO_64);
115
-
116
- for (i = 0; i < elements; i++) {
117
- read_vec_element(s, tcg_rn, rn, i, size+1);
118
- handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
119
- false, true, size+1, shift);
120
-
121
- tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
122
- }
123
-
124
- if (!is_q) {
125
- write_vec_element(s, tcg_final, rd, 0, MO_64);
126
- } else {
127
- write_vec_element(s, tcg_final, rd, 1, MO_64);
128
- }
129
-
130
- clear_vec_high(s, is_q, rd);
131
-}
132
-
133
-
134
/* AdvSIMD shift by immediate
135
* 31 30 29 28 23 22 19 18 16 15 11 10 9 5 4 0
136
* +---+---+---+-------------+------+------+--------+---+------+------+
137
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
138
}
139
140
switch (opcode) {
141
- case 0x10: /* SHRN */
142
+ case 0x10: /* SHRN / SQSHRUN */
143
case 0x11: /* RSHRN / SQRSHRUN */
144
if (is_u) {
145
handle_vec_simd_sqshrn(s, false, is_q, false, true, immh, immb,
146
opcode, rn, rd);
147
} else {
148
- handle_vec_simd_shrn(s, is_q, immh, immb, opcode, rn, rd);
149
+ unallocated_encoding(s);
150
}
151
break;
152
case 0x12: /* SQSHRN / UQSHRN */
153
--
154
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-28-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 36 +++++-
9
target/arm/tcg/translate-a64.c | 223 ++++++++++++++-------------------
10
2 files changed, 128 insertions(+), 131 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_b
17
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_h
18
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
19
20
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_b
21
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_h
22
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_s
23
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_d
24
+
25
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_b
26
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_h
27
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_s
28
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_d
29
+
30
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_b
31
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_h
32
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_s
33
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_d
34
+
35
# Advanced SIMD scalar shift by immediate
36
37
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
38
&rri_e esz=3 imm=%neon_rshift_i6
39
-@shli_d .... ..... 1 imm:6 ..... . rn:5 rd:5 &rri_e esz=3
40
+
41
+@shli_b .... ..... 0001 imm:3 ..... . rn:5 rd:5 &rri_e esz=0
42
+@shli_h .... ..... 001 imm:4 ..... . rn:5 rd:5 &rri_e esz=1
43
+@shli_s .... ..... 01 imm:5 ..... . rn:5 rd:5 &rri_e esz=2
44
+@shli_d .... ..... 1 imm:6 ..... . rn:5 rd:5 &rri_e esz=3
45
46
SSHR_s 0101 11110 .... ... 00000 1 ..... ..... @shri_d
47
USHR_s 0111 11110 .... ... 00000 1 ..... ..... @shri_d
48
@@ -XXX,XX +XXX,XX @@ SRI_s 0111 11110 .... ... 01000 1 ..... ..... @shri_d
49
50
SHL_s 0101 11110 .... ... 01010 1 ..... ..... @shli_d
51
SLI_s 0111 11110 .... ... 01010 1 ..... ..... @shli_d
52
+
53
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_b
54
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_h
55
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_s
56
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_d
57
+
58
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_b
59
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_h
60
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_s
61
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_d
62
+
63
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_b
64
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_h
65
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_s
66
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_d
67
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
68
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/tcg/translate-a64.c
70
+++ b/target/arm/tcg/translate-a64.c
71
@@ -XXX,XX +XXX,XX @@ TRANS(URSRA_v, do_vec_shift_imm, a, gen_gvec_ursra)
72
TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
73
TRANS(SHL_v, do_vec_shift_imm, a, tcg_gen_gvec_shli)
74
TRANS(SLI_v, do_vec_shift_imm, a, gen_gvec_sli);
75
+TRANS(SQSHL_vi, do_vec_shift_imm, a, gen_neon_sqshli)
76
+TRANS(UQSHL_vi, do_vec_shift_imm, a, gen_neon_uqshli)
77
+TRANS(SQSHLU_vi, do_vec_shift_imm, a, gen_neon_sqshlui)
78
79
static bool do_vec_shift_imm_wide(DisasContext *s, arg_qrri_e *a, bool is_u)
80
{
81
@@ -XXX,XX +XXX,XX @@ TRANS(SRI_s, do_scalar_shift_imm, a, gen_sri_d, true, 0)
82
TRANS(SHL_s, do_scalar_shift_imm, a, tcg_gen_shli_i64, false, 0)
83
TRANS(SLI_s, do_scalar_shift_imm, a, gen_sli_d, true, 0)
84
85
+static void trunc_i64_env_imm(TCGv_i64 d, TCGv_i64 s, int64_t i,
86
+ NeonGenTwoOpEnvFn *fn)
87
+{
88
+ TCGv_i32 t = tcg_temp_new_i32();
89
+ tcg_gen_extrl_i64_i32(t, s);
90
+ fn(t, tcg_env, t, tcg_constant_i32(i));
91
+ tcg_gen_extu_i32_i64(d, t);
92
+}
93
+
94
+static void gen_sqshli_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
95
+{
96
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_s8);
97
+}
98
+
99
+static void gen_sqshli_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
100
+{
101
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_s16);
102
+}
103
+
104
+static void gen_sqshli_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
105
+{
106
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_s32);
107
+}
108
+
109
+static void gen_sqshli_d(TCGv_i64 d, TCGv_i64 s, int64_t i)
110
+{
111
+ gen_helper_neon_qshl_s64(d, tcg_env, s, tcg_constant_i64(i));
112
+}
113
+
114
+static void gen_uqshli_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
115
+{
116
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_u8);
117
+}
118
+
119
+static void gen_uqshli_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
120
+{
121
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_u16);
122
+}
123
+
124
+static void gen_uqshli_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
125
+{
126
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_u32);
127
+}
128
+
129
+static void gen_uqshli_d(TCGv_i64 d, TCGv_i64 s, int64_t i)
130
+{
131
+ gen_helper_neon_qshl_u64(d, tcg_env, s, tcg_constant_i64(i));
132
+}
133
+
134
+static void gen_sqshlui_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
135
+{
136
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshlu_s8);
137
+}
138
+
139
+static void gen_sqshlui_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
140
+{
141
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshlu_s16);
142
+}
143
+
144
+static void gen_sqshlui_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
145
+{
146
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshlu_s32);
147
+}
148
+
149
+static void gen_sqshlui_d(TCGv_i64 d, TCGv_i64 s, int64_t i)
150
+{
151
+ gen_helper_neon_qshlu_s64(d, tcg_env, s, tcg_constant_i64(i));
152
+}
153
+
154
+static WideShiftImmFn * const f_scalar_sqshli[] = {
155
+ gen_sqshli_b, gen_sqshli_h, gen_sqshli_s, gen_sqshli_d
156
+};
157
+
158
+static WideShiftImmFn * const f_scalar_uqshli[] = {
159
+ gen_uqshli_b, gen_uqshli_h, gen_uqshli_s, gen_uqshli_d
160
+};
161
+
162
+static WideShiftImmFn * const f_scalar_sqshlui[] = {
163
+ gen_sqshlui_b, gen_sqshlui_h, gen_sqshlui_s, gen_sqshlui_d
164
+};
165
+
166
+/* Note that the helpers sign-extend their inputs, so don't do it here. */
167
+TRANS(SQSHL_si, do_scalar_shift_imm, a, f_scalar_sqshli[a->esz], false, 0)
168
+TRANS(UQSHL_si, do_scalar_shift_imm, a, f_scalar_uqshli[a->esz], false, 0)
169
+TRANS(SQSHLU_si, do_scalar_shift_imm, a, f_scalar_sqshlui[a->esz], false, 0)
170
+
171
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
172
* Note that it is the caller's responsibility to ensure that the
173
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
174
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
175
clear_vec_high(s, is_q, rd);
176
}
177
178
-/* SQSHLU, UQSHL, SQSHL: saturating left shifts */
179
-static void handle_simd_qshl(DisasContext *s, bool scalar, bool is_q,
180
- bool src_unsigned, bool dst_unsigned,
181
- int immh, int immb, int rn, int rd)
182
-{
183
- int immhb = immh << 3 | immb;
184
- int size = 32 - clz32(immh) - 1;
185
- int shift = immhb - (8 << size);
186
- int pass;
187
-
188
- assert(immh != 0);
189
- assert(!(scalar && is_q));
190
-
191
- if (!scalar) {
192
- if (!is_q && extract32(immh, 3, 1)) {
193
- unallocated_encoding(s);
194
- return;
195
- }
196
-
197
- /* Since we use the variable-shift helpers we must
198
- * replicate the shift count into each element of
199
- * the tcg_shift value.
200
- */
201
- switch (size) {
202
- case 0:
203
- shift |= shift << 8;
204
- /* fall through */
205
- case 1:
206
- shift |= shift << 16;
207
- break;
208
- case 2:
209
- case 3:
210
- break;
211
- default:
212
- g_assert_not_reached();
213
- }
214
- }
215
-
216
- if (!fp_access_check(s)) {
217
- return;
218
- }
219
-
220
- if (size == 3) {
221
- TCGv_i64 tcg_shift = tcg_constant_i64(shift);
222
- static NeonGenTwo64OpEnvFn * const fns[2][2] = {
223
- { gen_helper_neon_qshl_s64, gen_helper_neon_qshlu_s64 },
224
- { NULL, gen_helper_neon_qshl_u64 },
225
- };
226
- NeonGenTwo64OpEnvFn *genfn = fns[src_unsigned][dst_unsigned];
227
- int maxpass = is_q ? 2 : 1;
228
-
229
- for (pass = 0; pass < maxpass; pass++) {
230
- TCGv_i64 tcg_op = tcg_temp_new_i64();
231
-
232
- read_vec_element(s, tcg_op, rn, pass, MO_64);
233
- genfn(tcg_op, tcg_env, tcg_op, tcg_shift);
234
- write_vec_element(s, tcg_op, rd, pass, MO_64);
235
- }
236
- clear_vec_high(s, is_q, rd);
237
- } else {
238
- TCGv_i32 tcg_shift = tcg_constant_i32(shift);
239
- static NeonGenTwoOpEnvFn * const fns[2][2][3] = {
240
- {
241
- { gen_helper_neon_qshl_s8,
242
- gen_helper_neon_qshl_s16,
243
- gen_helper_neon_qshl_s32 },
244
- { gen_helper_neon_qshlu_s8,
245
- gen_helper_neon_qshlu_s16,
246
- gen_helper_neon_qshlu_s32 }
247
- }, {
248
- { NULL, NULL, NULL },
249
- { gen_helper_neon_qshl_u8,
250
- gen_helper_neon_qshl_u16,
251
- gen_helper_neon_qshl_u32 }
252
- }
253
- };
254
- NeonGenTwoOpEnvFn *genfn = fns[src_unsigned][dst_unsigned][size];
255
- MemOp memop = scalar ? size : MO_32;
256
- int maxpass = scalar ? 1 : is_q ? 4 : 2;
257
-
258
- for (pass = 0; pass < maxpass; pass++) {
259
- TCGv_i32 tcg_op = tcg_temp_new_i32();
260
-
261
- read_vec_element_i32(s, tcg_op, rn, pass, memop);
262
- genfn(tcg_op, tcg_env, tcg_op, tcg_shift);
263
- if (scalar) {
264
- switch (size) {
265
- case 0:
266
- tcg_gen_ext8u_i32(tcg_op, tcg_op);
267
- break;
268
- case 1:
269
- tcg_gen_ext16u_i32(tcg_op, tcg_op);
270
- break;
271
- case 2:
272
- break;
273
- default:
274
- g_assert_not_reached();
275
- }
276
- write_fp_sreg(s, rd, tcg_op);
277
- } else {
278
- write_vec_element_i32(s, tcg_op, rd, pass, MO_32);
279
- }
280
- }
281
-
282
- if (!scalar) {
283
- clear_vec_high(s, is_q, rd);
284
- }
285
- }
286
-}
287
-
288
/* Common vector code for handling integer to FP conversion */
289
static void handle_simd_intfp_conv(DisasContext *s, int rd, int rn,
290
int elements, int is_signed,
291
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
292
handle_vec_simd_sqshrn(s, true, false, is_u, is_u,
293
immh, immb, opcode, rn, rd);
294
break;
295
- case 0xc: /* SQSHLU */
296
- if (!is_u) {
297
- unallocated_encoding(s);
298
- return;
299
- }
300
- handle_simd_qshl(s, true, false, false, true, immh, immb, rn, rd);
301
- break;
302
- case 0xe: /* SQSHL, UQSHL */
303
- handle_simd_qshl(s, true, false, is_u, is_u, immh, immb, rn, rd);
304
- break;
305
case 0x1f: /* FCVTZS, FCVTZU */
306
handle_simd_shift_fpint_conv(s, true, false, is_u, immh, immb, rn, rd);
307
break;
308
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
309
case 0x06: /* SRSRA / URSRA */
310
case 0x08: /* SRI */
311
case 0x0a: /* SHL / SLI */
312
+ case 0x0c: /* SQSHLU */
313
+ case 0x0e: /* SQSHL, UQSHL */
314
unallocated_encoding(s);
315
break;
316
}
317
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
318
handle_simd_shift_intfp_conv(s, false, is_q, is_u, immh, immb,
319
opcode, rn, rd);
320
break;
321
- case 0xc: /* SQSHLU */
322
- if (!is_u) {
323
- unallocated_encoding(s);
324
- return;
325
- }
326
- handle_simd_qshl(s, false, is_q, false, true, immh, immb, rn, rd);
327
- break;
328
- case 0xe: /* SQSHL, UQSHL */
329
- handle_simd_qshl(s, false, is_q, is_u, is_u, immh, immb, rn, rd);
330
- break;
331
case 0x1f: /* FCVTZS/ FCVTZU */
332
handle_simd_shift_fpint_conv(s, false, is_q, is_u, immh, immb, rn, rd);
333
return;
334
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
335
case 0x06: /* SRSRA / URSRA (accum + rounding) */
336
case 0x08: /* SRI */
337
case 0x0a: /* SHL / SLI */
338
+ case 0x0c: /* SQSHLU */
339
+ case 0x0e: /* SQSHL, UQSHL */
340
case 0x14: /* SSHLL / USHLL */
341
unallocated_encoding(s);
342
return;
343
--
344
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-30-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/tcg/a64.decode | 30 +++++++
9
target/arm/tcg/translate-a64.c | 160 +++++++--------------------------
10
2 files changed, 63 insertions(+), 127 deletions(-)
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_s
17
18
# Advanced SIMD scalar shift by immediate
19
20
+@shri_b .... ..... 0001 ... ..... . rn:5 rd:5 \
21
+ &rri_e esz=0 imm=%neon_rshift_i3
22
+@shri_h .... ..... 001 .... ..... . rn:5 rd:5 \
23
+ &rri_e esz=1 imm=%neon_rshift_i4
24
+@shri_s .... ..... 01 ..... ..... . rn:5 rd:5 \
25
+ &rri_e esz=2 imm=%neon_rshift_i5
26
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
27
&rri_e esz=3 imm=%neon_rshift_i6
28
29
@@ -XXX,XX +XXX,XX @@ SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_b
30
SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_h
31
SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_s
32
SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_d
33
+
34
+SQSHRN_si 0101 11110 .... ... 10010 1 ..... ..... @shri_b
35
+SQSHRN_si 0101 11110 .... ... 10010 1 ..... ..... @shri_h
36
+SQSHRN_si 0101 11110 .... ... 10010 1 ..... ..... @shri_s
37
+
38
+UQSHRN_si 0111 11110 .... ... 10010 1 ..... ..... @shri_b
39
+UQSHRN_si 0111 11110 .... ... 10010 1 ..... ..... @shri_h
40
+UQSHRN_si 0111 11110 .... ... 10010 1 ..... ..... @shri_s
41
+
42
+SQSHRUN_si 0111 11110 .... ... 10000 1 ..... ..... @shri_b
43
+SQSHRUN_si 0111 11110 .... ... 10000 1 ..... ..... @shri_h
44
+SQSHRUN_si 0111 11110 .... ... 10000 1 ..... ..... @shri_s
45
+
46
+SQRSHRN_si 0101 11110 .... ... 10011 1 ..... ..... @shri_b
47
+SQRSHRN_si 0101 11110 .... ... 10011 1 ..... ..... @shri_h
48
+SQRSHRN_si 0101 11110 .... ... 10011 1 ..... ..... @shri_s
49
+
50
+UQRSHRN_si 0111 11110 .... ... 10011 1 ..... ..... @shri_b
51
+UQRSHRN_si 0111 11110 .... ... 10011 1 ..... ..... @shri_h
52
+UQRSHRN_si 0111 11110 .... ... 10011 1 ..... ..... @shri_s
53
+
54
+SQRSHRUN_si 0111 11110 .... ... 10001 1 ..... ..... @shri_b
55
+SQRSHRUN_si 0111 11110 .... ... 10001 1 ..... ..... @shri_h
56
+SQRSHRUN_si 0111 11110 .... ... 10001 1 ..... ..... @shri_s
57
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/tcg/translate-a64.c
60
+++ b/target/arm/tcg/translate-a64.c
61
@@ -XXX,XX +XXX,XX @@ TRANS(SQSHL_si, do_scalar_shift_imm, a, f_scalar_sqshli[a->esz], false, 0)
62
TRANS(UQSHL_si, do_scalar_shift_imm, a, f_scalar_uqshli[a->esz], false, 0)
63
TRANS(SQSHLU_si, do_scalar_shift_imm, a, f_scalar_sqshlui[a->esz], false, 0)
64
65
+static bool do_scalar_shift_imm_narrow(DisasContext *s, arg_rri_e *a,
66
+ WideShiftImmFn * const fns[3],
67
+ MemOp sign, bool zext)
68
+{
69
+ MemOp esz = a->esz;
70
+
71
+ tcg_debug_assert(esz >= MO_8 && esz <= MO_32);
72
+
73
+ if (fp_access_check(s)) {
74
+ TCGv_i64 rd = tcg_temp_new_i64();
75
+ TCGv_i64 rn = tcg_temp_new_i64();
76
+
77
+ read_vec_element(s, rn, a->rn, 0, (esz + 1) | sign);
78
+ fns[esz](rd, rn, a->imm);
79
+ if (zext) {
80
+ tcg_gen_ext_i64(rd, rd, esz);
81
+ }
82
+ write_fp_dreg(s, a->rd, rd);
83
+ }
84
+ return true;
85
+}
86
+
87
+TRANS(SQSHRN_si, do_scalar_shift_imm_narrow, a, sqshrn_fns, MO_SIGN, true)
88
+TRANS(SQRSHRN_si, do_scalar_shift_imm_narrow, a, sqrshrn_fns, MO_SIGN, true)
89
+TRANS(UQSHRN_si, do_scalar_shift_imm_narrow, a, uqshrn_fns, 0, false)
90
+TRANS(UQRSHRN_si, do_scalar_shift_imm_narrow, a, uqrshrn_fns, 0, false)
91
+TRANS(SQSHRUN_si, do_scalar_shift_imm_narrow, a, sqshrun_fns, MO_SIGN, false)
92
+TRANS(SQRSHRUN_si, do_scalar_shift_imm_narrow, a, sqrshrun_fns, MO_SIGN, false)
93
+
94
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
95
* Note that it is the caller's responsibility to ensure that the
96
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
97
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
98
}
99
}
100
101
-/*
102
- * Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
103
- *
104
- * This code is handles the common shifting code and is used by both
105
- * the vector and scalar code.
106
- */
107
-static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
108
- bool round, bool accumulate,
109
- bool is_u, int size, int shift)
110
-{
111
- if (!round) {
112
- if (is_u) {
113
- gen_ushr_d(tcg_src, tcg_src, shift);
114
- } else {
115
- gen_sshr_d(tcg_src, tcg_src, shift);
116
- }
117
- } else if (size == MO_64) {
118
- if (is_u) {
119
- gen_urshr_d(tcg_src, tcg_src, shift);
120
- } else {
121
- gen_srshr_d(tcg_src, tcg_src, shift);
122
- }
123
- } else {
124
- if (is_u) {
125
- gen_urshr_bhs(tcg_src, tcg_src, shift);
126
- } else {
127
- gen_srshr_bhs(tcg_src, tcg_src, shift);
128
- }
129
- }
130
-
131
- if (accumulate) {
132
- tcg_gen_add_i64(tcg_res, tcg_res, tcg_src);
133
- } else {
134
- tcg_gen_mov_i64(tcg_res, tcg_src);
135
- }
136
-}
137
-
138
-/* SQSHRN/SQSHRUN - Saturating (signed/unsigned) shift right with
139
- * (signed/unsigned) narrowing */
140
-static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
141
- bool is_u_shift, bool is_u_narrow,
142
- int immh, int immb, int opcode,
143
- int rn, int rd)
144
-{
145
- int immhb = immh << 3 | immb;
146
- int size = 32 - clz32(immh) - 1;
147
- int esize = 8 << size;
148
- int shift = (2 * esize) - immhb;
149
- int elements = is_scalar ? 1 : (64 / esize);
150
- bool round = extract32(opcode, 0, 1);
151
- MemOp ldop = (size + 1) | (is_u_shift ? 0 : MO_SIGN);
152
- TCGv_i64 tcg_rn, tcg_rd, tcg_final;
153
-
154
- static NeonGenOne64OpEnvFn * const signed_narrow_fns[4][2] = {
155
- { gen_helper_neon_narrow_sat_s8,
156
- gen_helper_neon_unarrow_sat8 },
157
- { gen_helper_neon_narrow_sat_s16,
158
- gen_helper_neon_unarrow_sat16 },
159
- { gen_helper_neon_narrow_sat_s32,
160
- gen_helper_neon_unarrow_sat32 },
161
- { NULL, NULL },
162
- };
163
- static NeonGenOne64OpEnvFn * const unsigned_narrow_fns[4] = {
164
- gen_helper_neon_narrow_sat_u8,
165
- gen_helper_neon_narrow_sat_u16,
166
- gen_helper_neon_narrow_sat_u32,
167
- NULL
168
- };
169
- NeonGenOne64OpEnvFn *narrowfn;
170
-
171
- int i;
172
-
173
- assert(size < 4);
174
-
175
- if (extract32(immh, 3, 1)) {
176
- unallocated_encoding(s);
177
- return;
178
- }
179
-
180
- if (!fp_access_check(s)) {
181
- return;
182
- }
183
-
184
- if (is_u_shift) {
185
- narrowfn = unsigned_narrow_fns[size];
186
- } else {
187
- narrowfn = signed_narrow_fns[size][is_u_narrow ? 1 : 0];
188
- }
189
-
190
- tcg_rn = tcg_temp_new_i64();
191
- tcg_rd = tcg_temp_new_i64();
192
- tcg_final = tcg_temp_new_i64();
193
-
194
- for (i = 0; i < elements; i++) {
195
- read_vec_element(s, tcg_rn, rn, i, ldop);
196
- handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
197
- false, is_u_shift, size+1, shift);
198
- narrowfn(tcg_rd, tcg_env, tcg_rd);
199
- if (i == 0) {
200
- tcg_gen_extract_i64(tcg_final, tcg_rd, 0, esize);
201
- } else {
202
- tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
203
- }
204
- }
205
-
206
- if (!is_q) {
207
- write_vec_element(s, tcg_final, rd, 0, MO_64);
208
- } else {
209
- write_vec_element(s, tcg_final, rd, 1, MO_64);
210
- }
211
- clear_vec_high(s, is_q, rd);
212
-}
213
-
214
/* Common vector code for handling integer to FP conversion */
215
static void handle_simd_intfp_conv(DisasContext *s, int rd, int rn,
216
int elements, int is_signed,
217
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
218
handle_simd_shift_intfp_conv(s, true, false, is_u, immh, immb,
219
opcode, rn, rd);
220
break;
221
- case 0x10: /* SQSHRUN, SQSHRUN2 */
222
- case 0x11: /* SQRSHRUN, SQRSHRUN2 */
223
- if (!is_u) {
224
- unallocated_encoding(s);
225
- return;
226
- }
227
- handle_vec_simd_sqshrn(s, true, false, false, true,
228
- immh, immb, opcode, rn, rd);
229
- break;
230
- case 0x12: /* SQSHRN, SQSHRN2, UQSHRN */
231
- case 0x13: /* SQRSHRN, SQRSHRN2, UQRSHRN, UQRSHRN2 */
232
- handle_vec_simd_sqshrn(s, true, false, is_u, is_u,
233
- immh, immb, opcode, rn, rd);
234
- break;
235
case 0x1f: /* FCVTZS, FCVTZU */
236
handle_simd_shift_fpint_conv(s, true, false, is_u, immh, immb, rn, rd);
237
break;
238
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
239
case 0x0a: /* SHL / SLI */
240
case 0x0c: /* SQSHLU */
241
case 0x0e: /* SQSHL, UQSHL */
242
+ case 0x10: /* SQSHRUN */
243
+ case 0x11: /* SQRSHRUN */
244
+ case 0x12: /* SQSHRN, UQSHRN */
245
+ case 0x13: /* SQRSHRN, UQRSHRN */
246
unallocated_encoding(s);
247
break;
248
}
249
--
250
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Jacob Abrams <satur9nine@gmail.com>
2
1
3
SW modifying USART_CR1 TE bit should cuase HW to respond by altering
4
USART_ISR TEACK bit, and likewise for RE and REACK bit.
5
6
This resolves some but not all issues necessary for the official STM USART
7
HAL driver to function as is.
8
9
Fixes: 87b77e6e01ca ("hw/char/stm32l4x5_usart: Enable serial read and write")
10
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2540
11
Signed-off-by: Jacob Abrams <satur9nine@gmail.com>
12
Message-id: 20240911043255.51966-1-satur9nine@gmail.com
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
16
hw/char/stm32l4x5_usart.c | 16 +++++++++++++
17
tests/qtest/stm32l4x5_usart-test.c | 36 +++++++++++++++++++++++++++++-
18
2 files changed, 51 insertions(+), 1 deletion(-)
19
20
diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/char/stm32l4x5_usart.c
23
+++ b/hw/char/stm32l4x5_usart.c
24
@@ -XXX,XX +XXX,XX @@ REG32(RDR, 0x24)
25
REG32(TDR, 0x28)
26
FIELD(TDR, TDR, 0, 9)
27
28
+static void stm32l4x5_update_isr(Stm32l4x5UsartBaseState *s)
29
+{
30
+ if (s->cr1 & R_CR1_TE_MASK) {
31
+ s->isr |= R_ISR_TEACK_MASK;
32
+ } else {
33
+ s->isr &= ~R_ISR_TEACK_MASK;
34
+ }
35
+
36
+ if (s->cr1 & R_CR1_RE_MASK) {
37
+ s->isr |= R_ISR_REACK_MASK;
38
+ } else {
39
+ s->isr &= ~R_ISR_REACK_MASK;
40
+ }
41
+}
42
+
43
static void stm32l4x5_update_irq(Stm32l4x5UsartBaseState *s)
44
{
45
if (((s->isr & R_ISR_WUF_MASK) && (s->cr3 & R_CR3_WUFIE_MASK)) ||
46
@@ -XXX,XX +XXX,XX @@ static void stm32l4x5_usart_base_write(void *opaque, hwaddr addr,
47
case A_CR1:
48
s->cr1 = value;
49
stm32l4x5_update_params(s);
50
+ stm32l4x5_update_isr(s);
51
stm32l4x5_update_irq(s);
52
return;
53
case A_CR2:
54
diff --git a/tests/qtest/stm32l4x5_usart-test.c b/tests/qtest/stm32l4x5_usart-test.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/tests/qtest/stm32l4x5_usart-test.c
57
+++ b/tests/qtest/stm32l4x5_usart-test.c
58
@@ -XXX,XX +XXX,XX @@ REG32(GTPR, 0x10)
59
REG32(RTOR, 0x14)
60
REG32(RQR, 0x18)
61
REG32(ISR, 0x1C)
62
+ FIELD(ISR, REACK, 22, 1)
63
+ FIELD(ISR, TEACK, 21, 1)
64
FIELD(ISR, TXE, 7, 1)
65
FIELD(ISR, RXNE, 5, 1)
66
FIELD(ISR, ORE, 3, 1)
67
@@ -XXX,XX +XXX,XX @@ static void init_uart(QTestState *qts)
68
69
/* Enable the transmitter, the receiver and the USART. */
70
qtest_writel(qts, (USART1_BASE_ADDR + A_CR1),
71
- R_CR1_UE_MASK | R_CR1_RE_MASK | R_CR1_TE_MASK);
72
+ cr1 | R_CR1_UE_MASK | R_CR1_RE_MASK | R_CR1_TE_MASK);
73
}
74
75
static void test_write_read(void)
76
@@ -XXX,XX +XXX,XX @@ static void test_send_str(void)
77
qtest_quit(qts);
78
}
79
80
+static void test_ack(void)
81
+{
82
+ uint32_t cr1;
83
+ uint32_t isr;
84
+ QTestState *qts = qtest_init("-M b-l475e-iot01a");
85
+
86
+ init_uart(qts);
87
+
88
+ cr1 = qtest_readl(qts, (USART1_BASE_ADDR + A_CR1));
89
+
90
+ /* Disable the transmitter and receiver. */
91
+ qtest_writel(qts, (USART1_BASE_ADDR + A_CR1),
92
+ cr1 & ~(R_CR1_RE_MASK | R_CR1_TE_MASK));
93
+
94
+ /* Test ISR ACK for transmitter and receiver disabled */
95
+ isr = qtest_readl(qts, (USART1_BASE_ADDR + A_ISR));
96
+ g_assert_false(isr & R_ISR_TEACK_MASK);
97
+ g_assert_false(isr & R_ISR_REACK_MASK);
98
+
99
+ /* Enable the transmitter and receiver. */
100
+ qtest_writel(qts, (USART1_BASE_ADDR + A_CR1),
101
+ cr1 | (R_CR1_RE_MASK | R_CR1_TE_MASK));
102
+
103
+ /* Test ISR ACK for transmitter and receiver disabled */
104
+ isr = qtest_readl(qts, (USART1_BASE_ADDR + A_ISR));
105
+ g_assert_true(isr & R_ISR_TEACK_MASK);
106
+ g_assert_true(isr & R_ISR_REACK_MASK);
107
+
108
+ qtest_quit(qts);
109
+}
110
+
111
int main(int argc, char **argv)
112
{
113
int ret;
114
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
115
qtest_add_func("stm32l4x5/usart/send_char", test_send_char);
116
qtest_add_func("stm32l4x5/usart/receive_str", test_receive_str);
117
qtest_add_func("stm32l4x5/usart/send_str", test_send_str);
118
+ qtest_add_func("stm32l4x5/usart/ack", test_ack);
119
ret = g_test_run();
120
121
return ret;
122
--
123
2.34.1
diff view generated by jsdifflib
1
In kvm_init_vcpu()and do_kvm_destroy_vcpu(), the return value from
1
In the GICv3 ITS model, we have a common coding pattern which has a
2
kvm_ioctl(..., KVM_GET_VCPU_MMAP_SIZE, ...)
2
local C struct like "DTEntry dte", which is a C representation of an
3
is an 'int', but we put it into a 'long' logal variable mmap_size.
3
in-guest-memory data structure, and we call a function such as
4
Coverity then complains that there might be a truncation when we copy
4
get_dte() to read guest memory and fill in the C struct. These
5
that value into the 'int ret' which we use for returning a value in
5
functions to read in the struct sometimes have cases where they will
6
an error-exit codepath. This can't ever actually overflow because
6
leave early and not fill in the whole struct (for instance get_dte()
7
the value was in an 'int' to start with, but it makes more sense
7
will set "dte->valid = false" and nothing else for the case where it
8
to use 'int' for mmap_size so we don't do the widen-then-narrow
8
is passed an entry_addr implying that there is no L2 table entry for
9
sequence in the first place.
9
the DTE). This then causes potential use of uninitialized memory
10
later, for instance when we call a trace event which prints all the
11
fields of the struct. Sufficiently advanced compilers may produce
12
-Wmaybe-uninitialized warnings about this, especially if LTO is
13
enabled.
10
14
11
Resolves: Coverity CID 1547515
15
Rather than trying to carefully separate out these trace events into
16
"only the 'valid' field is initialized" and "all fields can be
17
printed", zero-init all the structs when we define them. None of
18
these structs are large (the biggest is 24 bytes) and having
19
consistent behaviour is less likely to be buggy.
20
21
Cc: qemu-stable@nongnu.org
22
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2718
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
25
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
14
Message-id: 20240815131206.3231819-2-peter.maydell@linaro.org
26
Message-id: 20241213182337.3343068-1-peter.maydell@linaro.org
15
---
27
---
16
accel/kvm/kvm-all.c | 4 ++--
28
hw/intc/arm_gicv3_its.c | 44 ++++++++++++++++++++---------------------
17
1 file changed, 2 insertions(+), 2 deletions(-)
29
1 file changed, 22 insertions(+), 22 deletions(-)
18
30
19
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
31
diff --git a/hw/intc/arm_gicv3_its.c b/hw/intc/arm_gicv3_its.c
20
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
21
--- a/accel/kvm/kvm-all.c
33
--- a/hw/intc/arm_gicv3_its.c
22
+++ b/accel/kvm/kvm-all.c
34
+++ b/hw/intc/arm_gicv3_its.c
23
@@ -XXX,XX +XXX,XX @@ int kvm_create_and_park_vcpu(CPUState *cpu)
35
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult lookup_vte(GICv3ITSState *s, const char *who,
24
static int do_kvm_destroy_vcpu(CPUState *cpu)
36
static ItsCmdResult process_its_cmd_phys(GICv3ITSState *s, const ITEntry *ite,
37
int irqlevel)
25
{
38
{
26
KVMState *s = kvm_state;
39
- CTEntry cte;
27
- long mmap_size;
40
+ CTEntry cte = {};
28
+ int mmap_size;
41
ItsCmdResult cmdres;
29
int ret = 0;
42
30
43
cmdres = lookup_cte(s, __func__, ite->icid, &cte);
31
trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
44
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_its_cmd_phys(GICv3ITSState *s, const ITEntry *ite,
32
@@ -XXX,XX +XXX,XX @@ void kvm_destroy_vcpu(CPUState *cpu)
45
static ItsCmdResult process_its_cmd_virt(GICv3ITSState *s, const ITEntry *ite,
33
int kvm_init_vcpu(CPUState *cpu, Error **errp)
46
int irqlevel)
34
{
47
{
35
KVMState *s = kvm_state;
48
- VTEntry vte;
36
- long mmap_size;
49
+ VTEntry vte = {};
37
+ int mmap_size;
50
ItsCmdResult cmdres;
38
int ret;
51
39
52
cmdres = lookup_vte(s, __func__, ite->vpeid, &vte);
40
trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
53
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_its_cmd_virt(GICv3ITSState *s, const ITEntry *ite,
54
static ItsCmdResult do_process_its_cmd(GICv3ITSState *s, uint32_t devid,
55
uint32_t eventid, ItsCmdType cmd)
56
{
57
- DTEntry dte;
58
- ITEntry ite;
59
+ DTEntry dte = {};
60
+ ITEntry ite = {};
61
ItsCmdResult cmdres;
62
int irqlevel;
63
64
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_mapti(GICv3ITSState *s, const uint64_t *cmdpkt,
65
uint32_t pIntid = 0;
66
uint64_t num_eventids;
67
uint16_t icid = 0;
68
- DTEntry dte;
69
- ITEntry ite;
70
+ DTEntry dte = {};
71
+ ITEntry ite = {};
72
73
devid = (cmdpkt[0] & DEVID_MASK) >> DEVID_SHIFT;
74
eventid = cmdpkt[1] & EVENTID_MASK;
75
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_vmapti(GICv3ITSState *s, const uint64_t *cmdpkt,
76
{
77
uint32_t devid, eventid, vintid, doorbell, vpeid;
78
uint32_t num_eventids;
79
- DTEntry dte;
80
- ITEntry ite;
81
+ DTEntry dte = {};
82
+ ITEntry ite = {};
83
84
if (!its_feature_virtual(s)) {
85
return CMD_CONTINUE;
86
@@ -XXX,XX +XXX,XX @@ static bool update_cte(GICv3ITSState *s, uint16_t icid, const CTEntry *cte)
87
static ItsCmdResult process_mapc(GICv3ITSState *s, const uint64_t *cmdpkt)
88
{
89
uint16_t icid;
90
- CTEntry cte;
91
+ CTEntry cte = {};
92
93
icid = cmdpkt[2] & ICID_MASK;
94
cte.valid = cmdpkt[2] & CMD_FIELD_VALID_MASK;
95
@@ -XXX,XX +XXX,XX @@ static bool update_dte(GICv3ITSState *s, uint32_t devid, const DTEntry *dte)
96
static ItsCmdResult process_mapd(GICv3ITSState *s, const uint64_t *cmdpkt)
97
{
98
uint32_t devid;
99
- DTEntry dte;
100
+ DTEntry dte = {};
101
102
devid = (cmdpkt[0] & DEVID_MASK) >> DEVID_SHIFT;
103
dte.size = cmdpkt[1] & SIZE_MASK;
104
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_movi(GICv3ITSState *s, const uint64_t *cmdpkt)
105
{
106
uint32_t devid, eventid;
107
uint16_t new_icid;
108
- DTEntry dte;
109
- CTEntry old_cte, new_cte;
110
- ITEntry old_ite;
111
+ DTEntry dte = {};
112
+ CTEntry old_cte = {}, new_cte = {};
113
+ ITEntry old_ite = {};
114
ItsCmdResult cmdres;
115
116
devid = FIELD_EX64(cmdpkt[0], MOVI_0, DEVICEID);
117
@@ -XXX,XX +XXX,XX @@ static bool update_vte(GICv3ITSState *s, uint32_t vpeid, const VTEntry *vte)
118
119
static ItsCmdResult process_vmapp(GICv3ITSState *s, const uint64_t *cmdpkt)
120
{
121
- VTEntry vte;
122
+ VTEntry vte = {};
123
uint32_t vpeid;
124
125
if (!its_feature_virtual(s)) {
126
@@ -XXX,XX +XXX,XX @@ static void vmovp_callback(gpointer data, gpointer opaque)
127
*/
128
GICv3ITSState *s = data;
129
VmovpCallbackData *cbdata = opaque;
130
- VTEntry vte;
131
+ VTEntry vte = {};
132
ItsCmdResult cmdres;
133
134
cmdres = lookup_vte(s, __func__, cbdata->vpeid, &vte);
135
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_vmovi(GICv3ITSState *s, const uint64_t *cmdpkt)
136
{
137
uint32_t devid, eventid, vpeid, doorbell;
138
bool doorbell_valid;
139
- DTEntry dte;
140
- ITEntry ite;
141
- VTEntry old_vte, new_vte;
142
+ DTEntry dte = {};
143
+ ITEntry ite = {};
144
+ VTEntry old_vte = {}, new_vte = {};
145
ItsCmdResult cmdres;
146
147
if (!its_feature_virtual(s)) {
148
@@ -XXX,XX +XXX,XX @@ static ItsCmdResult process_vinvall(GICv3ITSState *s, const uint64_t *cmdpkt)
149
static ItsCmdResult process_inv(GICv3ITSState *s, const uint64_t *cmdpkt)
150
{
151
uint32_t devid, eventid;
152
- ITEntry ite;
153
- DTEntry dte;
154
- CTEntry cte;
155
- VTEntry vte;
156
+ ITEntry ite = {};
157
+ DTEntry dte = {};
158
+ CTEntry cte = {};
159
+ VTEntry vte = {};
160
ItsCmdResult cmdres;
161
162
devid = FIELD_EX64(cmdpkt[0], INV_0, DEVICEID);
41
--
163
--
42
2.34.1
164
2.34.1
43
165
44
166
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2
2
3
We want to run tests using default cpu without having to remember which
3
Update the URLs for the binaries we use for the firmware in the
4
Arm core is it.
4
sbsa-ref functional tests.
5
5
6
Change Neoverse-N1 (old default) test to use default cpu (Neoverse-N2 at
6
The firmware is built using Debian 'bookworm' cross toolchain (gcc
7
the moment).
7
12.2.0).
8
9
Used versions:
10
11
- Trusted Firmware v2.12.0
12
- Tianocore EDK2 stable202411
13
- Tianocore EDK2 Platforms code commit 4b3530d
14
15
This allows us to move away from "some git commit on trunk"
16
to a stable release for both TF-A and EDK2.
8
17
9
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
18
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
10
Message-id: 20240910-b4-move-to-freebsd-v5-1-0fb66d803c93@linaro.org
19
Message-id: 20241125125448.185504-1-marcin.juszkiewicz@linaro.org
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
20
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
22
---
14
tests/functional/test_aarch64_sbsaref.py | 18 ++++++++++--------
23
tests/functional/test_aarch64_sbsaref.py | 20 ++++++++++----------
15
1 file changed, 10 insertions(+), 8 deletions(-)
24
1 file changed, 10 insertions(+), 10 deletions(-)
16
25
17
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
26
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
18
index XXXXXXX..XXXXXXX 100755
27
index XXXXXXX..XXXXXXX 100755
19
--- a/tests/functional/test_aarch64_sbsaref.py
28
--- a/tests/functional/test_aarch64_sbsaref.py
20
+++ b/tests/functional/test_aarch64_sbsaref.py
29
+++ b/tests/functional/test_aarch64_sbsaref.py
30
@@ -XXX,XX +XXX,XX @@ def fetch_firmware(test):
31
32
Used components:
33
34
- - Trusted Firmware v2.11.0
35
- - Tianocore EDK2 4d4f569924
36
- - Tianocore EDK2-platforms 3f08401
37
+ - Trusted Firmware v2.12.0
38
+ - Tianocore EDK2 edk2-stable202411
39
+ - Tianocore EDK2-platforms 4b3530d
40
41
"""
42
43
@@ -XXX,XX +XXX,XX @@ class Aarch64SbsarefMachine(QemuSystemTest):
44
45
ASSET_FLASH0 = Asset(
46
('https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/'
47
- '20240619-148232/edk2/SBSA_FLASH0.fd.xz'),
48
- '0c954842a590988f526984de22e21ae0ab9cb351a0c99a8a58e928f0c7359cf7')
49
+ '20241122-189881/edk2/SBSA_FLASH0.fd.xz'),
50
+ '76eb89d42eebe324e4395329f47447cda9ac920aabcf99aca85424609c3384a5')
51
52
ASSET_FLASH1 = Asset(
53
('https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/'
54
- '20240619-148232/edk2/SBSA_FLASH1.fd.xz'),
55
- 'c6ec39374c4d79bb9e9cdeeb6db44732d90bb4a334cec92002b3f4b9cac4b5ee')
56
+ '20241122-189881/edk2/SBSA_FLASH1.fd.xz'),
57
+ 'f850f243bd8dbd49c51e061e0f79f1697546938f454aeb59ab7d93e5f0d412fc')
58
59
def test_sbsaref_edk2_firmware(self):
60
21
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_edk2_firmware(self):
61
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_edk2_firmware(self):
22
# This tests the whole boot chain from EFI to Userspace
62
23
# We only boot a whole OS for the current top level CPU and GIC
63
# AP Trusted ROM
24
# Other test profiles should use more minimal boots
64
wait_for_console_pattern(self, "Booting Trusted Firmware")
25
- def boot_alpine_linux(self, cpu):
65
- wait_for_console_pattern(self, "BL1: v2.11.0(release):")
26
+ def boot_alpine_linux(self, cpu=None):
66
+ wait_for_console_pattern(self, "BL1: v2.12.0(release):")
27
self.fetch_firmware()
67
wait_for_console_pattern(self, "BL1: Booting BL2")
28
68
29
iso_path = self.ASSET_ALPINE_ISO.fetch()
69
# Trusted Boot Firmware
30
70
- wait_for_console_pattern(self, "BL2: v2.11.0(release)")
31
self.vm.set_console()
71
+ wait_for_console_pattern(self, "BL2: v2.12.0(release)")
32
self.vm.add_args(
72
wait_for_console_pattern(self, "Booting BL31")
33
- "-cpu", cpu,
73
34
"-drive", f"file={iso_path},media=cdrom,format=raw",
74
# EL3 Runtime Software
35
)
75
- wait_for_console_pattern(self, "BL31: v2.11.0(release)")
36
+ if cpu:
76
+ wait_for_console_pattern(self, "BL31: v2.12.0(release)")
37
+ self.vm.add_args("-cpu", cpu)
77
38
78
# Non-trusted Firmware
39
self.vm.launch()
79
wait_for_console_pattern(self, "UEFI firmware (version 1.0")
40
wait_for_console_pattern(self, "Welcome to Alpine Linux 3.17")
41
@@ -XXX,XX +XXX,XX @@ def boot_alpine_linux(self, cpu):
42
def test_sbsaref_alpine_linux_cortex_a57(self):
43
self.boot_alpine_linux("cortex-a57")
44
45
- def test_sbsaref_alpine_linux_neoverse_n1(self):
46
- self.boot_alpine_linux("neoverse-n1")
47
+ def test_sbsaref_alpine_linux_default_cpu(self):
48
+ self.boot_alpine_linux()
49
50
def test_sbsaref_alpine_linux_max_pauth_off(self):
51
self.boot_alpine_linux("max,pauth=off")
52
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_alpine_linux_max(self):
53
# This tests the whole boot chain from EFI to Userspace
54
# We only boot a whole OS for the current top level CPU and GIC
55
# Other test profiles should use more minimal boots
56
- def boot_openbsd73(self, cpu):
57
+ def boot_openbsd73(self, cpu=None):
58
self.fetch_firmware()
59
60
img_path = self.ASSET_OPENBSD_ISO.fetch()
61
62
self.vm.set_console()
63
self.vm.add_args(
64
- "-cpu", cpu,
65
"-drive", f"file={img_path},format=raw,snapshot=on",
66
)
67
+ if cpu:
68
+ self.vm.add_args("-cpu", cpu)
69
70
self.vm.launch()
71
wait_for_console_pattern(self,
72
@@ -XXX,XX +XXX,XX @@ def boot_openbsd73(self, cpu):
73
def test_sbsaref_openbsd73_cortex_a57(self):
74
self.boot_openbsd73("cortex-a57")
75
76
- def test_sbsaref_openbsd73_neoverse_n1(self):
77
- self.boot_openbsd73("neoverse-n1")
78
+ def test_sbsaref_openbsd73_default_cpu(self):
79
+ self.boot_openbsd73()
80
81
def test_sbsaref_openbsd73_max_pauth_off(self):
82
self.boot_openbsd73("max,pauth=off")
83
--
80
--
84
2.34.1
81
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2
1
3
FreeBSD has longer support cycle for stable release (14.x EoL in 2028)
4
than OpenBSD (7.3 we use is already EoL). Also bugfixes are backported
5
so we can stay on 14.x for longer.
6
7
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
8
Message-id: 20240910-b4-move-to-freebsd-v5-2-0fb66d803c93@linaro.org
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
tests/functional/test_aarch64_sbsaref.py | 43 +++++++++++++++++++++++-
13
1 file changed, 42 insertions(+), 1 deletion(-)
14
15
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
16
index XXXXXXX..XXXXXXX 100755
17
--- a/tests/functional/test_aarch64_sbsaref.py
18
+++ b/tests/functional/test_aarch64_sbsaref.py
19
@@ -XXX,XX +XXX,XX @@
20
#!/usr/bin/env python3
21
#
22
-# Functional test that boots a Linux kernel and checks the console
23
+# Functional test that boots a kernel and checks the console
24
#
25
# SPDX-FileCopyrightText: 2023-2024 Linaro Ltd.
26
# SPDX-FileContributor: Philippe Mathieu-Daudé <philmd@linaro.org>
27
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_openbsd73_max(self):
28
self.boot_openbsd73("max")
29
30
31
+ ASSET_FREEBSD_ISO = Asset(
32
+ ('https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/'
33
+ '14.1/FreeBSD-14.1-RELEASE-arm64-aarch64-bootonly.iso'),
34
+ '44cdbae275ef1bb6dab1d5fbb59473d4f741e1c8ea8a80fd9e906b531d6ad461')
35
+
36
+ # This tests the whole boot chain from EFI to Userspace
37
+ # We only boot a whole OS for the current top level CPU and GIC
38
+ # Other test profiles should use more minimal boots
39
+ def boot_freebsd14(self, cpu=None):
40
+ self.fetch_firmware()
41
+
42
+ img_path = self.ASSET_FREEBSD_ISO.fetch()
43
+
44
+ self.vm.set_console()
45
+ self.vm.add_args(
46
+ "-drive", f"file={img_path},format=raw,snapshot=on",
47
+ )
48
+ if cpu:
49
+ self.vm.add_args("-cpu", cpu)
50
+
51
+ self.vm.launch()
52
+ wait_for_console_pattern(self, 'Welcome to FreeBSD!')
53
+
54
+ def test_sbsaref_freebsd14_cortex_a57(self):
55
+ self.boot_freebsd14("cortex-a57")
56
+
57
+ def test_sbsaref_freebsd14_default_cpu(self):
58
+ self.boot_freebsd14()
59
+
60
+ def test_sbsaref_freebsd14_max_pauth_off(self):
61
+ self.boot_freebsd14("max,pauth=off")
62
+
63
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
64
+ def test_sbsaref_freebsd14_max_pauth_impdef(self):
65
+ self.boot_freebsd14("max,pauth-impdef=on")
66
+
67
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
68
+ def test_sbsaref_freebsd14_max(self):
69
+ self.boot_freebsd14("max")
70
+
71
+
72
if __name__ == '__main__':
73
QemuSystemTest.main()
74
--
75
2.34.1
76
77
diff view generated by jsdifflib
Deleted patch
1
The code at the tail end of the loop in kvm_dirty_ring_reaper_thread()
2
is unreachable, because there is no way for execution to leave the
3
loop. Replace it with a g_assert_not_reached().
4
1
5
(The code has always been unreachable, right from the start
6
when the function was added in commit b4420f198dd8.)
7
8
Resolves: Coverity CID 1547687
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20240815131206.3231819-3-peter.maydell@linaro.org
11
---
12
accel/kvm/kvm-all.c | 6 +-----
13
1 file changed, 1 insertion(+), 5 deletions(-)
14
15
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/accel/kvm/kvm-all.c
18
+++ b/accel/kvm/kvm-all.c
19
@@ -XXX,XX +XXX,XX @@ static void *kvm_dirty_ring_reaper_thread(void *data)
20
r->reaper_iteration++;
21
}
22
23
- trace_kvm_dirty_ring_reaper("exit");
24
-
25
- rcu_unregister_thread();
26
-
27
- return NULL;
28
+ g_assert_not_reached();
29
}
30
31
static void kvm_dirty_ring_reaper_init(KVMState *s)
32
--
33
2.34.1
diff view generated by jsdifflib