1
Arm queue; the bulk of this is the VFP decodetree conversion...
1
First arm pullreq of the 8.0 series...
2
2
3
thanks
3
The following changes since commit ae2b87341b5ddb0dcb1b3f2d4f586ef18de75873:
4
-- PMM
5
4
6
The following changes since commit 4747524f9f243ca5ff1f146d37e423c00e923ee1:
5
Merge tag 'pull-qapi-2022-12-14-v2' of https://repo.or.cz/qemu/armbru into staging (2022-12-14 22:42:14 +0000)
7
8
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2019-06-12' into staging (2019-06-13 11:58:00 +0100)
9
6
10
are available in the Git repository at:
7
are available in the Git repository at:
11
8
12
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190613
9
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20221215
13
10
14
for you to fetch changes up to 07e4c7f769120c9a5bd6a26c2dc1421f2f838d80:
11
for you to fetch changes up to 4f3ebdc33618e7c163f769047859d6f34373e3af:
15
12
16
target/arm: Fix short-vector increment behaviour (2019-06-13 12:57:37 +0100)
13
target/arm: Restrict arm_cpu_exec_interrupt() to TCG accelerator (2022-12-15 11:18:20 +0000)
17
14
18
----------------------------------------------------------------
15
----------------------------------------------------------------
19
target-arm queue:
16
target-arm queue:
20
* convert aarch32 VFP decoder to decodetree
17
* hw/arm/virt: Add properties to allow more granular
21
(includes tightening up decode in a few places)
18
configuration of use of highmem space
22
* fix minor bugs in VFP short-vector handling
19
* target/arm: Add Cortex-A55 CPU
23
* hw/core/bus.c: Only the main system bus can have no parent
20
* hw/intc/arm_gicv3: Fix GICD_TYPER ITLinesNumber advertisement
24
* smmuv3: Fix decoding of ID register range
21
* Implement FEAT_EVT
25
* Implement NSACR gating of floating point
22
* Some 3-phase-reset conversions for Arm GIC, SMMU
26
* Use tcg_gen_gvec_bitsel
23
* hw/arm/boot: set initrd with #address-cells type in fdt
27
* Vectorize USHL and SSHL
24
* align user-mode exposed ID registers with Linux
25
* hw/misc: Move some arm-related files from specific_ss into softmmu_ss
26
* Restrict arm_cpu_exec_interrupt() to TCG accelerator
28
27
29
----------------------------------------------------------------
28
----------------------------------------------------------------
30
Peter Maydell (44):
29
Gavin Shan (7):
31
target/arm: Implement NSACR gating of floating point
30
hw/arm/virt: Introduce virt_set_high_memmap() helper
32
hw/arm/smmuv3: Fix decoding of ID register range
31
hw/arm/virt: Rename variable size to region_size in virt_set_high_memmap()
33
hw/core/bus.c: Only the main system bus can have no parent
32
hw/arm/virt: Introduce variable region_base in virt_set_high_memmap()
34
target/arm: Add stubs for AArch32 VFP decodetree
33
hw/arm/virt: Introduce virt_get_high_memmap_enabled() helper
35
target/arm: Factor out VFP access checking code
34
hw/arm/virt: Improve high memory region address assignment
36
target/arm: Fix Cortex-R5F MVFR values
35
hw/arm/virt: Add 'compact-highmem' property
37
target/arm: Explicitly enable VFP short-vectors for aarch32 -cpu max
36
hw/arm/virt: Add properties to disable high memory regions
38
target/arm: Convert the VSEL instructions to decodetree
39
target/arm: Convert VMINNM, VMAXNM to decodetree
40
target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM to decodetree
41
target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree
42
target/arm: Move the VFP trans_* functions to translate-vfp.inc.c
43
target/arm: Add helpers for VFP register loads and stores
44
target/arm: Convert "double-precision" register moves to decodetree
45
target/arm: Convert "single-precision" register moves to decodetree
46
target/arm: Convert VFP two-register transfer insns to decodetree
47
target/arm: Convert VFP VLDR and VSTR to decodetree
48
target/arm: Convert the VFP load/store multiple insns to decodetree
49
target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d
50
target/arm: Convert VFP VMLA to decodetree
51
target/arm: Convert VFP VMLS to decodetree
52
target/arm: Convert VFP VNMLS to decodetree
53
target/arm: Convert VFP VNMLA to decodetree
54
target/arm: Convert VMUL to decodetree
55
target/arm: Convert VNMUL to decodetree
56
target/arm: Convert VADD to decodetree
57
target/arm: Convert VSUB to decodetree
58
target/arm: Convert VDIV to decodetree
59
target/arm: Convert VFP fused multiply-add insns to decodetree
60
target/arm: Convert VMOV (imm) to decodetree
61
target/arm: Convert VABS to decodetree
62
target/arm: Convert VNEG to decodetree
63
target/arm: Convert VSQRT to decodetree
64
target/arm: Convert VMOV (register) to decodetree
65
target/arm: Convert VFP comparison insns to decodetree
66
target/arm: Convert the VCVT-from-f16 insns to decodetree
67
target/arm: Convert the VCVT-to-f16 insns to decodetree
68
target/arm: Convert VFP round insns to decodetree
69
target/arm: Convert double-single precision conversion insns to decodetree
70
target/arm: Convert integer-to-float insns to decodetree
71
target/arm: Convert VJCVT to decodetree
72
target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree
73
target/arm: Convert float-to-integer VCVT insns to decodetree
74
target/arm: Fix short-vector increment behaviour
75
37
76
Richard Henderson (4):
38
Luke Starrett (1):
77
target/arm: Vectorize USHL and SSHL
39
hw/intc/arm_gicv3: Fix GICD_TYPER ITLinesNumber advertisement
78
target/arm: Use tcg_gen_gvec_bitsel
79
target/arm: Fix output of PAuth Auth
80
decodetree: Fix comparison of Field
81
40
82
target/arm/Makefile.objs | 13 +
41
Mihai Carabas (1):
83
tests/tcg/aarch64/Makefile.target | 2 +-
42
hw/arm/virt: build SMBIOS 19 table
84
target/arm/cpu.h | 11 +
85
target/arm/helper.h | 11 +-
86
target/arm/translate-a64.h | 2 +
87
target/arm/translate.h | 9 +-
88
hw/arm/smmuv3.c | 2 +-
89
hw/core/bus.c | 21 +-
90
target/arm/cpu.c | 6 +
91
target/arm/helper.c | 75 +-
92
target/arm/neon_helper.c | 33 -
93
target/arm/pauth_helper.c | 4 +-
94
target/arm/translate-a64.c | 33 +-
95
target/arm/translate-vfp.inc.c | 2672 +++++++++++++++++++++++++++++++++++++
96
target/arm/translate.c | 1881 +++++---------------------
97
target/arm/vec_helper.c | 88 ++
98
tests/tcg/aarch64/pauth-2.c | 61 +
99
scripts/decodetree.py | 2 +-
100
target/arm/vfp-uncond.decode | 63 +
101
target/arm/vfp.decode | 242 ++++
102
20 files changed, 3593 insertions(+), 1638 deletions(-)
103
create mode 100644 target/arm/translate-vfp.inc.c
104
create mode 100644 tests/tcg/aarch64/pauth-2.c
105
create mode 100644 target/arm/vfp-uncond.decode
106
create mode 100644 target/arm/vfp.decode
107
43
44
Peter Maydell (15):
45
target/arm: Allow relevant HCR bits to be written for FEAT_EVT
46
target/arm: Implement HCR_EL2.TTLBIS traps
47
target/arm: Implement HCR_EL2.TTLBOS traps
48
target/arm: Implement HCR_EL2.TICAB,TOCU traps
49
target/arm: Implement HCR_EL2.TID4 traps
50
target/arm: Report FEAT_EVT for TCG '-cpu max'
51
hw/arm: Convert TYPE_ARM_SMMU to 3-phase reset
52
hw/arm: Convert TYPE_ARM_SMMUV3 to 3-phase reset
53
hw/intc: Convert TYPE_ARM_GIC_COMMON to 3-phase reset
54
hw/intc: Convert TYPE_ARM_GIC_KVM to 3-phase reset
55
hw/intc: Convert TYPE_ARM_GICV3_COMMON to 3-phase reset
56
hw/intc: Convert TYPE_KVM_ARM_GICV3 to 3-phase reset
57
hw/intc: Convert TYPE_ARM_GICV3_ITS_COMMON to 3-phase reset
58
hw/intc: Convert TYPE_ARM_GICV3_ITS to 3-phase reset
59
hw/intc: Convert TYPE_KVM_ARM_ITS to 3-phase reset
60
61
Philippe Mathieu-Daudé (1):
62
target/arm: Restrict arm_cpu_exec_interrupt() to TCG accelerator
63
64
Schspa Shi (1):
65
hw/arm/boot: set initrd with #address-cells type in fdt
66
67
Thomas Huth (1):
68
hw/misc: Move some arm-related files from specific_ss into softmmu_ss
69
70
Timofey Kutergin (1):
71
target/arm: Add Cortex-A55 CPU
72
73
Zhuojia Shen (1):
74
target/arm: align exposed ID registers with Linux
75
76
docs/system/arm/emulation.rst | 1 +
77
docs/system/arm/virt.rst | 18 +++
78
include/hw/arm/smmuv3.h | 2 +-
79
include/hw/arm/virt.h | 2 +
80
include/hw/misc/xlnx-zynqmp-apu-ctrl.h | 2 +-
81
target/arm/cpu.h | 30 +++++
82
target/arm/kvm-consts.h | 8 +-
83
hw/arm/boot.c | 10 +-
84
hw/arm/smmu-common.c | 7 +-
85
hw/arm/smmuv3.c | 12 +-
86
hw/arm/virt.c | 202 +++++++++++++++++++++++-----
87
hw/intc/arm_gic_common.c | 7 +-
88
hw/intc/arm_gic_kvm.c | 14 +-
89
hw/intc/arm_gicv3_common.c | 7 +-
90
hw/intc/arm_gicv3_dist.c | 4 +-
91
hw/intc/arm_gicv3_its.c | 14 +-
92
hw/intc/arm_gicv3_its_common.c | 7 +-
93
hw/intc/arm_gicv3_its_kvm.c | 14 +-
94
hw/intc/arm_gicv3_kvm.c | 14 +-
95
hw/misc/imx6_src.c | 2 +-
96
hw/misc/iotkit-sysctl.c | 1 -
97
target/arm/cpu.c | 5 +-
98
target/arm/cpu64.c | 70 ++++++++++
99
target/arm/cpu_tcg.c | 1 +
100
target/arm/helper.c | 231 ++++++++++++++++++++++++---------
101
hw/misc/meson.build | 11 +-
102
26 files changed, 538 insertions(+), 158 deletions(-)
103
diff view generated by jsdifflib
1
Convert the VFP VMOV (immediate) instruction to decodetree.
1
From: Gavin Shan <gshan@redhat.com>
2
2
3
This introduces virt_set_high_memmap() helper. The logic of high
4
memory region address assignment is moved to the helper. The intention
5
is to make the subsequent optimization for high memory region address
6
assignment easier.
7
8
No functional change intended.
9
10
Signed-off-by: Gavin Shan <gshan@redhat.com>
11
Reviewed-by: Eric Auger <eric.auger@redhat.com>
12
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
13
Reviewed-by: Marc Zyngier <maz@kernel.org>
14
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
15
Message-id: 20221029224307.138822-2-gshan@redhat.com
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
17
---
6
target/arm/translate-vfp.inc.c | 129 +++++++++++++++++++++++++++++++++
18
hw/arm/virt.c | 74 ++++++++++++++++++++++++++++-----------------------
7
target/arm/translate.c | 27 +------
19
1 file changed, 41 insertions(+), 33 deletions(-)
8
target/arm/vfp.decode | 5 ++
9
3 files changed, 136 insertions(+), 25 deletions(-)
10
20
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
21
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
12
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
23
--- a/hw/arm/virt.c
14
+++ b/target/arm/translate-vfp.inc.c
24
+++ b/hw/arm/virt.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_sp *a)
25
@@ -XXX,XX +XXX,XX @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
16
26
return arm_cpu_mp_affinity(idx, clustersz);
17
return true;
18
}
27
}
28
29
+static void virt_set_high_memmap(VirtMachineState *vms,
30
+ hwaddr base, int pa_bits)
31
+{
32
+ int i;
19
+
33
+
20
+static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
34
+ for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
21
+{
35
+ hwaddr size = extended_memmap[i].size;
22
+ uint32_t delta_d = 0;
36
+ bool fits;
23
+ uint32_t bank_mask = 0;
24
+ int veclen = s->vec_len;
25
+ TCGv_i32 fd;
26
+ uint32_t n, i, vd;
27
+
37
+
28
+ vd = a->vd;
38
+ base = ROUND_UP(base, size);
39
+ vms->memmap[i].base = base;
40
+ vms->memmap[i].size = size;
29
+
41
+
30
+ if (!dc_isar_feature(aa32_fpshvec, s) &&
42
+ /*
31
+ (veclen != 0 || s->vec_stride != 0)) {
43
+ * Check each device to see if they fit in the PA space,
32
+ return false;
44
+ * moving highest_gpa as we go.
33
+ }
45
+ *
46
+ * For each device that doesn't fit, disable it.
47
+ */
48
+ fits = (base + size) <= BIT_ULL(pa_bits);
49
+ if (fits) {
50
+ vms->highest_gpa = base + size - 1;
51
+ }
34
+
52
+
35
+ if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
53
+ switch (i) {
36
+ return false;
54
+ case VIRT_HIGH_GIC_REDIST2:
37
+ }
55
+ vms->highmem_redists &= fits;
38
+
56
+ break;
39
+ if (!vfp_access_check(s)) {
57
+ case VIRT_HIGH_PCIE_ECAM:
40
+ return true;
58
+ vms->highmem_ecam &= fits;
41
+ }
59
+ break;
42
+
60
+ case VIRT_HIGH_PCIE_MMIO:
43
+ if (veclen > 0) {
61
+ vms->highmem_mmio &= fits;
44
+ bank_mask = 0x18;
45
+ /* Figure out what type of vector operation this is. */
46
+ if ((vd & bank_mask) == 0) {
47
+ /* scalar */
48
+ veclen = 0;
49
+ } else {
50
+ delta_d = s->vec_stride + 1;
51
+ }
52
+ }
53
+
54
+ n = (a->imm4h << 28) & 0x80000000;
55
+ i = ((a->imm4h << 4) & 0x70) | a->imm4l;
56
+ if (i & 0x40) {
57
+ i |= 0x780;
58
+ } else {
59
+ i |= 0x800;
60
+ }
61
+ n |= i << 19;
62
+
63
+ fd = tcg_temp_new_i32();
64
+ tcg_gen_movi_i32(fd, n);
65
+
66
+ for (;;) {
67
+ neon_store_reg32(fd, vd);
68
+
69
+ if (veclen == 0) {
70
+ break;
62
+ break;
71
+ }
63
+ }
72
+
64
+
73
+ /* Set up the operands for the next iteration */
65
+ base += size;
74
+ veclen--;
75
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
76
+ }
66
+ }
77
+
78
+ tcg_temp_free_i32(fd);
79
+ return true;
80
+}
67
+}
81
+
68
+
82
+static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
69
static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
83
+{
84
+ uint32_t delta_d = 0;
85
+ uint32_t bank_mask = 0;
86
+ int veclen = s->vec_len;
87
+ TCGv_i64 fd;
88
+ uint32_t n, i, vd;
89
+
90
+ vd = a->vd;
91
+
92
+ /* UNDEF accesses to D16-D31 if they don't exist. */
93
+ if (!dc_isar_feature(aa32_fp_d32, s) && (vd & 0x10)) {
94
+ return false;
95
+ }
96
+
97
+ if (!dc_isar_feature(aa32_fpshvec, s) &&
98
+ (veclen != 0 || s->vec_stride != 0)) {
99
+ return false;
100
+ }
101
+
102
+ if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
103
+ return false;
104
+ }
105
+
106
+ if (!vfp_access_check(s)) {
107
+ return true;
108
+ }
109
+
110
+ if (veclen > 0) {
111
+ bank_mask = 0xc;
112
+ /* Figure out what type of vector operation this is. */
113
+ if ((vd & bank_mask) == 0) {
114
+ /* scalar */
115
+ veclen = 0;
116
+ } else {
117
+ delta_d = (s->vec_stride >> 1) + 1;
118
+ }
119
+ }
120
+
121
+ n = (a->imm4h << 28) & 0x80000000;
122
+ i = ((a->imm4h << 4) & 0x70) | a->imm4l;
123
+ if (i & 0x40) {
124
+ i |= 0x3f80;
125
+ } else {
126
+ i |= 0x4000;
127
+ }
128
+ n |= i << 16;
129
+
130
+ fd = tcg_temp_new_i64();
131
+ tcg_gen_movi_i64(fd, ((uint64_t)n) << 32);
132
+
133
+ for (;;) {
134
+ neon_store_reg64(fd, vd);
135
+
136
+ if (veclen == 0) {
137
+ break;
138
+ }
139
+
140
+ /* Set up the operands for the next iteration */
141
+ veclen--;
142
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
143
+ }
144
+
145
+ tcg_temp_free_i64(fd);
146
+ return true;
147
+}
148
diff --git a/target/arm/translate.c b/target/arm/translate.c
149
index XXXXXXX..XXXXXXX 100644
150
--- a/target/arm/translate.c
151
+++ b/target/arm/translate.c
152
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
153
*/
154
static int disas_vfp_insn(DisasContext *s, uint32_t insn)
155
{
70
{
156
- uint32_t rd, rn, rm, op, i, n, delta_d, delta_m, bank_mask;
71
MachineState *ms = MACHINE(vms);
157
+ uint32_t rd, rn, rm, op, delta_d, delta_m, bank_mask;
72
@@ -XXX,XX +XXX,XX @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
158
int dp, veclen;
73
/* We know for sure that at least the memory fits in the PA space */
159
TCGv_i32 tmp;
74
vms->highest_gpa = memtop - 1;
160
TCGv_i32 tmp2;
75
161
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
76
- for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
162
rn = VFP_SREG_N(insn);
77
- hwaddr size = extended_memmap[i].size;
163
78
- bool fits;
164
switch (op) {
165
- case 0 ... 13:
166
+ case 0 ... 14:
167
/* Already handled by decodetree */
168
return 1;
169
default:
170
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
171
for (;;) {
172
/* Perform the calculation. */
173
switch (op) {
174
- case 14: /* fconst */
175
- if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
176
- return 1;
177
- }
178
-
79
-
179
- n = (insn << 12) & 0x80000000;
80
- base = ROUND_UP(base, size);
180
- i = ((insn >> 12) & 0x70) | (insn & 0xf);
81
- vms->memmap[i].base = base;
181
- if (dp) {
82
- vms->memmap[i].size = size;
182
- if (i & 0x40)
83
-
183
- i |= 0x3f80;
84
- /*
184
- else
85
- * Check each device to see if they fit in the PA space,
185
- i |= 0x4000;
86
- * moving highest_gpa as we go.
186
- n |= i << 16;
87
- *
187
- tcg_gen_movi_i64(cpu_F0d, ((uint64_t)n) << 32);
88
- * For each device that doesn't fit, disable it.
188
- } else {
89
- */
189
- if (i & 0x40)
90
- fits = (base + size) <= BIT_ULL(pa_bits);
190
- i |= 0x780;
91
- if (fits) {
191
- else
92
- vms->highest_gpa = base + size - 1;
192
- i |= 0x800;
93
- }
193
- n |= i << 19;
94
-
194
- tcg_gen_movi_i32(cpu_F0s, n);
95
- switch (i) {
195
- }
96
- case VIRT_HIGH_GIC_REDIST2:
196
- break;
97
- vms->highmem_redists &= fits;
197
case 15: /* extension space */
98
- break;
198
switch (rn) {
99
- case VIRT_HIGH_PCIE_ECAM:
199
case 0: /* cpy */
100
- vms->highmem_ecam &= fits;
200
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
101
- break;
201
index XXXXXXX..XXXXXXX 100644
102
- case VIRT_HIGH_PCIE_MMIO:
202
--- a/target/arm/vfp.decode
103
- vms->highmem_mmio &= fits;
203
+++ b/target/arm/vfp.decode
104
- break;
204
@@ -XXX,XX +XXX,XX @@ VFM_sp ---- 1110 1.10 .... .... 1010 . o2:1 . 0 .... \
105
- }
205
vm=%vm_sp vn=%vn_sp vd=%vd_sp o1=2
106
-
206
VFM_dp ---- 1110 1.10 .... .... 1011 . o2:1 . 0 .... \
107
- base += size;
207
vm=%vm_dp vn=%vn_dp vd=%vd_dp o1=2
108
- }
208
+
109
+ virt_set_high_memmap(vms, base, pa_bits);
209
+VMOV_imm_sp ---- 1110 1.11 imm4h:4 .... 1010 0000 imm4l:4 \
110
210
+ vd=%vd_sp
111
if (device_memory_size > 0) {
211
+VMOV_imm_dp ---- 1110 1.11 imm4h:4 .... 1011 0000 imm4l:4 \
112
ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
212
+ vd=%vd_dp
213
--
113
--
214
2.20.1
114
2.25.1
215
216
diff view generated by jsdifflib
1
For VFP short vectors, the VFP registers are divided into a
1
From: Gavin Shan <gshan@redhat.com>
2
series of banks: for single-precision these are s0-s7, s8-s15,
3
s16-s23 and s24-s31; for double-precision they are d0-d3,
4
d4-d7, ... d28-d31. Some banks are "scalar" meaning that
5
use of a register within them triggers a pure-scalar or
6
mixed vector-scalar operation rather than a full vector
7
operation. The scalar banks are s0-s7, d0-d3 and d16-d19.
8
When using a bank as part of a vector operation, we
9
iterate through it, increasing the register number by
10
the specified stride each time, and wrapping around to
11
the beginning of the bank.
12
2
13
Unfortunately our calculation of the "increment" part of this
3
This renames variable 'size' to 'region_size' in virt_set_high_memmap().
14
was incorrect:
4
Its counterpart ('region_base') will be introduced in next patch.
15
vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask)
16
will only do the intended thing if bank_mask has exactly
17
one set high bit. For instance for doubles (bank_mask = 0xc),
18
if we start with vd = 6 and delta_d = 2 then vd is updated
19
to 12 rather than the intended 4.
20
5
21
This only causes problems in the unlikely case that the
6
No functional change intended.
22
starting register is not the first in its bank: if the
23
register number doesn't have to wrap around then the
24
expression happens to give the right answer.
25
7
26
Fix this bug by abstracting out the "check whether register
8
Signed-off-by: Gavin Shan <gshan@redhat.com>
27
is in a scalar bank" and "advance register within bank"
9
Reviewed-by: Eric Auger <eric.auger@redhat.com>
28
operations to utility functions which use the right
10
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
29
bit masking operations.
11
Reviewed-by: Marc Zyngier <maz@kernel.org>
12
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
13
Message-id: 20221029224307.138822-3-gshan@redhat.com
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
16
hw/arm/virt.c | 15 ++++++++-------
17
1 file changed, 8 insertions(+), 7 deletions(-)
30
18
31
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
32
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
33
---
34
target/arm/translate-vfp.inc.c | 100 ++++++++++++++++++++-------------
35
1 file changed, 60 insertions(+), 40 deletions(-)
36
37
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
38
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/translate-vfp.inc.c
21
--- a/hw/arm/virt.c
40
+++ b/target/arm/translate-vfp.inc.c
22
+++ b/hw/arm/virt.c
41
@@ -XXX,XX +XXX,XX @@ typedef void VFPGen3OpDPFn(TCGv_i64 vd,
23
@@ -XXX,XX +XXX,XX @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
42
typedef void VFPGen2OpSPFn(TCGv_i32 vd, TCGv_i32 vm);
24
static void virt_set_high_memmap(VirtMachineState *vms,
43
typedef void VFPGen2OpDPFn(TCGv_i64 vd, TCGv_i64 vm);
25
hwaddr base, int pa_bits)
44
45
+/*
46
+ * Return true if the specified S reg is in a scalar bank
47
+ * (ie if it is s0..s7)
48
+ */
49
+static inline bool vfp_sreg_is_scalar(int reg)
50
+{
51
+ return (reg & 0x18) == 0;
52
+}
53
+
54
+/*
55
+ * Return true if the specified D reg is in a scalar bank
56
+ * (ie if it is d0..d3 or d16..d19)
57
+ */
58
+static inline bool vfp_dreg_is_scalar(int reg)
59
+{
60
+ return (reg & 0xc) == 0;
61
+}
62
+
63
+/*
64
+ * Advance the S reg number forwards by delta within its bank
65
+ * (ie increment the low 3 bits but leave the rest the same)
66
+ */
67
+static inline int vfp_advance_sreg(int reg, int delta)
68
+{
69
+ return ((reg + delta) & 0x7) | (reg & ~0x7);
70
+}
71
+
72
+/*
73
+ * Advance the D reg number forwards by delta within its bank
74
+ * (ie increment the low 2 bits but leave the rest the same)
75
+ */
76
+static inline int vfp_advance_dreg(int reg, int delta)
77
+{
78
+ return ((reg + delta) & 0x3) | (reg & ~0x3);
79
+}
80
+
81
/*
82
* Perform a 3-operand VFP data processing instruction. fn is the
83
* callback to do the actual operation; this function deals with the
84
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn,
85
{
26
{
86
uint32_t delta_m = 0;
27
+ hwaddr region_size;
87
uint32_t delta_d = 0;
28
+ bool fits;
88
- uint32_t bank_mask = 0;
29
int i;
89
int veclen = s->vec_len;
30
90
TCGv_i32 f0, f1, fd;
31
for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
91
TCGv_ptr fpst;
32
- hwaddr size = extended_memmap[i].size;
92
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn,
33
- bool fits;
34
+ region_size = extended_memmap[i].size;
35
36
- base = ROUND_UP(base, size);
37
+ base = ROUND_UP(base, region_size);
38
vms->memmap[i].base = base;
39
- vms->memmap[i].size = size;
40
+ vms->memmap[i].size = region_size;
41
42
/*
43
* Check each device to see if they fit in the PA space,
44
@@ -XXX,XX +XXX,XX @@ static void virt_set_high_memmap(VirtMachineState *vms,
45
*
46
* For each device that doesn't fit, disable it.
47
*/
48
- fits = (base + size) <= BIT_ULL(pa_bits);
49
+ fits = (base + region_size) <= BIT_ULL(pa_bits);
50
if (fits) {
51
- vms->highest_gpa = base + size - 1;
52
+ vms->highest_gpa = base + region_size - 1;
53
}
54
55
switch (i) {
56
@@ -XXX,XX +XXX,XX @@ static void virt_set_high_memmap(VirtMachineState *vms,
57
break;
58
}
59
60
- base += size;
61
+ base += region_size;
93
}
62
}
94
63
}
95
if (veclen > 0) {
64
96
- bank_mask = 0x18;
97
-
98
/* Figure out what type of vector operation this is. */
99
- if ((vd & bank_mask) == 0) {
100
+ if (vfp_sreg_is_scalar(vd)) {
101
/* scalar */
102
veclen = 0;
103
} else {
104
delta_d = s->vec_stride + 1;
105
106
- if ((vm & bank_mask) == 0) {
107
+ if (vfp_sreg_is_scalar(vm)) {
108
/* mixed scalar/vector */
109
delta_m = 0;
110
} else {
111
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn,
112
113
/* Set up the operands for the next iteration */
114
veclen--;
115
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
116
- vn = ((vn + delta_d) & (bank_mask - 1)) | (vn & bank_mask);
117
+ vd = vfp_advance_sreg(vd, delta_d);
118
+ vn = vfp_advance_sreg(vn, delta_d);
119
neon_load_reg32(f0, vn);
120
if (delta_m) {
121
- vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
122
+ vm = vfp_advance_sreg(vm, delta_m);
123
neon_load_reg32(f1, vm);
124
}
125
}
126
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
127
{
128
uint32_t delta_m = 0;
129
uint32_t delta_d = 0;
130
- uint32_t bank_mask = 0;
131
int veclen = s->vec_len;
132
TCGv_i64 f0, f1, fd;
133
TCGv_ptr fpst;
134
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
135
}
136
137
if (veclen > 0) {
138
- bank_mask = 0xc;
139
-
140
/* Figure out what type of vector operation this is. */
141
- if ((vd & bank_mask) == 0) {
142
+ if (vfp_dreg_is_scalar(vd)) {
143
/* scalar */
144
veclen = 0;
145
} else {
146
delta_d = (s->vec_stride >> 1) + 1;
147
148
- if ((vm & bank_mask) == 0) {
149
+ if (vfp_dreg_is_scalar(vm)) {
150
/* mixed scalar/vector */
151
delta_m = 0;
152
} else {
153
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
154
}
155
/* Set up the operands for the next iteration */
156
veclen--;
157
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
158
- vn = ((vn + delta_d) & (bank_mask - 1)) | (vn & bank_mask);
159
+ vd = vfp_advance_dreg(vd, delta_d);
160
+ vn = vfp_advance_dreg(vn, delta_d);
161
neon_load_reg64(f0, vn);
162
if (delta_m) {
163
- vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
164
+ vm = vfp_advance_dreg(vm, delta_m);
165
neon_load_reg64(f1, vm);
166
}
167
}
168
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
169
{
170
uint32_t delta_m = 0;
171
uint32_t delta_d = 0;
172
- uint32_t bank_mask = 0;
173
int veclen = s->vec_len;
174
TCGv_i32 f0, fd;
175
176
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
177
}
178
179
if (veclen > 0) {
180
- bank_mask = 0x18;
181
-
182
/* Figure out what type of vector operation this is. */
183
- if ((vd & bank_mask) == 0) {
184
+ if (vfp_sreg_is_scalar(vd)) {
185
/* scalar */
186
veclen = 0;
187
} else {
188
delta_d = s->vec_stride + 1;
189
190
- if ((vm & bank_mask) == 0) {
191
+ if (vfp_sreg_is_scalar(vm)) {
192
/* mixed scalar/vector */
193
delta_m = 0;
194
} else {
195
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
196
if (delta_m == 0) {
197
/* single source one-many */
198
while (veclen--) {
199
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
200
+ vd = vfp_advance_sreg(vd, delta_d);
201
neon_store_reg32(fd, vd);
202
}
203
break;
204
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
205
206
/* Set up the operands for the next iteration */
207
veclen--;
208
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
209
- vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
210
+ vd = vfp_advance_sreg(vd, delta_d);
211
+ vm = vfp_advance_sreg(vm, delta_m);
212
neon_load_reg32(f0, vm);
213
}
214
215
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
216
{
217
uint32_t delta_m = 0;
218
uint32_t delta_d = 0;
219
- uint32_t bank_mask = 0;
220
int veclen = s->vec_len;
221
TCGv_i64 f0, fd;
222
223
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
224
}
225
226
if (veclen > 0) {
227
- bank_mask = 0xc;
228
-
229
/* Figure out what type of vector operation this is. */
230
- if ((vd & bank_mask) == 0) {
231
+ if (vfp_dreg_is_scalar(vd)) {
232
/* scalar */
233
veclen = 0;
234
} else {
235
delta_d = (s->vec_stride >> 1) + 1;
236
237
- if ((vm & bank_mask) == 0) {
238
+ if (vfp_dreg_is_scalar(vm)) {
239
/* mixed scalar/vector */
240
delta_m = 0;
241
} else {
242
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
243
if (delta_m == 0) {
244
/* single source one-many */
245
while (veclen--) {
246
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
247
+ vd = vfp_advance_dreg(vd, delta_d);
248
neon_store_reg64(fd, vd);
249
}
250
break;
251
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
252
253
/* Set up the operands for the next iteration */
254
veclen--;
255
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
256
- vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
257
+ vd = vfp_advance_dreg(vd, delta_d);
258
+ vd = vfp_advance_dreg(vm, delta_m);
259
neon_load_reg64(f0, vm);
260
}
261
262
@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_sp *a)
263
static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
264
{
265
uint32_t delta_d = 0;
266
- uint32_t bank_mask = 0;
267
int veclen = s->vec_len;
268
TCGv_i32 fd;
269
uint32_t n, i, vd;
270
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
271
}
272
273
if (veclen > 0) {
274
- bank_mask = 0x18;
275
/* Figure out what type of vector operation this is. */
276
- if ((vd & bank_mask) == 0) {
277
+ if (vfp_sreg_is_scalar(vd)) {
278
/* scalar */
279
veclen = 0;
280
} else {
281
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
282
283
/* Set up the operands for the next iteration */
284
veclen--;
285
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
286
+ vd = vfp_advance_sreg(vd, delta_d);
287
}
288
289
tcg_temp_free_i32(fd);
290
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
291
static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
292
{
293
uint32_t delta_d = 0;
294
- uint32_t bank_mask = 0;
295
int veclen = s->vec_len;
296
TCGv_i64 fd;
297
uint32_t n, i, vd;
298
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
299
}
300
301
if (veclen > 0) {
302
- bank_mask = 0xc;
303
/* Figure out what type of vector operation this is. */
304
- if ((vd & bank_mask) == 0) {
305
+ if (vfp_dreg_is_scalar(vd)) {
306
/* scalar */
307
veclen = 0;
308
} else {
309
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
310
311
/* Set up the operands for the next iteration */
312
veclen--;
313
- vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
314
+ vfp_advance_dreg(vd, delta_d);
315
}
316
317
tcg_temp_free_i64(fd);
318
--
65
--
319
2.20.1
66
2.25.1
320
321
diff view generated by jsdifflib
1
The current VFP code has two different idioms for
1
From: Gavin Shan <gshan@redhat.com>
2
loading and storing from the VFP register file:
3
1 using the gen_mov_F0_vreg() and similar functions,
4
which load and store to a fixed set of TCG globals
5
cpu_F0s, CPU_F0d, etc
6
2 by direct calls to tcg_gen_ld_f64() and friends
7
2
8
We want to phase out idiom 1 (because the use of the
3
This introduces variable 'region_base' for the base address of the
9
fixed globals is a relic of a much older version of TCG),
4
specific high memory region. It's the preparatory work to optimize
10
but idiom 2 is quite longwinded:
5
high memory region address assignment.
11
tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg))
12
requires us to specify the 64-bitness twice, once in
13
the function name and once by passing 'true' to
14
vfp_reg_offset(). There's no guard against accidentally
15
passing the wrong flag.
16
6
17
Instead, let's move to a convention of accessing 64-bit
7
No functional change intended.
18
registers via the existing neon_load_reg64() and
19
neon_store_reg64(), and provide new neon_load_reg32()
20
and neon_store_reg32() for the 32-bit equivalents.
21
8
22
Implement the new functions and use them in the code in
9
Signed-off-by: Gavin Shan <gshan@redhat.com>
23
translate-vfp.inc.c. We will convert the rest of the VFP
10
Reviewed-by: Eric Auger <eric.auger@redhat.com>
24
code as we do the decodetree conversion in subsequent
11
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
25
commits.
12
Reviewed-by: Marc Zyngier <maz@kernel.org>
13
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
14
Message-id: 20221029224307.138822-4-gshan@redhat.com
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
hw/arm/virt.c | 12 ++++++------
18
1 file changed, 6 insertions(+), 6 deletions(-)
26
19
27
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
28
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
29
---
30
target/arm/translate-vfp.inc.c | 40 +++++++++++++++++-----------------
31
target/arm/translate.c | 10 +++++++++
32
2 files changed, 30 insertions(+), 20 deletions(-)
33
34
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
35
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/translate-vfp.inc.c
22
--- a/hw/arm/virt.c
37
+++ b/target/arm/translate-vfp.inc.c
23
+++ b/hw/arm/virt.c
38
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
24
@@ -XXX,XX +XXX,XX @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
39
tcg_gen_ext_i32_i64(nf, cpu_NF);
25
static void virt_set_high_memmap(VirtMachineState *vms,
40
tcg_gen_ext_i32_i64(vf, cpu_VF);
26
hwaddr base, int pa_bits)
41
27
{
42
- tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
28
- hwaddr region_size;
43
- tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
29
+ hwaddr region_base, region_size;
44
+ neon_load_reg64(frn, rn);
30
bool fits;
45
+ neon_load_reg64(frm, rm);
31
int i;
46
switch (a->cc) {
32
47
case 0: /* eq: Z */
33
for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
48
tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero,
34
+ region_base = ROUND_UP(base, extended_memmap[i].size);
49
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
35
region_size = extended_memmap[i].size;
50
tcg_temp_free_i64(tmp);
36
37
- base = ROUND_UP(base, region_size);
38
- vms->memmap[i].base = base;
39
+ vms->memmap[i].base = region_base;
40
vms->memmap[i].size = region_size;
41
42
/*
43
@@ -XXX,XX +XXX,XX @@ static void virt_set_high_memmap(VirtMachineState *vms,
44
*
45
* For each device that doesn't fit, disable it.
46
*/
47
- fits = (base + region_size) <= BIT_ULL(pa_bits);
48
+ fits = (region_base + region_size) <= BIT_ULL(pa_bits);
49
if (fits) {
50
- vms->highest_gpa = base + region_size - 1;
51
+ vms->highest_gpa = region_base + region_size - 1;
52
}
53
54
switch (i) {
55
@@ -XXX,XX +XXX,XX @@ static void virt_set_high_memmap(VirtMachineState *vms,
51
break;
56
break;
52
}
57
}
53
- tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd));
58
54
+ neon_store_reg64(dest, rd);
59
- base += region_size;
55
tcg_temp_free_i64(frn);
60
+ base = region_base + region_size;
56
tcg_temp_free_i64(frm);
57
tcg_temp_free_i64(dest);
58
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
59
frn = tcg_temp_new_i32();
60
frm = tcg_temp_new_i32();
61
dest = tcg_temp_new_i32();
62
- tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
63
- tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
64
+ neon_load_reg32(frn, rn);
65
+ neon_load_reg32(frm, rm);
66
switch (a->cc) {
67
case 0: /* eq: Z */
68
tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero,
69
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
70
tcg_temp_free_i32(tmp);
71
break;
72
}
73
- tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd));
74
+ neon_store_reg32(dest, rd);
75
tcg_temp_free_i32(frn);
76
tcg_temp_free_i32(frm);
77
tcg_temp_free_i32(dest);
78
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
79
frm = tcg_temp_new_i64();
80
dest = tcg_temp_new_i64();
81
82
- tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
83
- tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
84
+ neon_load_reg64(frn, rn);
85
+ neon_load_reg64(frm, rm);
86
if (vmin) {
87
gen_helper_vfp_minnumd(dest, frn, frm, fpst);
88
} else {
89
gen_helper_vfp_maxnumd(dest, frn, frm, fpst);
90
}
91
- tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd));
92
+ neon_store_reg64(dest, rd);
93
tcg_temp_free_i64(frn);
94
tcg_temp_free_i64(frm);
95
tcg_temp_free_i64(dest);
96
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
97
frm = tcg_temp_new_i32();
98
dest = tcg_temp_new_i32();
99
100
- tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
101
- tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
102
+ neon_load_reg32(frn, rn);
103
+ neon_load_reg32(frm, rm);
104
if (vmin) {
105
gen_helper_vfp_minnums(dest, frn, frm, fpst);
106
} else {
107
gen_helper_vfp_maxnums(dest, frn, frm, fpst);
108
}
109
- tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd));
110
+ neon_store_reg32(dest, rd);
111
tcg_temp_free_i32(frn);
112
tcg_temp_free_i32(frm);
113
tcg_temp_free_i32(dest);
114
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
115
TCGv_i64 tcg_res;
116
tcg_op = tcg_temp_new_i64();
117
tcg_res = tcg_temp_new_i64();
118
- tcg_gen_ld_f64(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
119
+ neon_load_reg64(tcg_op, rm);
120
gen_helper_rintd(tcg_res, tcg_op, fpst);
121
- tcg_gen_st_f64(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
122
+ neon_store_reg64(tcg_res, rd);
123
tcg_temp_free_i64(tcg_op);
124
tcg_temp_free_i64(tcg_res);
125
} else {
126
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
127
TCGv_i32 tcg_res;
128
tcg_op = tcg_temp_new_i32();
129
tcg_res = tcg_temp_new_i32();
130
- tcg_gen_ld_f32(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
131
+ neon_load_reg32(tcg_op, rm);
132
gen_helper_rints(tcg_res, tcg_op, fpst);
133
- tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
134
+ neon_store_reg32(tcg_res, rd);
135
tcg_temp_free_i32(tcg_op);
136
tcg_temp_free_i32(tcg_res);
137
}
61
}
138
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
139
tcg_double = tcg_temp_new_i64();
140
tcg_res = tcg_temp_new_i64();
141
tcg_tmp = tcg_temp_new_i32();
142
- tcg_gen_ld_f64(tcg_double, cpu_env, vfp_reg_offset(1, rm));
143
+ neon_load_reg64(tcg_double, rm);
144
if (is_signed) {
145
gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst);
146
} else {
147
gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst);
148
}
149
tcg_gen_extrl_i64_i32(tcg_tmp, tcg_res);
150
- tcg_gen_st_f32(tcg_tmp, cpu_env, vfp_reg_offset(0, rd));
151
+ neon_store_reg32(tcg_tmp, rd);
152
tcg_temp_free_i32(tcg_tmp);
153
tcg_temp_free_i64(tcg_res);
154
tcg_temp_free_i64(tcg_double);
155
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
156
TCGv_i32 tcg_single, tcg_res;
157
tcg_single = tcg_temp_new_i32();
158
tcg_res = tcg_temp_new_i32();
159
- tcg_gen_ld_f32(tcg_single, cpu_env, vfp_reg_offset(0, rm));
160
+ neon_load_reg32(tcg_single, rm);
161
if (is_signed) {
162
gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst);
163
} else {
164
gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
165
}
166
- tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(0, rd));
167
+ neon_store_reg32(tcg_res, rd);
168
tcg_temp_free_i32(tcg_res);
169
tcg_temp_free_i32(tcg_single);
170
}
171
diff --git a/target/arm/translate.c b/target/arm/translate.c
172
index XXXXXXX..XXXXXXX 100644
173
--- a/target/arm/translate.c
174
+++ b/target/arm/translate.c
175
@@ -XXX,XX +XXX,XX @@ static inline void neon_store_reg64(TCGv_i64 var, int reg)
176
tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg));
177
}
62
}
178
63
179
+static inline void neon_load_reg32(TCGv_i32 var, int reg)
180
+{
181
+ tcg_gen_ld_i32(var, cpu_env, vfp_reg_offset(false, reg));
182
+}
183
+
184
+static inline void neon_store_reg32(TCGv_i32 var, int reg)
185
+{
186
+ tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg));
187
+}
188
+
189
static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
190
{
191
TCGv_ptr ret = tcg_temp_new_ptr();
192
--
64
--
193
2.20.1
65
2.25.1
194
195
diff view generated by jsdifflib
1
Move the trans_*() functions we've just created from translate.c
1
From: Gavin Shan <gshan@redhat.com>
2
to translate-vfp.inc.c. This is pure code motion with no textual
3
changes (this can be checked with 'git show --color-moved').
4
2
3
This introduces virt_get_high_memmap_enabled() helper, which returns
4
the pointer to vms->highmem_{redists, ecam, mmio}. The pointer will
5
be used in the subsequent patches.
6
7
No functional change intended.
8
9
Signed-off-by: Gavin Shan <gshan@redhat.com>
10
Reviewed-by: Eric Auger <eric.auger@redhat.com>
11
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
12
Reviewed-by: Marc Zyngier <maz@kernel.org>
13
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
14
Message-id: 20221029224307.138822-5-gshan@redhat.com
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
16
---
8
target/arm/translate-vfp.inc.c | 337 +++++++++++++++++++++++++++++++++
17
hw/arm/virt.c | 32 +++++++++++++++++++-------------
9
target/arm/translate.c | 337 ---------------------------------
18
1 file changed, 19 insertions(+), 13 deletions(-)
10
2 files changed, 337 insertions(+), 337 deletions(-)
11
19
12
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
20
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
13
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-vfp.inc.c
22
--- a/hw/arm/virt.c
15
+++ b/target/arm/translate-vfp.inc.c
23
+++ b/hw/arm/virt.c
16
@@ -XXX,XX +XXX,XX @@ static bool vfp_access_check(DisasContext *s)
24
@@ -XXX,XX +XXX,XX @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
17
{
25
return arm_cpu_mp_affinity(idx, clustersz);
18
return full_vfp_access_check(s, false);
19
}
26
}
27
28
+static inline bool *virt_get_high_memmap_enabled(VirtMachineState *vms,
29
+ int index)
30
+{
31
+ bool *enabled_array[] = {
32
+ &vms->highmem_redists,
33
+ &vms->highmem_ecam,
34
+ &vms->highmem_mmio,
35
+ };
20
+
36
+
21
+static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
37
+ assert(ARRAY_SIZE(extended_memmap) - VIRT_LOWMEMMAP_LAST ==
22
+{
38
+ ARRAY_SIZE(enabled_array));
23
+ uint32_t rd, rn, rm;
39
+ assert(index - VIRT_LOWMEMMAP_LAST < ARRAY_SIZE(enabled_array));
24
+ bool dp = a->dp;
25
+
40
+
26
+ if (!dc_isar_feature(aa32_vsel, s)) {
41
+ return enabled_array[index - VIRT_LOWMEMMAP_LAST];
27
+ return false;
28
+ }
29
+
30
+ /* UNDEF accesses to D16-D31 if they don't exist */
31
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
32
+ ((a->vm | a->vn | a->vd) & 0x10)) {
33
+ return false;
34
+ }
35
+ rd = a->vd;
36
+ rn = a->vn;
37
+ rm = a->vm;
38
+
39
+ if (!vfp_access_check(s)) {
40
+ return true;
41
+ }
42
+
43
+ if (dp) {
44
+ TCGv_i64 frn, frm, dest;
45
+ TCGv_i64 tmp, zero, zf, nf, vf;
46
+
47
+ zero = tcg_const_i64(0);
48
+
49
+ frn = tcg_temp_new_i64();
50
+ frm = tcg_temp_new_i64();
51
+ dest = tcg_temp_new_i64();
52
+
53
+ zf = tcg_temp_new_i64();
54
+ nf = tcg_temp_new_i64();
55
+ vf = tcg_temp_new_i64();
56
+
57
+ tcg_gen_extu_i32_i64(zf, cpu_ZF);
58
+ tcg_gen_ext_i32_i64(nf, cpu_NF);
59
+ tcg_gen_ext_i32_i64(vf, cpu_VF);
60
+
61
+ tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
62
+ tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
63
+ switch (a->cc) {
64
+ case 0: /* eq: Z */
65
+ tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero,
66
+ frn, frm);
67
+ break;
68
+ case 1: /* vs: V */
69
+ tcg_gen_movcond_i64(TCG_COND_LT, dest, vf, zero,
70
+ frn, frm);
71
+ break;
72
+ case 2: /* ge: N == V -> N ^ V == 0 */
73
+ tmp = tcg_temp_new_i64();
74
+ tcg_gen_xor_i64(tmp, vf, nf);
75
+ tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero,
76
+ frn, frm);
77
+ tcg_temp_free_i64(tmp);
78
+ break;
79
+ case 3: /* gt: !Z && N == V */
80
+ tcg_gen_movcond_i64(TCG_COND_NE, dest, zf, zero,
81
+ frn, frm);
82
+ tmp = tcg_temp_new_i64();
83
+ tcg_gen_xor_i64(tmp, vf, nf);
84
+ tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero,
85
+ dest, frm);
86
+ tcg_temp_free_i64(tmp);
87
+ break;
88
+ }
89
+ tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd));
90
+ tcg_temp_free_i64(frn);
91
+ tcg_temp_free_i64(frm);
92
+ tcg_temp_free_i64(dest);
93
+
94
+ tcg_temp_free_i64(zf);
95
+ tcg_temp_free_i64(nf);
96
+ tcg_temp_free_i64(vf);
97
+
98
+ tcg_temp_free_i64(zero);
99
+ } else {
100
+ TCGv_i32 frn, frm, dest;
101
+ TCGv_i32 tmp, zero;
102
+
103
+ zero = tcg_const_i32(0);
104
+
105
+ frn = tcg_temp_new_i32();
106
+ frm = tcg_temp_new_i32();
107
+ dest = tcg_temp_new_i32();
108
+ tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
109
+ tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
110
+ switch (a->cc) {
111
+ case 0: /* eq: Z */
112
+ tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero,
113
+ frn, frm);
114
+ break;
115
+ case 1: /* vs: V */
116
+ tcg_gen_movcond_i32(TCG_COND_LT, dest, cpu_VF, zero,
117
+ frn, frm);
118
+ break;
119
+ case 2: /* ge: N == V -> N ^ V == 0 */
120
+ tmp = tcg_temp_new_i32();
121
+ tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
122
+ tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero,
123
+ frn, frm);
124
+ tcg_temp_free_i32(tmp);
125
+ break;
126
+ case 3: /* gt: !Z && N == V */
127
+ tcg_gen_movcond_i32(TCG_COND_NE, dest, cpu_ZF, zero,
128
+ frn, frm);
129
+ tmp = tcg_temp_new_i32();
130
+ tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
131
+ tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero,
132
+ dest, frm);
133
+ tcg_temp_free_i32(tmp);
134
+ break;
135
+ }
136
+ tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd));
137
+ tcg_temp_free_i32(frn);
138
+ tcg_temp_free_i32(frm);
139
+ tcg_temp_free_i32(dest);
140
+
141
+ tcg_temp_free_i32(zero);
142
+ }
143
+
144
+ return true;
145
+}
42
+}
146
+
43
+
147
+static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
44
static void virt_set_high_memmap(VirtMachineState *vms,
148
+{
45
hwaddr base, int pa_bits)
149
+ uint32_t rd, rn, rm;
46
{
150
+ bool dp = a->dp;
47
hwaddr region_base, region_size;
151
+ bool vmin = a->op;
48
- bool fits;
152
+ TCGv_ptr fpst;
49
+ bool *region_enabled, fits;
153
+
50
int i;
154
+ if (!dc_isar_feature(aa32_vminmaxnm, s)) {
51
155
+ return false;
52
for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
156
+ }
53
+ region_enabled = virt_get_high_memmap_enabled(vms, i);
157
+
54
region_base = ROUND_UP(base, extended_memmap[i].size);
158
+ /* UNDEF accesses to D16-D31 if they don't exist */
55
region_size = extended_memmap[i].size;
159
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
56
160
+ ((a->vm | a->vn | a->vd) & 0x10)) {
57
@@ -XXX,XX +XXX,XX @@ static void virt_set_high_memmap(VirtMachineState *vms,
161
+ return false;
58
vms->highest_gpa = region_base + region_size - 1;
162
+ }
59
}
163
+ rd = a->vd;
60
164
+ rn = a->vn;
61
- switch (i) {
165
+ rm = a->vm;
62
- case VIRT_HIGH_GIC_REDIST2:
166
+
63
- vms->highmem_redists &= fits;
167
+ if (!vfp_access_check(s)) {
168
+ return true;
169
+ }
170
+
171
+ fpst = get_fpstatus_ptr(0);
172
+
173
+ if (dp) {
174
+ TCGv_i64 frn, frm, dest;
175
+
176
+ frn = tcg_temp_new_i64();
177
+ frm = tcg_temp_new_i64();
178
+ dest = tcg_temp_new_i64();
179
+
180
+ tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
181
+ tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
182
+ if (vmin) {
183
+ gen_helper_vfp_minnumd(dest, frn, frm, fpst);
184
+ } else {
185
+ gen_helper_vfp_maxnumd(dest, frn, frm, fpst);
186
+ }
187
+ tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd));
188
+ tcg_temp_free_i64(frn);
189
+ tcg_temp_free_i64(frm);
190
+ tcg_temp_free_i64(dest);
191
+ } else {
192
+ TCGv_i32 frn, frm, dest;
193
+
194
+ frn = tcg_temp_new_i32();
195
+ frm = tcg_temp_new_i32();
196
+ dest = tcg_temp_new_i32();
197
+
198
+ tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
199
+ tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
200
+ if (vmin) {
201
+ gen_helper_vfp_minnums(dest, frn, frm, fpst);
202
+ } else {
203
+ gen_helper_vfp_maxnums(dest, frn, frm, fpst);
204
+ }
205
+ tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd));
206
+ tcg_temp_free_i32(frn);
207
+ tcg_temp_free_i32(frm);
208
+ tcg_temp_free_i32(dest);
209
+ }
210
+
211
+ tcg_temp_free_ptr(fpst);
212
+ return true;
213
+}
214
+
215
+/*
216
+ * Table for converting the most common AArch32 encoding of
217
+ * rounding mode to arm_fprounding order (which matches the
218
+ * common AArch64 order); see ARM ARM pseudocode FPDecodeRM().
219
+ */
220
+static const uint8_t fp_decode_rm[] = {
221
+ FPROUNDING_TIEAWAY,
222
+ FPROUNDING_TIEEVEN,
223
+ FPROUNDING_POSINF,
224
+ FPROUNDING_NEGINF,
225
+};
226
+
227
+static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
228
+{
229
+ uint32_t rd, rm;
230
+ bool dp = a->dp;
231
+ TCGv_ptr fpst;
232
+ TCGv_i32 tcg_rmode;
233
+ int rounding = fp_decode_rm[a->rm];
234
+
235
+ if (!dc_isar_feature(aa32_vrint, s)) {
236
+ return false;
237
+ }
238
+
239
+ /* UNDEF accesses to D16-D31 if they don't exist */
240
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
241
+ ((a->vm | a->vd) & 0x10)) {
242
+ return false;
243
+ }
244
+ rd = a->vd;
245
+ rm = a->vm;
246
+
247
+ if (!vfp_access_check(s)) {
248
+ return true;
249
+ }
250
+
251
+ fpst = get_fpstatus_ptr(0);
252
+
253
+ tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
254
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
255
+
256
+ if (dp) {
257
+ TCGv_i64 tcg_op;
258
+ TCGv_i64 tcg_res;
259
+ tcg_op = tcg_temp_new_i64();
260
+ tcg_res = tcg_temp_new_i64();
261
+ tcg_gen_ld_f64(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
262
+ gen_helper_rintd(tcg_res, tcg_op, fpst);
263
+ tcg_gen_st_f64(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
264
+ tcg_temp_free_i64(tcg_op);
265
+ tcg_temp_free_i64(tcg_res);
266
+ } else {
267
+ TCGv_i32 tcg_op;
268
+ TCGv_i32 tcg_res;
269
+ tcg_op = tcg_temp_new_i32();
270
+ tcg_res = tcg_temp_new_i32();
271
+ tcg_gen_ld_f32(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
272
+ gen_helper_rints(tcg_res, tcg_op, fpst);
273
+ tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
274
+ tcg_temp_free_i32(tcg_op);
275
+ tcg_temp_free_i32(tcg_res);
276
+ }
277
+
278
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
279
+ tcg_temp_free_i32(tcg_rmode);
280
+
281
+ tcg_temp_free_ptr(fpst);
282
+ return true;
283
+}
284
+
285
+static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
286
+{
287
+ uint32_t rd, rm;
288
+ bool dp = a->dp;
289
+ TCGv_ptr fpst;
290
+ TCGv_i32 tcg_rmode, tcg_shift;
291
+ int rounding = fp_decode_rm[a->rm];
292
+ bool is_signed = a->op;
293
+
294
+ if (!dc_isar_feature(aa32_vcvt_dr, s)) {
295
+ return false;
296
+ }
297
+
298
+ /* UNDEF accesses to D16-D31 if they don't exist */
299
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
300
+ return false;
301
+ }
302
+ rd = a->vd;
303
+ rm = a->vm;
304
+
305
+ if (!vfp_access_check(s)) {
306
+ return true;
307
+ }
308
+
309
+ fpst = get_fpstatus_ptr(0);
310
+
311
+ tcg_shift = tcg_const_i32(0);
312
+
313
+ tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
314
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
315
+
316
+ if (dp) {
317
+ TCGv_i64 tcg_double, tcg_res;
318
+ TCGv_i32 tcg_tmp;
319
+ tcg_double = tcg_temp_new_i64();
320
+ tcg_res = tcg_temp_new_i64();
321
+ tcg_tmp = tcg_temp_new_i32();
322
+ tcg_gen_ld_f64(tcg_double, cpu_env, vfp_reg_offset(1, rm));
323
+ if (is_signed) {
324
+ gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst);
325
+ } else {
326
+ gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst);
327
+ }
328
+ tcg_gen_extrl_i64_i32(tcg_tmp, tcg_res);
329
+ tcg_gen_st_f32(tcg_tmp, cpu_env, vfp_reg_offset(0, rd));
330
+ tcg_temp_free_i32(tcg_tmp);
331
+ tcg_temp_free_i64(tcg_res);
332
+ tcg_temp_free_i64(tcg_double);
333
+ } else {
334
+ TCGv_i32 tcg_single, tcg_res;
335
+ tcg_single = tcg_temp_new_i32();
336
+ tcg_res = tcg_temp_new_i32();
337
+ tcg_gen_ld_f32(tcg_single, cpu_env, vfp_reg_offset(0, rm));
338
+ if (is_signed) {
339
+ gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst);
340
+ } else {
341
+ gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
342
+ }
343
+ tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(0, rd));
344
+ tcg_temp_free_i32(tcg_res);
345
+ tcg_temp_free_i32(tcg_single);
346
+ }
347
+
348
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
349
+ tcg_temp_free_i32(tcg_rmode);
350
+
351
+ tcg_temp_free_i32(tcg_shift);
352
+
353
+ tcg_temp_free_ptr(fpst);
354
+
355
+ return true;
356
+}
357
diff --git a/target/arm/translate.c b/target/arm/translate.c
358
index XXXXXXX..XXXXXXX 100644
359
--- a/target/arm/translate.c
360
+++ b/target/arm/translate.c
361
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
362
tcg_temp_free_i32(tmp);
363
}
364
365
-static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
366
-{
367
- uint32_t rd, rn, rm;
368
- bool dp = a->dp;
369
-
370
- if (!dc_isar_feature(aa32_vsel, s)) {
371
- return false;
372
- }
373
-
374
- /* UNDEF accesses to D16-D31 if they don't exist */
375
- if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
376
- ((a->vm | a->vn | a->vd) & 0x10)) {
377
- return false;
378
- }
379
- rd = a->vd;
380
- rn = a->vn;
381
- rm = a->vm;
382
-
383
- if (!vfp_access_check(s)) {
384
- return true;
385
- }
386
-
387
- if (dp) {
388
- TCGv_i64 frn, frm, dest;
389
- TCGv_i64 tmp, zero, zf, nf, vf;
390
-
391
- zero = tcg_const_i64(0);
392
-
393
- frn = tcg_temp_new_i64();
394
- frm = tcg_temp_new_i64();
395
- dest = tcg_temp_new_i64();
396
-
397
- zf = tcg_temp_new_i64();
398
- nf = tcg_temp_new_i64();
399
- vf = tcg_temp_new_i64();
400
-
401
- tcg_gen_extu_i32_i64(zf, cpu_ZF);
402
- tcg_gen_ext_i32_i64(nf, cpu_NF);
403
- tcg_gen_ext_i32_i64(vf, cpu_VF);
404
-
405
- tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
406
- tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
407
- switch (a->cc) {
408
- case 0: /* eq: Z */
409
- tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero,
410
- frn, frm);
411
- break;
64
- break;
412
- case 1: /* vs: V */
65
- case VIRT_HIGH_PCIE_ECAM:
413
- tcg_gen_movcond_i64(TCG_COND_LT, dest, vf, zero,
66
- vms->highmem_ecam &= fits;
414
- frn, frm);
415
- break;
67
- break;
416
- case 2: /* ge: N == V -> N ^ V == 0 */
68
- case VIRT_HIGH_PCIE_MMIO:
417
- tmp = tcg_temp_new_i64();
69
- vms->highmem_mmio &= fits;
418
- tcg_gen_xor_i64(tmp, vf, nf);
419
- tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero,
420
- frn, frm);
421
- tcg_temp_free_i64(tmp);
422
- break;
423
- case 3: /* gt: !Z && N == V */
424
- tcg_gen_movcond_i64(TCG_COND_NE, dest, zf, zero,
425
- frn, frm);
426
- tmp = tcg_temp_new_i64();
427
- tcg_gen_xor_i64(tmp, vf, nf);
428
- tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero,
429
- dest, frm);
430
- tcg_temp_free_i64(tmp);
431
- break;
70
- break;
432
- }
71
- }
433
- tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd));
434
- tcg_temp_free_i64(frn);
435
- tcg_temp_free_i64(frm);
436
- tcg_temp_free_i64(dest);
437
-
72
-
438
- tcg_temp_free_i64(zf);
73
+ *region_enabled &= fits;
439
- tcg_temp_free_i64(nf);
74
base = region_base + region_size;
440
- tcg_temp_free_i64(vf);
75
}
441
-
76
}
442
- tcg_temp_free_i64(zero);
443
- } else {
444
- TCGv_i32 frn, frm, dest;
445
- TCGv_i32 tmp, zero;
446
-
447
- zero = tcg_const_i32(0);
448
-
449
- frn = tcg_temp_new_i32();
450
- frm = tcg_temp_new_i32();
451
- dest = tcg_temp_new_i32();
452
- tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
453
- tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
454
- switch (a->cc) {
455
- case 0: /* eq: Z */
456
- tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero,
457
- frn, frm);
458
- break;
459
- case 1: /* vs: V */
460
- tcg_gen_movcond_i32(TCG_COND_LT, dest, cpu_VF, zero,
461
- frn, frm);
462
- break;
463
- case 2: /* ge: N == V -> N ^ V == 0 */
464
- tmp = tcg_temp_new_i32();
465
- tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
466
- tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero,
467
- frn, frm);
468
- tcg_temp_free_i32(tmp);
469
- break;
470
- case 3: /* gt: !Z && N == V */
471
- tcg_gen_movcond_i32(TCG_COND_NE, dest, cpu_ZF, zero,
472
- frn, frm);
473
- tmp = tcg_temp_new_i32();
474
- tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
475
- tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero,
476
- dest, frm);
477
- tcg_temp_free_i32(tmp);
478
- break;
479
- }
480
- tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd));
481
- tcg_temp_free_i32(frn);
482
- tcg_temp_free_i32(frm);
483
- tcg_temp_free_i32(dest);
484
-
485
- tcg_temp_free_i32(zero);
486
- }
487
-
488
- return true;
489
-}
490
-
491
-static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
492
-{
493
- uint32_t rd, rn, rm;
494
- bool dp = a->dp;
495
- bool vmin = a->op;
496
- TCGv_ptr fpst;
497
-
498
- if (!dc_isar_feature(aa32_vminmaxnm, s)) {
499
- return false;
500
- }
501
-
502
- /* UNDEF accesses to D16-D31 if they don't exist */
503
- if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
504
- ((a->vm | a->vn | a->vd) & 0x10)) {
505
- return false;
506
- }
507
- rd = a->vd;
508
- rn = a->vn;
509
- rm = a->vm;
510
-
511
- if (!vfp_access_check(s)) {
512
- return true;
513
- }
514
-
515
- fpst = get_fpstatus_ptr(0);
516
-
517
- if (dp) {
518
- TCGv_i64 frn, frm, dest;
519
-
520
- frn = tcg_temp_new_i64();
521
- frm = tcg_temp_new_i64();
522
- dest = tcg_temp_new_i64();
523
-
524
- tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
525
- tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
526
- if (vmin) {
527
- gen_helper_vfp_minnumd(dest, frn, frm, fpst);
528
- } else {
529
- gen_helper_vfp_maxnumd(dest, frn, frm, fpst);
530
- }
531
- tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd));
532
- tcg_temp_free_i64(frn);
533
- tcg_temp_free_i64(frm);
534
- tcg_temp_free_i64(dest);
535
- } else {
536
- TCGv_i32 frn, frm, dest;
537
-
538
- frn = tcg_temp_new_i32();
539
- frm = tcg_temp_new_i32();
540
- dest = tcg_temp_new_i32();
541
-
542
- tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
543
- tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
544
- if (vmin) {
545
- gen_helper_vfp_minnums(dest, frn, frm, fpst);
546
- } else {
547
- gen_helper_vfp_maxnums(dest, frn, frm, fpst);
548
- }
549
- tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd));
550
- tcg_temp_free_i32(frn);
551
- tcg_temp_free_i32(frm);
552
- tcg_temp_free_i32(dest);
553
- }
554
-
555
- tcg_temp_free_ptr(fpst);
556
- return true;
557
-}
558
-
559
-/*
560
- * Table for converting the most common AArch32 encoding of
561
- * rounding mode to arm_fprounding order (which matches the
562
- * common AArch64 order); see ARM ARM pseudocode FPDecodeRM().
563
- */
564
-static const uint8_t fp_decode_rm[] = {
565
- FPROUNDING_TIEAWAY,
566
- FPROUNDING_TIEEVEN,
567
- FPROUNDING_POSINF,
568
- FPROUNDING_NEGINF,
569
-};
570
-
571
-static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
572
-{
573
- uint32_t rd, rm;
574
- bool dp = a->dp;
575
- TCGv_ptr fpst;
576
- TCGv_i32 tcg_rmode;
577
- int rounding = fp_decode_rm[a->rm];
578
-
579
- if (!dc_isar_feature(aa32_vrint, s)) {
580
- return false;
581
- }
582
-
583
- /* UNDEF accesses to D16-D31 if they don't exist */
584
- if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
585
- ((a->vm | a->vd) & 0x10)) {
586
- return false;
587
- }
588
- rd = a->vd;
589
- rm = a->vm;
590
-
591
- if (!vfp_access_check(s)) {
592
- return true;
593
- }
594
-
595
- fpst = get_fpstatus_ptr(0);
596
-
597
- tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
598
- gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
599
-
600
- if (dp) {
601
- TCGv_i64 tcg_op;
602
- TCGv_i64 tcg_res;
603
- tcg_op = tcg_temp_new_i64();
604
- tcg_res = tcg_temp_new_i64();
605
- tcg_gen_ld_f64(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
606
- gen_helper_rintd(tcg_res, tcg_op, fpst);
607
- tcg_gen_st_f64(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
608
- tcg_temp_free_i64(tcg_op);
609
- tcg_temp_free_i64(tcg_res);
610
- } else {
611
- TCGv_i32 tcg_op;
612
- TCGv_i32 tcg_res;
613
- tcg_op = tcg_temp_new_i32();
614
- tcg_res = tcg_temp_new_i32();
615
- tcg_gen_ld_f32(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
616
- gen_helper_rints(tcg_res, tcg_op, fpst);
617
- tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
618
- tcg_temp_free_i32(tcg_op);
619
- tcg_temp_free_i32(tcg_res);
620
- }
621
-
622
- gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
623
- tcg_temp_free_i32(tcg_rmode);
624
-
625
- tcg_temp_free_ptr(fpst);
626
- return true;
627
-}
628
-
629
-static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
630
-{
631
- uint32_t rd, rm;
632
- bool dp = a->dp;
633
- TCGv_ptr fpst;
634
- TCGv_i32 tcg_rmode, tcg_shift;
635
- int rounding = fp_decode_rm[a->rm];
636
- bool is_signed = a->op;
637
-
638
- if (!dc_isar_feature(aa32_vcvt_dr, s)) {
639
- return false;
640
- }
641
-
642
- /* UNDEF accesses to D16-D31 if they don't exist */
643
- if (dp && !dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
644
- return false;
645
- }
646
- rd = a->vd;
647
- rm = a->vm;
648
-
649
- if (!vfp_access_check(s)) {
650
- return true;
651
- }
652
-
653
- fpst = get_fpstatus_ptr(0);
654
-
655
- tcg_shift = tcg_const_i32(0);
656
-
657
- tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
658
- gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
659
-
660
- if (dp) {
661
- TCGv_i64 tcg_double, tcg_res;
662
- TCGv_i32 tcg_tmp;
663
- tcg_double = tcg_temp_new_i64();
664
- tcg_res = tcg_temp_new_i64();
665
- tcg_tmp = tcg_temp_new_i32();
666
- tcg_gen_ld_f64(tcg_double, cpu_env, vfp_reg_offset(1, rm));
667
- if (is_signed) {
668
- gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst);
669
- } else {
670
- gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst);
671
- }
672
- tcg_gen_extrl_i64_i32(tcg_tmp, tcg_res);
673
- tcg_gen_st_f32(tcg_tmp, cpu_env, vfp_reg_offset(0, rd));
674
- tcg_temp_free_i32(tcg_tmp);
675
- tcg_temp_free_i64(tcg_res);
676
- tcg_temp_free_i64(tcg_double);
677
- } else {
678
- TCGv_i32 tcg_single, tcg_res;
679
- tcg_single = tcg_temp_new_i32();
680
- tcg_res = tcg_temp_new_i32();
681
- tcg_gen_ld_f32(tcg_single, cpu_env, vfp_reg_offset(0, rm));
682
- if (is_signed) {
683
- gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst);
684
- } else {
685
- gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
686
- }
687
- tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(0, rd));
688
- tcg_temp_free_i32(tcg_res);
689
- tcg_temp_free_i32(tcg_single);
690
- }
691
-
692
- gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
693
- tcg_temp_free_i32(tcg_rmode);
694
-
695
- tcg_temp_free_i32(tcg_shift);
696
-
697
- tcg_temp_free_ptr(fpst);
698
-
699
- return true;
700
-}
701
-
702
/*
703
* Disassemble a VFP instruction. Returns nonzero if an error occurred
704
* (ie. an undefined instruction).
705
--
77
--
706
2.20.1
78
2.25.1
707
708
diff view generated by jsdifflib
1
From: Gavin Shan <gshan@redhat.com>
2
3
There are three high memory regions, which are VIRT_HIGH_REDIST2,
4
VIRT_HIGH_PCIE_ECAM and VIRT_HIGH_PCIE_MMIO. Their base addresses
5
are floating on highest RAM address. However, they can be disabled
6
in several cases.
7
8
(1) One specific high memory region is likely to be disabled by
9
code by toggling vms->highmem_{redists, ecam, mmio}.
10
11
(2) VIRT_HIGH_PCIE_ECAM region is disabled on machine, which is
12
'virt-2.12' or ealier than it.
13
14
(3) VIRT_HIGH_PCIE_ECAM region is disabled when firmware is loaded
15
on 32-bits system.
16
17
(4) One specific high memory region is disabled when it breaks the
18
PA space limit.
19
20
The current implementation of virt_set_{memmap, high_memmap}() isn't
21
optimized because the high memory region's PA space is always reserved,
22
regardless of whatever the actual state in the corresponding
23
vms->highmem_{redists, ecam, mmio} flag. In the code, 'base' and
24
'vms->highest_gpa' are always increased for case (1), (2) and (3).
25
It's unnecessary since the assigned PA space for the disabled high
26
memory region won't be used afterwards.
27
28
Improve the address assignment for those three high memory region by
29
skipping the address assignment for one specific high memory region if
30
it has been disabled in case (1), (2) and (3). The memory layout may
31
be changed after the improvement is applied, which leads to potential
32
migration breakage. So 'vms->highmem_compact' is added to control if
33
the improvement should be applied. For now, 'vms->highmem_compact' is
34
set to false, meaning that we don't have memory layout change until it
35
becomes configurable through property 'compact-highmem' in next patch.
36
37
Signed-off-by: Gavin Shan <gshan@redhat.com>
38
Reviewed-by: Eric Auger <eric.auger@redhat.com>
39
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
40
Reviewed-by: Marc Zyngier <maz@kernel.org>
41
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
42
Message-id: 20221029224307.138822-6-gshan@redhat.com
1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
43
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3
---
44
---
4
target/arm/translate-vfp.inc.c | 10 ++++++++++
45
include/hw/arm/virt.h | 1 +
5
target/arm/translate.c | 8 +-------
46
hw/arm/virt.c | 15 ++++++++++-----
6
target/arm/vfp.decode | 5 +++++
47
2 files changed, 11 insertions(+), 5 deletions(-)
7
3 files changed, 16 insertions(+), 7 deletions(-)
8
48
9
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
49
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
10
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
11
--- a/target/arm/translate-vfp.inc.c
51
--- a/include/hw/arm/virt.h
12
+++ b/target/arm/translate-vfp.inc.c
52
+++ b/include/hw/arm/virt.h
13
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
53
@@ -XXX,XX +XXX,XX @@ struct VirtMachineState {
14
return true;
54
PFlashCFI01 *flash[2];
55
bool secure;
56
bool highmem;
57
+ bool highmem_compact;
58
bool highmem_ecam;
59
bool highmem_mmio;
60
bool highmem_redists;
61
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/hw/arm/virt.c
64
+++ b/hw/arm/virt.c
65
@@ -XXX,XX +XXX,XX @@ static void virt_set_high_memmap(VirtMachineState *vms,
66
vms->memmap[i].size = region_size;
67
68
/*
69
- * Check each device to see if they fit in the PA space,
70
- * moving highest_gpa as we go.
71
+ * Check each device to see if it fits in the PA space,
72
+ * moving highest_gpa as we go. For compatibility, move
73
+ * highest_gpa for disabled fitting devices as well, if
74
+ * the compact layout has been disabled.
75
*
76
* For each device that doesn't fit, disable it.
77
*/
78
fits = (region_base + region_size) <= BIT_ULL(pa_bits);
79
- if (fits) {
80
- vms->highest_gpa = region_base + region_size - 1;
81
+ *region_enabled &= fits;
82
+ if (vms->highmem_compact && !*region_enabled) {
83
+ continue;
84
}
85
86
- *region_enabled &= fits;
87
base = region_base + region_size;
88
+ if (fits) {
89
+ vms->highest_gpa = base - 1;
90
+ }
91
}
15
}
92
}
16
93
17
+static bool trans_VMOV_reg_sp(DisasContext *s, arg_VMOV_reg_sp *a)
18
+{
19
+ return do_vfp_2op_sp(s, tcg_gen_mov_i32, a->vd, a->vm);
20
+}
21
+
22
+static bool trans_VMOV_reg_dp(DisasContext *s, arg_VMOV_reg_dp *a)
23
+{
24
+ return do_vfp_2op_dp(s, tcg_gen_mov_i64, a->vd, a->vm);
25
+}
26
+
27
static bool trans_VABS_sp(DisasContext *s, arg_VABS_sp *a)
28
{
29
return do_vfp_2op_sp(s, gen_helper_vfp_abss, a->vd, a->vm);
30
diff --git a/target/arm/translate.c b/target/arm/translate.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/translate.c
33
+++ b/target/arm/translate.c
34
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
35
return 1;
36
case 15:
37
switch (rn) {
38
- case 1 ... 3:
39
+ case 0 ... 3:
40
/* Already handled by decodetree */
41
return 1;
42
default:
43
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
44
if (op == 15) {
45
/* rn is opcode, encoded as per VFP_SREG_N. */
46
switch (rn) {
47
- case 0x00: /* vmov */
48
- break;
49
-
50
case 0x04: /* vcvtb.f64.f16, vcvtb.f32.f16 */
51
case 0x05: /* vcvtt.f64.f16, vcvtt.f32.f16 */
52
/*
53
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
54
switch (op) {
55
case 15: /* extension space */
56
switch (rn) {
57
- case 0: /* cpy */
58
- /* no-op */
59
- break;
60
case 4: /* vcvtb.f32.f16, vcvtb.f64.f16 */
61
{
62
TCGv_ptr fpst = get_fpstatus_ptr(false);
63
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/vfp.decode
66
+++ b/target/arm/vfp.decode
67
@@ -XXX,XX +XXX,XX @@ VMOV_imm_sp ---- 1110 1.11 imm4h:4 .... 1010 0000 imm4l:4 \
68
VMOV_imm_dp ---- 1110 1.11 imm4h:4 .... 1011 0000 imm4l:4 \
69
vd=%vd_dp
70
71
+VMOV_reg_sp ---- 1110 1.11 0000 .... 1010 01.0 .... \
72
+ vd=%vd_sp vm=%vm_sp
73
+VMOV_reg_dp ---- 1110 1.11 0000 .... 1011 01.0 .... \
74
+ vd=%vd_dp vm=%vm_dp
75
+
76
VABS_sp ---- 1110 1.11 0000 .... 1010 11.0 .... \
77
vd=%vd_sp vm=%vm_sp
78
VABS_dp ---- 1110 1.11 0000 .... 1011 11.0 .... \
79
--
94
--
80
2.20.1
95
2.25.1
81
82
diff view generated by jsdifflib
1
Convert the "double-precision" register moves to decodetree:
1
From: Gavin Shan <gshan@redhat.com>
2
this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP.
3
2
4
Note that the conversion process has tightened up a few of the
3
After the improvement to high memory region address assignment is
5
UNDEF encoding checks: we now correctly forbid:
4
applied, the memory layout can be changed, introducing possible
6
* VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10
5
migration breakage. For example, VIRT_HIGH_PCIE_MMIO memory region
7
* VMOV-from-gpr with opc1:opc2 == 0x10
6
is disabled or enabled when the optimization is applied or not, with
8
* VDUP with B:E == 11
7
the following configuration. The configuration is only achievable by
9
* VDUP with Q == 1 and Vn<0> == 1
8
modifying the source code until more properties are added to allow
9
users selectively disable those high memory regions.
10
10
11
pa_bits = 40;
12
vms->highmem_redists = false;
13
vms->highmem_ecam = false;
14
vms->highmem_mmio = true;
15
16
# qemu-system-aarch64 -accel kvm -cpu host \
17
-machine virt-7.2,compact-highmem={on, off} \
18
-m 4G,maxmem=511G -monitor stdio
19
20
Region compact-highmem=off compact-highmem=on
21
----------------------------------------------------------------
22
MEM [1GB 512GB] [1GB 512GB]
23
HIGH_GIC_REDISTS2 [512GB 512GB+64MB] [disabled]
24
HIGH_PCIE_ECAM [512GB+256MB 512GB+512MB] [disabled]
25
HIGH_PCIE_MMIO [disabled] [512GB 1TB]
26
27
In order to keep backwords compatibility, we need to disable the
28
optimization on machine, which is virt-7.1 or ealier than it. It
29
means the optimization is enabled by default from virt-7.2. Besides,
30
'compact-highmem' property is added so that the optimization can be
31
explicitly enabled or disabled on all machine types by users.
32
33
Signed-off-by: Gavin Shan <gshan@redhat.com>
34
Reviewed-by: Eric Auger <eric.auger@redhat.com>
35
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
36
Reviewed-by: Marc Zyngier <maz@kernel.org>
37
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
38
Message-id: 20221029224307.138822-7-gshan@redhat.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
39
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
40
---
13
The accesses of elements < 32 bits could be improved by doing
41
docs/system/arm/virt.rst | 4 ++++
14
direct ld/st of the right size rather than 32-bit read-and-shift
42
include/hw/arm/virt.h | 1 +
15
or read-modify-write, but we leave this for later cleanup,
43
hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
16
since this series is generally trying to stick to fixing
44
3 files changed, 37 insertions(+)
17
the decode.
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
---
20
target/arm/translate-vfp.inc.c | 147 +++++++++++++++++++++++++++++++++
21
target/arm/translate.c | 83 +------------------
22
target/arm/vfp.decode | 36 ++++++++
23
3 files changed, 185 insertions(+), 81 deletions(-)
24
45
25
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
46
diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
26
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate-vfp.inc.c
48
--- a/docs/system/arm/virt.rst
28
+++ b/target/arm/translate-vfp.inc.c
49
+++ b/docs/system/arm/virt.rst
29
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
50
@@ -XXX,XX +XXX,XX @@ highmem
30
51
address space above 32 bits. The default is ``on`` for machine types
31
return true;
52
later than ``virt-2.12``.
53
54
+compact-highmem
55
+ Set ``on``/``off`` to enable/disable the compact layout for high memory regions.
56
+ The default is ``on`` for machine types later than ``virt-7.2``.
57
+
58
gic-version
59
Specify the version of the Generic Interrupt Controller (GIC) to provide.
60
Valid values are:
61
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
62
index XXXXXXX..XXXXXXX 100644
63
--- a/include/hw/arm/virt.h
64
+++ b/include/hw/arm/virt.h
65
@@ -XXX,XX +XXX,XX @@ struct VirtMachineClass {
66
bool no_pmu;
67
bool claim_edge_triggered_timers;
68
bool smbios_old_sys_ver;
69
+ bool no_highmem_compact;
70
bool no_highmem_ecam;
71
bool no_ged; /* Machines < 4.2 have no support for ACPI GED device */
72
bool kvm_no_adjvtime;
73
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/hw/arm/virt.c
76
+++ b/hw/arm/virt.c
77
@@ -XXX,XX +XXX,XX @@ static const MemMapEntry base_memmap[] = {
78
* Note the extended_memmap is sized so that it eventually also includes the
79
* base_memmap entries (VIRT_HIGH_GIC_REDIST2 index is greater than the last
80
* index of base_memmap).
81
+ *
82
+ * The memory map for these Highmem IO Regions can be in legacy or compact
83
+ * layout, depending on 'compact-highmem' property. With legacy layout, the
84
+ * PA space for one specific region is always reserved, even if the region
85
+ * has been disabled or doesn't fit into the PA space. However, the PA space
86
+ * for the region won't be reserved in these circumstances with compact layout.
87
*/
88
static MemMapEntry extended_memmap[] = {
89
/* Additional 64 MB redist region (can contain up to 512 redistributors) */
90
@@ -XXX,XX +XXX,XX @@ static void virt_set_highmem(Object *obj, bool value, Error **errp)
91
vms->highmem = value;
32
}
92
}
93
94
+static bool virt_get_compact_highmem(Object *obj, Error **errp)
95
+{
96
+ VirtMachineState *vms = VIRT_MACHINE(obj);
33
+
97
+
34
+static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
98
+ return vms->highmem_compact;
35
+{
36
+ /* VMOV scalar to general purpose register */
37
+ TCGv_i32 tmp;
38
+ int pass;
39
+ uint32_t offset;
40
+
41
+ /* UNDEF accesses to D16-D31 if they don't exist */
42
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
43
+ return false;
44
+ }
45
+
46
+ offset = a->index << a->size;
47
+ pass = extract32(offset, 2, 1);
48
+ offset = extract32(offset, 0, 2) * 8;
49
+
50
+ if (a->size != 2 && !arm_dc_feature(s, ARM_FEATURE_NEON)) {
51
+ return false;
52
+ }
53
+
54
+ if (!vfp_access_check(s)) {
55
+ return true;
56
+ }
57
+
58
+ tmp = neon_load_reg(a->vn, pass);
59
+ switch (a->size) {
60
+ case 0:
61
+ if (offset) {
62
+ tcg_gen_shri_i32(tmp, tmp, offset);
63
+ }
64
+ if (a->u) {
65
+ gen_uxtb(tmp);
66
+ } else {
67
+ gen_sxtb(tmp);
68
+ }
69
+ break;
70
+ case 1:
71
+ if (a->u) {
72
+ if (offset) {
73
+ tcg_gen_shri_i32(tmp, tmp, 16);
74
+ } else {
75
+ gen_uxth(tmp);
76
+ }
77
+ } else {
78
+ if (offset) {
79
+ tcg_gen_sari_i32(tmp, tmp, 16);
80
+ } else {
81
+ gen_sxth(tmp);
82
+ }
83
+ }
84
+ break;
85
+ case 2:
86
+ break;
87
+ }
88
+ store_reg(s, a->rt, tmp);
89
+
90
+ return true;
91
+}
99
+}
92
+
100
+
93
+static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a)
101
+static void virt_set_compact_highmem(Object *obj, bool value, Error **errp)
94
+{
102
+{
95
+ /* VMOV general purpose register to scalar */
103
+ VirtMachineState *vms = VIRT_MACHINE(obj);
96
+ TCGv_i32 tmp, tmp2;
97
+ int pass;
98
+ uint32_t offset;
99
+
104
+
100
+ /* UNDEF accesses to D16-D31 if they don't exist */
105
+ vms->highmem_compact = value;
101
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
102
+ return false;
103
+ }
104
+
105
+ offset = a->index << a->size;
106
+ pass = extract32(offset, 2, 1);
107
+ offset = extract32(offset, 0, 2) * 8;
108
+
109
+ if (a->size != 2 && !arm_dc_feature(s, ARM_FEATURE_NEON)) {
110
+ return false;
111
+ }
112
+
113
+ if (!vfp_access_check(s)) {
114
+ return true;
115
+ }
116
+
117
+ tmp = load_reg(s, a->rt);
118
+ switch (a->size) {
119
+ case 0:
120
+ tmp2 = neon_load_reg(a->vn, pass);
121
+ tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 8);
122
+ tcg_temp_free_i32(tmp2);
123
+ break;
124
+ case 1:
125
+ tmp2 = neon_load_reg(a->vn, pass);
126
+ tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 16);
127
+ tcg_temp_free_i32(tmp2);
128
+ break;
129
+ case 2:
130
+ break;
131
+ }
132
+ neon_store_reg(a->vn, pass, tmp);
133
+
134
+ return true;
135
+}
106
+}
136
+
107
+
137
+static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
108
static bool virt_get_its(Object *obj, Error **errp)
138
+{
109
{
139
+ /* VDUP (general purpose register) */
110
VirtMachineState *vms = VIRT_MACHINE(obj);
140
+ TCGv_i32 tmp;
111
@@ -XXX,XX +XXX,XX @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
141
+ int size, vec_size;
112
"Set on/off to enable/disable using "
113
"physical address space above 32 bits");
114
115
+ object_class_property_add_bool(oc, "compact-highmem",
116
+ virt_get_compact_highmem,
117
+ virt_set_compact_highmem);
118
+ object_class_property_set_description(oc, "compact-highmem",
119
+ "Set on/off to enable/disable compact "
120
+ "layout for high memory regions");
142
+
121
+
143
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
122
object_class_property_add_str(oc, "gic-version", virt_get_gic_version,
144
+ return false;
123
virt_set_gic_version);
145
+ }
124
object_class_property_set_description(oc, "gic-version",
125
@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
126
127
/* High memory is enabled by default */
128
vms->highmem = true;
129
+ vms->highmem_compact = !vmc->no_highmem_compact;
130
vms->gic_version = VIRT_GIC_VERSION_NOSEL;
131
132
vms->highmem_ecam = !vmc->no_highmem_ecam;
133
@@ -XXX,XX +XXX,XX @@ DEFINE_VIRT_MACHINE_AS_LATEST(7, 2)
134
135
static void virt_machine_7_1_options(MachineClass *mc)
136
{
137
+ VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
146
+
138
+
147
+ /* UNDEF accesses to D16-D31 if they don't exist */
139
virt_machine_7_2_options(mc);
148
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
140
compat_props_add(mc->compat_props, hw_compat_7_1, hw_compat_7_1_len);
149
+ return false;
141
+ /* Compact layout for high memory regions was introduced with 7.2 */
150
+ }
142
+ vmc->no_highmem_compact = true;
151
+
143
}
152
+ if (a->b && a->e) {
144
DEFINE_VIRT_MACHINE(7, 1)
153
+ return false;
145
154
+ }
155
+
156
+ if (a->q && (a->vn & 1)) {
157
+ return false;
158
+ }
159
+
160
+ vec_size = a->q ? 16 : 8;
161
+ if (a->b) {
162
+ size = 0;
163
+ } else if (a->e) {
164
+ size = 1;
165
+ } else {
166
+ size = 2;
167
+ }
168
+
169
+ if (!vfp_access_check(s)) {
170
+ return true;
171
+ }
172
+
173
+ tmp = load_reg(s, a->rt);
174
+ tcg_gen_gvec_dup_i32(size, neon_reg_offset(a->vn, 0),
175
+ vec_size, vec_size, tmp);
176
+ tcg_temp_free_i32(tmp);
177
+
178
+ return true;
179
+}
180
diff --git a/target/arm/translate.c b/target/arm/translate.c
181
index XXXXXXX..XXXXXXX 100644
182
--- a/target/arm/translate.c
183
+++ b/target/arm/translate.c
184
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
185
/* single register transfer */
186
rd = (insn >> 12) & 0xf;
187
if (dp) {
188
- int size;
189
- int pass;
190
-
191
- VFP_DREG_N(rn, insn);
192
- if (insn & 0xf)
193
- return 1;
194
- if (insn & 0x00c00060
195
- && !arm_dc_feature(s, ARM_FEATURE_NEON)) {
196
- return 1;
197
- }
198
-
199
- pass = (insn >> 21) & 1;
200
- if (insn & (1 << 22)) {
201
- size = 0;
202
- offset = ((insn >> 5) & 3) * 8;
203
- } else if (insn & (1 << 5)) {
204
- size = 1;
205
- offset = (insn & (1 << 6)) ? 16 : 0;
206
- } else {
207
- size = 2;
208
- offset = 0;
209
- }
210
- if (insn & ARM_CP_RW_BIT) {
211
- /* vfp->arm */
212
- tmp = neon_load_reg(rn, pass);
213
- switch (size) {
214
- case 0:
215
- if (offset)
216
- tcg_gen_shri_i32(tmp, tmp, offset);
217
- if (insn & (1 << 23))
218
- gen_uxtb(tmp);
219
- else
220
- gen_sxtb(tmp);
221
- break;
222
- case 1:
223
- if (insn & (1 << 23)) {
224
- if (offset) {
225
- tcg_gen_shri_i32(tmp, tmp, 16);
226
- } else {
227
- gen_uxth(tmp);
228
- }
229
- } else {
230
- if (offset) {
231
- tcg_gen_sari_i32(tmp, tmp, 16);
232
- } else {
233
- gen_sxth(tmp);
234
- }
235
- }
236
- break;
237
- case 2:
238
- break;
239
- }
240
- store_reg(s, rd, tmp);
241
- } else {
242
- /* arm->vfp */
243
- tmp = load_reg(s, rd);
244
- if (insn & (1 << 23)) {
245
- /* VDUP */
246
- int vec_size = pass ? 16 : 8;
247
- tcg_gen_gvec_dup_i32(size, neon_reg_offset(rn, 0),
248
- vec_size, vec_size, tmp);
249
- tcg_temp_free_i32(tmp);
250
- } else {
251
- /* VMOV */
252
- switch (size) {
253
- case 0:
254
- tmp2 = neon_load_reg(rn, pass);
255
- tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 8);
256
- tcg_temp_free_i32(tmp2);
257
- break;
258
- case 1:
259
- tmp2 = neon_load_reg(rn, pass);
260
- tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 16);
261
- tcg_temp_free_i32(tmp2);
262
- break;
263
- case 2:
264
- break;
265
- }
266
- neon_store_reg(rn, pass, tmp);
267
- }
268
- }
269
+ /* already handled by decodetree */
270
+ return 1;
271
} else { /* !dp */
272
bool is_sysreg;
273
274
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
275
index XXXXXXX..XXXXXXX 100644
276
--- a/target/arm/vfp.decode
277
+++ b/target/arm/vfp.decode
278
@@ -XXX,XX +XXX,XX @@
279
# 1110 1110 .... .... .... 101. .... ....
280
# (but those patterns might also cover some Neon instructions,
281
# which do not live in this file.)
282
+
283
+# VFP registers have an odd encoding with a four-bit field
284
+# and a one-bit field which are assembled in different orders
285
+# depending on whether the register is double or single precision.
286
+# Each individual instruction function must do the checks for
287
+# "double register selected but CPU does not have double support"
288
+# and "double register number has bit 4 set but CPU does not
289
+# support D16-D31" (which should UNDEF).
290
+%vm_dp 5:1 0:4
291
+%vm_sp 0:4 5:1
292
+%vn_dp 7:1 16:4
293
+%vn_sp 16:4 7:1
294
+%vd_dp 22:1 12:4
295
+%vd_sp 12:4 22:1
296
+
297
+%vmov_idx_b 21:1 5:2
298
+%vmov_idx_h 21:1 6:1
299
+
300
+# VMOV scalar to general-purpose register; note that this does
301
+# include some Neon cases.
302
+VMOV_to_gp ---- 1110 u:1 1. 1 .... rt:4 1011 ... 1 0000 \
303
+ vn=%vn_dp size=0 index=%vmov_idx_b
304
+VMOV_to_gp ---- 1110 u:1 0. 1 .... rt:4 1011 ..1 1 0000 \
305
+ vn=%vn_dp size=1 index=%vmov_idx_h
306
+VMOV_to_gp ---- 1110 0 0 index:1 1 .... rt:4 1011 .00 1 0000 \
307
+ vn=%vn_dp size=2 u=0
308
+
309
+VMOV_from_gp ---- 1110 0 1. 0 .... rt:4 1011 ... 1 0000 \
310
+ vn=%vn_dp size=0 index=%vmov_idx_b
311
+VMOV_from_gp ---- 1110 0 0. 0 .... rt:4 1011 ..1 1 0000 \
312
+ vn=%vn_dp size=1 index=%vmov_idx_h
313
+VMOV_from_gp ---- 1110 0 0 index:1 0 .... rt:4 1011 .00 1 0000 \
314
+ vn=%vn_dp size=2
315
+
316
+VDUP ---- 1110 1 b:1 q:1 0 .... rt:4 1011 . 0 e:1 1 0000 \
317
+ vn=%vn_dp
318
--
146
--
319
2.20.1
147
2.25.1
320
321
diff view generated by jsdifflib
1
Convert the VFP VMLA instruction to decodetree.
1
From: Gavin Shan <gshan@redhat.com>
2
2
3
This is the first of the VFP 3-operand data processing instructions,
3
The 3 high memory regions are usually enabled by default, but they may
4
so we include in this patch the code which loops over the elements
4
be not used. For example, VIRT_HIGH_GIC_REDIST2 isn't needed by GICv2.
5
for an old-style VFP vector operation. The existing code to do this
5
This leads to waste in the PA space.
6
looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since
7
we are going to be converting instructions one at a time anyway
8
we can take the opportunity to make the new loop use TCG temporaries,
9
which means we can do that conversion one operation at a time
10
rather than needing to do it all in one go.
11
6
12
We include an UNDEF check which was missing in the old code:
7
Add properties ("highmem-redists", "highmem-ecam", "highmem-mmio") to
13
short-vector operations (with stride or length non-zero) were
8
allow users selectively disable them if needed. After that, the high
14
deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec
9
memory region for GICv3 or GICv4 redistributor can be disabled by user,
15
field does not indicate that support for short vectors is present
10
the number of maximal supported CPUs needs to be calculated based on
16
we UNDEF the operations that would use them. (This is a change
11
'vms->highmem_redists'. The follow-up error message is also improved
17
of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which
12
to indicate if the high memory region for GICv3 and GICv4 has been
18
previously were all incorrectly allowing short-vector operations.)
13
enabled or not.
19
14
20
Note that the conversion fixes a bug in the old code for the
15
Suggested-by: Marc Zyngier <maz@kernel.org>
21
case of VFP short-vector "mixed scalar/vector operations". These
16
Signed-off-by: Gavin Shan <gshan@redhat.com>
22
happen where the destination register is in a vector bank but
17
Reviewed-by: Marc Zyngier <maz@kernel.org>
23
but the second operand is in a scalar bank. For example
18
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
24
vmla.f64 d10, d1, d16 with length 2 stride 2
19
Reviewed-by: Eric Auger <eric.auger@redhat.com>
25
is equivalent to the pair of scalar operations
20
Message-id: 20221029224307.138822-8-gshan@redhat.com
26
vmla.f64 d10, d1, d16
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
27
vmla.f64 d8, d3, d16
22
---
28
where the destination and first input register cycle through
23
docs/system/arm/virt.rst | 13 +++++++
29
their vector but the second input is scalar (d16). In the
24
hw/arm/virt.c | 75 ++++++++++++++++++++++++++++++++++++++--
30
old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d}
25
2 files changed, 86 insertions(+), 2 deletions(-)
31
as a temporary output for the multiply, which trashes the
32
second input operand. For the fully-scalar case (where we
33
never do a second iteration) and the fully-vector case
34
(where the loop loads the new second input operand) this
35
doesn't matter, but for the mixed scalar/vector case we
36
will end up using the wrong value for later loop iterations.
37
In the new code we use TCG temporaries and so avoid the bug.
38
This bug is present for all the multiply-accumulate insns
39
that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS.
40
26
41
Note 2: the expression used to calculate the next register
27
diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
42
number in the vector bank is not in fact correct; we leave
43
this behaviour unchanged from the old decoder and will
44
fix this bug later in the series.
45
46
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
47
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
48
---
49
target/arm/cpu.h | 5 +
50
target/arm/translate-vfp.inc.c | 205 +++++++++++++++++++++++++++++++++
51
target/arm/translate.c | 14 ++-
52
target/arm/vfp.decode | 6 +
53
4 files changed, 224 insertions(+), 6 deletions(-)
54
55
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
56
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/cpu.h
29
--- a/docs/system/arm/virt.rst
58
+++ b/target/arm/cpu.h
30
+++ b/docs/system/arm/virt.rst
59
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp_d32(const ARMISARegisters *id)
31
@@ -XXX,XX +XXX,XX @@ compact-highmem
60
return FIELD_EX64(id->mvfr0, MVFR0, SIMDREG) >= 2;
32
Set ``on``/``off`` to enable/disable the compact layout for high memory regions.
33
The default is ``on`` for machine types later than ``virt-7.2``.
34
35
+highmem-redists
36
+ Set ``on``/``off`` to enable/disable the high memory region for GICv3 or
37
+ GICv4 redistributor. The default is ``on``. Setting this to ``off`` will
38
+ limit the maximum number of CPUs when GICv3 or GICv4 is used.
39
+
40
+highmem-ecam
41
+ Set ``on``/``off`` to enable/disable the high memory region for PCI ECAM.
42
+ The default is ``on`` for machine types later than ``virt-3.0``.
43
+
44
+highmem-mmio
45
+ Set ``on``/``off`` to enable/disable the high memory region for PCI MMIO.
46
+ The default is ``on``.
47
+
48
gic-version
49
Specify the version of the Generic Interrupt Controller (GIC) to provide.
50
Valid values are:
51
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/hw/arm/virt.c
54
+++ b/hw/arm/virt.c
55
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
56
if (vms->gic_version == VIRT_GIC_VERSION_2) {
57
virt_max_cpus = GIC_NCPU;
58
} else {
59
- virt_max_cpus = virt_redist_capacity(vms, VIRT_GIC_REDIST) +
60
- virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
61
+ virt_max_cpus = virt_redist_capacity(vms, VIRT_GIC_REDIST);
62
+ if (vms->highmem_redists) {
63
+ virt_max_cpus += virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
64
+ }
65
}
66
67
if (max_cpus > virt_max_cpus) {
68
error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
69
"supported by machine 'mach-virt' (%d)",
70
max_cpus, virt_max_cpus);
71
+ if (vms->gic_version != VIRT_GIC_VERSION_2 && !vms->highmem_redists) {
72
+ error_printf("Try 'highmem-redists=on' for more CPUs\n");
73
+ }
74
+
75
exit(1);
76
}
77
78
@@ -XXX,XX +XXX,XX @@ static void virt_set_compact_highmem(Object *obj, bool value, Error **errp)
79
vms->highmem_compact = value;
61
}
80
}
62
81
63
+static inline bool isar_feature_aa32_fpshvec(const ARMISARegisters *id)
82
+static bool virt_get_highmem_redists(Object *obj, Error **errp)
64
+{
83
+{
65
+ return FIELD_EX64(id->mvfr0, MVFR0, FPSHVEC) > 0;
84
+ VirtMachineState *vms = VIRT_MACHINE(obj);
85
+
86
+ return vms->highmem_redists;
66
+}
87
+}
67
+
88
+
68
/*
89
+static void virt_set_highmem_redists(Object *obj, bool value, Error **errp)
69
* We always set the FP and SIMD FP16 fields to indicate identical
90
+{
70
* levels of support (assuming SIMD is implemented at all), so
91
+ VirtMachineState *vms = VIRT_MACHINE(obj);
71
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
72
index XXXXXXX..XXXXXXX 100644
73
--- a/target/arm/translate-vfp.inc.c
74
+++ b/target/arm/translate-vfp.inc.c
75
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
76
77
return true;
78
}
79
+
92
+
80
+/*
93
+ vms->highmem_redists = value;
81
+ * Types for callbacks for do_vfp_3op_sp() and do_vfp_3op_dp().
82
+ * The callback should emit code to write a value to vd. If
83
+ * do_vfp_3op_{sp,dp}() was passed reads_vd then the TCGv vd
84
+ * will contain the old value of the relevant VFP register;
85
+ * otherwise it must be written to only.
86
+ */
87
+typedef void VFPGen3OpSPFn(TCGv_i32 vd,
88
+ TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst);
89
+typedef void VFPGen3OpDPFn(TCGv_i64 vd,
90
+ TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst);
91
+
92
+/*
93
+ * Perform a 3-operand VFP data processing instruction. fn is the
94
+ * callback to do the actual operation; this function deals with the
95
+ * code to handle looping around for VFP vector processing.
96
+ */
97
+static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn,
98
+ int vd, int vn, int vm, bool reads_vd)
99
+{
100
+ uint32_t delta_m = 0;
101
+ uint32_t delta_d = 0;
102
+ uint32_t bank_mask = 0;
103
+ int veclen = s->vec_len;
104
+ TCGv_i32 f0, f1, fd;
105
+ TCGv_ptr fpst;
106
+
107
+ if (!dc_isar_feature(aa32_fpshvec, s) &&
108
+ (veclen != 0 || s->vec_stride != 0)) {
109
+ return false;
110
+ }
111
+
112
+ if (!vfp_access_check(s)) {
113
+ return true;
114
+ }
115
+
116
+ if (veclen > 0) {
117
+ bank_mask = 0x18;
118
+
119
+ /* Figure out what type of vector operation this is. */
120
+ if ((vd & bank_mask) == 0) {
121
+ /* scalar */
122
+ veclen = 0;
123
+ } else {
124
+ delta_d = s->vec_stride + 1;
125
+
126
+ if ((vm & bank_mask) == 0) {
127
+ /* mixed scalar/vector */
128
+ delta_m = 0;
129
+ } else {
130
+ /* vector */
131
+ delta_m = delta_d;
132
+ }
133
+ }
134
+ }
135
+
136
+ f0 = tcg_temp_new_i32();
137
+ f1 = tcg_temp_new_i32();
138
+ fd = tcg_temp_new_i32();
139
+ fpst = get_fpstatus_ptr(0);
140
+
141
+ neon_load_reg32(f0, vn);
142
+ neon_load_reg32(f1, vm);
143
+
144
+ for (;;) {
145
+ if (reads_vd) {
146
+ neon_load_reg32(fd, vd);
147
+ }
148
+ fn(fd, f0, f1, fpst);
149
+ neon_store_reg32(fd, vd);
150
+
151
+ if (veclen == 0) {
152
+ break;
153
+ }
154
+
155
+ /* Set up the operands for the next iteration */
156
+ veclen--;
157
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
158
+ vn = ((vn + delta_d) & (bank_mask - 1)) | (vn & bank_mask);
159
+ neon_load_reg32(f0, vn);
160
+ if (delta_m) {
161
+ vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
162
+ neon_load_reg32(f1, vm);
163
+ }
164
+ }
165
+
166
+ tcg_temp_free_i32(f0);
167
+ tcg_temp_free_i32(f1);
168
+ tcg_temp_free_i32(fd);
169
+ tcg_temp_free_ptr(fpst);
170
+
171
+ return true;
172
+}
94
+}
173
+
95
+
174
+static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
96
+static bool virt_get_highmem_ecam(Object *obj, Error **errp)
175
+ int vd, int vn, int vm, bool reads_vd)
176
+{
97
+{
177
+ uint32_t delta_m = 0;
98
+ VirtMachineState *vms = VIRT_MACHINE(obj);
178
+ uint32_t delta_d = 0;
179
+ uint32_t bank_mask = 0;
180
+ int veclen = s->vec_len;
181
+ TCGv_i64 f0, f1, fd;
182
+ TCGv_ptr fpst;
183
+
99
+
184
+ /* UNDEF accesses to D16-D31 if they don't exist */
100
+ return vms->highmem_ecam;
185
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((vd | vn | vm) & 0x10)) {
186
+ return false;
187
+ }
188
+
189
+ if (!dc_isar_feature(aa32_fpshvec, s) &&
190
+ (veclen != 0 || s->vec_stride != 0)) {
191
+ return false;
192
+ }
193
+
194
+ if (!vfp_access_check(s)) {
195
+ return true;
196
+ }
197
+
198
+ if (veclen > 0) {
199
+ bank_mask = 0xc;
200
+
201
+ /* Figure out what type of vector operation this is. */
202
+ if ((vd & bank_mask) == 0) {
203
+ /* scalar */
204
+ veclen = 0;
205
+ } else {
206
+ delta_d = (s->vec_stride >> 1) + 1;
207
+
208
+ if ((vm & bank_mask) == 0) {
209
+ /* mixed scalar/vector */
210
+ delta_m = 0;
211
+ } else {
212
+ /* vector */
213
+ delta_m = delta_d;
214
+ }
215
+ }
216
+ }
217
+
218
+ f0 = tcg_temp_new_i64();
219
+ f1 = tcg_temp_new_i64();
220
+ fd = tcg_temp_new_i64();
221
+ fpst = get_fpstatus_ptr(0);
222
+
223
+ neon_load_reg64(f0, vn);
224
+ neon_load_reg64(f1, vm);
225
+
226
+ for (;;) {
227
+ if (reads_vd) {
228
+ neon_load_reg64(fd, vd);
229
+ }
230
+ fn(fd, f0, f1, fpst);
231
+ neon_store_reg64(fd, vd);
232
+
233
+ if (veclen == 0) {
234
+ break;
235
+ }
236
+ /* Set up the operands for the next iteration */
237
+ veclen--;
238
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
239
+ vn = ((vn + delta_d) & (bank_mask - 1)) | (vn & bank_mask);
240
+ neon_load_reg64(f0, vn);
241
+ if (delta_m) {
242
+ vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
243
+ neon_load_reg64(f1, vm);
244
+ }
245
+ }
246
+
247
+ tcg_temp_free_i64(f0);
248
+ tcg_temp_free_i64(f1);
249
+ tcg_temp_free_i64(fd);
250
+ tcg_temp_free_ptr(fpst);
251
+
252
+ return true;
253
+}
101
+}
254
+
102
+
255
+static void gen_VMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
103
+static void virt_set_highmem_ecam(Object *obj, bool value, Error **errp)
256
+{
104
+{
257
+ /* Note that order of inputs to the add matters for NaNs */
105
+ VirtMachineState *vms = VIRT_MACHINE(obj);
258
+ TCGv_i32 tmp = tcg_temp_new_i32();
259
+
106
+
260
+ gen_helper_vfp_muls(tmp, vn, vm, fpst);
107
+ vms->highmem_ecam = value;
261
+ gen_helper_vfp_adds(vd, vd, tmp, fpst);
262
+ tcg_temp_free_i32(tmp);
263
+}
108
+}
264
+
109
+
265
+static bool trans_VMLA_sp(DisasContext *s, arg_VMLA_sp *a)
110
+static bool virt_get_highmem_mmio(Object *obj, Error **errp)
266
+{
111
+{
267
+ return do_vfp_3op_sp(s, gen_VMLA_sp, a->vd, a->vn, a->vm, true);
112
+ VirtMachineState *vms = VIRT_MACHINE(obj);
113
+
114
+ return vms->highmem_mmio;
268
+}
115
+}
269
+
116
+
270
+static void gen_VMLA_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
117
+static void virt_set_highmem_mmio(Object *obj, bool value, Error **errp)
271
+{
118
+{
272
+ /* Note that order of inputs to the add matters for NaNs */
119
+ VirtMachineState *vms = VIRT_MACHINE(obj);
273
+ TCGv_i64 tmp = tcg_temp_new_i64();
274
+
120
+
275
+ gen_helper_vfp_muld(tmp, vn, vm, fpst);
121
+ vms->highmem_mmio = value;
276
+ gen_helper_vfp_addd(vd, vd, tmp, fpst);
277
+ tcg_temp_free_i64(tmp);
278
+}
122
+}
279
+
123
+
280
+static bool trans_VMLA_dp(DisasContext *s, arg_VMLA_sp *a)
281
+{
282
+ return do_vfp_3op_dp(s, gen_VMLA_dp, a->vd, a->vn, a->vm, true);
283
+}
284
diff --git a/target/arm/translate.c b/target/arm/translate.c
285
index XXXXXXX..XXXXXXX 100644
286
--- a/target/arm/translate.c
287
+++ b/target/arm/translate.c
288
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
289
op = ((insn >> 20) & 8) | ((insn >> 19) & 6) | ((insn >> 6) & 1);
290
rn = VFP_SREG_N(insn);
291
292
+ switch (op) {
293
+ case 0:
294
+ /* Already handled by decodetree */
295
+ return 1;
296
+ default:
297
+ break;
298
+ }
299
+
124
+
300
if (op == 15) {
125
static bool virt_get_its(Object *obj, Error **errp)
301
/* rn is opcode, encoded as per VFP_SREG_N. */
126
{
302
switch (rn) {
127
VirtMachineState *vms = VIRT_MACHINE(obj);
303
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
128
@@ -XXX,XX +XXX,XX @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
304
for (;;) {
129
"Set on/off to enable/disable compact "
305
/* Perform the calculation. */
130
"layout for high memory regions");
306
switch (op) {
131
307
- case 0: /* VMLA: fd + (fn * fm) */
132
+ object_class_property_add_bool(oc, "highmem-redists",
308
- /* Note that order of inputs to the add matters for NaNs */
133
+ virt_get_highmem_redists,
309
- gen_vfp_F1_mul(dp);
134
+ virt_set_highmem_redists);
310
- gen_mov_F0_vreg(dp, rd);
135
+ object_class_property_set_description(oc, "highmem-redists",
311
- gen_vfp_add(dp);
136
+ "Set on/off to enable/disable high "
312
- break;
137
+ "memory region for GICv3 or GICv4 "
313
case 1: /* VMLS: fd + -(fn * fm) */
138
+ "redistributor");
314
gen_vfp_mul(dp);
315
gen_vfp_F1_neg(dp);
316
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
317
index XXXXXXX..XXXXXXX 100644
318
--- a/target/arm/vfp.decode
319
+++ b/target/arm/vfp.decode
320
@@ -XXX,XX +XXX,XX @@ VLDM_VSTM_sp ---- 1101 0.1 l:1 rn:4 .... 1010 imm:8 \
321
vd=%vd_sp p=1 u=0 w=1
322
VLDM_VSTM_dp ---- 1101 0.1 l:1 rn:4 .... 1011 imm:8 \
323
vd=%vd_dp p=1 u=0 w=1
324
+
139
+
325
+# 3-register VFP data-processing; bits [23,21:20,6] identify the operation.
140
+ object_class_property_add_bool(oc, "highmem-ecam",
326
+VMLA_sp ---- 1110 0.00 .... .... 1010 .0.0 .... \
141
+ virt_get_highmem_ecam,
327
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
142
+ virt_set_highmem_ecam);
328
+VMLA_dp ---- 1110 0.00 .... .... 1011 .0.0 .... \
143
+ object_class_property_set_description(oc, "highmem-ecam",
329
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
144
+ "Set on/off to enable/disable high "
145
+ "memory region for PCI ECAM");
146
+
147
+ object_class_property_add_bool(oc, "highmem-mmio",
148
+ virt_get_highmem_mmio,
149
+ virt_set_highmem_mmio);
150
+ object_class_property_set_description(oc, "highmem-mmio",
151
+ "Set on/off to enable/disable high "
152
+ "memory region for PCI MMIO");
153
+
154
object_class_property_add_str(oc, "gic-version", virt_get_gic_version,
155
virt_set_gic_version);
156
object_class_property_set_description(oc, "gic-version",
330
--
157
--
331
2.20.1
158
2.25.1
332
333
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Mihai Carabas <mihai.carabas@oracle.com>
2
2
3
The ARM pseudocode installs the error_code into the original
3
Use the base_memmap to build the SMBIOS 19 table which provides the address
4
pointer, not the encrypted pointer. The difference applies
4
mapping for a Physical Memory Array (from spec [1] chapter 7.20).
5
within the 7 bits of pac data; the result should be the sign
6
extension of bit 55.
7
5
8
Add a testcase to that effect.
6
This was present on i386 from commit c97294ec1b9e36887e119589d456557d72ab37b5
7
("SMBIOS: Build aggregate smbios tables and entry point").
9
8
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
[1] https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.5.0.pdf
10
11
The absence of this table is a breach of the specs and is
12
detected by the FirmwareTestSuite (FWTS), but it doesn't
13
cause any known problems for guest OSes.
14
15
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
16
Message-id: 1668789029-5432-1-git-send-email-mihai.carabas@oracle.com
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
17
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
19
---
14
tests/tcg/aarch64/Makefile.target | 2 +-
20
hw/arm/virt.c | 8 +++++++-
15
target/arm/pauth_helper.c | 4 +-
21
1 file changed, 7 insertions(+), 1 deletion(-)
16
tests/tcg/aarch64/pauth-2.c | 61 +++++++++++++++++++++++++++++++
17
3 files changed, 64 insertions(+), 3 deletions(-)
18
create mode 100644 tests/tcg/aarch64/pauth-2.c
19
22
20
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
23
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
21
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
22
--- a/tests/tcg/aarch64/Makefile.target
25
--- a/hw/arm/virt.c
23
+++ b/tests/tcg/aarch64/Makefile.target
26
+++ b/hw/arm/virt.c
24
@@ -XXX,XX +XXX,XX @@ run-fcvt: fcvt
27
@@ -XXX,XX +XXX,XX @@ static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
25
    $(call run-test,$<,$(QEMU) $<, "$< on $(TARGET_NAME)")
28
static void virt_build_smbios(VirtMachineState *vms)
26
    $(call diff-out,$<,$(AARCH64_SRC)/fcvt.ref)
29
{
27
30
MachineClass *mc = MACHINE_GET_CLASS(vms);
28
-AARCH64_TESTS += pauth-1
31
+ MachineState *ms = MACHINE(vms);
29
+AARCH64_TESTS += pauth-1 pauth-2
32
VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
30
run-pauth-%: QEMU += -cpu max
33
uint8_t *smbios_tables, *smbios_anchor;
31
34
size_t smbios_tables_len, smbios_anchor_len;
32
TESTS:=$(AARCH64_TESTS)
35
+ struct smbios_phys_mem_area mem_array;
33
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
36
const char *product = "QEMU Virtual Machine";
34
index XXXXXXX..XXXXXXX 100644
37
35
--- a/target/arm/pauth_helper.c
38
if (kvm_enabled()) {
36
+++ b/target/arm/pauth_helper.c
39
@@ -XXX,XX +XXX,XX @@ static void virt_build_smbios(VirtMachineState *vms)
37
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
40
vmc->smbios_old_sys_ver ? "1.0" : mc->name, false,
38
if (unlikely(extract64(test, bot_bit, top_bit - bot_bit))) {
41
true, SMBIOS_ENTRY_POINT_TYPE_64);
39
int error_code = (keynumber << 1) | (keynumber ^ 1);
42
40
if (param.tbi) {
43
- smbios_get_tables(MACHINE(vms), NULL, 0,
41
- return deposit64(ptr, 53, 2, error_code);
44
+ /* build the array of physical mem area from base_memmap */
42
+ return deposit64(orig_ptr, 53, 2, error_code);
45
+ mem_array.address = vms->memmap[VIRT_MEM].base;
43
} else {
46
+ mem_array.length = ms->ram_size;
44
- return deposit64(ptr, 61, 2, error_code);
45
+ return deposit64(orig_ptr, 61, 2, error_code);
46
}
47
}
48
return orig_ptr;
49
diff --git a/tests/tcg/aarch64/pauth-2.c b/tests/tcg/aarch64/pauth-2.c
50
new file mode 100644
51
index XXXXXXX..XXXXXXX
52
--- /dev/null
53
+++ b/tests/tcg/aarch64/pauth-2.c
54
@@ -XXX,XX +XXX,XX @@
55
+#include <stdint.h>
56
+#include <assert.h>
57
+
47
+
58
+asm(".arch armv8.4-a");
48
+ smbios_get_tables(ms, &mem_array, 1,
59
+
49
&smbios_tables, &smbios_tables_len,
60
+void do_test(uint64_t value)
50
&smbios_anchor, &smbios_anchor_len,
61
+{
51
&error_fatal);
62
+ uint64_t salt1, salt2;
63
+ uint64_t encode, decode;
64
+
65
+ /*
66
+ * With TBI enabled and a 48-bit VA, there are 7 bits of auth,
67
+ * and so a 1/128 chance of encode = pac(value,key,salt) producing
68
+ * an auth for which leaves value unchanged.
69
+ * Iterate until we find a salt for which encode != value.
70
+ */
71
+ for (salt1 = 1; ; salt1++) {
72
+ asm volatile("pacda %0, %2" : "=r"(encode) : "0"(value), "r"(salt1));
73
+ if (encode != value) {
74
+ break;
75
+ }
76
+ }
77
+
78
+ /* A valid salt must produce a valid authorization. */
79
+ asm volatile("autda %0, %2" : "=r"(decode) : "0"(encode), "r"(salt1));
80
+ assert(decode == value);
81
+
82
+ /*
83
+ * An invalid salt usually fails authorization, but again there
84
+ * is a chance of choosing another salt that works.
85
+ * Iterate until we find another salt which does fail.
86
+ */
87
+ for (salt2 = salt1 + 1; ; salt2++) {
88
+ asm volatile("autda %0, %2" : "=r"(decode) : "0"(encode), "r"(salt2));
89
+ if (decode != value) {
90
+ break;
91
+ }
92
+ }
93
+
94
+ /* The VA bits, bit 55, and the TBI bits, should be unchanged. */
95
+ assert(((decode ^ value) & 0xff80ffffffffffffull) == 0);
96
+
97
+ /*
98
+ * Bits [54:53] are an error indicator based on the key used;
99
+ * the DA key above is keynumber 0, so error == 0b01. Otherwise
100
+ * bit 55 of the original is sign-extended into the rest of the auth.
101
+ */
102
+ if ((value >> 55) & 1) {
103
+ assert(((decode >> 48) & 0xff) == 0b10111111);
104
+ } else {
105
+ assert(((decode >> 48) & 0xff) == 0b00100000);
106
+ }
107
+}
108
+
109
+int main()
110
+{
111
+ do_test(0);
112
+ do_test(-1);
113
+ do_test(0xda004acedeadbeefull);
114
+ return 0;
115
+}
116
--
52
--
117
2.20.1
53
2.25.1
118
119
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Timofey Kutergin <tkutergin@gmail.com>
2
2
3
This replaces 3 target-specific implementations for BIT, BIF, and BSL.
3
The Cortex-A55 is one of the newer armv8.2+ CPUs; in particular
4
it supports the Privileged Access Never (PAN) feature. Add
5
a model of this CPU, so you can use a CPU type on the virt
6
board that models a specific real hardware CPU, rather than
7
having to use the QEMU-specific "max" CPU type.
4
8
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Timofey Kutergin <tkutergin@gmail.com>
10
Message-id: 20221121150819.2782817-1-tkutergin@gmail.com
11
[PMM: tweaked commit message]
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20190518191934.21887-3-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
14
---
10
target/arm/translate-a64.h | 2 +
15
docs/system/arm/virt.rst | 1 +
11
target/arm/translate.h | 3 --
16
hw/arm/virt.c | 1 +
12
target/arm/translate-a64.c | 15 ++++++--
17
target/arm/cpu64.c | 69 ++++++++++++++++++++++++++++++++++++++++
13
target/arm/translate.c | 78 +++-----------------------------------
18
3 files changed, 71 insertions(+)
14
4 files changed, 20 insertions(+), 78 deletions(-)
15
19
16
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
20
diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
17
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate-a64.h
22
--- a/docs/system/arm/virt.rst
19
+++ b/target/arm/translate-a64.h
23
+++ b/docs/system/arm/virt.rst
20
@@ -XXX,XX +XXX,XX @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
24
@@ -XXX,XX +XXX,XX @@ Supported guest CPU types:
21
uint32_t, uint32_t);
25
- ``cortex-a15`` (32-bit; the default)
22
typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
26
- ``cortex-a35`` (64-bit)
23
uint32_t, uint32_t, uint32_t);
27
- ``cortex-a53`` (64-bit)
24
+typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
28
+- ``cortex-a55`` (64-bit)
25
+ uint32_t, uint32_t, uint32_t);
29
- ``cortex-a57`` (64-bit)
26
30
- ``cortex-a72`` (64-bit)
27
#endif /* TARGET_ARM_TRANSLATE_A64_H */
31
- ``cortex-a76`` (64-bit)
28
diff --git a/target/arm/translate.h b/target/arm/translate.h
32
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
29
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate.h
34
--- a/hw/arm/virt.c
31
+++ b/target/arm/translate.h
35
+++ b/hw/arm/virt.c
32
@@ -XXX,XX +XXX,XX @@ static inline void gen_ss_advance(DisasContext *s)
36
@@ -XXX,XX +XXX,XX @@ static const char *valid_cpus[] = {
37
ARM_CPU_TYPE_NAME("cortex-a15"),
38
ARM_CPU_TYPE_NAME("cortex-a35"),
39
ARM_CPU_TYPE_NAME("cortex-a53"),
40
+ ARM_CPU_TYPE_NAME("cortex-a55"),
41
ARM_CPU_TYPE_NAME("cortex-a57"),
42
ARM_CPU_TYPE_NAME("cortex-a72"),
43
ARM_CPU_TYPE_NAME("cortex-a76"),
44
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/cpu64.c
47
+++ b/target/arm/cpu64.c
48
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
49
define_cortex_a72_a57_a53_cp_reginfo(cpu);
33
}
50
}
34
51
35
/* Vector operations shared between ARM and AArch64. */
52
+static void aarch64_a55_initfn(Object *obj)
36
-extern const GVecGen3 bsl_op;
37
-extern const GVecGen3 bit_op;
38
-extern const GVecGen3 bif_op;
39
extern const GVecGen3 mla_op[4];
40
extern const GVecGen3 mls_op[4];
41
extern const GVecGen3 cmtst_op[4];
42
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/translate-a64.c
45
+++ b/target/arm/translate-a64.c
46
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn3(DisasContext *s, bool is_q, int rd, int rn, int rm,
47
vec_full_reg_offset(s, rm), is_q ? 16 : 8, vec_full_reg_size(s));
48
}
49
50
+/* Expand a 4-operand AdvSIMD vector operation using an expander function. */
51
+static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
52
+ int rx, GVecGen4Fn *gvec_fn, int vece)
53
+{
53
+{
54
+ gvec_fn(vece, vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
54
+ ARMCPU *cpu = ARM_CPU(obj);
55
+ vec_full_reg_offset(s, rm), vec_full_reg_offset(s, rx),
55
+
56
+ is_q ? 16 : 8, vec_full_reg_size(s));
56
+ cpu->dtb_compatible = "arm,cortex-a55";
57
+ set_feature(&cpu->env, ARM_FEATURE_V8);
58
+ set_feature(&cpu->env, ARM_FEATURE_NEON);
59
+ set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
60
+ set_feature(&cpu->env, ARM_FEATURE_AARCH64);
61
+ set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
62
+ set_feature(&cpu->env, ARM_FEATURE_EL2);
63
+ set_feature(&cpu->env, ARM_FEATURE_EL3);
64
+ set_feature(&cpu->env, ARM_FEATURE_PMU);
65
+
66
+ /* Ordered by B2.4 AArch64 registers by functional group */
67
+ cpu->clidr = 0x82000023;
68
+ cpu->ctr = 0x84448004; /* L1Ip = VIPT */
69
+ cpu->dcz_blocksize = 4; /* 64 bytes */
70
+ cpu->isar.id_aa64dfr0 = 0x0000000010305408ull;
71
+ cpu->isar.id_aa64isar0 = 0x0000100010211120ull;
72
+ cpu->isar.id_aa64isar1 = 0x0000000000100001ull;
73
+ cpu->isar.id_aa64mmfr0 = 0x0000000000101122ull;
74
+ cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
75
+ cpu->isar.id_aa64mmfr2 = 0x0000000000001011ull;
76
+ cpu->isar.id_aa64pfr0 = 0x0000000010112222ull;
77
+ cpu->isar.id_aa64pfr1 = 0x0000000000000010ull;
78
+ cpu->id_afr0 = 0x00000000;
79
+ cpu->isar.id_dfr0 = 0x04010088;
80
+ cpu->isar.id_isar0 = 0x02101110;
81
+ cpu->isar.id_isar1 = 0x13112111;
82
+ cpu->isar.id_isar2 = 0x21232042;
83
+ cpu->isar.id_isar3 = 0x01112131;
84
+ cpu->isar.id_isar4 = 0x00011142;
85
+ cpu->isar.id_isar5 = 0x01011121;
86
+ cpu->isar.id_isar6 = 0x00000010;
87
+ cpu->isar.id_mmfr0 = 0x10201105;
88
+ cpu->isar.id_mmfr1 = 0x40000000;
89
+ cpu->isar.id_mmfr2 = 0x01260000;
90
+ cpu->isar.id_mmfr3 = 0x02122211;
91
+ cpu->isar.id_mmfr4 = 0x00021110;
92
+ cpu->isar.id_pfr0 = 0x10010131;
93
+ cpu->isar.id_pfr1 = 0x00011011;
94
+ cpu->isar.id_pfr2 = 0x00000011;
95
+ cpu->midr = 0x412FD050; /* r2p0 */
96
+ cpu->revidr = 0;
97
+
98
+ /* From B2.23 CCSIDR_EL1 */
99
+ cpu->ccsidr[0] = 0x700fe01a; /* 32KB L1 dcache */
100
+ cpu->ccsidr[1] = 0x200fe01a; /* 32KB L1 icache */
101
+ cpu->ccsidr[2] = 0x703fe07a; /* 512KB L2 cache */
102
+
103
+ /* From B2.96 SCTLR_EL3 */
104
+ cpu->reset_sctlr = 0x30c50838;
105
+
106
+ /* From B4.45 ICH_VTR_EL2 */
107
+ cpu->gic_num_lrs = 4;
108
+ cpu->gic_vpribits = 5;
109
+ cpu->gic_vprebits = 5;
110
+ cpu->gic_pribits = 5;
111
+
112
+ cpu->isar.mvfr0 = 0x10110222;
113
+ cpu->isar.mvfr1 = 0x13211111;
114
+ cpu->isar.mvfr2 = 0x00000043;
115
+
116
+ /* From D5.4 AArch64 PMU register summary */
117
+ cpu->isar.reset_pmcr_el0 = 0x410b3000;
57
+}
118
+}
58
+
119
+
59
/* Expand a 2-operand + immediate AdvSIMD vector operation using
120
static void aarch64_a72_initfn(Object *obj)
60
* an op descriptor.
61
*/
62
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
63
return;
64
65
case 5: /* BSL bitwise select */
66
- gen_gvec_op3(s, is_q, rd, rn, rm, &bsl_op);
67
+ gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0);
68
return;
69
case 6: /* BIT, bitwise insert if true */
70
- gen_gvec_op3(s, is_q, rd, rn, rm, &bit_op);
71
+ gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0);
72
return;
73
case 7: /* BIF, bitwise insert if false */
74
- gen_gvec_op3(s, is_q, rd, rn, rm, &bif_op);
75
+ gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0);
76
return;
77
78
default:
79
diff --git a/target/arm/translate.c b/target/arm/translate.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/target/arm/translate.c
82
+++ b/target/arm/translate.c
83
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
84
return 1;
85
}
86
87
-/*
88
- * Expanders for VBitOps_VBIF, VBIT, VBSL.
89
- */
90
-static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
91
-{
92
- tcg_gen_xor_i64(rn, rn, rm);
93
- tcg_gen_and_i64(rn, rn, rd);
94
- tcg_gen_xor_i64(rd, rm, rn);
95
-}
96
-
97
-static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
98
-{
99
- tcg_gen_xor_i64(rn, rn, rd);
100
- tcg_gen_and_i64(rn, rn, rm);
101
- tcg_gen_xor_i64(rd, rd, rn);
102
-}
103
-
104
-static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
105
-{
106
- tcg_gen_xor_i64(rn, rn, rd);
107
- tcg_gen_andc_i64(rn, rn, rm);
108
- tcg_gen_xor_i64(rd, rd, rn);
109
-}
110
-
111
-static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
112
-{
113
- tcg_gen_xor_vec(vece, rn, rn, rm);
114
- tcg_gen_and_vec(vece, rn, rn, rd);
115
- tcg_gen_xor_vec(vece, rd, rm, rn);
116
-}
117
-
118
-static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
119
-{
120
- tcg_gen_xor_vec(vece, rn, rn, rd);
121
- tcg_gen_and_vec(vece, rn, rn, rm);
122
- tcg_gen_xor_vec(vece, rd, rd, rn);
123
-}
124
-
125
-static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
126
-{
127
- tcg_gen_xor_vec(vece, rn, rn, rd);
128
- tcg_gen_andc_vec(vece, rn, rn, rm);
129
- tcg_gen_xor_vec(vece, rd, rd, rn);
130
-}
131
-
132
-const GVecGen3 bsl_op = {
133
- .fni8 = gen_bsl_i64,
134
- .fniv = gen_bsl_vec,
135
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
136
- .load_dest = true
137
-};
138
-
139
-const GVecGen3 bit_op = {
140
- .fni8 = gen_bit_i64,
141
- .fniv = gen_bit_vec,
142
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
143
- .load_dest = true
144
-};
145
-
146
-const GVecGen3 bif_op = {
147
- .fni8 = gen_bif_i64,
148
- .fniv = gen_bif_vec,
149
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
150
- .load_dest = true
151
-};
152
-
153
static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
154
{
121
{
155
tcg_gen_vec_sar8i_i64(a, a, shift);
122
ARMCPU *cpu = ARM_CPU(obj);
156
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
123
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo aarch64_cpus[] = {
157
vec_size, vec_size);
124
{ .name = "cortex-a35", .initfn = aarch64_a35_initfn },
158
break;
125
{ .name = "cortex-a57", .initfn = aarch64_a57_initfn },
159
case 5: /* VBSL */
126
{ .name = "cortex-a53", .initfn = aarch64_a53_initfn },
160
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
127
+ { .name = "cortex-a55", .initfn = aarch64_a55_initfn },
161
- vec_size, vec_size, &bsl_op);
128
{ .name = "cortex-a72", .initfn = aarch64_a72_initfn },
162
+ tcg_gen_gvec_bitsel(MO_8, rd_ofs, rd_ofs, rn_ofs, rm_ofs,
129
{ .name = "cortex-a76", .initfn = aarch64_a76_initfn },
163
+ vec_size, vec_size);
130
{ .name = "a64fx", .initfn = aarch64_a64fx_initfn },
164
break;
165
case 6: /* VBIT */
166
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
167
- vec_size, vec_size, &bit_op);
168
+ tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rn_ofs, rd_ofs,
169
+ vec_size, vec_size);
170
break;
171
case 7: /* VBIF */
172
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
173
- vec_size, vec_size, &bif_op);
174
+ tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rd_ofs, rn_ofs,
175
+ vec_size, vec_size);
176
break;
177
}
178
return 0;
179
--
131
--
180
2.20.1
132
2.25.1
181
182
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Luke Starrett <lukes@xsightlabs.com>
2
2
3
These instructions shift left or right depending on the sign
3
The ARM GICv3 TRM describes that the ITLinesNumber field of GICD_TYPER
4
of the input, and 7 bits are significant to the shift. This
4
register:
5
requires several masks and selects in addition to the actual
6
shifts to form the complete answer.
7
5
8
That said, the operation is still a small improvement even for
6
"indicates the maximum SPI INTID that the GIC implementation supports"
9
two 64-bit elements -- 13 vector operations instead of 2 * 7
10
integer operations.
11
7
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
As SPI #0 is absolute IRQ #32, the max SPI INTID should have accounted
13
Message-id: 20190603232209.20704-1-richard.henderson@linaro.org
9
for the internal 16x SGI's and 16x PPI's. However, the original GICv3
10
model subtracted off the SGI/PPI. Cosmetically this can be seen at OS
11
boot (Linux) showing 32 shy of what should be there, i.e.:
12
13
[ 0.000000] GICv3: 224 SPIs implemented
14
15
Though in hw/arm/virt.c, the machine is configured for 256 SPI's. ARM
16
virt machine likely doesn't have a problem with this because the upper
17
32 IRQ's don't actually have anything meaningful wired. But, this does
18
become a functional issue on a custom use case which wants to make use
19
of these IRQ's. Additionally, boot code (i.e. TF-A) will only init up
20
to the number (blocks of 32) that it believes to actually be there.
21
22
Signed-off-by: Luke Starrett <lukes@xsightlabs.com>
23
Message-id: AM9P193MB168473D99B761E204E032095D40D9@AM9P193MB1684.EURP193.PROD.OUTLOOK.COM
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
24
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
26
---
17
target/arm/helper.h | 11 +-
27
hw/intc/arm_gicv3_dist.c | 4 ++--
18
target/arm/translate.h | 6 +
28
1 file changed, 2 insertions(+), 2 deletions(-)
19
target/arm/neon_helper.c | 33 ----
20
target/arm/translate-a64.c | 18 +--
21
target/arm/translate.c | 300 +++++++++++++++++++++++++++++++++++--
22
target/arm/vec_helper.c | 88 +++++++++++
23
6 files changed, 390 insertions(+), 66 deletions(-)
24
29
25
diff --git a/target/arm/helper.h b/target/arm/helper.h
30
diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c
26
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/helper.h
32
--- a/hw/intc/arm_gicv3_dist.c
28
+++ b/target/arm/helper.h
33
+++ b/hw/intc/arm_gicv3_dist.c
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
34
@@ -XXX,XX +XXX,XX @@ static bool gicd_readl(GICv3State *s, hwaddr offset,
30
DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
35
* MBIS == 0 (message-based SPIs not supported)
31
DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
36
* SecurityExtn == 1 if security extns supported
32
37
* CPUNumber == 0 since for us ARE is always 1
33
-DEF_HELPER_2(neon_shl_u8, i32, i32, i32)
38
- * ITLinesNumber == (num external irqs / 32) - 1
34
-DEF_HELPER_2(neon_shl_s8, i32, i32, i32)
39
+ * ITLinesNumber == (((max SPI IntID + 1) / 32) - 1)
35
DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
40
*/
36
DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
41
- int itlinesnumber = ((s->num_irq - GIC_INTERNAL) / 32) - 1;
37
-DEF_HELPER_2(neon_shl_u32, i32, i32, i32)
42
+ int itlinesnumber = (s->num_irq / 32) - 1;
38
-DEF_HELPER_2(neon_shl_s32, i32, i32, i32)
43
/*
39
-DEF_HELPER_2(neon_shl_u64, i64, i64, i64)
44
* SecurityExtn must be RAZ if GICD_CTLR.DS == 1, and
40
-DEF_HELPER_2(neon_shl_s64, i64, i64, i64)
45
* "security extensions not supported" always implies DS == 1,
41
DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
42
DEF_HELPER_2(neon_rshl_s8, i32, i32, i32)
43
DEF_HELPER_2(neon_rshl_u16, i32, i32, i32)
44
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr)
45
DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr)
46
DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr)
47
48
+DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
49
+DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
52
+
53
#ifdef TARGET_AARCH64
54
#include "helper-a64.h"
55
#include "helper-sve.h"
56
diff --git a/target/arm/translate.h b/target/arm/translate.h
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/translate.h
59
+++ b/target/arm/translate.h
60
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 bif_op;
61
extern const GVecGen3 mla_op[4];
62
extern const GVecGen3 mls_op[4];
63
extern const GVecGen3 cmtst_op[4];
64
+extern const GVecGen3 sshl_op[4];
65
+extern const GVecGen3 ushl_op[4];
66
extern const GVecGen2i ssra_op[4];
67
extern const GVecGen2i usra_op[4];
68
extern const GVecGen2i sri_op[4];
69
@@ -XXX,XX +XXX,XX @@ extern const GVecGen4 sqadd_op[4];
70
extern const GVecGen4 uqsub_op[4];
71
extern const GVecGen4 sqsub_op[4];
72
void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
73
+void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
74
+void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
75
+void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
76
+void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
77
78
/*
79
* Forward to the isar_feature_* tests given a DisasContext pointer.
80
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/neon_helper.c
83
+++ b/target/arm/neon_helper.c
84
@@ -XXX,XX +XXX,XX @@ NEON_VOP(abd_u32, neon_u32, 1)
85
} else { \
86
dest = src1 << tmp; \
87
}} while (0)
88
-NEON_VOP(shl_u8, neon_u8, 4)
89
NEON_VOP(shl_u16, neon_u16, 2)
90
-NEON_VOP(shl_u32, neon_u32, 1)
91
#undef NEON_FN
92
93
-uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop)
94
-{
95
- int8_t shift = (int8_t)shiftop;
96
- if (shift >= 64 || shift <= -64) {
97
- val = 0;
98
- } else if (shift < 0) {
99
- val >>= -shift;
100
- } else {
101
- val <<= shift;
102
- }
103
- return val;
104
-}
105
-
106
#define NEON_FN(dest, src1, src2) do { \
107
int8_t tmp; \
108
tmp = (int8_t)src2; \
109
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop)
110
} else { \
111
dest = src1 << tmp; \
112
}} while (0)
113
-NEON_VOP(shl_s8, neon_s8, 4)
114
NEON_VOP(shl_s16, neon_s16, 2)
115
-NEON_VOP(shl_s32, neon_s32, 1)
116
#undef NEON_FN
117
118
-uint64_t HELPER(neon_shl_s64)(uint64_t valop, uint64_t shiftop)
119
-{
120
- int8_t shift = (int8_t)shiftop;
121
- int64_t val = valop;
122
- if (shift >= 64) {
123
- val = 0;
124
- } else if (shift <= -64) {
125
- val >>= 63;
126
- } else if (shift < 0) {
127
- val >>= -shift;
128
- } else {
129
- val <<= shift;
130
- }
131
- return val;
132
-}
133
-
134
#define NEON_FN(dest, src1, src2) do { \
135
int8_t tmp; \
136
tmp = (int8_t)src2; \
137
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
138
index XXXXXXX..XXXXXXX 100644
139
--- a/target/arm/translate-a64.c
140
+++ b/target/arm/translate-a64.c
141
@@ -XXX,XX +XXX,XX @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
142
break;
143
case 0x8: /* SSHL, USHL */
144
if (u) {
145
- gen_helper_neon_shl_u64(tcg_rd, tcg_rn, tcg_rm);
146
+ gen_ushl_i64(tcg_rd, tcg_rn, tcg_rm);
147
} else {
148
- gen_helper_neon_shl_s64(tcg_rd, tcg_rn, tcg_rm);
149
+ gen_sshl_i64(tcg_rd, tcg_rn, tcg_rm);
150
}
151
break;
152
case 0x9: /* SQSHL, UQSHL */
153
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
154
is_q ? 16 : 8, vec_full_reg_size(s),
155
(u ? uqsub_op : sqsub_op) + size);
156
return;
157
+ case 0x08: /* SSHL, USHL */
158
+ gen_gvec_op3(s, is_q, rd, rn, rm,
159
+ u ? &ushl_op[size] : &sshl_op[size]);
160
+ return;
161
case 0x0c: /* SMAX, UMAX */
162
if (u) {
163
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size);
164
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
165
genfn = fns[size][u];
166
break;
167
}
168
- case 0x8: /* SSHL, USHL */
169
- {
170
- static NeonGenTwoOpFn * const fns[3][2] = {
171
- { gen_helper_neon_shl_s8, gen_helper_neon_shl_u8 },
172
- { gen_helper_neon_shl_s16, gen_helper_neon_shl_u16 },
173
- { gen_helper_neon_shl_s32, gen_helper_neon_shl_u32 },
174
- };
175
- genfn = fns[size][u];
176
- break;
177
- }
178
case 0x9: /* SQSHL, UQSHL */
179
{
180
static NeonGenTwoOpEnvFn * const fns[3][2] = {
181
diff --git a/target/arm/translate.c b/target/arm/translate.c
182
index XXXXXXX..XXXXXXX 100644
183
--- a/target/arm/translate.c
184
+++ b/target/arm/translate.c
185
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_shift_narrow(int size, TCGv_i32 var, TCGv_i32 shift,
186
if (u) {
187
switch (size) {
188
case 1: gen_helper_neon_shl_u16(var, var, shift); break;
189
- case 2: gen_helper_neon_shl_u32(var, var, shift); break;
190
+ case 2: gen_ushl_i32(var, var, shift); break;
191
default: abort();
192
}
193
} else {
194
switch (size) {
195
case 1: gen_helper_neon_shl_s16(var, var, shift); break;
196
- case 2: gen_helper_neon_shl_s32(var, var, shift); break;
197
+ case 2: gen_sshl_i32(var, var, shift); break;
198
default: abort();
199
}
200
}
201
@@ -XXX,XX +XXX,XX @@ const GVecGen3 cmtst_op[4] = {
202
.vece = MO_64 },
203
};
204
205
+void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
206
+{
207
+ TCGv_i32 lval = tcg_temp_new_i32();
208
+ TCGv_i32 rval = tcg_temp_new_i32();
209
+ TCGv_i32 lsh = tcg_temp_new_i32();
210
+ TCGv_i32 rsh = tcg_temp_new_i32();
211
+ TCGv_i32 zero = tcg_const_i32(0);
212
+ TCGv_i32 max = tcg_const_i32(32);
213
+
214
+ /*
215
+ * Rely on the TCG guarantee that out of range shifts produce
216
+ * unspecified results, not undefined behaviour (i.e. no trap).
217
+ * Discard out-of-range results after the fact.
218
+ */
219
+ tcg_gen_ext8s_i32(lsh, b);
220
+ tcg_gen_neg_i32(rsh, lsh);
221
+ tcg_gen_shl_i32(lval, a, lsh);
222
+ tcg_gen_shr_i32(rval, a, rsh);
223
+ tcg_gen_movcond_i32(TCG_COND_LTU, d, lsh, max, lval, zero);
224
+ tcg_gen_movcond_i32(TCG_COND_LTU, d, rsh, max, rval, d);
225
+
226
+ tcg_temp_free_i32(lval);
227
+ tcg_temp_free_i32(rval);
228
+ tcg_temp_free_i32(lsh);
229
+ tcg_temp_free_i32(rsh);
230
+ tcg_temp_free_i32(zero);
231
+ tcg_temp_free_i32(max);
232
+}
233
+
234
+void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
235
+{
236
+ TCGv_i64 lval = tcg_temp_new_i64();
237
+ TCGv_i64 rval = tcg_temp_new_i64();
238
+ TCGv_i64 lsh = tcg_temp_new_i64();
239
+ TCGv_i64 rsh = tcg_temp_new_i64();
240
+ TCGv_i64 zero = tcg_const_i64(0);
241
+ TCGv_i64 max = tcg_const_i64(64);
242
+
243
+ /*
244
+ * Rely on the TCG guarantee that out of range shifts produce
245
+ * unspecified results, not undefined behaviour (i.e. no trap).
246
+ * Discard out-of-range results after the fact.
247
+ */
248
+ tcg_gen_ext8s_i64(lsh, b);
249
+ tcg_gen_neg_i64(rsh, lsh);
250
+ tcg_gen_shl_i64(lval, a, lsh);
251
+ tcg_gen_shr_i64(rval, a, rsh);
252
+ tcg_gen_movcond_i64(TCG_COND_LTU, d, lsh, max, lval, zero);
253
+ tcg_gen_movcond_i64(TCG_COND_LTU, d, rsh, max, rval, d);
254
+
255
+ tcg_temp_free_i64(lval);
256
+ tcg_temp_free_i64(rval);
257
+ tcg_temp_free_i64(lsh);
258
+ tcg_temp_free_i64(rsh);
259
+ tcg_temp_free_i64(zero);
260
+ tcg_temp_free_i64(max);
261
+}
262
+
263
+static void gen_ushl_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
264
+{
265
+ TCGv_vec lval = tcg_temp_new_vec_matching(d);
266
+ TCGv_vec rval = tcg_temp_new_vec_matching(d);
267
+ TCGv_vec lsh = tcg_temp_new_vec_matching(d);
268
+ TCGv_vec rsh = tcg_temp_new_vec_matching(d);
269
+ TCGv_vec msk, max;
270
+
271
+ /*
272
+ * Rely on the TCG guarantee that out of range shifts produce
273
+ * unspecified results, not undefined behaviour (i.e. no trap).
274
+ * Discard out-of-range results after the fact.
275
+ */
276
+ tcg_gen_neg_vec(vece, rsh, b);
277
+ if (vece == MO_8) {
278
+ tcg_gen_mov_vec(lsh, b);
279
+ } else {
280
+ msk = tcg_temp_new_vec_matching(d);
281
+ tcg_gen_dupi_vec(vece, msk, 0xff);
282
+ tcg_gen_and_vec(vece, lsh, b, msk);
283
+ tcg_gen_and_vec(vece, rsh, rsh, msk);
284
+ tcg_temp_free_vec(msk);
285
+ }
286
+
287
+ /*
288
+ * Perform possibly out of range shifts, trusting that the operation
289
+ * does not trap. Discard unused results after the fact.
290
+ */
291
+ tcg_gen_shlv_vec(vece, lval, a, lsh);
292
+ tcg_gen_shrv_vec(vece, rval, a, rsh);
293
+
294
+ max = tcg_temp_new_vec_matching(d);
295
+ tcg_gen_dupi_vec(vece, max, 8 << vece);
296
+
297
+ /*
298
+ * The choice of LT (signed) and GEU (unsigned) are biased toward
299
+ * the instructions of the x86_64 host. For MO_8, the whole byte
300
+ * is significant so we must use an unsigned compare; otherwise we
301
+ * have already masked to a byte and so a signed compare works.
302
+ * Other tcg hosts have a full set of comparisons and do not care.
303
+ */
304
+ if (vece == MO_8) {
305
+ tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
306
+ tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
307
+ tcg_gen_andc_vec(vece, lval, lval, lsh);
308
+ tcg_gen_andc_vec(vece, rval, rval, rsh);
309
+ } else {
310
+ tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
311
+ tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
312
+ tcg_gen_and_vec(vece, lval, lval, lsh);
313
+ tcg_gen_and_vec(vece, rval, rval, rsh);
314
+ }
315
+ tcg_gen_or_vec(vece, d, lval, rval);
316
+
317
+ tcg_temp_free_vec(max);
318
+ tcg_temp_free_vec(lval);
319
+ tcg_temp_free_vec(rval);
320
+ tcg_temp_free_vec(lsh);
321
+ tcg_temp_free_vec(rsh);
322
+}
323
+
324
+static const TCGOpcode ushl_list[] = {
325
+ INDEX_op_neg_vec, INDEX_op_shlv_vec,
326
+ INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
327
+};
328
+
329
+const GVecGen3 ushl_op[4] = {
330
+ { .fniv = gen_ushl_vec,
331
+ .fno = gen_helper_gvec_ushl_b,
332
+ .opt_opc = ushl_list,
333
+ .vece = MO_8 },
334
+ { .fniv = gen_ushl_vec,
335
+ .fno = gen_helper_gvec_ushl_h,
336
+ .opt_opc = ushl_list,
337
+ .vece = MO_16 },
338
+ { .fni4 = gen_ushl_i32,
339
+ .fniv = gen_ushl_vec,
340
+ .opt_opc = ushl_list,
341
+ .vece = MO_32 },
342
+ { .fni8 = gen_ushl_i64,
343
+ .fniv = gen_ushl_vec,
344
+ .opt_opc = ushl_list,
345
+ .vece = MO_64 },
346
+};
347
+
348
+void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
349
+{
350
+ TCGv_i32 lval = tcg_temp_new_i32();
351
+ TCGv_i32 rval = tcg_temp_new_i32();
352
+ TCGv_i32 lsh = tcg_temp_new_i32();
353
+ TCGv_i32 rsh = tcg_temp_new_i32();
354
+ TCGv_i32 zero = tcg_const_i32(0);
355
+ TCGv_i32 max = tcg_const_i32(31);
356
+
357
+ /*
358
+ * Rely on the TCG guarantee that out of range shifts produce
359
+ * unspecified results, not undefined behaviour (i.e. no trap).
360
+ * Discard out-of-range results after the fact.
361
+ */
362
+ tcg_gen_ext8s_i32(lsh, b);
363
+ tcg_gen_neg_i32(rsh, lsh);
364
+ tcg_gen_shl_i32(lval, a, lsh);
365
+ tcg_gen_umin_i32(rsh, rsh, max);
366
+ tcg_gen_sar_i32(rval, a, rsh);
367
+ tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
368
+ tcg_gen_movcond_i32(TCG_COND_LT, d, lsh, zero, rval, lval);
369
+
370
+ tcg_temp_free_i32(lval);
371
+ tcg_temp_free_i32(rval);
372
+ tcg_temp_free_i32(lsh);
373
+ tcg_temp_free_i32(rsh);
374
+ tcg_temp_free_i32(zero);
375
+ tcg_temp_free_i32(max);
376
+}
377
+
378
+void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
379
+{
380
+ TCGv_i64 lval = tcg_temp_new_i64();
381
+ TCGv_i64 rval = tcg_temp_new_i64();
382
+ TCGv_i64 lsh = tcg_temp_new_i64();
383
+ TCGv_i64 rsh = tcg_temp_new_i64();
384
+ TCGv_i64 zero = tcg_const_i64(0);
385
+ TCGv_i64 max = tcg_const_i64(63);
386
+
387
+ /*
388
+ * Rely on the TCG guarantee that out of range shifts produce
389
+ * unspecified results, not undefined behaviour (i.e. no trap).
390
+ * Discard out-of-range results after the fact.
391
+ */
392
+ tcg_gen_ext8s_i64(lsh, b);
393
+ tcg_gen_neg_i64(rsh, lsh);
394
+ tcg_gen_shl_i64(lval, a, lsh);
395
+ tcg_gen_umin_i64(rsh, rsh, max);
396
+ tcg_gen_sar_i64(rval, a, rsh);
397
+ tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
398
+ tcg_gen_movcond_i64(TCG_COND_LT, d, lsh, zero, rval, lval);
399
+
400
+ tcg_temp_free_i64(lval);
401
+ tcg_temp_free_i64(rval);
402
+ tcg_temp_free_i64(lsh);
403
+ tcg_temp_free_i64(rsh);
404
+ tcg_temp_free_i64(zero);
405
+ tcg_temp_free_i64(max);
406
+}
407
+
408
+static void gen_sshl_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
409
+{
410
+ TCGv_vec lval = tcg_temp_new_vec_matching(d);
411
+ TCGv_vec rval = tcg_temp_new_vec_matching(d);
412
+ TCGv_vec lsh = tcg_temp_new_vec_matching(d);
413
+ TCGv_vec rsh = tcg_temp_new_vec_matching(d);
414
+ TCGv_vec tmp = tcg_temp_new_vec_matching(d);
415
+
416
+ /*
417
+ * Rely on the TCG guarantee that out of range shifts produce
418
+ * unspecified results, not undefined behaviour (i.e. no trap).
419
+ * Discard out-of-range results after the fact.
420
+ */
421
+ tcg_gen_neg_vec(vece, rsh, b);
422
+ if (vece == MO_8) {
423
+ tcg_gen_mov_vec(lsh, b);
424
+ } else {
425
+ tcg_gen_dupi_vec(vece, tmp, 0xff);
426
+ tcg_gen_and_vec(vece, lsh, b, tmp);
427
+ tcg_gen_and_vec(vece, rsh, rsh, tmp);
428
+ }
429
+
430
+ /* Bound rsh so out of bound right shift gets -1. */
431
+ tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
432
+ tcg_gen_umin_vec(vece, rsh, rsh, tmp);
433
+ tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
434
+
435
+ tcg_gen_shlv_vec(vece, lval, a, lsh);
436
+ tcg_gen_sarv_vec(vece, rval, a, rsh);
437
+
438
+ /* Select in-bound left shift. */
439
+ tcg_gen_andc_vec(vece, lval, lval, tmp);
440
+
441
+ /* Select between left and right shift. */
442
+ if (vece == MO_8) {
443
+ tcg_gen_dupi_vec(vece, tmp, 0);
444
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, d, lsh, tmp, rval, lval);
445
+ } else {
446
+ tcg_gen_dupi_vec(vece, tmp, 0x80);
447
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, d, lsh, tmp, lval, rval);
448
+ }
449
+
450
+ tcg_temp_free_vec(lval);
451
+ tcg_temp_free_vec(rval);
452
+ tcg_temp_free_vec(lsh);
453
+ tcg_temp_free_vec(rsh);
454
+ tcg_temp_free_vec(tmp);
455
+}
456
+
457
+static const TCGOpcode sshl_list[] = {
458
+ INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
459
+ INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
460
+};
461
+
462
+const GVecGen3 sshl_op[4] = {
463
+ { .fniv = gen_sshl_vec,
464
+ .fno = gen_helper_gvec_sshl_b,
465
+ .opt_opc = sshl_list,
466
+ .vece = MO_8 },
467
+ { .fniv = gen_sshl_vec,
468
+ .fno = gen_helper_gvec_sshl_h,
469
+ .opt_opc = sshl_list,
470
+ .vece = MO_16 },
471
+ { .fni4 = gen_sshl_i32,
472
+ .fniv = gen_sshl_vec,
473
+ .opt_opc = sshl_list,
474
+ .vece = MO_32 },
475
+ { .fni8 = gen_sshl_i64,
476
+ .fniv = gen_sshl_vec,
477
+ .opt_opc = sshl_list,
478
+ .vece = MO_64 },
479
+};
480
+
481
static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
482
TCGv_vec a, TCGv_vec b)
483
{
484
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
485
vec_size, vec_size);
486
}
487
return 0;
488
+
489
+ case NEON_3R_VSHL:
490
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
491
+ u ? &ushl_op[size] : &sshl_op[size]);
492
+ return 0;
493
}
494
495
if (size == 3) {
496
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
497
neon_load_reg64(cpu_V0, rn + pass);
498
neon_load_reg64(cpu_V1, rm + pass);
499
switch (op) {
500
- case NEON_3R_VSHL:
501
- if (u) {
502
- gen_helper_neon_shl_u64(cpu_V0, cpu_V1, cpu_V0);
503
- } else {
504
- gen_helper_neon_shl_s64(cpu_V0, cpu_V1, cpu_V0);
505
- }
506
- break;
507
case NEON_3R_VQSHL:
508
if (u) {
509
gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
510
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
511
}
512
pairwise = 0;
513
switch (op) {
514
- case NEON_3R_VSHL:
515
case NEON_3R_VQSHL:
516
case NEON_3R_VRSHL:
517
case NEON_3R_VQRSHL:
518
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
519
case NEON_3R_VHSUB:
520
GEN_NEON_INTEGER_OP(hsub);
521
break;
522
- case NEON_3R_VSHL:
523
- GEN_NEON_INTEGER_OP(shl);
524
- break;
525
case NEON_3R_VQSHL:
526
GEN_NEON_INTEGER_OP_ENV(qshl);
527
break;
528
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
529
}
530
} else {
531
if (input_unsigned) {
532
- gen_helper_neon_shl_u64(cpu_V0, in, tmp64);
533
+ gen_ushl_i64(cpu_V0, in, tmp64);
534
} else {
535
- gen_helper_neon_shl_s64(cpu_V0, in, tmp64);
536
+ gen_sshl_i64(cpu_V0, in, tmp64);
537
}
538
}
539
tmp = tcg_temp_new_i32();
540
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
541
index XXXXXXX..XXXXXXX 100644
542
--- a/target/arm/vec_helper.c
543
+++ b/target/arm/vec_helper.c
544
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm,
545
do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc,
546
get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
547
}
548
+
549
+void HELPER(gvec_sshl_b)(void *vd, void *vn, void *vm, uint32_t desc)
550
+{
551
+ intptr_t i, opr_sz = simd_oprsz(desc);
552
+ int8_t *d = vd, *n = vn, *m = vm;
553
+
554
+ for (i = 0; i < opr_sz; ++i) {
555
+ int8_t mm = m[i];
556
+ int8_t nn = n[i];
557
+ int8_t res = 0;
558
+ if (mm >= 0) {
559
+ if (mm < 8) {
560
+ res = nn << mm;
561
+ }
562
+ } else {
563
+ res = nn >> (mm > -8 ? -mm : 7);
564
+ }
565
+ d[i] = res;
566
+ }
567
+ clear_tail(d, opr_sz, simd_maxsz(desc));
568
+}
569
+
570
+void HELPER(gvec_sshl_h)(void *vd, void *vn, void *vm, uint32_t desc)
571
+{
572
+ intptr_t i, opr_sz = simd_oprsz(desc);
573
+ int16_t *d = vd, *n = vn, *m = vm;
574
+
575
+ for (i = 0; i < opr_sz / 2; ++i) {
576
+ int8_t mm = m[i]; /* only 8 bits of shift are significant */
577
+ int16_t nn = n[i];
578
+ int16_t res = 0;
579
+ if (mm >= 0) {
580
+ if (mm < 16) {
581
+ res = nn << mm;
582
+ }
583
+ } else {
584
+ res = nn >> (mm > -16 ? -mm : 15);
585
+ }
586
+ d[i] = res;
587
+ }
588
+ clear_tail(d, opr_sz, simd_maxsz(desc));
589
+}
590
+
591
+void HELPER(gvec_ushl_b)(void *vd, void *vn, void *vm, uint32_t desc)
592
+{
593
+ intptr_t i, opr_sz = simd_oprsz(desc);
594
+ uint8_t *d = vd, *n = vn, *m = vm;
595
+
596
+ for (i = 0; i < opr_sz; ++i) {
597
+ int8_t mm = m[i];
598
+ uint8_t nn = n[i];
599
+ uint8_t res = 0;
600
+ if (mm >= 0) {
601
+ if (mm < 8) {
602
+ res = nn << mm;
603
+ }
604
+ } else {
605
+ if (mm > -8) {
606
+ res = nn >> -mm;
607
+ }
608
+ }
609
+ d[i] = res;
610
+ }
611
+ clear_tail(d, opr_sz, simd_maxsz(desc));
612
+}
613
+
614
+void HELPER(gvec_ushl_h)(void *vd, void *vn, void *vm, uint32_t desc)
615
+{
616
+ intptr_t i, opr_sz = simd_oprsz(desc);
617
+ uint16_t *d = vd, *n = vn, *m = vm;
618
+
619
+ for (i = 0; i < opr_sz / 2; ++i) {
620
+ int8_t mm = m[i]; /* only 8 bits of shift are significant */
621
+ uint16_t nn = n[i];
622
+ uint16_t res = 0;
623
+ if (mm >= 0) {
624
+ if (mm < 16) {
625
+ res = nn << mm;
626
+ }
627
+ } else {
628
+ if (mm > -16) {
629
+ res = nn >> -mm;
630
+ }
631
+ }
632
+ d[i] = res;
633
+ }
634
+ clear_tail(d, opr_sz, simd_maxsz(desc));
635
+}
636
--
46
--
637
2.20.1
47
2.25.1
638
639
diff view generated by jsdifflib
1
Convert the VSEL instructions to decodetree.
1
FEAT_EVT adds five new bits to the HCR_EL2 register: TTLBIS, TTLBOS,
2
We leave trans_VSEL() in translate.c for now as this allows
2
TICAB, TOCU and TID4. These allow the guest to enable trapping of
3
the patch to show just the changes from the old handle_vsel().
3
various EL1 instructions to EL2. In this commit, add the necessary
4
code to allow the guest to set these bits if the feature is present;
5
because the bit is always zero when the feature isn't present we
6
won't need to use explicit feature checks in the "trap on condition"
7
tests in the following commits.
4
8
5
In the old code the check for "do D16-D31 exist" was hidden in
9
Note that although full implementation of the feature (mandatory from
6
the VFP_DREG macro, and assumed that VFPv3 always implied that
10
Armv8.5 onward) requires all five trap bits, the ID registers permit
7
D16-D31 exist. In the new code we do the correct ID register test.
11
a value indicating that only TICAB, TOCU and TID4 are implemented,
8
This gives identical behaviour for most of our CPUs, and fixes
12
which might be the case for CPUs between Armv8.2 and Armv8.5.
9
previously incorrect handling for Cortex-R5F, Cortex-M4 and
10
Cortex-M33, which all implement VFPv3 or better with only 16
11
double-precision registers.
12
13
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
---
16
---
16
target/arm/cpu.h | 6 ++++++
17
target/arm/cpu.h | 30 ++++++++++++++++++++++++++++++
17
target/arm/translate-vfp.inc.c | 9 +++++++++
18
target/arm/helper.c | 6 ++++++
18
target/arm/translate.c | 35 ++++++++++++++++++++++++----------
19
2 files changed, 36 insertions(+)
19
target/arm/vfp-uncond.decode | 19 ++++++++++++++++++
20
4 files changed, 59 insertions(+), 10 deletions(-)
21
20
22
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
21
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
23
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/cpu.h
23
--- a/target/arm/cpu.h
25
+++ b/target/arm/cpu.h
24
+++ b/target/arm/cpu.h
26
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
25
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
27
return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
26
return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
28
}
27
}
29
28
30
+static inline bool isar_feature_aa32_fp_d32(const ARMISARegisters *id)
29
+static inline bool isar_feature_aa32_half_evt(const ARMISARegisters *id)
31
+{
30
+{
32
+ /* Return true if D16-D31 are implemented */
31
+ return FIELD_EX32(id->id_mmfr4, ID_MMFR4, EVT) >= 1;
33
+ return FIELD_EX64(id->mvfr0, MVFR0, SIMDREG) >= 2;
32
+}
33
+
34
+static inline bool isar_feature_aa32_evt(const ARMISARegisters *id)
35
+{
36
+ return FIELD_EX32(id->id_mmfr4, ID_MMFR4, EVT) >= 2;
37
+}
38
+
39
static inline bool isar_feature_aa32_dit(const ARMISARegisters *id)
40
{
41
return FIELD_EX32(id->id_pfr0, ID_PFR0, DIT) != 0;
42
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ids(const ARMISARegisters *id)
43
return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, IDS) != 0;
44
}
45
46
+static inline bool isar_feature_aa64_half_evt(const ARMISARegisters *id)
47
+{
48
+ return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, EVT) >= 1;
49
+}
50
+
51
+static inline bool isar_feature_aa64_evt(const ARMISARegisters *id)
52
+{
53
+ return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, EVT) >= 2;
54
+}
55
+
56
static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
57
{
58
return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
59
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_ras(const ARMISARegisters *id)
60
return isar_feature_aa64_ras(id) || isar_feature_aa32_ras(id);
61
}
62
63
+static inline bool isar_feature_any_half_evt(const ARMISARegisters *id)
64
+{
65
+ return isar_feature_aa64_half_evt(id) || isar_feature_aa32_half_evt(id);
66
+}
67
+
68
+static inline bool isar_feature_any_evt(const ARMISARegisters *id)
69
+{
70
+ return isar_feature_aa64_evt(id) || isar_feature_aa32_evt(id);
34
+}
71
+}
35
+
72
+
36
/*
73
/*
37
* We always set the FP and SIMD FP16 fields to indicate identical
74
* Forward to the above feature tests given an ARMCPU pointer.
38
* levels of support (assuming SIMD is implemented at all), so
75
*/
39
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
76
diff --git a/target/arm/helper.c b/target/arm/helper.c
40
index XXXXXXX..XXXXXXX 100644
77
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate-vfp.inc.c
78
--- a/target/arm/helper.c
42
+++ b/target/arm/translate-vfp.inc.c
79
+++ b/target/arm/helper.c
43
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
80
@@ -XXX,XX +XXX,XX @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
44
81
}
45
return true;
82
}
46
}
83
47
+
84
+ if (cpu_isar_feature(any_evt, cpu)) {
48
+/*
85
+ valid_mask |= HCR_TTLBIS | HCR_TTLBOS | HCR_TICAB | HCR_TOCU | HCR_TID4;
49
+ * The most usual kind of VFP access check, for everything except
86
+ } else if (cpu_isar_feature(any_half_evt, cpu)) {
50
+ * FMXR/FMRX to the always-available special registers.
87
+ valid_mask |= HCR_TICAB | HCR_TOCU | HCR_TID4;
51
+ */
52
+static bool vfp_access_check(DisasContext *s)
53
+{
54
+ return full_vfp_access_check(s, false);
55
+}
56
diff --git a/target/arm/translate.c b/target/arm/translate.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/translate.c
59
+++ b/target/arm/translate.c
60
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
61
tcg_temp_free_i32(tmp);
62
}
63
64
-static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
65
- uint32_t dp)
66
+static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
67
{
68
- uint32_t cc = extract32(insn, 20, 2);
69
+ uint32_t rd, rn, rm;
70
+ bool dp = a->dp;
71
+
72
+ if (!dc_isar_feature(aa32_vsel, s)) {
73
+ return false;
74
+ }
88
+ }
75
+
89
+
76
+ /* UNDEF accesses to D16-D31 if they don't exist */
90
/* Clear RES0 bits. */
77
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
91
value &= valid_mask;
78
+ ((a->vm | a->vn | a->vd) & 0x10)) {
92
79
+ return false;
80
+ }
81
+ rd = a->vd;
82
+ rn = a->vn;
83
+ rm = a->vm;
84
+
85
+ if (!vfp_access_check(s)) {
86
+ return true;
87
+ }
88
89
if (dp) {
90
TCGv_i64 frn, frm, dest;
91
@@ -XXX,XX +XXX,XX @@ static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
92
93
tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn));
94
tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm));
95
- switch (cc) {
96
+ switch (a->cc) {
97
case 0: /* eq: Z */
98
tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero,
99
frn, frm);
100
@@ -XXX,XX +XXX,XX @@ static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
101
dest = tcg_temp_new_i32();
102
tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn));
103
tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm));
104
- switch (cc) {
105
+ switch (a->cc) {
106
case 0: /* eq: Z */
107
tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero,
108
frn, frm);
109
@@ -XXX,XX +XXX,XX @@ static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
110
tcg_temp_free_i32(zero);
111
}
112
113
- return 0;
114
+ return true;
115
}
116
117
static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn,
118
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
119
rm = VFP_SREG_M(insn);
120
}
121
122
- if ((insn & 0x0f800e50) == 0x0e000a00 && dc_isar_feature(aa32_vsel, s)) {
123
- return handle_vsel(insn, rd, rn, rm, dp);
124
- } else if ((insn & 0x0fb00e10) == 0x0e800a00 &&
125
- dc_isar_feature(aa32_vminmaxnm, s)) {
126
+ if ((insn & 0x0fb00e10) == 0x0e800a00 &&
127
+ dc_isar_feature(aa32_vminmaxnm, s)) {
128
return handle_vminmaxnm(insn, rd, rn, rm, dp);
129
} else if ((insn & 0x0fbc0ed0) == 0x0eb80a40 &&
130
dc_isar_feature(aa32_vrint, s)) {
131
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
132
index XXXXXXX..XXXXXXX 100644
133
--- a/target/arm/vfp-uncond.decode
134
+++ b/target/arm/vfp-uncond.decode
135
@@ -XXX,XX +XXX,XX @@
136
# 1111 1110 .... .... .... 101. .... ....
137
# (but those patterns might also cover some Neon instructions,
138
# which do not live in this file.)
139
+
140
+# VFP registers have an odd encoding with a four-bit field
141
+# and a one-bit field which are assembled in different orders
142
+# depending on whether the register is double or single precision.
143
+# Each individual instruction function must do the checks for
144
+# "double register selected but CPU does not have double support"
145
+# and "double register number has bit 4 set but CPU does not
146
+# support D16-D31" (which should UNDEF).
147
+%vm_dp 5:1 0:4
148
+%vm_sp 0:4 5:1
149
+%vn_dp 7:1 16:4
150
+%vn_sp 16:4 7:1
151
+%vd_dp 22:1 12:4
152
+%vd_sp 12:4 22:1
153
+
154
+VSEL 1111 1110 0. cc:2 .... .... 1010 .0.0 .... \
155
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp dp=0
156
+VSEL 1111 1110 0. cc:2 .... .... 1011 .0.0 .... \
157
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp dp=1
158
--
93
--
159
2.20.1
94
2.25.1
160
161
diff view generated by jsdifflib
1
Convert the VCVT (between floating-point and fixed-point) instructions
1
For FEAT_EVT, the HCR_EL2.TTLBIS bit allows trapping on EL1 use of
2
to decodetree.
2
TLB maintenance instructions that operate on the inner shareable
3
domain:
4
5
AArch64:
6
TLBI VMALLE1IS, TLBI VAE1IS, TLBI ASIDE1IS, TLBI VAAE1IS,
7
TLBI VALE1IS, TLBI VAALE1IS, TLBI RVAE1IS, TLBI RVAAE1IS,
8
TLBI RVALE1IS, and TLBI RVAALE1IS.
9
10
AArch32:
11
TLBIALLIS, TLBIMVAIS, TLBIASIDIS, TLBIMVAAIS, TLBIMVALIS,
12
and TLBIMVAALIS.
13
14
Add the trapping support.
3
15
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
18
---
7
target/arm/translate-vfp.inc.c | 124 +++++++++++++++++++++++++++++++++
19
target/arm/helper.c | 43 +++++++++++++++++++++++++++----------------
8
target/arm/translate.c | 57 +--------------
20
1 file changed, 27 insertions(+), 16 deletions(-)
9
target/arm/vfp.decode | 10 +++
10
3 files changed, 136 insertions(+), 55 deletions(-)
11
21
12
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
22
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-vfp.inc.c
24
--- a/target/arm/helper.c
15
+++ b/target/arm/translate-vfp.inc.c
25
+++ b/target/arm/helper.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
26
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_ttlb(CPUARMState *env, const ARMCPRegInfo *ri,
17
tcg_temp_free_i32(vd);
27
return CP_ACCESS_OK;
18
return true;
19
}
28
}
20
+
29
21
+static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a)
30
+/* Check for traps from EL1 due to HCR_EL2.TTLB or TTLBIS. */
31
+static CPAccessResult access_ttlbis(CPUARMState *env, const ARMCPRegInfo *ri,
32
+ bool isread)
22
+{
33
+{
23
+ TCGv_i32 vd, shift;
34
+ if (arm_current_el(env) == 1 &&
24
+ TCGv_ptr fpst;
35
+ (arm_hcr_el2_eff(env) & (HCR_TTLB | HCR_TTLBIS))) {
25
+ int frac_bits;
36
+ return CP_ACCESS_TRAP_EL2;
26
+
27
+ if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
28
+ return false;
29
+ }
37
+ }
30
+
38
+ return CP_ACCESS_OK;
31
+ if (!vfp_access_check(s)) {
32
+ return true;
33
+ }
34
+
35
+ frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm);
36
+
37
+ vd = tcg_temp_new_i32();
38
+ neon_load_reg32(vd, a->vd);
39
+
40
+ fpst = get_fpstatus_ptr(false);
41
+ shift = tcg_const_i32(frac_bits);
42
+
43
+ /* Switch on op:U:sx bits */
44
+ switch (a->opc) {
45
+ case 0:
46
+ gen_helper_vfp_shtos(vd, vd, shift, fpst);
47
+ break;
48
+ case 1:
49
+ gen_helper_vfp_sltos(vd, vd, shift, fpst);
50
+ break;
51
+ case 2:
52
+ gen_helper_vfp_uhtos(vd, vd, shift, fpst);
53
+ break;
54
+ case 3:
55
+ gen_helper_vfp_ultos(vd, vd, shift, fpst);
56
+ break;
57
+ case 4:
58
+ gen_helper_vfp_toshs_round_to_zero(vd, vd, shift, fpst);
59
+ break;
60
+ case 5:
61
+ gen_helper_vfp_tosls_round_to_zero(vd, vd, shift, fpst);
62
+ break;
63
+ case 6:
64
+ gen_helper_vfp_touhs_round_to_zero(vd, vd, shift, fpst);
65
+ break;
66
+ case 7:
67
+ gen_helper_vfp_touls_round_to_zero(vd, vd, shift, fpst);
68
+ break;
69
+ default:
70
+ g_assert_not_reached();
71
+ }
72
+
73
+ neon_store_reg32(vd, a->vd);
74
+ tcg_temp_free_i32(vd);
75
+ tcg_temp_free_i32(shift);
76
+ tcg_temp_free_ptr(fpst);
77
+ return true;
78
+}
39
+}
79
+
40
+
80
+static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
41
static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
81
+{
42
{
82
+ TCGv_i64 vd;
43
ARMCPU *cpu = env_archcpu(env);
83
+ TCGv_i32 shift;
44
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
84
+ TCGv_ptr fpst;
45
static const ARMCPRegInfo v7mp_cp_reginfo[] = {
85
+ int frac_bits;
46
/* 32 bit TLB invalidates, Inner Shareable */
86
+
47
{ .name = "TLBIALLIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
87
+ if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
48
- .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
88
+ return false;
49
+ .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlbis,
89
+ }
50
.writefn = tlbiall_is_write },
90
+
51
{ .name = "TLBIMVAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
91
+ /* UNDEF accesses to D16-D31 if they don't exist. */
52
- .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
92
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
53
+ .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlbis,
93
+ return false;
54
.writefn = tlbimva_is_write },
94
+ }
55
{ .name = "TLBIASIDIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
95
+
56
- .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
96
+ if (!vfp_access_check(s)) {
57
+ .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlbis,
97
+ return true;
58
.writefn = tlbiasid_is_write },
98
+ }
59
{ .name = "TLBIMVAAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
99
+
60
- .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
100
+ frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm);
61
+ .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlbis,
101
+
62
.writefn = tlbimvaa_is_write },
102
+ vd = tcg_temp_new_i64();
63
};
103
+ neon_load_reg64(vd, a->vd);
64
104
+
65
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
105
+ fpst = get_fpstatus_ptr(false);
66
/* TLBI operations */
106
+ shift = tcg_const_i32(frac_bits);
67
{ .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
107
+
68
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
108
+ /* Switch on op:U:sx bits */
69
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
109
+ switch (a->opc) {
70
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
110
+ case 0:
71
.writefn = tlbi_aa64_vmalle1is_write },
111
+ gen_helper_vfp_shtod(vd, vd, shift, fpst);
72
{ .name = "TLBI_VAE1IS", .state = ARM_CP_STATE_AA64,
112
+ break;
73
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
113
+ case 1:
74
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
114
+ gen_helper_vfp_sltod(vd, vd, shift, fpst);
75
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
115
+ break;
76
.writefn = tlbi_aa64_vae1is_write },
116
+ case 2:
77
{ .name = "TLBI_ASIDE1IS", .state = ARM_CP_STATE_AA64,
117
+ gen_helper_vfp_uhtod(vd, vd, shift, fpst);
78
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
118
+ break;
79
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
119
+ case 3:
80
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
120
+ gen_helper_vfp_ultod(vd, vd, shift, fpst);
81
.writefn = tlbi_aa64_vmalle1is_write },
121
+ break;
82
{ .name = "TLBI_VAAE1IS", .state = ARM_CP_STATE_AA64,
122
+ case 4:
83
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
123
+ gen_helper_vfp_toshd_round_to_zero(vd, vd, shift, fpst);
84
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
124
+ break;
85
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
125
+ case 5:
86
.writefn = tlbi_aa64_vae1is_write },
126
+ gen_helper_vfp_tosld_round_to_zero(vd, vd, shift, fpst);
87
{ .name = "TLBI_VALE1IS", .state = ARM_CP_STATE_AA64,
127
+ break;
88
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
128
+ case 6:
89
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
129
+ gen_helper_vfp_touhd_round_to_zero(vd, vd, shift, fpst);
90
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
130
+ break;
91
.writefn = tlbi_aa64_vae1is_write },
131
+ case 7:
92
{ .name = "TLBI_VAALE1IS", .state = ARM_CP_STATE_AA64,
132
+ gen_helper_vfp_tould_round_to_zero(vd, vd, shift, fpst);
93
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
133
+ break;
94
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
134
+ default:
95
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
135
+ g_assert_not_reached();
96
.writefn = tlbi_aa64_vae1is_write },
136
+ }
97
{ .name = "TLBI_VMALLE1", .state = ARM_CP_STATE_AA64,
137
+
98
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
138
+ neon_store_reg64(vd, a->vd);
99
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
139
+ tcg_temp_free_i64(vd);
100
#endif
140
+ tcg_temp_free_i32(shift);
101
/* TLB invalidate last level of translation table walk */
141
+ tcg_temp_free_ptr(fpst);
102
{ .name = "TLBIMVALIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
142
+ return true;
103
- .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
143
+}
104
+ .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlbis,
144
diff --git a/target/arm/translate.c b/target/arm/translate.c
105
.writefn = tlbimva_is_write },
145
index XXXXXXX..XXXXXXX 100644
106
{ .name = "TLBIMVAALIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
146
--- a/target/arm/translate.c
107
- .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
147
+++ b/target/arm/translate.c
108
+ .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlbis,
148
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_##name(int dp, int shift, int neon) \
109
.writefn = tlbimvaa_is_write },
149
tcg_temp_free_i32(tmp_shift); \
110
{ .name = "TLBIMVAL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
150
tcg_temp_free_ptr(statusptr); \
111
.type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
151
}
112
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pauth_reginfo[] = {
152
-VFP_GEN_FIX(tosh, _round_to_zero)
113
static const ARMCPRegInfo tlbirange_reginfo[] = {
153
VFP_GEN_FIX(tosl, _round_to_zero)
114
{ .name = "TLBI_RVAE1IS", .state = ARM_CP_STATE_AA64,
154
-VFP_GEN_FIX(touh, _round_to_zero)
115
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 1,
155
VFP_GEN_FIX(toul, _round_to_zero)
116
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
156
-VFP_GEN_FIX(shto, )
117
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
157
VFP_GEN_FIX(slto, )
118
.writefn = tlbi_aa64_rvae1is_write },
158
-VFP_GEN_FIX(uhto, )
119
{ .name = "TLBI_RVAAE1IS", .state = ARM_CP_STATE_AA64,
159
VFP_GEN_FIX(ulto, )
120
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 3,
160
#undef VFP_GEN_FIX
121
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
161
122
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
162
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
123
.writefn = tlbi_aa64_rvae1is_write },
163
return 1;
124
{ .name = "TLBI_RVALE1IS", .state = ARM_CP_STATE_AA64,
164
case 15:
125
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 5,
165
switch (rn) {
126
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
166
- case 0 ... 19:
127
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
167
+ case 0 ... 23:
128
.writefn = tlbi_aa64_rvae1is_write },
168
+ case 28 ... 31:
129
{ .name = "TLBI_RVAALE1IS", .state = ARM_CP_STATE_AA64,
169
/* Already handled by decodetree */
130
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 7,
170
return 1;
131
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
171
default:
132
+ .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
172
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
133
.writefn = tlbi_aa64_rvae1is_write },
173
rd_is_dp = false;
134
{ .name = "TLBI_RVAE1OS", .state = ARM_CP_STATE_AA64,
174
break;
135
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 1,
175
176
- case 0x14: /* vcvt fp <-> fixed */
177
- case 0x15:
178
- case 0x16:
179
- case 0x17:
180
- case 0x1c:
181
- case 0x1d:
182
- case 0x1e:
183
- case 0x1f:
184
- if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
185
- return 1;
186
- }
187
- /* Immediate frac_bits has same format as SREG_M. */
188
- rm_is_dp = false;
189
- break;
190
-
191
default:
192
return 1;
193
}
194
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
195
/* Load the initial operands. */
196
if (op == 15) {
197
switch (rn) {
198
- case 0x14: /* vcvt fp <-> fixed */
199
- case 0x15:
200
- case 0x16:
201
- case 0x17:
202
- case 0x1c:
203
- case 0x1d:
204
- case 0x1e:
205
- case 0x1f:
206
- /* Source and destination the same. */
207
- gen_mov_F0_vreg(dp, rd);
208
- break;
209
default:
210
/* One source operand. */
211
gen_mov_F0_vreg(rm_is_dp, rm);
212
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
213
switch (op) {
214
case 15: /* extension space */
215
switch (rn) {
216
- case 20: /* fshto */
217
- gen_vfp_shto(dp, 16 - rm, 0);
218
- break;
219
- case 21: /* fslto */
220
- gen_vfp_slto(dp, 32 - rm, 0);
221
- break;
222
- case 22: /* fuhto */
223
- gen_vfp_uhto(dp, 16 - rm, 0);
224
- break;
225
- case 23: /* fulto */
226
- gen_vfp_ulto(dp, 32 - rm, 0);
227
- break;
228
case 24: /* ftoui */
229
gen_vfp_toui(dp, 0);
230
break;
231
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
232
case 27: /* ftosiz */
233
gen_vfp_tosiz(dp, 0);
234
break;
235
- case 28: /* ftosh */
236
- gen_vfp_tosh(dp, 16 - rm, 0);
237
- break;
238
- case 29: /* ftosl */
239
- gen_vfp_tosl(dp, 32 - rm, 0);
240
- break;
241
- case 30: /* ftouh */
242
- gen_vfp_touh(dp, 16 - rm, 0);
243
- break;
244
- case 31: /* ftoul */
245
- gen_vfp_toul(dp, 32 - rm, 0);
246
- break;
247
default: /* undefined */
248
g_assert_not_reached();
249
}
250
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
251
index XXXXXXX..XXXXXXX 100644
252
--- a/target/arm/vfp.decode
253
+++ b/target/arm/vfp.decode
254
@@ -XXX,XX +XXX,XX @@ VCVT_int_dp ---- 1110 1.11 1000 .... 1011 s:1 1.0 .... \
255
# VJCVT is always dp to sp
256
VJCVT ---- 1110 1.11 1001 .... 1011 11.0 .... \
257
vd=%vd_sp vm=%vm_dp
258
+
259
+# VCVT between floating-point and fixed-point. The immediate value
260
+# is in the same format as a Vm single-precision register number.
261
+# We assemble bits 18 (op), 16 (u) and 7 (sx) into a single opc field
262
+# for the convenience of the trans_VCVT_fix functions.
263
+%vcvt_fix_op 18:1 16:1 7:1
264
+VCVT_fix_sp ---- 1110 1.11 1.1. .... 1010 .1.0 .... \
265
+ vd=%vd_sp imm=%vm_sp opc=%vcvt_fix_op
266
+VCVT_fix_dp ---- 1110 1.11 1.1. .... 1011 .1.0 .... \
267
+ vd=%vd_dp imm=%vm_sp opc=%vcvt_fix_op
268
--
136
--
269
2.20.1
137
2.25.1
270
271
diff view generated by jsdifflib
1
Convert the VJCVT instruction to decodetree.
1
For FEAT_EVT, the HCR_EL2.TTLBOS bit allows trapping on EL1
2
use of TLB maintenance instructions that operate on the
3
outer shareable domain:
4
5
TLBI VMALLE1OS, TLBI VAE1OS, TLBI ASIDE1OS,TLBI VAAE1OS,
6
TLBI VALE1OS, TLBI VAALE1OS, TLBI RVAE1OS, TLBI RVAAE1OS,
7
TLBI RVALE1OS, and TLBI RVAALE1OS.
8
9
(There are no AArch32 outer-shareable TLB maintenance ops.)
10
11
Implement the trapping.
2
12
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
15
---
6
target/arm/translate-vfp.inc.c | 28 ++++++++++++++++++++++++++++
16
target/arm/helper.c | 33 +++++++++++++++++++++++----------
7
target/arm/translate.c | 12 +-----------
17
1 file changed, 23 insertions(+), 10 deletions(-)
8
target/arm/vfp.decode | 4 ++++
9
3 files changed, 33 insertions(+), 11 deletions(-)
10
18
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
21
--- a/target/arm/helper.c
14
+++ b/target/arm/translate-vfp.inc.c
22
+++ b/target/arm/helper.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
23
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_ttlbis(CPUARMState *env, const ARMCPRegInfo *ri,
16
tcg_temp_free_ptr(fpst);
24
return CP_ACCESS_OK;
17
return true;
18
}
25
}
26
27
+#ifdef TARGET_AARCH64
28
+/* Check for traps from EL1 due to HCR_EL2.TTLB or TTLBOS. */
29
+static CPAccessResult access_ttlbos(CPUARMState *env, const ARMCPRegInfo *ri,
30
+ bool isread)
31
+{
32
+ if (arm_current_el(env) == 1 &&
33
+ (arm_hcr_el2_eff(env) & (HCR_TTLB | HCR_TTLBOS))) {
34
+ return CP_ACCESS_TRAP_EL2;
35
+ }
36
+ return CP_ACCESS_OK;
37
+}
38
+#endif
19
+
39
+
20
+static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
40
static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
21
+{
41
{
22
+ TCGv_i32 vd;
42
ARMCPU *cpu = env_archcpu(env);
23
+ TCGv_i64 vm;
43
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbirange_reginfo[] = {
24
+
44
.writefn = tlbi_aa64_rvae1is_write },
25
+ if (!dc_isar_feature(aa32_jscvt, s)) {
45
{ .name = "TLBI_RVAE1OS", .state = ARM_CP_STATE_AA64,
26
+ return false;
46
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 1,
27
+ }
47
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
28
+
48
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
29
+ /* UNDEF accesses to D16-D31 if they don't exist. */
49
.writefn = tlbi_aa64_rvae1is_write },
30
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
50
{ .name = "TLBI_RVAAE1OS", .state = ARM_CP_STATE_AA64,
31
+ return false;
51
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 3,
32
+ }
52
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
33
+
53
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
34
+ if (!vfp_access_check(s)) {
54
.writefn = tlbi_aa64_rvae1is_write },
35
+ return true;
55
{ .name = "TLBI_RVALE1OS", .state = ARM_CP_STATE_AA64,
36
+ }
56
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 5,
37
+
57
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
38
+ vm = tcg_temp_new_i64();
58
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
39
+ vd = tcg_temp_new_i32();
59
.writefn = tlbi_aa64_rvae1is_write },
40
+ neon_load_reg64(vm, a->vm);
60
{ .name = "TLBI_RVAALE1OS", .state = ARM_CP_STATE_AA64,
41
+ gen_helper_vjcvt(vd, vm, cpu_env);
61
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 7,
42
+ neon_store_reg32(vd, a->vd);
62
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
43
+ tcg_temp_free_i64(vm);
63
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
44
+ tcg_temp_free_i32(vd);
64
.writefn = tlbi_aa64_rvae1is_write },
45
+ return true;
65
{ .name = "TLBI_RVAE1", .state = ARM_CP_STATE_AA64,
46
+}
66
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 1,
47
diff --git a/target/arm/translate.c b/target/arm/translate.c
67
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbirange_reginfo[] = {
48
index XXXXXXX..XXXXXXX 100644
68
static const ARMCPRegInfo tlbios_reginfo[] = {
49
--- a/target/arm/translate.c
69
{ .name = "TLBI_VMALLE1OS", .state = ARM_CP_STATE_AA64,
50
+++ b/target/arm/translate.c
70
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 0,
51
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
71
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
52
return 1;
72
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
53
case 15:
73
.writefn = tlbi_aa64_vmalle1is_write },
54
switch (rn) {
74
{ .name = "TLBI_VAE1OS", .state = ARM_CP_STATE_AA64,
55
- case 0 ... 17:
75
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 1,
56
+ case 0 ... 19:
76
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
57
/* Already handled by decodetree */
77
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
58
return 1;
78
.writefn = tlbi_aa64_vae1is_write },
59
default:
79
{ .name = "TLBI_ASIDE1OS", .state = ARM_CP_STATE_AA64,
60
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
80
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 2,
61
rm_is_dp = false;
81
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
62
break;
82
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
63
83
.writefn = tlbi_aa64_vmalle1is_write },
64
- case 0x13: /* vjcvt */
84
{ .name = "TLBI_VAAE1OS", .state = ARM_CP_STATE_AA64,
65
- if (!dp || !dc_isar_feature(aa32_jscvt, s)) {
85
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 3,
66
- return 1;
86
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
67
- }
87
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
68
- rd_is_dp = false;
88
.writefn = tlbi_aa64_vae1is_write },
69
- break;
89
{ .name = "TLBI_VALE1OS", .state = ARM_CP_STATE_AA64,
70
-
90
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 5,
71
default:
91
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
72
return 1;
92
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
73
}
93
.writefn = tlbi_aa64_vae1is_write },
74
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
94
{ .name = "TLBI_VAALE1OS", .state = ARM_CP_STATE_AA64,
75
switch (op) {
95
.opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 7,
76
case 15: /* extension space */
96
- .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
77
switch (rn) {
97
+ .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
78
- case 19: /* vjcvt */
98
.writefn = tlbi_aa64_vae1is_write },
79
- gen_helper_vjcvt(cpu_F0s, cpu_F0d, cpu_env);
99
{ .name = "TLBI_ALLE2OS", .state = ARM_CP_STATE_AA64,
80
- break;
100
.opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 0,
81
case 20: /* fshto */
82
gen_vfp_shto(dp, 16 - rm, 0);
83
break;
84
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/vfp.decode
87
+++ b/target/arm/vfp.decode
88
@@ -XXX,XX +XXX,XX @@ VCVT_int_sp ---- 1110 1.11 1000 .... 1010 s:1 1.0 .... \
89
vd=%vd_sp vm=%vm_sp
90
VCVT_int_dp ---- 1110 1.11 1000 .... 1011 s:1 1.0 .... \
91
vd=%vd_dp vm=%vm_sp
92
+
93
+# VJCVT is always dp to sp
94
+VJCVT ---- 1110 1.11 1001 .... 1011 11.0 .... \
95
+ vd=%vd_sp vm=%vm_dp
96
--
101
--
97
2.20.1
102
2.25.1
98
99
diff view generated by jsdifflib
1
Convert the VFP round-to-integer instructions VRINTR, VRINTZ and
1
For FEAT_EVT, the HCR_EL2.TICAB bit allows trapping of the ICIALLUIS
2
VRINTX to decodetree.
2
and IC IALLUIS cache maintenance instructions.
3
3
4
These instructions were only introduced as part of the "VFP misc"
4
The HCR_EL2.TOCU bit traps all the other cache maintenance
5
additions in v8A, so we check this. The old decoder's implementation
5
instructions that operate to the point of unification:
6
was incorrectly providing them even for v7A CPUs.
6
AArch64 IC IVAU, IC IALLU, DC CVAU
7
AArch32 ICIMVAU, ICIALLU, DCCMVAU
8
9
The two trap bits between them cover all of the cache maintenance
10
instructions which must also check the HCR_TPU flag. Turn the old
11
aa64_cacheop_pou_access() function into a helper function which takes
12
the set of HCR_EL2 flags to check as an argument, and call it from
13
new access_ticab() and access_tocu() functions as appropriate for
14
each cache op.
7
15
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
---
18
---
11
target/arm/translate-vfp.inc.c | 163 +++++++++++++++++++++++++++++++++
19
target/arm/helper.c | 36 +++++++++++++++++++++++-------------
12
target/arm/translate.c | 45 +--------
20
1 file changed, 23 insertions(+), 13 deletions(-)
13
target/arm/vfp.decode | 15 +++
14
3 files changed, 179 insertions(+), 44 deletions(-)
15
21
16
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
22
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate-vfp.inc.c
24
--- a/target/arm/helper.c
19
+++ b/target/arm/translate-vfp.inc.c
25
+++ b/target/arm/helper.c
20
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
26
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
21
tcg_temp_free_i32(tmp);
27
return CP_ACCESS_OK;
22
return true;
23
}
28
}
24
+
29
25
+static bool trans_VRINTR_sp(DisasContext *s, arg_VRINTR_sp *a)
30
-static CPAccessResult aa64_cacheop_pou_access(CPUARMState *env,
31
- const ARMCPRegInfo *ri,
32
- bool isread)
33
+static CPAccessResult do_cacheop_pou_access(CPUARMState *env, uint64_t hcrflags)
34
{
35
/* Cache invalidate/clean to Point of Unification... */
36
switch (arm_current_el(env)) {
37
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_pou_access(CPUARMState *env,
38
}
39
/* fall through */
40
case 1:
41
- /* ... EL1 must trap to EL2 if HCR_EL2.TPU is set. */
42
- if (arm_hcr_el2_eff(env) & HCR_TPU) {
43
+ /* ... EL1 must trap to EL2 if relevant HCR_EL2 flags are set. */
44
+ if (arm_hcr_el2_eff(env) & hcrflags) {
45
return CP_ACCESS_TRAP_EL2;
46
}
47
break;
48
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_pou_access(CPUARMState *env,
49
return CP_ACCESS_OK;
50
}
51
52
+static CPAccessResult access_ticab(CPUARMState *env, const ARMCPRegInfo *ri,
53
+ bool isread)
26
+{
54
+{
27
+ TCGv_ptr fpst;
55
+ return do_cacheop_pou_access(env, HCR_TICAB | HCR_TPU);
28
+ TCGv_i32 tmp;
29
+
30
+ if (!dc_isar_feature(aa32_vrint, s)) {
31
+ return false;
32
+ }
33
+
34
+ if (!vfp_access_check(s)) {
35
+ return true;
36
+ }
37
+
38
+ tmp = tcg_temp_new_i32();
39
+ neon_load_reg32(tmp, a->vm);
40
+ fpst = get_fpstatus_ptr(false);
41
+ gen_helper_rints(tmp, tmp, fpst);
42
+ neon_store_reg32(tmp, a->vd);
43
+ tcg_temp_free_ptr(fpst);
44
+ tcg_temp_free_i32(tmp);
45
+ return true;
46
+}
56
+}
47
+
57
+
48
+static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_sp *a)
58
+static CPAccessResult access_tocu(CPUARMState *env, const ARMCPRegInfo *ri,
59
+ bool isread)
49
+{
60
+{
50
+ TCGv_ptr fpst;
61
+ return do_cacheop_pou_access(env, HCR_TOCU | HCR_TPU);
51
+ TCGv_i64 tmp;
52
+
53
+ if (!dc_isar_feature(aa32_vrint, s)) {
54
+ return false;
55
+ }
56
+
57
+ /* UNDEF accesses to D16-D31 if they don't exist. */
58
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
59
+ return false;
60
+ }
61
+
62
+ if (!vfp_access_check(s)) {
63
+ return true;
64
+ }
65
+
66
+ tmp = tcg_temp_new_i64();
67
+ neon_load_reg64(tmp, a->vm);
68
+ fpst = get_fpstatus_ptr(false);
69
+ gen_helper_rintd(tmp, tmp, fpst);
70
+ neon_store_reg64(tmp, a->vd);
71
+ tcg_temp_free_ptr(fpst);
72
+ tcg_temp_free_i64(tmp);
73
+ return true;
74
+}
62
+}
75
+
63
+
76
+static bool trans_VRINTZ_sp(DisasContext *s, arg_VRINTZ_sp *a)
64
/* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
77
+{
65
* Page D4-1736 (DDI0487A.b)
78
+ TCGv_ptr fpst;
66
*/
79
+ TCGv_i32 tmp;
67
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
80
+ TCGv_i32 tcg_rmode;
68
{ .name = "IC_IALLUIS", .state = ARM_CP_STATE_AA64,
81
+
69
.opc0 = 1, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
82
+ if (!dc_isar_feature(aa32_vrint, s)) {
70
.access = PL1_W, .type = ARM_CP_NOP,
83
+ return false;
71
- .accessfn = aa64_cacheop_pou_access },
84
+ }
72
+ .accessfn = access_ticab },
85
+
73
{ .name = "IC_IALLU", .state = ARM_CP_STATE_AA64,
86
+ if (!vfp_access_check(s)) {
74
.opc0 = 1, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
87
+ return true;
75
.access = PL1_W, .type = ARM_CP_NOP,
88
+ }
76
- .accessfn = aa64_cacheop_pou_access },
89
+
77
+ .accessfn = access_tocu },
90
+ tmp = tcg_temp_new_i32();
78
{ .name = "IC_IVAU", .state = ARM_CP_STATE_AA64,
91
+ neon_load_reg32(tmp, a->vm);
79
.opc0 = 1, .opc1 = 3, .crn = 7, .crm = 5, .opc2 = 1,
92
+ fpst = get_fpstatus_ptr(false);
80
.access = PL0_W, .type = ARM_CP_NOP,
93
+ tcg_rmode = tcg_const_i32(float_round_to_zero);
81
- .accessfn = aa64_cacheop_pou_access },
94
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
82
+ .accessfn = access_tocu },
95
+ gen_helper_rints(tmp, tmp, fpst);
83
{ .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
96
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
84
.opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
97
+ neon_store_reg32(tmp, a->vd);
85
.access = PL1_W, .accessfn = aa64_cacheop_poc_access,
98
+ tcg_temp_free_ptr(fpst);
86
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
99
+ tcg_temp_free_i32(tcg_rmode);
87
{ .name = "DC_CVAU", .state = ARM_CP_STATE_AA64,
100
+ tcg_temp_free_i32(tmp);
88
.opc0 = 1, .opc1 = 3, .crn = 7, .crm = 11, .opc2 = 1,
101
+ return true;
89
.access = PL0_W, .type = ARM_CP_NOP,
102
+}
90
- .accessfn = aa64_cacheop_pou_access },
103
+
91
+ .accessfn = access_tocu },
104
+static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_sp *a)
92
{ .name = "DC_CIVAC", .state = ARM_CP_STATE_AA64,
105
+{
93
.opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 1,
106
+ TCGv_ptr fpst;
94
.access = PL0_W, .type = ARM_CP_NOP,
107
+ TCGv_i64 tmp;
95
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
108
+ TCGv_i32 tcg_rmode;
96
.writefn = tlbiipas2is_hyp_write },
109
+
97
/* 32 bit cache operations */
110
+ if (!dc_isar_feature(aa32_vrint, s)) {
98
{ .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
111
+ return false;
99
- .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
112
+ }
100
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_ticab },
113
+
101
{ .name = "BPIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 6,
114
+ /* UNDEF accesses to D16-D31 if they don't exist. */
102
.type = ARM_CP_NOP, .access = PL1_W },
115
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
103
{ .name = "ICIALLU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
116
+ return false;
104
- .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
117
+ }
105
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tocu },
118
+
106
{ .name = "ICIMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 1,
119
+ if (!vfp_access_check(s)) {
107
- .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
120
+ return true;
108
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tocu },
121
+ }
109
{ .name = "BPIALL", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 6,
122
+
110
.type = ARM_CP_NOP, .access = PL1_W },
123
+ tmp = tcg_temp_new_i64();
111
{ .name = "BPIMVA", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 7,
124
+ neon_load_reg64(tmp, a->vm);
112
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
125
+ fpst = get_fpstatus_ptr(false);
113
{ .name = "DCCSW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
126
+ tcg_rmode = tcg_const_i32(float_round_to_zero);
114
.type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
127
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
115
{ .name = "DCCMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 11, .opc2 = 1,
128
+ gen_helper_rintd(tmp, tmp, fpst);
116
- .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
129
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
117
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tocu },
130
+ neon_store_reg64(tmp, a->vd);
118
{ .name = "DCCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 1,
131
+ tcg_temp_free_ptr(fpst);
119
.type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
132
+ tcg_temp_free_i64(tmp);
120
{ .name = "DCCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
133
+ tcg_temp_free_i32(tcg_rmode);
134
+ return true;
135
+}
136
+
137
+static bool trans_VRINTX_sp(DisasContext *s, arg_VRINTX_sp *a)
138
+{
139
+ TCGv_ptr fpst;
140
+ TCGv_i32 tmp;
141
+
142
+ if (!dc_isar_feature(aa32_vrint, s)) {
143
+ return false;
144
+ }
145
+
146
+ if (!vfp_access_check(s)) {
147
+ return true;
148
+ }
149
+
150
+ tmp = tcg_temp_new_i32();
151
+ neon_load_reg32(tmp, a->vm);
152
+ fpst = get_fpstatus_ptr(false);
153
+ gen_helper_rints_exact(tmp, tmp, fpst);
154
+ neon_store_reg32(tmp, a->vd);
155
+ tcg_temp_free_ptr(fpst);
156
+ tcg_temp_free_i32(tmp);
157
+ return true;
158
+}
159
+
160
+static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
161
+{
162
+ TCGv_ptr fpst;
163
+ TCGv_i64 tmp;
164
+
165
+ if (!dc_isar_feature(aa32_vrint, s)) {
166
+ return false;
167
+ }
168
+
169
+ /* UNDEF accesses to D16-D31 if they don't exist. */
170
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
171
+ return false;
172
+ }
173
+
174
+ if (!vfp_access_check(s)) {
175
+ return true;
176
+ }
177
+
178
+ tmp = tcg_temp_new_i64();
179
+ neon_load_reg64(tmp, a->vm);
180
+ fpst = get_fpstatus_ptr(false);
181
+ gen_helper_rintd_exact(tmp, tmp, fpst);
182
+ neon_store_reg64(tmp, a->vd);
183
+ tcg_temp_free_ptr(fpst);
184
+ tcg_temp_free_i64(tmp);
185
+ return true;
186
+}
187
diff --git a/target/arm/translate.c b/target/arm/translate.c
188
index XXXXXXX..XXXXXXX 100644
189
--- a/target/arm/translate.c
190
+++ b/target/arm/translate.c
191
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
192
return 1;
193
case 15:
194
switch (rn) {
195
- case 0 ... 11:
196
+ case 0 ... 14:
197
/* Already handled by decodetree */
198
return 1;
199
default:
200
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
201
if (op == 15) {
202
/* rn is opcode, encoded as per VFP_SREG_N. */
203
switch (rn) {
204
- case 0x0c: /* vrintr */
205
- case 0x0d: /* vrintz */
206
- case 0x0e: /* vrintx */
207
- break;
208
-
209
case 0x0f: /* vcvt double<->single */
210
rd_is_dp = !dp;
211
break;
212
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
213
switch (op) {
214
case 15: /* extension space */
215
switch (rn) {
216
- case 12: /* vrintr */
217
- {
218
- TCGv_ptr fpst = get_fpstatus_ptr(0);
219
- if (dp) {
220
- gen_helper_rintd(cpu_F0d, cpu_F0d, fpst);
221
- } else {
222
- gen_helper_rints(cpu_F0s, cpu_F0s, fpst);
223
- }
224
- tcg_temp_free_ptr(fpst);
225
- break;
226
- }
227
- case 13: /* vrintz */
228
- {
229
- TCGv_ptr fpst = get_fpstatus_ptr(0);
230
- TCGv_i32 tcg_rmode;
231
- tcg_rmode = tcg_const_i32(float_round_to_zero);
232
- gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
233
- if (dp) {
234
- gen_helper_rintd(cpu_F0d, cpu_F0d, fpst);
235
- } else {
236
- gen_helper_rints(cpu_F0s, cpu_F0s, fpst);
237
- }
238
- gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
239
- tcg_temp_free_i32(tcg_rmode);
240
- tcg_temp_free_ptr(fpst);
241
- break;
242
- }
243
- case 14: /* vrintx */
244
- {
245
- TCGv_ptr fpst = get_fpstatus_ptr(0);
246
- if (dp) {
247
- gen_helper_rintd_exact(cpu_F0d, cpu_F0d, fpst);
248
- } else {
249
- gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpst);
250
- }
251
- tcg_temp_free_ptr(fpst);
252
- break;
253
- }
254
case 15: /* single<->double conversion */
255
if (dp) {
256
gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env);
257
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
258
index XXXXXXX..XXXXXXX 100644
259
--- a/target/arm/vfp.decode
260
+++ b/target/arm/vfp.decode
261
@@ -XXX,XX +XXX,XX @@ VCVT_f16_f32 ---- 1110 1.11 0011 .... 1010 t:1 1.0 .... \
262
vd=%vd_sp vm=%vm_sp
263
VCVT_f16_f64 ---- 1110 1.11 0011 .... 1011 t:1 1.0 .... \
264
vd=%vd_sp vm=%vm_dp
265
+
266
+VRINTR_sp ---- 1110 1.11 0110 .... 1010 01.0 .... \
267
+ vd=%vd_sp vm=%vm_sp
268
+VRINTR_dp ---- 1110 1.11 0110 .... 1011 01.0 .... \
269
+ vd=%vd_dp vm=%vm_dp
270
+
271
+VRINTZ_sp ---- 1110 1.11 0110 .... 1010 11.0 .... \
272
+ vd=%vd_sp vm=%vm_sp
273
+VRINTZ_dp ---- 1110 1.11 0110 .... 1011 11.0 .... \
274
+ vd=%vd_dp vm=%vm_dp
275
+
276
+VRINTX_sp ---- 1110 1.11 0111 .... 1010 01.0 .... \
277
+ vd=%vd_sp vm=%vm_sp
278
+VRINTX_dp ---- 1110 1.11 0111 .... 1011 01.0 .... \
279
+ vd=%vd_dp vm=%vm_dp
280
--
121
--
281
2.20.1
122
2.25.1
282
283
diff view generated by jsdifflib
1
Factor out the VFP access checking code so that we can use it in the
1
For FEAT_EVT, the HCR_EL2.TID4 trap allows trapping of the cache ID
2
leaf functions of the decodetree decoder.
2
registers CCSIDR_EL1, CCSIDR2_EL1, CLIDR_EL1 and CSSELR_EL1 (and
3
their AArch32 equivalents). This is a subset of the registers
4
trapped by HCR_EL2.TID2, which includes all of these and also the
5
CTR_EL0 register.
3
6
4
We call the function full_vfp_access_check() so we can keep
7
Our implementation already uses a separate access function for
5
the more natural vfp_access_check() for a version which doesn't
8
CTR_EL0 (ctr_el0_access()), so all of the registers currently using
6
have the 'ignore_vfp_enabled' flag -- that way almost all VFP
9
access_aa64_tid2() should also be checking TID4. Make that function
7
insns will be able to use vfp_access_check(s) and only the
10
check both TID2 and TID4, and rename it appropriately.
8
special-register access function will have to use
9
full_vfp_access_check(s, ignore_vfp_enabled).
10
11
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
---
14
---
14
target/arm/translate-vfp.inc.c | 100 ++++++++++++++++++++++++++++++++
15
target/arm/helper.c | 17 +++++++++--------
15
target/arm/translate.c | 101 +++++----------------------------
16
1 file changed, 9 insertions(+), 8 deletions(-)
16
2 files changed, 113 insertions(+), 88 deletions(-)
17
17
18
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
18
diff --git a/target/arm/helper.c b/target/arm/helper.c
19
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate-vfp.inc.c
20
--- a/target/arm/helper.c
21
+++ b/target/arm/translate-vfp.inc.c
21
+++ b/target/arm/helper.c
22
@@ -XXX,XX +XXX,XX @@
22
@@ -XXX,XX +XXX,XX @@ static void scr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
23
/* Include the generated VFP decoder */
23
scr_write(env, ri, 0);
24
#include "decode-vfp.inc.c"
25
#include "decode-vfp-uncond.inc.c"
26
+
27
+/*
28
+ * Check that VFP access is enabled. If it is, do the necessary
29
+ * M-profile lazy-FP handling and then return true.
30
+ * If not, emit code to generate an appropriate exception and
31
+ * return false.
32
+ * The ignore_vfp_enabled argument specifies that we should ignore
33
+ * whether VFP is enabled via FPEXC[EN]: this should be true for FMXR/FMRX
34
+ * accesses to FPSID, FPEXC, MVFR0, MVFR1, MVFR2, and false for all other insns.
35
+ */
36
+static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
37
+{
38
+ if (s->fp_excp_el) {
39
+ if (arm_dc_feature(s, ARM_FEATURE_M)) {
40
+ gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
41
+ s->fp_excp_el);
42
+ } else {
43
+ gen_exception_insn(s, 4, EXCP_UDEF,
44
+ syn_fp_access_trap(1, 0xe, false),
45
+ s->fp_excp_el);
46
+ }
47
+ return false;
48
+ }
49
+
50
+ if (!s->vfp_enabled && !ignore_vfp_enabled) {
51
+ assert(!arm_dc_feature(s, ARM_FEATURE_M));
52
+ gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
53
+ default_exception_el(s));
54
+ return false;
55
+ }
56
+
57
+ if (arm_dc_feature(s, ARM_FEATURE_M)) {
58
+ /* Handle M-profile lazy FP state mechanics */
59
+
60
+ /* Trigger lazy-state preservation if necessary */
61
+ if (s->v7m_lspact) {
62
+ /*
63
+ * Lazy state saving affects external memory and also the NVIC,
64
+ * so we must mark it as an IO operation for icount.
65
+ */
66
+ if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
67
+ gen_io_start();
68
+ }
69
+ gen_helper_v7m_preserve_fp_state(cpu_env);
70
+ if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
71
+ gen_io_end();
72
+ }
73
+ /*
74
+ * If the preserve_fp_state helper doesn't throw an exception
75
+ * then it will clear LSPACT; we don't need to repeat this for
76
+ * any further FP insns in this TB.
77
+ */
78
+ s->v7m_lspact = false;
79
+ }
80
+
81
+ /* Update ownership of FP context: set FPCCR.S to match current state */
82
+ if (s->v8m_fpccr_s_wrong) {
83
+ TCGv_i32 tmp;
84
+
85
+ tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
86
+ if (s->v8m_secure) {
87
+ tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
88
+ } else {
89
+ tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
90
+ }
91
+ store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
92
+ /* Don't need to do this for any further FP insns in this TB */
93
+ s->v8m_fpccr_s_wrong = false;
94
+ }
95
+
96
+ if (s->v7m_new_fp_ctxt_needed) {
97
+ /*
98
+ * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA
99
+ * and the FPSCR.
100
+ */
101
+ TCGv_i32 control, fpscr;
102
+ uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
103
+
104
+ fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
105
+ gen_helper_vfp_set_fpscr(cpu_env, fpscr);
106
+ tcg_temp_free_i32(fpscr);
107
+ /*
108
+ * We don't need to arrange to end the TB, because the only
109
+ * parts of FPSCR which we cache in the TB flags are the VECLEN
110
+ * and VECSTRIDE, and those don't exist for M-profile.
111
+ */
112
+
113
+ if (s->v8m_secure) {
114
+ bits |= R_V7M_CONTROL_SFPA_MASK;
115
+ }
116
+ control = load_cpu_field(v7m.control[M_REG_S]);
117
+ tcg_gen_ori_i32(control, control, bits);
118
+ store_cpu_field(control, v7m.control[M_REG_S]);
119
+ /* Don't need to do this for any further FP insns in this TB */
120
+ s->v7m_new_fp_ctxt_needed = false;
121
+ }
122
+ }
123
+
124
+ return true;
125
+}
126
diff --git a/target/arm/translate.c b/target/arm/translate.c
127
index XXXXXXX..XXXXXXX 100644
128
--- a/target/arm/translate.c
129
+++ b/target/arm/translate.c
130
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
131
return 1;
132
}
24
}
133
25
134
-/* Disassemble a VFP instruction. Returns nonzero if an error occurred
26
-static CPAccessResult access_aa64_tid2(CPUARMState *env,
135
- (ie. an undefined instruction). */
27
- const ARMCPRegInfo *ri,
136
+/*
28
- bool isread)
137
+ * Disassemble a VFP instruction. Returns nonzero if an error occurred
29
+static CPAccessResult access_tid4(CPUARMState *env,
138
+ * (ie. an undefined instruction).
30
+ const ARMCPRegInfo *ri,
139
+ */
31
+ bool isread)
140
static int disas_vfp_insn(DisasContext *s, uint32_t insn)
141
{
32
{
142
uint32_t rd, rn, rm, op, i, n, offset, delta_d, delta_m, bank_mask;
33
- if (arm_current_el(env) == 1 && (arm_hcr_el2_eff(env) & HCR_TID2)) {
143
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
34
+ if (arm_current_el(env) == 1 &&
144
TCGv_i32 addr;
35
+ (arm_hcr_el2_eff(env) & (HCR_TID2 | HCR_TID4))) {
145
TCGv_i32 tmp;
36
return CP_ACCESS_TRAP_EL2;
146
TCGv_i32 tmp2;
147
+ bool ignore_vfp_enabled = false;
148
149
if (!arm_dc_feature(s, ARM_FEATURE_VFP)) {
150
return 1;
151
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
152
}
153
}
37
}
154
38
155
- /* FIXME: this access check should not take precedence over UNDEF
39
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
156
+ /*
40
{ .name = "CCSIDR", .state = ARM_CP_STATE_BOTH,
157
+ * FIXME: this access check should not take precedence over UNDEF
41
.opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 0,
158
* for invalid encodings; we will generate incorrect syndrome information
42
.access = PL1_R,
159
* for attempts to execute invalid vfp/neon encodings with FP disabled.
43
- .accessfn = access_aa64_tid2,
160
*/
44
+ .accessfn = access_tid4,
161
- if (s->fp_excp_el) {
45
.readfn = ccsidr_read, .type = ARM_CP_NO_RAW },
162
- if (arm_dc_feature(s, ARM_FEATURE_M)) {
46
{ .name = "CSSELR", .state = ARM_CP_STATE_BOTH,
163
- gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
47
.opc0 = 3, .crn = 0, .crm = 0, .opc1 = 2, .opc2 = 0,
164
- s->fp_excp_el);
48
.access = PL1_RW,
165
- } else {
49
- .accessfn = access_aa64_tid2,
166
- gen_exception_insn(s, 4, EXCP_UDEF,
50
+ .accessfn = access_tid4,
167
- syn_fp_access_trap(1, 0xe, false),
51
.writefn = csselr_write, .resetvalue = 0,
168
- s->fp_excp_el);
52
.bank_fieldoffsets = { offsetof(CPUARMState, cp15.csselr_s),
169
- }
53
offsetof(CPUARMState, cp15.csselr_ns) } },
170
- return 0;
54
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ccsidr2_reginfo[] = {
171
- }
55
{ .name = "CCSIDR2", .state = ARM_CP_STATE_BOTH,
172
-
56
.opc0 = 3, .opc1 = 1, .crn = 0, .crm = 0, .opc2 = 2,
173
- if (!s->vfp_enabled) {
57
.access = PL1_R,
174
- /* VFP disabled. Only allow fmxr/fmrx to/from some control regs. */
58
- .accessfn = access_aa64_tid2,
175
- if ((insn & 0x0fe00fff) != 0x0ee00a10)
59
+ .accessfn = access_tid4,
176
- return 1;
60
.readfn = ccsidr2_read, .type = ARM_CP_NO_RAW },
177
+ if ((insn & 0x0fe00fff) == 0x0ee00a10) {
61
};
178
rn = (insn >> 16) & 0xf;
62
179
- if (rn != ARM_VFP_FPSID && rn != ARM_VFP_FPEXC && rn != ARM_VFP_MVFR2
63
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
180
- && rn != ARM_VFP_MVFR1 && rn != ARM_VFP_MVFR0) {
64
.name = "CLIDR", .state = ARM_CP_STATE_BOTH,
181
- return 1;
65
.opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 1,
182
+ if (rn == ARM_VFP_FPSID || rn == ARM_VFP_FPEXC || rn == ARM_VFP_MVFR2
66
.access = PL1_R, .type = ARM_CP_CONST,
183
+ || rn == ARM_VFP_MVFR1 || rn == ARM_VFP_MVFR0) {
67
- .accessfn = access_aa64_tid2,
184
+ ignore_vfp_enabled = true;
68
+ .accessfn = access_tid4,
185
}
69
.resetvalue = cpu->clidr
186
}
70
};
187
-
71
define_one_arm_cp_reg(cpu, &clidr);
188
- if (arm_dc_feature(s, ARM_FEATURE_M)) {
189
- /* Handle M-profile lazy FP state mechanics */
190
-
191
- /* Trigger lazy-state preservation if necessary */
192
- if (s->v7m_lspact) {
193
- /*
194
- * Lazy state saving affects external memory and also the NVIC,
195
- * so we must mark it as an IO operation for icount.
196
- */
197
- if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
198
- gen_io_start();
199
- }
200
- gen_helper_v7m_preserve_fp_state(cpu_env);
201
- if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
202
- gen_io_end();
203
- }
204
- /*
205
- * If the preserve_fp_state helper doesn't throw an exception
206
- * then it will clear LSPACT; we don't need to repeat this for
207
- * any further FP insns in this TB.
208
- */
209
- s->v7m_lspact = false;
210
- }
211
-
212
- /* Update ownership of FP context: set FPCCR.S to match current state */
213
- if (s->v8m_fpccr_s_wrong) {
214
- TCGv_i32 tmp;
215
-
216
- tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
217
- if (s->v8m_secure) {
218
- tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
219
- } else {
220
- tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
221
- }
222
- store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
223
- /* Don't need to do this for any further FP insns in this TB */
224
- s->v8m_fpccr_s_wrong = false;
225
- }
226
-
227
- if (s->v7m_new_fp_ctxt_needed) {
228
- /*
229
- * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA
230
- * and the FPSCR.
231
- */
232
- TCGv_i32 control, fpscr;
233
- uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
234
-
235
- fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
236
- gen_helper_vfp_set_fpscr(cpu_env, fpscr);
237
- tcg_temp_free_i32(fpscr);
238
- /*
239
- * We don't need to arrange to end the TB, because the only
240
- * parts of FPSCR which we cache in the TB flags are the VECLEN
241
- * and VECSTRIDE, and those don't exist for M-profile.
242
- */
243
-
244
- if (s->v8m_secure) {
245
- bits |= R_V7M_CONTROL_SFPA_MASK;
246
- }
247
- control = load_cpu_field(v7m.control[M_REG_S]);
248
- tcg_gen_ori_i32(control, control, bits);
249
- store_cpu_field(control, v7m.control[M_REG_S]);
250
- /* Don't need to do this for any further FP insns in this TB */
251
- s->v7m_new_fp_ctxt_needed = false;
252
- }
253
+ if (!full_vfp_access_check(s, ignore_vfp_enabled)) {
254
+ return 0;
255
}
256
257
if (extract32(insn, 28, 4) == 0xf) {
258
--
72
--
259
2.20.1
73
2.25.1
260
261
diff view generated by jsdifflib
1
Convert the VCVT integer-to-float instructions to decodetree.
1
Update the ID registers for TCG's '-cpu max' to report the
2
FEAT_EVT Enhanced Virtualization Traps support.
2
3
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
---
6
target/arm/translate-vfp.inc.c | 58 ++++++++++++++++++++++++++++++++++
7
docs/system/arm/emulation.rst | 1 +
7
target/arm/translate.c | 12 +------
8
target/arm/cpu64.c | 1 +
8
target/arm/vfp.decode | 6 ++++
9
target/arm/cpu_tcg.c | 1 +
9
3 files changed, 65 insertions(+), 11 deletions(-)
10
3 files changed, 3 insertions(+)
10
11
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
--- a/docs/system/arm/emulation.rst
14
+++ b/target/arm/translate-vfp.inc.c
15
+++ b/docs/system/arm/emulation.rst
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
16
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
16
tcg_temp_free_i64(vm);
17
- FEAT_DoubleFault (Double Fault Extension)
17
return true;
18
- FEAT_E0PD (Preventing EL0 access to halves of address maps)
18
}
19
- FEAT_ETS (Enhanced Translation Synchronization)
19
+
20
+- FEAT_EVT (Enhanced Virtualization Traps)
20
+static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a)
21
- FEAT_FCMA (Floating-point complex number instructions)
21
+{
22
- FEAT_FHM (Floating-point half-precision multiplication instructions)
22
+ TCGv_i32 vm;
23
- FEAT_FP16 (Half-precision floating-point data processing)
23
+ TCGv_ptr fpst;
24
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
24
+
25
+ if (!vfp_access_check(s)) {
26
+ return true;
27
+ }
28
+
29
+ vm = tcg_temp_new_i32();
30
+ neon_load_reg32(vm, a->vm);
31
+ fpst = get_fpstatus_ptr(false);
32
+ if (a->s) {
33
+ /* i32 -> f32 */
34
+ gen_helper_vfp_sitos(vm, vm, fpst);
35
+ } else {
36
+ /* u32 -> f32 */
37
+ gen_helper_vfp_uitos(vm, vm, fpst);
38
+ }
39
+ neon_store_reg32(vm, a->vd);
40
+ tcg_temp_free_i32(vm);
41
+ tcg_temp_free_ptr(fpst);
42
+ return true;
43
+}
44
+
45
+static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
46
+{
47
+ TCGv_i32 vm;
48
+ TCGv_i64 vd;
49
+ TCGv_ptr fpst;
50
+
51
+ /* UNDEF accesses to D16-D31 if they don't exist. */
52
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
53
+ return false;
54
+ }
55
+
56
+ if (!vfp_access_check(s)) {
57
+ return true;
58
+ }
59
+
60
+ vm = tcg_temp_new_i32();
61
+ vd = tcg_temp_new_i64();
62
+ neon_load_reg32(vm, a->vm);
63
+ fpst = get_fpstatus_ptr(false);
64
+ if (a->s) {
65
+ /* i32 -> f64 */
66
+ gen_helper_vfp_sitod(vd, vm, fpst);
67
+ } else {
68
+ /* u32 -> f64 */
69
+ gen_helper_vfp_uitod(vd, vm, fpst);
70
+ }
71
+ neon_store_reg64(vd, a->vd);
72
+ tcg_temp_free_i32(vm);
73
+ tcg_temp_free_i64(vd);
74
+ tcg_temp_free_ptr(fpst);
75
+ return true;
76
+}
77
diff --git a/target/arm/translate.c b/target/arm/translate.c
78
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
79
--- a/target/arm/translate.c
26
--- a/target/arm/cpu64.c
80
+++ b/target/arm/translate.c
27
+++ b/target/arm/cpu64.c
81
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
28
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
82
return 1;
29
t = FIELD_DP64(t, ID_AA64MMFR2, FWB, 1); /* FEAT_S2FWB */
83
case 15:
30
t = FIELD_DP64(t, ID_AA64MMFR2, TTL, 1); /* FEAT_TTL */
84
switch (rn) {
31
t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2); /* FEAT_BBM at level 2 */
85
- case 0 ... 15:
32
+ t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2); /* FEAT_EVT */
86
+ case 0 ... 17:
33
t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1); /* FEAT_E0PD */
87
/* Already handled by decodetree */
34
cpu->isar.id_aa64mmfr2 = t;
88
return 1;
35
89
default:
36
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
90
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
91
if (op == 15) {
92
/* rn is opcode, encoded as per VFP_SREG_N. */
93
switch (rn) {
94
- case 0x10: /* vcvt.fxx.u32 */
95
- case 0x11: /* vcvt.fxx.s32 */
96
- rm_is_dp = false;
97
- break;
98
case 0x18: /* vcvtr.u32.fxx */
99
case 0x19: /* vcvtz.u32.fxx */
100
case 0x1a: /* vcvtr.s32.fxx */
101
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
102
switch (op) {
103
case 15: /* extension space */
104
switch (rn) {
105
- case 16: /* fuito */
106
- gen_vfp_uito(dp, 0);
107
- break;
108
- case 17: /* fsito */
109
- gen_vfp_sito(dp, 0);
110
- break;
111
case 19: /* vjcvt */
112
gen_helper_vjcvt(cpu_F0s, cpu_F0d, cpu_env);
113
break;
114
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
115
index XXXXXXX..XXXXXXX 100644
37
index XXXXXXX..XXXXXXX 100644
116
--- a/target/arm/vfp.decode
38
--- a/target/arm/cpu_tcg.c
117
+++ b/target/arm/vfp.decode
39
+++ b/target/arm/cpu_tcg.c
118
@@ -XXX,XX +XXX,XX @@ VCVT_sp ---- 1110 1.11 0111 .... 1010 11.0 .... \
40
@@ -XXX,XX +XXX,XX @@ void aa32_max_features(ARMCPU *cpu)
119
vd=%vd_dp vm=%vm_sp
41
t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
120
VCVT_dp ---- 1110 1.11 0111 .... 1011 11.0 .... \
42
t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* FEAT_TTCNP */
121
vd=%vd_sp vm=%vm_dp
43
t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* FEAT_XNX */
122
+
44
+ t = FIELD_DP32(t, ID_MMFR4, EVT, 2); /* FEAT_EVT */
123
+# VCVT from integer to floating point: Vm always single; Vd depends on size
45
cpu->isar.id_mmfr4 = t;
124
+VCVT_int_sp ---- 1110 1.11 1000 .... 1010 s:1 1.0 .... \
46
125
+ vd=%vd_sp vm=%vm_sp
47
t = cpu->isar.id_mmfr5;
126
+VCVT_int_dp ---- 1110 1.11 1000 .... 1011 s:1 1.0 .... \
127
+ vd=%vd_dp vm=%vm_sp
128
--
48
--
129
2.20.1
49
2.25.1
130
131
diff view generated by jsdifflib
1
In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
1
Convert the TYPE_ARM_SMMU device to 3-phase reset. The legacy method
2
some qbus buses not being connected to qdev devices -- if the
2
doesn't do anything that's invalid in the hold phase, so the
3
bus has no parent object then we register a reset function which
3
conversion is simple and not a behaviour change.
4
resets the bus on system reset (and unregister it when the
5
bus is unparented).
6
4
7
Nearly a decade later, we have now no buses in the tree which
5
Note that we must convert this base class before we can convert the
8
are created with non-NULL parents, so we can remove the
6
TYPE_ARM_SMMUV3 subclass -- transitional support in Resettable
9
workaround and instead just assert that if the bus has a NULL
7
handles "chain to parent class reset" when the base class is 3-phase
10
parent then it is the main system bus.
8
and the subclass is still using legacy reset, but not the other way
11
9
around.
12
(The absence of other parentless buses was confirmed by
13
code inspection of all the callsites of qbus_create() and
14
qbus_create_inplace() and cross-checked by 'make check'.)
15
10
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Reviewed-by: Markus Armbruster <armbru@redhat.com>
12
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
18
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
14
Reviewed-by: Eric Auger <eric.auger@redhat.com>
20
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
15
Message-id: 20221109161444.3397405-2-peter.maydell@linaro.org
21
Message-id: 20190523150543.22676-1-peter.maydell@linaro.org
22
---
16
---
23
hw/core/bus.c | 21 +++++++++------------
17
hw/arm/smmu-common.c | 7 ++++---
24
1 file changed, 9 insertions(+), 12 deletions(-)
18
1 file changed, 4 insertions(+), 3 deletions(-)
25
19
26
diff --git a/hw/core/bus.c b/hw/core/bus.c
20
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
27
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
28
--- a/hw/core/bus.c
22
--- a/hw/arm/smmu-common.c
29
+++ b/hw/core/bus.c
23
+++ b/hw/arm/smmu-common.c
30
@@ -XXX,XX +XXX,XX @@ static void qbus_realize(BusState *bus, DeviceState *parent, const char *name)
24
@@ -XXX,XX +XXX,XX @@ static void smmu_base_realize(DeviceState *dev, Error **errp)
31
bus->parent->num_child_bus++;
32
object_property_add_child(OBJECT(bus->parent), bus->name, OBJECT(bus), NULL);
33
object_unref(OBJECT(bus));
34
- } else if (bus != sysbus_get_default()) {
35
- /* TODO: once all bus devices are qdevified,
36
- only reset handler for main_system_bus should be registered here. */
37
- qemu_register_reset(qbus_reset_all_fn, bus);
38
+ } else {
39
+ /* The only bus without a parent is the main system bus */
40
+ assert(bus == sysbus_get_default());
41
}
25
}
42
}
26
}
43
27
44
@@ -XXX,XX +XXX,XX @@ static void bus_unparent(Object *obj)
28
-static void smmu_base_reset(DeviceState *dev)
45
BusState *bus = BUS(obj);
29
+static void smmu_base_reset_hold(Object *obj)
46
BusChild *kid;
30
{
47
31
- SMMUState *s = ARM_SMMU(dev);
48
+ /* Only the main system bus has no parent, and that bus is never freed */
32
+ SMMUState *s = ARM_SMMU(obj);
49
+ assert(bus->parent);
33
50
+
34
g_hash_table_remove_all(s->configs);
51
while ((kid = QTAILQ_FIRST(&bus->children)) != NULL) {
35
g_hash_table_remove_all(s->iotlb);
52
DeviceState *dev = kid->child;
36
@@ -XXX,XX +XXX,XX @@ static Property smmu_dev_properties[] = {
53
object_unparent(OBJECT(dev));
37
static void smmu_base_class_init(ObjectClass *klass, void *data)
54
}
38
{
55
- if (bus->parent) {
39
DeviceClass *dc = DEVICE_CLASS(klass);
56
- QLIST_REMOVE(bus, sibling);
40
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
57
- bus->parent->num_child_bus--;
41
SMMUBaseClass *sbc = ARM_SMMU_CLASS(klass);
58
- bus->parent = NULL;
42
59
- } else {
43
device_class_set_props(dc, smmu_dev_properties);
60
- assert(bus != sysbus_get_default()); /* main_system_bus is never freed */
44
device_class_set_parent_realize(dc, smmu_base_realize,
61
- qemu_unregister_reset(qbus_reset_all_fn, bus);
45
&sbc->parent_realize);
62
- }
46
- dc->reset = smmu_base_reset;
63
+ QLIST_REMOVE(bus, sibling);
47
+ rc->phases.hold = smmu_base_reset_hold;
64
+ bus->parent->num_child_bus--;
65
+ bus->parent = NULL;
66
}
48
}
67
49
68
void qbus_create_inplace(void *bus, size_t size, const char *typename,
50
static const TypeInfo smmu_base_info = {
69
--
51
--
70
2.20.1
52
2.25.1
71
53
72
54
diff view generated by jsdifflib
1
The SMMUv3 ID registers cover an area 0x30 bytes in size
1
Convert the TYPE_ARM_SMMUV3 device to 3-phase reset. The legacy
2
(12 registers, 4 bytes each). We were incorrectly decoding
2
reset method doesn't do anything that's invalid in the hold phase, so
3
only the first 0x20 bytes.
3
the conversion only requires changing it to a hold phase method, and
4
using the 3-phase versions of the "save the parent reset method and
5
chain to it" code.
4
6
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
9
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Message-id: 20190524124829.2589-1-peter.maydell@linaro.org
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20221109161444.3397405-3-peter.maydell@linaro.org
8
---
12
---
9
hw/arm/smmuv3.c | 2 +-
13
include/hw/arm/smmuv3.h | 2 +-
10
1 file changed, 1 insertion(+), 1 deletion(-)
14
hw/arm/smmuv3.c | 12 ++++++++----
15
2 files changed, 9 insertions(+), 5 deletions(-)
11
16
17
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/include/hw/arm/smmuv3.h
20
+++ b/include/hw/arm/smmuv3.h
21
@@ -XXX,XX +XXX,XX @@ struct SMMUv3Class {
22
/*< public >*/
23
24
DeviceRealize parent_realize;
25
- DeviceReset parent_reset;
26
+ ResettablePhases parent_phases;
27
};
28
29
#define TYPE_ARM_SMMUV3 "arm-smmuv3"
12
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
30
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
13
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
14
--- a/hw/arm/smmuv3.c
32
--- a/hw/arm/smmuv3.c
15
+++ b/hw/arm/smmuv3.c
33
+++ b/hw/arm/smmuv3.c
16
@@ -XXX,XX +XXX,XX @@ static MemTxResult smmu_readl(SMMUv3State *s, hwaddr offset,
34
@@ -XXX,XX +XXX,XX @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev)
17
uint64_t *data, MemTxAttrs attrs)
35
}
36
}
37
38
-static void smmu_reset(DeviceState *dev)
39
+static void smmu_reset_hold(Object *obj)
18
{
40
{
19
switch (offset) {
41
- SMMUv3State *s = ARM_SMMUV3(dev);
20
- case A_IDREGS ... A_IDREGS + 0x1f:
42
+ SMMUv3State *s = ARM_SMMUV3(obj);
21
+ case A_IDREGS ... A_IDREGS + 0x2f:
43
SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
22
*data = smmuv3_idreg(offset - A_IDREGS);
44
23
return MEMTX_OK;
45
- c->parent_reset(dev);
24
case A_IDR0 ... A_IDR5:
46
+ if (c->parent_phases.hold) {
47
+ c->parent_phases.hold(obj);
48
+ }
49
50
smmuv3_init_regs(s);
51
}
52
@@ -XXX,XX +XXX,XX @@ static void smmuv3_instance_init(Object *obj)
53
static void smmuv3_class_init(ObjectClass *klass, void *data)
54
{
55
DeviceClass *dc = DEVICE_CLASS(klass);
56
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
57
SMMUv3Class *c = ARM_SMMUV3_CLASS(klass);
58
59
dc->vmsd = &vmstate_smmuv3;
60
- device_class_set_parent_reset(dc, smmu_reset, &c->parent_reset);
61
+ resettable_class_set_parent_phases(rc, NULL, smmu_reset_hold, NULL,
62
+ &c->parent_phases);
63
c->parent_realize = dc->realize;
64
dc->realize = smmu_realize;
65
}
25
--
66
--
26
2.20.1
67
2.25.1
27
68
28
69
diff view generated by jsdifflib
1
Convert the VCVT double/single precision conversion insns to decodetree.
1
Convert the TYPE_ARM_GIC_COMMON device to 3-phase reset. This is a
2
simple no-behaviour-change conversion.
2
3
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20221109161444.3397405-4-peter.maydell@linaro.org
5
---
8
---
6
target/arm/translate-vfp.inc.c | 48 ++++++++++++++++++++++++++++++++++
9
hw/intc/arm_gic_common.c | 7 ++++---
7
target/arm/translate.c | 13 +--------
10
1 file changed, 4 insertions(+), 3 deletions(-)
8
target/arm/vfp.decode | 6 +++++
9
3 files changed, 55 insertions(+), 12 deletions(-)
10
11
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
--- a/hw/intc/arm_gic_common.c
14
+++ b/target/arm/translate-vfp.inc.c
15
+++ b/hw/intc/arm_gic_common.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
16
@@ -XXX,XX +XXX,XX @@ static inline void arm_gic_common_reset_irq_state(GICState *s, int first_cpu,
16
tcg_temp_free_i64(tmp);
17
}
17
return true;
18
}
18
}
19
+
19
20
+static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
20
-static void arm_gic_common_reset(DeviceState *dev)
21
+{
21
+static void arm_gic_common_reset_hold(Object *obj)
22
+ TCGv_i64 vd;
22
{
23
+ TCGv_i32 vm;
23
- GICState *s = ARM_GIC_COMMON(dev);
24
+
24
+ GICState *s = ARM_GIC_COMMON(obj);
25
+ /* UNDEF accesses to D16-D31 if they don't exist. */
25
int i, j;
26
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
26
int resetprio;
27
+ return false;
27
28
+ }
28
@@ -XXX,XX +XXX,XX @@ static Property arm_gic_common_properties[] = {
29
+
29
static void arm_gic_common_class_init(ObjectClass *klass, void *data)
30
+ if (!vfp_access_check(s)) {
30
{
31
+ return true;
31
DeviceClass *dc = DEVICE_CLASS(klass);
32
+ }
32
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
33
+
33
ARMLinuxBootIfClass *albifc = ARM_LINUX_BOOT_IF_CLASS(klass);
34
+ vm = tcg_temp_new_i32();
34
35
+ vd = tcg_temp_new_i64();
35
- dc->reset = arm_gic_common_reset;
36
+ neon_load_reg32(vm, a->vm);
36
+ rc->phases.hold = arm_gic_common_reset_hold;
37
+ gen_helper_vfp_fcvtds(vd, vm, cpu_env);
37
dc->realize = arm_gic_common_realize;
38
+ neon_store_reg64(vd, a->vd);
38
device_class_set_props(dc, arm_gic_common_properties);
39
+ tcg_temp_free_i32(vm);
39
dc->vmsd = &vmstate_gic;
40
+ tcg_temp_free_i64(vd);
41
+ return true;
42
+}
43
+
44
+static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
45
+{
46
+ TCGv_i64 vm;
47
+ TCGv_i32 vd;
48
+
49
+ /* UNDEF accesses to D16-D31 if they don't exist. */
50
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
51
+ return false;
52
+ }
53
+
54
+ if (!vfp_access_check(s)) {
55
+ return true;
56
+ }
57
+
58
+ vd = tcg_temp_new_i32();
59
+ vm = tcg_temp_new_i64();
60
+ neon_load_reg64(vm, a->vm);
61
+ gen_helper_vfp_fcvtsd(vd, vm, cpu_env);
62
+ neon_store_reg32(vd, a->vd);
63
+ tcg_temp_free_i32(vd);
64
+ tcg_temp_free_i64(vm);
65
+ return true;
66
+}
67
diff --git a/target/arm/translate.c b/target/arm/translate.c
68
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/translate.c
70
+++ b/target/arm/translate.c
71
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
72
return 1;
73
case 15:
74
switch (rn) {
75
- case 0 ... 14:
76
+ case 0 ... 15:
77
/* Already handled by decodetree */
78
return 1;
79
default:
80
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
81
if (op == 15) {
82
/* rn is opcode, encoded as per VFP_SREG_N. */
83
switch (rn) {
84
- case 0x0f: /* vcvt double<->single */
85
- rd_is_dp = !dp;
86
- break;
87
-
88
case 0x10: /* vcvt.fxx.u32 */
89
case 0x11: /* vcvt.fxx.s32 */
90
rm_is_dp = false;
91
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
92
switch (op) {
93
case 15: /* extension space */
94
switch (rn) {
95
- case 15: /* single<->double conversion */
96
- if (dp) {
97
- gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env);
98
- } else {
99
- gen_helper_vfp_fcvtds(cpu_F0d, cpu_F0s, cpu_env);
100
- }
101
- break;
102
case 16: /* fuito */
103
gen_vfp_uito(dp, 0);
104
break;
105
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
106
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/vfp.decode
108
+++ b/target/arm/vfp.decode
109
@@ -XXX,XX +XXX,XX @@ VRINTX_sp ---- 1110 1.11 0111 .... 1010 01.0 .... \
110
vd=%vd_sp vm=%vm_sp
111
VRINTX_dp ---- 1110 1.11 0111 .... 1011 01.0 .... \
112
vd=%vd_dp vm=%vm_dp
113
+
114
+# VCVT between single and double: Vm precision depends on size; Vd is its reverse
115
+VCVT_sp ---- 1110 1.11 0111 .... 1010 11.0 .... \
116
+ vd=%vd_dp vm=%vm_sp
117
+VCVT_dp ---- 1110 1.11 0111 .... 1011 11.0 .... \
118
+ vd=%vd_sp vm=%vm_dp
119
--
40
--
120
2.20.1
41
2.25.1
121
42
122
43
diff view generated by jsdifflib
1
Convert the float-to-integer VCVT instructions to decodetree.
1
Now we have converted TYPE_ARM_GIC_COMMON, we can convert the
2
Since these are the last unconverted instructions, we can
2
TYPE_ARM_GIC_KVM subclass to 3-phase reset.
3
delete the old decoder structure entirely now.
4
3
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Message-id: 20221109161444.3397405-5-peter.maydell@linaro.org
7
---
8
---
8
target/arm/translate-vfp.inc.c | 72 ++++++++++
9
hw/intc/arm_gic_kvm.c | 14 +++++++++-----
9
target/arm/translate.c | 241 +--------------------------------
10
1 file changed, 9 insertions(+), 5 deletions(-)
10
target/arm/vfp.decode | 6 +
11
3 files changed, 80 insertions(+), 239 deletions(-)
12
11
13
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate-vfp.inc.c
14
--- a/hw/intc/arm_gic_kvm.c
16
+++ b/target/arm/translate-vfp.inc.c
15
+++ b/hw/intc/arm_gic_kvm.c
17
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
16
@@ -XXX,XX +XXX,XX @@ DECLARE_OBJ_CHECKERS(GICState, KVMARMGICClass,
18
tcg_temp_free_ptr(fpst);
17
struct KVMARMGICClass {
19
return true;
18
ARMGICCommonClass parent_class;
19
DeviceRealize parent_realize;
20
- void (*parent_reset)(DeviceState *dev);
21
+ ResettablePhases parent_phases;
22
};
23
24
void kvm_arm_gic_set_irq(uint32_t num_irq, int irq, int level)
25
@@ -XXX,XX +XXX,XX @@ static void kvm_arm_gic_get(GICState *s)
26
}
20
}
27
}
21
+
28
22
+static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a)
29
-static void kvm_arm_gic_reset(DeviceState *dev)
23
+{
30
+static void kvm_arm_gic_reset_hold(Object *obj)
24
+ TCGv_i32 vm;
31
{
25
+ TCGv_ptr fpst;
32
- GICState *s = ARM_GIC_COMMON(dev);
26
+
33
+ GICState *s = ARM_GIC_COMMON(obj);
27
+ if (!vfp_access_check(s)) {
34
KVMARMGICClass *kgc = KVM_ARM_GIC_GET_CLASS(s);
28
+ return true;
35
36
- kgc->parent_reset(dev);
37
+ if (kgc->parent_phases.hold) {
38
+ kgc->parent_phases.hold(obj);
29
+ }
39
+ }
30
+
40
31
+ fpst = get_fpstatus_ptr(false);
41
if (kvm_arm_gic_can_save_restore(s)) {
32
+ vm = tcg_temp_new_i32();
42
kvm_arm_gic_put(s);
33
+ neon_load_reg32(vm, a->vm);
43
@@ -XXX,XX +XXX,XX @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
34
+
44
static void kvm_arm_gic_class_init(ObjectClass *klass, void *data)
35
+ if (a->s) {
45
{
36
+ if (a->rz) {
46
DeviceClass *dc = DEVICE_CLASS(klass);
37
+ gen_helper_vfp_tosizs(vm, vm, fpst);
47
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
38
+ } else {
48
ARMGICCommonClass *agcc = ARM_GIC_COMMON_CLASS(klass);
39
+ gen_helper_vfp_tosis(vm, vm, fpst);
49
KVMARMGICClass *kgc = KVM_ARM_GIC_CLASS(klass);
40
+ }
50
41
+ } else {
51
@@ -XXX,XX +XXX,XX @@ static void kvm_arm_gic_class_init(ObjectClass *klass, void *data)
42
+ if (a->rz) {
52
agcc->post_load = kvm_arm_gic_put;
43
+ gen_helper_vfp_touizs(vm, vm, fpst);
53
device_class_set_parent_realize(dc, kvm_arm_gic_realize,
44
+ } else {
54
&kgc->parent_realize);
45
+ gen_helper_vfp_touis(vm, vm, fpst);
55
- device_class_set_parent_reset(dc, kvm_arm_gic_reset, &kgc->parent_reset);
46
+ }
56
+ resettable_class_set_parent_phases(rc, NULL, kvm_arm_gic_reset_hold, NULL,
47
+ }
57
+ &kgc->parent_phases);
48
+ neon_store_reg32(vm, a->vd);
49
+ tcg_temp_free_i32(vm);
50
+ tcg_temp_free_ptr(fpst);
51
+ return true;
52
+}
53
+
54
+static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
55
+{
56
+ TCGv_i32 vd;
57
+ TCGv_i64 vm;
58
+ TCGv_ptr fpst;
59
+
60
+ /* UNDEF accesses to D16-D31 if they don't exist. */
61
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
62
+ return false;
63
+ }
64
+
65
+ if (!vfp_access_check(s)) {
66
+ return true;
67
+ }
68
+
69
+ fpst = get_fpstatus_ptr(false);
70
+ vm = tcg_temp_new_i64();
71
+ vd = tcg_temp_new_i32();
72
+ neon_load_reg64(vm, a->vm);
73
+
74
+ if (a->s) {
75
+ if (a->rz) {
76
+ gen_helper_vfp_tosizd(vd, vm, fpst);
77
+ } else {
78
+ gen_helper_vfp_tosid(vd, vm, fpst);
79
+ }
80
+ } else {
81
+ if (a->rz) {
82
+ gen_helper_vfp_touizd(vd, vm, fpst);
83
+ } else {
84
+ gen_helper_vfp_touid(vd, vm, fpst);
85
+ }
86
+ }
87
+ neon_store_reg32(vd, a->vd);
88
+ tcg_temp_free_i32(vd);
89
+ tcg_temp_free_i64(vm);
90
+ tcg_temp_free_ptr(fpst);
91
+ return true;
92
+}
93
diff --git a/target/arm/translate.c b/target/arm/translate.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/translate.c
96
+++ b/target/arm/translate.c
97
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_##name(int dp, int neon) \
98
tcg_temp_free_ptr(statusptr); \
99
}
58
}
100
59
101
-VFP_GEN_FTOI(toui)
60
static const TypeInfo kvm_arm_gic_info = {
102
VFP_GEN_FTOI(touiz)
103
-VFP_GEN_FTOI(tosi)
104
VFP_GEN_FTOI(tosiz)
105
#undef VFP_GEN_FTOI
106
107
@@ -XXX,XX +XXX,XX @@ static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
108
}
109
110
#define tcg_gen_ld_f32 tcg_gen_ld_i32
111
-#define tcg_gen_ld_f64 tcg_gen_ld_i64
112
#define tcg_gen_st_f32 tcg_gen_st_i32
113
-#define tcg_gen_st_f64 tcg_gen_st_i64
114
-
115
-static inline void gen_mov_F0_vreg(int dp, int reg)
116
-{
117
- if (dp)
118
- tcg_gen_ld_f64(cpu_F0d, cpu_env, vfp_reg_offset(dp, reg));
119
- else
120
- tcg_gen_ld_f32(cpu_F0s, cpu_env, vfp_reg_offset(dp, reg));
121
-}
122
-
123
-static inline void gen_mov_F1_vreg(int dp, int reg)
124
-{
125
- if (dp)
126
- tcg_gen_ld_f64(cpu_F1d, cpu_env, vfp_reg_offset(dp, reg));
127
- else
128
- tcg_gen_ld_f32(cpu_F1s, cpu_env, vfp_reg_offset(dp, reg));
129
-}
130
-
131
-static inline void gen_mov_vreg_F0(int dp, int reg)
132
-{
133
- if (dp)
134
- tcg_gen_st_f64(cpu_F0d, cpu_env, vfp_reg_offset(dp, reg));
135
- else
136
- tcg_gen_st_f32(cpu_F0s, cpu_env, vfp_reg_offset(dp, reg));
137
-}
138
139
#define ARM_CP_RW_BIT (1 << 20)
140
141
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
142
*/
143
static int disas_vfp_insn(DisasContext *s, uint32_t insn)
144
{
145
- uint32_t rd, rn, rm, op, delta_d, delta_m, bank_mask;
146
- int dp, veclen;
147
-
148
if (!arm_dc_feature(s, ARM_FEATURE_VFP)) {
149
return 1;
150
}
151
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
152
return 0;
153
}
154
}
155
-
156
- if (extract32(insn, 28, 4) == 0xf) {
157
- /*
158
- * Encodings with T=1 (Thumb) or unconditional (ARM): these
159
- * were all handled by the decodetree decoder, so any insn
160
- * patterns which get here must be UNDEF.
161
- */
162
- return 1;
163
- }
164
-
165
- /*
166
- * FIXME: this access check should not take precedence over UNDEF
167
- * for invalid encodings; we will generate incorrect syndrome information
168
- * for attempts to execute invalid vfp/neon encodings with FP disabled.
169
- */
170
- if (!vfp_access_check(s)) {
171
- return 0;
172
- }
173
-
174
- dp = ((insn & 0xf00) == 0xb00);
175
- switch ((insn >> 24) & 0xf) {
176
- case 0xe:
177
- if (insn & (1 << 4)) {
178
- /* already handled by decodetree */
179
- return 1;
180
- } else {
181
- /* data processing */
182
- bool rd_is_dp = dp;
183
- bool rm_is_dp = dp;
184
- bool no_output = false;
185
-
186
- /* The opcode is in bits 23, 21, 20 and 6. */
187
- op = ((insn >> 20) & 8) | ((insn >> 19) & 6) | ((insn >> 6) & 1);
188
- rn = VFP_SREG_N(insn);
189
-
190
- switch (op) {
191
- case 0 ... 14:
192
- /* Already handled by decodetree */
193
- return 1;
194
- case 15:
195
- switch (rn) {
196
- case 0 ... 23:
197
- case 28 ... 31:
198
- /* Already handled by decodetree */
199
- return 1;
200
- default:
201
- break;
202
- }
203
- default:
204
- break;
205
- }
206
-
207
- if (op == 15) {
208
- /* rn is opcode, encoded as per VFP_SREG_N. */
209
- switch (rn) {
210
- case 0x18: /* vcvtr.u32.fxx */
211
- case 0x19: /* vcvtz.u32.fxx */
212
- case 0x1a: /* vcvtr.s32.fxx */
213
- case 0x1b: /* vcvtz.s32.fxx */
214
- rd_is_dp = false;
215
- break;
216
-
217
- default:
218
- return 1;
219
- }
220
- } else if (dp) {
221
- /* rn is register number */
222
- VFP_DREG_N(rn, insn);
223
- }
224
-
225
- if (rd_is_dp) {
226
- VFP_DREG_D(rd, insn);
227
- } else {
228
- rd = VFP_SREG_D(insn);
229
- }
230
- if (rm_is_dp) {
231
- VFP_DREG_M(rm, insn);
232
- } else {
233
- rm = VFP_SREG_M(insn);
234
- }
235
-
236
- veclen = s->vec_len;
237
- if (op == 15 && rn > 3) {
238
- veclen = 0;
239
- }
240
-
241
- /* Shut up compiler warnings. */
242
- delta_m = 0;
243
- delta_d = 0;
244
- bank_mask = 0;
245
-
246
- if (veclen > 0) {
247
- if (dp)
248
- bank_mask = 0xc;
249
- else
250
- bank_mask = 0x18;
251
-
252
- /* Figure out what type of vector operation this is. */
253
- if ((rd & bank_mask) == 0) {
254
- /* scalar */
255
- veclen = 0;
256
- } else {
257
- if (dp)
258
- delta_d = (s->vec_stride >> 1) + 1;
259
- else
260
- delta_d = s->vec_stride + 1;
261
-
262
- if ((rm & bank_mask) == 0) {
263
- /* mixed scalar/vector */
264
- delta_m = 0;
265
- } else {
266
- /* vector */
267
- delta_m = delta_d;
268
- }
269
- }
270
- }
271
-
272
- /* Load the initial operands. */
273
- if (op == 15) {
274
- switch (rn) {
275
- default:
276
- /* One source operand. */
277
- gen_mov_F0_vreg(rm_is_dp, rm);
278
- break;
279
- }
280
- } else {
281
- /* Two source operands. */
282
- gen_mov_F0_vreg(dp, rn);
283
- gen_mov_F1_vreg(dp, rm);
284
- }
285
-
286
- for (;;) {
287
- /* Perform the calculation. */
288
- switch (op) {
289
- case 15: /* extension space */
290
- switch (rn) {
291
- case 24: /* ftoui */
292
- gen_vfp_toui(dp, 0);
293
- break;
294
- case 25: /* ftouiz */
295
- gen_vfp_touiz(dp, 0);
296
- break;
297
- case 26: /* ftosi */
298
- gen_vfp_tosi(dp, 0);
299
- break;
300
- case 27: /* ftosiz */
301
- gen_vfp_tosiz(dp, 0);
302
- break;
303
- default: /* undefined */
304
- g_assert_not_reached();
305
- }
306
- break;
307
- default: /* undefined */
308
- return 1;
309
- }
310
-
311
- /* Write back the result, if any. */
312
- if (!no_output) {
313
- gen_mov_vreg_F0(rd_is_dp, rd);
314
- }
315
-
316
- /* break out of the loop if we have finished */
317
- if (veclen == 0) {
318
- break;
319
- }
320
-
321
- if (op == 15 && delta_m == 0) {
322
- /* single source one-many */
323
- while (veclen--) {
324
- rd = ((rd + delta_d) & (bank_mask - 1))
325
- | (rd & bank_mask);
326
- gen_mov_vreg_F0(dp, rd);
327
- }
328
- break;
329
- }
330
- /* Setup the next operands. */
331
- veclen--;
332
- rd = ((rd + delta_d) & (bank_mask - 1))
333
- | (rd & bank_mask);
334
-
335
- if (op == 15) {
336
- /* One source operand. */
337
- rm = ((rm + delta_m) & (bank_mask - 1))
338
- | (rm & bank_mask);
339
- gen_mov_F0_vreg(dp, rm);
340
- } else {
341
- /* Two source operands. */
342
- rn = ((rn + delta_d) & (bank_mask - 1))
343
- | (rn & bank_mask);
344
- gen_mov_F0_vreg(dp, rn);
345
- if (delta_m) {
346
- rm = ((rm + delta_m) & (bank_mask - 1))
347
- | (rm & bank_mask);
348
- gen_mov_F1_vreg(dp, rm);
349
- }
350
- }
351
- }
352
- }
353
- break;
354
- case 0xc:
355
- case 0xd:
356
- /* Already handled by decodetree */
357
- return 1;
358
- default:
359
- /* Should never happen. */
360
- return 1;
361
- }
362
- return 0;
363
+ /* If the decodetree decoder didn't handle this insn, it must be UNDEF */
364
+ return 1;
365
}
366
367
static inline bool use_goto_tb(DisasContext *s, target_ulong dest)
368
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
369
index XXXXXXX..XXXXXXX 100644
370
--- a/target/arm/vfp.decode
371
+++ b/target/arm/vfp.decode
372
@@ -XXX,XX +XXX,XX @@ VCVT_fix_sp ---- 1110 1.11 1.1. .... 1010 .1.0 .... \
373
vd=%vd_sp imm=%vm_sp opc=%vcvt_fix_op
374
VCVT_fix_dp ---- 1110 1.11 1.1. .... 1011 .1.0 .... \
375
vd=%vd_dp imm=%vm_sp opc=%vcvt_fix_op
376
+
377
+# VCVT float to integer (VCVT and VCVTR): Vd always single; Vd depends on size
378
+VCVT_sp_int ---- 1110 1.11 110 s:1 .... 1010 rz:1 1.0 .... \
379
+ vd=%vd_sp vm=%vm_sp
380
+VCVT_dp_int ---- 1110 1.11 110 s:1 .... 1011 rz:1 1.0 .... \
381
+ vd=%vd_sp vm=%vm_dp
382
--
61
--
383
2.20.1
62
2.25.1
384
63
385
64
diff view generated by jsdifflib
1
Convert the VCVTT and VCVTB instructions which convert from
1
Convert the TYPE_ARM_GICV3_COMMON parent class to 3-phase reset.
2
f32 and f64 to f16 to decodetree.
3
4
Since we're no longer constrained to the old decoder's style
5
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
6
store of the right half of the input single-precision register
7
rather than doing a load/modify/store sequence on the full
8
32 bits.
9
2
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Message-id: 20221109161444.3397405-6-peter.maydell@linaro.org
12
---
7
---
13
target/arm/translate-vfp.inc.c | 62 ++++++++++++++++++++++++++
8
hw/intc/arm_gicv3_common.c | 7 ++++---
14
target/arm/translate.c | 79 +---------------------------------
9
1 file changed, 4 insertions(+), 3 deletions(-)
15
target/arm/vfp.decode | 6 +++
16
3 files changed, 69 insertions(+), 78 deletions(-)
17
10
18
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
11
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
19
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate-vfp.inc.c
13
--- a/hw/intc/arm_gicv3_common.c
21
+++ b/target/arm/translate-vfp.inc.c
14
+++ b/hw/intc/arm_gicv3_common.c
22
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
15
@@ -XXX,XX +XXX,XX @@ static void arm_gicv3_finalize(Object *obj)
23
tcg_temp_free_i64(vd);
16
g_free(s->redist_region_count);
24
return true;
25
}
17
}
26
+
18
27
+static bool trans_VCVT_f16_f32(DisasContext *s, arg_VCVT_f16_f32 *a)
19
-static void arm_gicv3_common_reset(DeviceState *dev)
28
+{
20
+static void arm_gicv3_common_reset_hold(Object *obj)
29
+ TCGv_ptr fpst;
30
+ TCGv_i32 ahp_mode;
31
+ TCGv_i32 tmp;
32
+
33
+ if (!dc_isar_feature(aa32_fp16_spconv, s)) {
34
+ return false;
35
+ }
36
+
37
+ if (!vfp_access_check(s)) {
38
+ return true;
39
+ }
40
+
41
+ fpst = get_fpstatus_ptr(false);
42
+ ahp_mode = get_ahp_flag();
43
+ tmp = tcg_temp_new_i32();
44
+
45
+ neon_load_reg32(tmp, a->vm);
46
+ gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp_mode);
47
+ tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t));
48
+ tcg_temp_free_i32(ahp_mode);
49
+ tcg_temp_free_ptr(fpst);
50
+ tcg_temp_free_i32(tmp);
51
+ return true;
52
+}
53
+
54
+static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
55
+{
56
+ TCGv_ptr fpst;
57
+ TCGv_i32 ahp_mode;
58
+ TCGv_i32 tmp;
59
+ TCGv_i64 vm;
60
+
61
+ if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
62
+ return false;
63
+ }
64
+
65
+ /* UNDEF accesses to D16-D31 if they don't exist. */
66
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
67
+ return false;
68
+ }
69
+
70
+ if (!vfp_access_check(s)) {
71
+ return true;
72
+ }
73
+
74
+ fpst = get_fpstatus_ptr(false);
75
+ ahp_mode = get_ahp_flag();
76
+ tmp = tcg_temp_new_i32();
77
+ vm = tcg_temp_new_i64();
78
+
79
+ neon_load_reg64(vm, a->vm);
80
+ gen_helper_vfp_fcvt_f64_to_f16(tmp, vm, fpst, ahp_mode);
81
+ tcg_temp_free_i64(vm);
82
+ tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t));
83
+ tcg_temp_free_i32(ahp_mode);
84
+ tcg_temp_free_ptr(fpst);
85
+ tcg_temp_free_i32(tmp);
86
+ return true;
87
+}
88
diff --git a/target/arm/translate.c b/target/arm/translate.c
89
index XXXXXXX..XXXXXXX 100644
90
--- a/target/arm/translate.c
91
+++ b/target/arm/translate.c
92
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
93
#define VFP_SREG_M(insn) VFP_SREG(insn, 0, 5)
94
#define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn, 0, 5)
95
96
-/* Move between integer and VFP cores. */
97
-static TCGv_i32 gen_vfp_mrs(void)
98
-{
99
- TCGv_i32 tmp = tcg_temp_new_i32();
100
- tcg_gen_mov_i32(tmp, cpu_F0s);
101
- return tmp;
102
-}
103
-
104
-static void gen_vfp_msr(TCGv_i32 tmp)
105
-{
106
- tcg_gen_mov_i32(cpu_F0s, tmp);
107
- tcg_temp_free_i32(tmp);
108
-}
109
-
110
static void gen_neon_dup_low16(TCGv_i32 var)
111
{
21
{
112
TCGv_i32 tmp = tcg_temp_new_i32();
22
- GICv3State *s = ARM_GICV3_COMMON(dev);
113
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
23
+ GICv3State *s = ARM_GICV3_COMMON(obj);
24
int i;
25
26
for (i = 0; i < s->num_cpu; i++) {
27
@@ -XXX,XX +XXX,XX @@ static Property arm_gicv3_common_properties[] = {
28
static void arm_gicv3_common_class_init(ObjectClass *klass, void *data)
114
{
29
{
115
uint32_t rd, rn, rm, op, delta_d, delta_m, bank_mask;
30
DeviceClass *dc = DEVICE_CLASS(klass);
116
int dp, veclen;
31
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
117
- TCGv_i32 tmp;
32
ARMLinuxBootIfClass *albifc = ARM_LINUX_BOOT_IF_CLASS(klass);
118
- TCGv_i32 tmp2;
33
119
34
- dc->reset = arm_gicv3_common_reset;
120
if (!arm_dc_feature(s, ARM_FEATURE_VFP)) {
35
+ rc->phases.hold = arm_gicv3_common_reset_hold;
121
return 1;
36
dc->realize = arm_gicv3_common_realize;
122
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
37
device_class_set_props(dc, arm_gicv3_common_properties);
123
return 1;
38
dc->vmsd = &vmstate_gicv3;
124
case 15:
125
switch (rn) {
126
- case 0 ... 5:
127
- case 8 ... 11:
128
+ case 0 ... 11:
129
/* Already handled by decodetree */
130
return 1;
131
default:
132
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
133
if (op == 15) {
134
/* rn is opcode, encoded as per VFP_SREG_N. */
135
switch (rn) {
136
- case 0x06: /* vcvtb.f16.f32, vcvtb.f16.f64 */
137
- case 0x07: /* vcvtt.f16.f32, vcvtt.f16.f64 */
138
- if (dp) {
139
- if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
140
- return 1;
141
- }
142
- } else {
143
- if (!dc_isar_feature(aa32_fp16_spconv, s)) {
144
- return 1;
145
- }
146
- }
147
- rd_is_dp = false;
148
- break;
149
-
150
case 0x0c: /* vrintr */
151
case 0x0d: /* vrintz */
152
case 0x0e: /* vrintx */
153
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
154
switch (op) {
155
case 15: /* extension space */
156
switch (rn) {
157
- case 6: /* vcvtb.f16.f32, vcvtb.f16.f64 */
158
- {
159
- TCGv_ptr fpst = get_fpstatus_ptr(false);
160
- TCGv_i32 ahp = get_ahp_flag();
161
- tmp = tcg_temp_new_i32();
162
-
163
- if (dp) {
164
- gen_helper_vfp_fcvt_f64_to_f16(tmp, cpu_F0d,
165
- fpst, ahp);
166
- } else {
167
- gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s,
168
- fpst, ahp);
169
- }
170
- tcg_temp_free_i32(ahp);
171
- tcg_temp_free_ptr(fpst);
172
- gen_mov_F0_vreg(0, rd);
173
- tmp2 = gen_vfp_mrs();
174
- tcg_gen_andi_i32(tmp2, tmp2, 0xffff0000);
175
- tcg_gen_or_i32(tmp, tmp, tmp2);
176
- tcg_temp_free_i32(tmp2);
177
- gen_vfp_msr(tmp);
178
- break;
179
- }
180
- case 7: /* vcvtt.f16.f32, vcvtt.f16.f64 */
181
- {
182
- TCGv_ptr fpst = get_fpstatus_ptr(false);
183
- TCGv_i32 ahp = get_ahp_flag();
184
- tmp = tcg_temp_new_i32();
185
- if (dp) {
186
- gen_helper_vfp_fcvt_f64_to_f16(tmp, cpu_F0d,
187
- fpst, ahp);
188
- } else {
189
- gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s,
190
- fpst, ahp);
191
- }
192
- tcg_temp_free_i32(ahp);
193
- tcg_temp_free_ptr(fpst);
194
- tcg_gen_shli_i32(tmp, tmp, 16);
195
- gen_mov_F0_vreg(0, rd);
196
- tmp2 = gen_vfp_mrs();
197
- tcg_gen_ext16u_i32(tmp2, tmp2);
198
- tcg_gen_or_i32(tmp, tmp, tmp2);
199
- tcg_temp_free_i32(tmp2);
200
- gen_vfp_msr(tmp);
201
- break;
202
- }
203
case 12: /* vrintr */
204
{
205
TCGv_ptr fpst = get_fpstatus_ptr(0);
206
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
207
index XXXXXXX..XXXXXXX 100644
208
--- a/target/arm/vfp.decode
209
+++ b/target/arm/vfp.decode
210
@@ -XXX,XX +XXX,XX @@ VCVT_f32_f16 ---- 1110 1.11 0010 .... 1010 t:1 1.0 .... \
211
vd=%vd_sp vm=%vm_sp
212
VCVT_f64_f16 ---- 1110 1.11 0010 .... 1011 t:1 1.0 .... \
213
vd=%vd_dp vm=%vm_sp
214
+
215
+# VCVTB and VCVTT to f16: Vd format is always vd_sp; Vm format depends on size bit
216
+VCVT_f16_f32 ---- 1110 1.11 0011 .... 1010 t:1 1.0 .... \
217
+ vd=%vd_sp vm=%vm_sp
218
+VCVT_f16_f64 ---- 1110 1.11 0011 .... 1011 t:1 1.0 .... \
219
+ vd=%vd_sp vm=%vm_dp
220
--
39
--
221
2.20.1
40
2.25.1
222
41
223
42
diff view generated by jsdifflib
1
Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree.
1
Convert the TYPE_KVM_ARM_GICV3 device to 3-phase reset.
2
trans_VCVT() is temporarily left in translate.c.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Message-id: 20221109161444.3397405-7-peter.maydell@linaro.org
6
---
7
---
7
target/arm/translate.c | 72 +++++++++++++++++-------------------
8
hw/intc/arm_gicv3_kvm.c | 14 +++++++++-----
8
target/arm/vfp-uncond.decode | 6 +++
9
1 file changed, 9 insertions(+), 5 deletions(-)
9
2 files changed, 39 insertions(+), 39 deletions(-)
10
10
11
diff --git a/target/arm/translate.c b/target/arm/translate.c
11
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
12
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate.c
13
--- a/hw/intc/arm_gicv3_kvm.c
14
+++ b/target/arm/translate.c
14
+++ b/hw/intc/arm_gicv3_kvm.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
15
@@ -XXX,XX +XXX,XX @@ DECLARE_OBJ_CHECKERS(GICv3State, KVMARMGICv3Class,
16
return true;
16
struct KVMARMGICv3Class {
17
ARMGICv3CommonClass parent_class;
18
DeviceRealize parent_realize;
19
- void (*parent_reset)(DeviceState *dev);
20
+ ResettablePhases parent_phases;
21
};
22
23
static void kvm_arm_gicv3_set_irq(void *opaque, int irq, int level)
24
@@ -XXX,XX +XXX,XX @@ static void arm_gicv3_icc_reset(CPUARMState *env, const ARMCPRegInfo *ri)
25
c->icc_ctlr_el1[GICV3_S] = c->icc_ctlr_el1[GICV3_NS];
17
}
26
}
18
27
19
-static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
28
-static void kvm_arm_gicv3_reset(DeviceState *dev)
20
- int rounding)
29
+static void kvm_arm_gicv3_reset_hold(Object *obj)
21
+static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
22
{
30
{
23
- bool is_signed = extract32(insn, 7, 1);
31
- GICv3State *s = ARM_GICV3_COMMON(dev);
24
- TCGv_ptr fpst = get_fpstatus_ptr(0);
32
+ GICv3State *s = ARM_GICV3_COMMON(obj);
25
+ uint32_t rd, rm;
33
KVMARMGICv3Class *kgc = KVM_ARM_GICV3_GET_CLASS(s);
26
+ bool dp = a->dp;
34
27
+ TCGv_ptr fpst;
35
DPRINTF("Reset\n");
28
TCGv_i32 tcg_rmode, tcg_shift;
36
29
+ int rounding = fp_decode_rm[a->rm];
37
- kgc->parent_reset(dev);
30
+ bool is_signed = a->op;
38
+ if (kgc->parent_phases.hold) {
31
+
39
+ kgc->parent_phases.hold(obj);
32
+ if (!dc_isar_feature(aa32_vcvt_dr, s)) {
33
+ return false;
34
+ }
40
+ }
35
+
41
36
+ /* UNDEF accesses to D16-D31 if they don't exist */
42
if (s->migration_blocker) {
37
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
43
DPRINTF("Cannot put kernel gic state, no kernel interface\n");
38
+ return false;
44
@@ -XXX,XX +XXX,XX @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
39
+ }
45
static void kvm_arm_gicv3_class_init(ObjectClass *klass, void *data)
40
+ rd = a->vd;
46
{
41
+ rm = a->vm;
47
DeviceClass *dc = DEVICE_CLASS(klass);
42
+
48
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
43
+ if (!vfp_access_check(s)) {
49
ARMGICv3CommonClass *agcc = ARM_GICV3_COMMON_CLASS(klass);
44
+ return true;
50
KVMARMGICv3Class *kgc = KVM_ARM_GICV3_CLASS(klass);
45
+ }
51
46
+
52
@@ -XXX,XX +XXX,XX @@ static void kvm_arm_gicv3_class_init(ObjectClass *klass, void *data)
47
+ fpst = get_fpstatus_ptr(0);
53
agcc->post_load = kvm_arm_gicv3_put;
48
54
device_class_set_parent_realize(dc, kvm_arm_gicv3_realize,
49
tcg_shift = tcg_const_i32(0);
55
&kgc->parent_realize);
50
56
- device_class_set_parent_reset(dc, kvm_arm_gicv3_reset, &kgc->parent_reset);
51
@@ -XXX,XX +XXX,XX @@ static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
57
+ resettable_class_set_parent_phases(rc, NULL, kvm_arm_gicv3_reset_hold, NULL,
52
if (dp) {
58
+ &kgc->parent_phases);
53
TCGv_i64 tcg_double, tcg_res;
54
TCGv_i32 tcg_tmp;
55
- /* Rd is encoded as a single precision register even when the source
56
- * is double precision.
57
- */
58
- rd = ((rd << 1) & 0x1e) | ((rd >> 4) & 0x1);
59
tcg_double = tcg_temp_new_i64();
60
tcg_res = tcg_temp_new_i64();
61
tcg_tmp = tcg_temp_new_i32();
62
@@ -XXX,XX +XXX,XX @@ static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
63
64
tcg_temp_free_ptr(fpst);
65
66
- return 0;
67
-}
68
-
69
-static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
70
-{
71
- uint32_t rd, rm, dp = extract32(insn, 8, 1);
72
-
73
- if (dp) {
74
- VFP_DREG_D(rd, insn);
75
- VFP_DREG_M(rm, insn);
76
- } else {
77
- rd = VFP_SREG_D(insn);
78
- rm = VFP_SREG_M(insn);
79
- }
80
-
81
- if ((insn & 0x0fbc0e50) == 0x0ebc0a40 &&
82
- dc_isar_feature(aa32_vcvt_dr, s)) {
83
- /* VCVTA, VCVTN, VCVTP, VCVTM */
84
- int rounding = fp_decode_rm[extract32(insn, 16, 2)];
85
- return handle_vcvt(insn, rd, rm, dp, rounding);
86
- }
87
- return 1;
88
+ return true;
89
}
59
}
90
60
91
/*
61
static const TypeInfo kvm_arm_gicv3_info = {
92
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
93
}
94
}
95
96
+ if (extract32(insn, 28, 4) == 0xf) {
97
+ /*
98
+ * Encodings with T=1 (Thumb) or unconditional (ARM): these
99
+ * were all handled by the decodetree decoder, so any insn
100
+ * patterns which get here must be UNDEF.
101
+ */
102
+ return 1;
103
+ }
104
+
105
/*
106
* FIXME: this access check should not take precedence over UNDEF
107
* for invalid encodings; we will generate incorrect syndrome information
108
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
109
return 0;
110
}
111
112
- if (extract32(insn, 28, 4) == 0xf) {
113
- /*
114
- * Encodings with T=1 (Thumb) or unconditional (ARM):
115
- * only used for the "miscellaneous VFP features" added in v8A
116
- * and v7M (and gated on the MVFR2.FPMisc field).
117
- */
118
- return disas_vfp_misc_insn(s, insn);
119
- }
120
-
121
dp = ((insn & 0xf00) == 0xb00);
122
switch ((insn >> 24) & 0xf) {
123
case 0xe:
124
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
125
index XXXXXXX..XXXXXXX 100644
126
--- a/target/arm/vfp-uncond.decode
127
+++ b/target/arm/vfp-uncond.decode
128
@@ -XXX,XX +XXX,XX @@ VRINT 1111 1110 1.11 10 rm:2 .... 1010 01.0 .... \
129
vm=%vm_sp vd=%vd_sp dp=0
130
VRINT 1111 1110 1.11 10 rm:2 .... 1011 01.0 .... \
131
vm=%vm_dp vd=%vd_dp dp=1
132
+
133
+# VCVT float to int with specified rounding mode; Vd is always single-precision
134
+VCVT 1111 1110 1.11 11 rm:2 .... 1010 op:1 1.0 .... \
135
+ vm=%vm_sp vd=%vd_sp dp=0
136
+VCVT 1111 1110 1.11 11 rm:2 .... 1011 op:1 1.0 .... \
137
+ vm=%vm_dp vd=%vd_sp dp=1
138
--
62
--
139
2.20.1
63
2.25.1
140
64
141
65
diff view generated by jsdifflib
1
Convert the VRINTA/VRINTN/VRINTP/VRINTM instructions to decodetree.
1
Convert the TYPE_ARM_GICV3_ITS_COMMON parent class to 3-phase reset.
2
Again, trans_VRINT() is temporarily left in translate.c.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20221109161444.3397405-8-peter.maydell@linaro.org
5
---
7
---
6
target/arm/translate.c | 60 +++++++++++++++++++++++-------------
8
hw/intc/arm_gicv3_its_common.c | 7 ++++---
7
target/arm/vfp-uncond.decode | 5 +++
9
1 file changed, 4 insertions(+), 3 deletions(-)
8
2 files changed, 43 insertions(+), 22 deletions(-)
9
10
10
diff --git a/target/arm/translate.c b/target/arm/translate.c
11
diff --git a/hw/intc/arm_gicv3_its_common.c b/hw/intc/arm_gicv3_its_common.c
11
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
12
--- a/target/arm/translate.c
13
--- a/hw/intc/arm_gicv3_its_common.c
13
+++ b/target/arm/translate.c
14
+++ b/hw/intc/arm_gicv3_its_common.c
14
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
15
@@ -XXX,XX +XXX,XX @@ void gicv3_its_init_mmio(GICv3ITSState *s, const MemoryRegionOps *ops,
15
return true;
16
msi_nonbroken = true;
16
}
17
}
17
18
18
-static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
19
-static void gicv3_its_common_reset(DeviceState *dev)
19
- int rounding)
20
+static void gicv3_its_common_reset_hold(Object *obj)
20
+/*
21
+ * Table for converting the most common AArch32 encoding of
22
+ * rounding mode to arm_fprounding order (which matches the
23
+ * common AArch64 order); see ARM ARM pseudocode FPDecodeRM().
24
+ */
25
+static const uint8_t fp_decode_rm[] = {
26
+ FPROUNDING_TIEAWAY,
27
+ FPROUNDING_TIEEVEN,
28
+ FPROUNDING_POSINF,
29
+ FPROUNDING_NEGINF,
30
+};
31
+
32
+static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
33
{
21
{
34
- TCGv_ptr fpst = get_fpstatus_ptr(0);
22
- GICv3ITSState *s = ARM_GICV3_ITS_COMMON(dev);
35
+ uint32_t rd, rm;
23
+ GICv3ITSState *s = ARM_GICV3_ITS_COMMON(obj);
36
+ bool dp = a->dp;
24
37
+ TCGv_ptr fpst;
25
s->ctlr = 0;
38
TCGv_i32 tcg_rmode;
26
s->cbaser = 0;
39
+ int rounding = fp_decode_rm[a->rm];
27
@@ -XXX,XX +XXX,XX @@ static void gicv3_its_common_reset(DeviceState *dev)
40
+
28
static void gicv3_its_common_class_init(ObjectClass *klass, void *data)
41
+ if (!dc_isar_feature(aa32_vrint, s)) {
29
{
42
+ return false;
30
DeviceClass *dc = DEVICE_CLASS(klass);
43
+ }
31
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
44
+
32
45
+ /* UNDEF accesses to D16-D31 if they don't exist */
33
- dc->reset = gicv3_its_common_reset;
46
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
34
+ rc->phases.hold = gicv3_its_common_reset_hold;
47
+ ((a->vm | a->vd) & 0x10)) {
35
dc->vmsd = &vmstate_its;
48
+ return false;
49
+ }
50
+ rd = a->vd;
51
+ rm = a->vm;
52
+
53
+ if (!vfp_access_check(s)) {
54
+ return true;
55
+ }
56
+
57
+ fpst = get_fpstatus_ptr(0);
58
59
tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
60
gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
61
@@ -XXX,XX +XXX,XX @@ static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
62
tcg_temp_free_i32(tcg_rmode);
63
64
tcg_temp_free_ptr(fpst);
65
- return 0;
66
+ return true;
67
}
36
}
68
37
69
static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
70
@@ -XXX,XX +XXX,XX @@ static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
71
return 0;
72
}
73
74
-/* Table for converting the most common AArch32 encoding of
75
- * rounding mode to arm_fprounding order (which matches the
76
- * common AArch64 order); see ARM ARM pseudocode FPDecodeRM().
77
- */
78
-static const uint8_t fp_decode_rm[] = {
79
- FPROUNDING_TIEAWAY,
80
- FPROUNDING_TIEEVEN,
81
- FPROUNDING_POSINF,
82
- FPROUNDING_NEGINF,
83
-};
84
-
85
static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
86
{
87
uint32_t rd, rm, dp = extract32(insn, 8, 1);
88
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
89
rm = VFP_SREG_M(insn);
90
}
91
92
- if ((insn & 0x0fbc0ed0) == 0x0eb80a40 &&
93
- dc_isar_feature(aa32_vrint, s)) {
94
- /* VRINTA, VRINTN, VRINTP, VRINTM */
95
- int rounding = fp_decode_rm[extract32(insn, 16, 2)];
96
- return handle_vrint(insn, rd, rm, dp, rounding);
97
- } else if ((insn & 0x0fbc0e50) == 0x0ebc0a40 &&
98
- dc_isar_feature(aa32_vcvt_dr, s)) {
99
+ if ((insn & 0x0fbc0e50) == 0x0ebc0a40 &&
100
+ dc_isar_feature(aa32_vcvt_dr, s)) {
101
/* VCVTA, VCVTN, VCVTP, VCVTM */
102
int rounding = fp_decode_rm[extract32(insn, 16, 2)];
103
return handle_vcvt(insn, rd, rm, dp, rounding);
104
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
105
index XXXXXXX..XXXXXXX 100644
106
--- a/target/arm/vfp-uncond.decode
107
+++ b/target/arm/vfp-uncond.decode
108
@@ -XXX,XX +XXX,XX @@ VMINMAXNM 1111 1110 1.00 .... .... 1010 . op:1 .0 .... \
109
vm=%vm_sp vn=%vn_sp vd=%vd_sp dp=0
110
VMINMAXNM 1111 1110 1.00 .... .... 1011 . op:1 .0 .... \
111
vm=%vm_dp vn=%vn_dp vd=%vd_dp dp=1
112
+
113
+VRINT 1111 1110 1.11 10 rm:2 .... 1010 01.0 .... \
114
+ vm=%vm_sp vd=%vd_sp dp=0
115
+VRINT 1111 1110 1.11 10 rm:2 .... 1011 01.0 .... \
116
+ vm=%vm_dp vd=%vd_dp dp=1
117
--
38
--
118
2.20.1
39
2.25.1
119
40
120
41
diff view generated by jsdifflib
1
Convert the VMINNM and VMAXNM instructions to decodetree.
1
Convert the TYPE_ARM_GICV3_ITS device to 3-phase reset.
2
As with VSEL, we leave the trans_VMINMAXNM() function
3
in translate.c for the moment.
4
2
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Message-id: 20221109161444.3397405-9-peter.maydell@linaro.org
7
---
7
---
8
target/arm/translate.c | 41 ++++++++++++++++++++++++------------
8
hw/intc/arm_gicv3_its.c | 14 +++++++++-----
9
target/arm/vfp-uncond.decode | 5 +++++
9
1 file changed, 9 insertions(+), 5 deletions(-)
10
2 files changed, 33 insertions(+), 13 deletions(-)
11
10
12
diff --git a/target/arm/translate.c b/target/arm/translate.c
11
diff --git a/hw/intc/arm_gicv3_its.c b/hw/intc/arm_gicv3_its.c
13
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate.c
13
--- a/hw/intc/arm_gicv3_its.c
15
+++ b/target/arm/translate.c
14
+++ b/hw/intc/arm_gicv3_its.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
15
@@ -XXX,XX +XXX,XX @@ DECLARE_OBJ_CHECKERS(GICv3ITSState, GICv3ITSClass,
17
return true;
16
17
struct GICv3ITSClass {
18
GICv3ITSCommonClass parent_class;
19
- void (*parent_reset)(DeviceState *dev);
20
+ ResettablePhases parent_phases;
21
};
22
23
/*
24
@@ -XXX,XX +XXX,XX @@ static void gicv3_arm_its_realize(DeviceState *dev, Error **errp)
25
}
18
}
26
}
19
27
20
-static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn,
28
-static void gicv3_its_reset(DeviceState *dev)
21
- uint32_t rm, uint32_t dp)
29
+static void gicv3_its_reset_hold(Object *obj)
22
+static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
23
{
30
{
24
- uint32_t vmin = extract32(insn, 6, 1);
31
- GICv3ITSState *s = ARM_GICV3_ITS_COMMON(dev);
25
- TCGv_ptr fpst = get_fpstatus_ptr(0);
32
+ GICv3ITSState *s = ARM_GICV3_ITS_COMMON(obj);
26
+ uint32_t rd, rn, rm;
33
GICv3ITSClass *c = ARM_GICV3_ITS_GET_CLASS(s);
27
+ bool dp = a->dp;
34
28
+ bool vmin = a->op;
35
- c->parent_reset(dev);
29
+ TCGv_ptr fpst;
36
+ if (c->parent_phases.hold) {
30
+
37
+ c->parent_phases.hold(obj);
31
+ if (!dc_isar_feature(aa32_vminmaxnm, s)) {
32
+ return false;
33
+ }
38
+ }
34
+
39
35
+ /* UNDEF accesses to D16-D31 if they don't exist */
40
/* Quiescent bit reset to 1 */
36
+ if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
41
s->ctlr = FIELD_DP32(s->ctlr, GITS_CTLR, QUIESCENT, 1);
37
+ ((a->vm | a->vn | a->vd) & 0x10)) {
42
@@ -XXX,XX +XXX,XX @@ static Property gicv3_its_props[] = {
38
+ return false;
43
static void gicv3_its_class_init(ObjectClass *klass, void *data)
39
+ }
44
{
40
+ rd = a->vd;
45
DeviceClass *dc = DEVICE_CLASS(klass);
41
+ rn = a->vn;
46
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
42
+ rm = a->vm;
47
GICv3ITSClass *ic = ARM_GICV3_ITS_CLASS(klass);
43
+
48
GICv3ITSCommonClass *icc = ARM_GICV3_ITS_COMMON_CLASS(klass);
44
+ if (!vfp_access_check(s)) {
49
45
+ return true;
50
dc->realize = gicv3_arm_its_realize;
46
+ }
51
device_class_set_props(dc, gicv3_its_props);
47
+
52
- device_class_set_parent_reset(dc, gicv3_its_reset, &ic->parent_reset);
48
+ fpst = get_fpstatus_ptr(0);
53
+ resettable_class_set_parent_phases(rc, NULL, gicv3_its_reset_hold, NULL,
49
54
+ &ic->parent_phases);
50
if (dp) {
55
icc->post_load = gicv3_its_post_load;
51
TCGv_i64 frn, frm, dest;
52
@@ -XXX,XX +XXX,XX @@ static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn,
53
}
54
55
tcg_temp_free_ptr(fpst);
56
- return 0;
57
+ return true;
58
}
56
}
59
57
60
static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
61
@@ -XXX,XX +XXX,XX @@ static const uint8_t fp_decode_rm[] = {
62
63
static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
64
{
65
- uint32_t rd, rn, rm, dp = extract32(insn, 8, 1);
66
+ uint32_t rd, rm, dp = extract32(insn, 8, 1);
67
68
if (dp) {
69
VFP_DREG_D(rd, insn);
70
- VFP_DREG_N(rn, insn);
71
VFP_DREG_M(rm, insn);
72
} else {
73
rd = VFP_SREG_D(insn);
74
- rn = VFP_SREG_N(insn);
75
rm = VFP_SREG_M(insn);
76
}
77
78
- if ((insn & 0x0fb00e10) == 0x0e800a00 &&
79
- dc_isar_feature(aa32_vminmaxnm, s)) {
80
- return handle_vminmaxnm(insn, rd, rn, rm, dp);
81
- } else if ((insn & 0x0fbc0ed0) == 0x0eb80a40 &&
82
- dc_isar_feature(aa32_vrint, s)) {
83
+ if ((insn & 0x0fbc0ed0) == 0x0eb80a40 &&
84
+ dc_isar_feature(aa32_vrint, s)) {
85
/* VRINTA, VRINTN, VRINTP, VRINTM */
86
int rounding = fp_decode_rm[extract32(insn, 16, 2)];
87
return handle_vrint(insn, rd, rm, dp, rounding);
88
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
89
index XXXXXXX..XXXXXXX 100644
90
--- a/target/arm/vfp-uncond.decode
91
+++ b/target/arm/vfp-uncond.decode
92
@@ -XXX,XX +XXX,XX @@ VSEL 1111 1110 0. cc:2 .... .... 1010 .0.0 .... \
93
vm=%vm_sp vn=%vn_sp vd=%vd_sp dp=0
94
VSEL 1111 1110 0. cc:2 .... .... 1011 .0.0 .... \
95
vm=%vm_dp vn=%vn_dp vd=%vd_dp dp=1
96
+
97
+VMINMAXNM 1111 1110 1.00 .... .... 1010 . op:1 .0 .... \
98
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp dp=0
99
+VMINMAXNM 1111 1110 1.00 .... .... 1011 . op:1 .0 .... \
100
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp dp=1
101
--
58
--
102
2.20.1
59
2.25.1
103
60
104
61
diff view generated by jsdifflib
1
Convert the VFP VABS instruction to decodetree.
1
Convert the TYPE_KVM_ARM_ITS device to 3-phase reset.
2
3
Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or
4
VFPGen2OpDPFn because none of the operations which use this format
5
and support short vectors will need it.
6
2
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Message-id: 20221109161444.3397405-10-peter.maydell@linaro.org
9
---
7
---
10
target/arm/translate-vfp.inc.c | 167 +++++++++++++++++++++++++++++++++
8
hw/intc/arm_gicv3_its_kvm.c | 14 +++++++++-----
11
target/arm/translate.c | 12 ++-
9
1 file changed, 9 insertions(+), 5 deletions(-)
12
target/arm/vfp.decode | 5 +
13
3 files changed, 180 insertions(+), 4 deletions(-)
14
10
15
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
11
diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c
16
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-vfp.inc.c
13
--- a/hw/intc/arm_gicv3_its_kvm.c
18
+++ b/target/arm/translate-vfp.inc.c
14
+++ b/hw/intc/arm_gicv3_its_kvm.c
19
@@ -XXX,XX +XXX,XX @@ typedef void VFPGen3OpSPFn(TCGv_i32 vd,
15
@@ -XXX,XX +XXX,XX @@ DECLARE_OBJ_CHECKERS(GICv3ITSState, KVMARMITSClass,
20
typedef void VFPGen3OpDPFn(TCGv_i64 vd,
16
21
TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst);
17
struct KVMARMITSClass {
22
18
GICv3ITSCommonClass parent_class;
23
+/*
19
- void (*parent_reset)(DeviceState *dev);
24
+ * Types for callbacks for do_vfp_2op_sp() and do_vfp_2op_dp().
20
+ ResettablePhases parent_phases;
25
+ * The callback should emit code to write a value to vd (which
21
};
26
+ * should be written to only).
22
27
+ */
23
28
+typedef void VFPGen2OpSPFn(TCGv_i32 vd, TCGv_i32 vm);
24
@@ -XXX,XX +XXX,XX @@ static void kvm_arm_its_post_load(GICv3ITSState *s)
29
+typedef void VFPGen2OpDPFn(TCGv_i64 vd, TCGv_i64 vm);
25
GITS_CTLR, &s->ctlr, true, &error_abort);
30
+
31
/*
32
* Perform a 3-operand VFP data processing instruction. fn is the
33
* callback to do the actual operation; this function deals with the
34
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
35
return true;
36
}
26
}
37
27
38
+static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
28
-static void kvm_arm_its_reset(DeviceState *dev)
39
+{
29
+static void kvm_arm_its_reset_hold(Object *obj)
40
+ uint32_t delta_m = 0;
30
{
41
+ uint32_t delta_d = 0;
31
- GICv3ITSState *s = ARM_GICV3_ITS_COMMON(dev);
42
+ uint32_t bank_mask = 0;
32
+ GICv3ITSState *s = ARM_GICV3_ITS_COMMON(obj);
43
+ int veclen = s->vec_len;
33
KVMARMITSClass *c = KVM_ARM_ITS_GET_CLASS(s);
44
+ TCGv_i32 f0, fd;
34
int i;
45
+
35
46
+ if (!dc_isar_feature(aa32_fpshvec, s) &&
36
- c->parent_reset(dev);
47
+ (veclen != 0 || s->vec_stride != 0)) {
37
+ if (c->parent_phases.hold) {
48
+ return false;
38
+ c->parent_phases.hold(obj);
49
+ }
39
+ }
50
+
40
51
+ if (!vfp_access_check(s)) {
41
if (kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
52
+ return true;
42
KVM_DEV_ARM_ITS_CTRL_RESET)) {
53
+ }
43
@@ -XXX,XX +XXX,XX @@ static Property kvm_arm_its_props[] = {
54
+
44
static void kvm_arm_its_class_init(ObjectClass *klass, void *data)
55
+ if (veclen > 0) {
56
+ bank_mask = 0x18;
57
+
58
+ /* Figure out what type of vector operation this is. */
59
+ if ((vd & bank_mask) == 0) {
60
+ /* scalar */
61
+ veclen = 0;
62
+ } else {
63
+ delta_d = s->vec_stride + 1;
64
+
65
+ if ((vm & bank_mask) == 0) {
66
+ /* mixed scalar/vector */
67
+ delta_m = 0;
68
+ } else {
69
+ /* vector */
70
+ delta_m = delta_d;
71
+ }
72
+ }
73
+ }
74
+
75
+ f0 = tcg_temp_new_i32();
76
+ fd = tcg_temp_new_i32();
77
+
78
+ neon_load_reg32(f0, vm);
79
+
80
+ for (;;) {
81
+ fn(fd, f0);
82
+ neon_store_reg32(fd, vd);
83
+
84
+ if (veclen == 0) {
85
+ break;
86
+ }
87
+
88
+ if (delta_m == 0) {
89
+ /* single source one-many */
90
+ while (veclen--) {
91
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
92
+ neon_store_reg32(fd, vd);
93
+ }
94
+ break;
95
+ }
96
+
97
+ /* Set up the operands for the next iteration */
98
+ veclen--;
99
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
100
+ vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
101
+ neon_load_reg32(f0, vm);
102
+ }
103
+
104
+ tcg_temp_free_i32(f0);
105
+ tcg_temp_free_i32(fd);
106
+
107
+ return true;
108
+}
109
+
110
+static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
111
+{
112
+ uint32_t delta_m = 0;
113
+ uint32_t delta_d = 0;
114
+ uint32_t bank_mask = 0;
115
+ int veclen = s->vec_len;
116
+ TCGv_i64 f0, fd;
117
+
118
+ /* UNDEF accesses to D16-D31 if they don't exist */
119
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((vd | vm) & 0x10)) {
120
+ return false;
121
+ }
122
+
123
+ if (!dc_isar_feature(aa32_fpshvec, s) &&
124
+ (veclen != 0 || s->vec_stride != 0)) {
125
+ return false;
126
+ }
127
+
128
+ if (!vfp_access_check(s)) {
129
+ return true;
130
+ }
131
+
132
+ if (veclen > 0) {
133
+ bank_mask = 0xc;
134
+
135
+ /* Figure out what type of vector operation this is. */
136
+ if ((vd & bank_mask) == 0) {
137
+ /* scalar */
138
+ veclen = 0;
139
+ } else {
140
+ delta_d = (s->vec_stride >> 1) + 1;
141
+
142
+ if ((vm & bank_mask) == 0) {
143
+ /* mixed scalar/vector */
144
+ delta_m = 0;
145
+ } else {
146
+ /* vector */
147
+ delta_m = delta_d;
148
+ }
149
+ }
150
+ }
151
+
152
+ f0 = tcg_temp_new_i64();
153
+ fd = tcg_temp_new_i64();
154
+
155
+ neon_load_reg64(f0, vm);
156
+
157
+ for (;;) {
158
+ fn(fd, f0);
159
+ neon_store_reg64(fd, vd);
160
+
161
+ if (veclen == 0) {
162
+ break;
163
+ }
164
+
165
+ if (delta_m == 0) {
166
+ /* single source one-many */
167
+ while (veclen--) {
168
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
169
+ neon_store_reg64(fd, vd);
170
+ }
171
+ break;
172
+ }
173
+
174
+ /* Set up the operands for the next iteration */
175
+ veclen--;
176
+ vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask);
177
+ vm = ((vm + delta_m) & (bank_mask - 1)) | (vm & bank_mask);
178
+ neon_load_reg64(f0, vm);
179
+ }
180
+
181
+ tcg_temp_free_i64(f0);
182
+ tcg_temp_free_i64(fd);
183
+
184
+ return true;
185
+}
186
+
187
static void gen_VMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
188
{
45
{
189
/* Note that order of inputs to the add matters for NaNs */
46
DeviceClass *dc = DEVICE_CLASS(klass);
190
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
47
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
191
tcg_temp_free_i64(fd);
48
GICv3ITSCommonClass *icc = ARM_GICV3_ITS_COMMON_CLASS(klass);
192
return true;
49
KVMARMITSClass *ic = KVM_ARM_ITS_CLASS(klass);
193
}
50
194
+
51
dc->realize = kvm_arm_its_realize;
195
+static bool trans_VABS_sp(DisasContext *s, arg_VABS_sp *a)
52
device_class_set_props(dc, kvm_arm_its_props);
196
+{
53
- device_class_set_parent_reset(dc, kvm_arm_its_reset, &ic->parent_reset);
197
+ return do_vfp_2op_sp(s, gen_helper_vfp_abss, a->vd, a->vm);
54
+ resettable_class_set_parent_phases(rc, NULL, kvm_arm_its_reset_hold, NULL,
198
+}
55
+ &ic->parent_phases);
199
+
56
icc->send_msi = kvm_its_send_msi;
200
+static bool trans_VABS_dp(DisasContext *s, arg_VABS_dp *a)
57
icc->pre_save = kvm_arm_its_pre_save;
201
+{
58
icc->post_load = kvm_arm_its_post_load;
202
+ return do_vfp_2op_dp(s, gen_helper_vfp_absd, a->vd, a->vm);
203
+}
204
diff --git a/target/arm/translate.c b/target/arm/translate.c
205
index XXXXXXX..XXXXXXX 100644
206
--- a/target/arm/translate.c
207
+++ b/target/arm/translate.c
208
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
209
case 0 ... 14:
210
/* Already handled by decodetree */
211
return 1;
212
+ case 15:
213
+ switch (rn) {
214
+ case 1:
215
+ /* Already handled by decodetree */
216
+ return 1;
217
+ default:
218
+ break;
219
+ }
220
default:
221
break;
222
}
223
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
224
/* rn is opcode, encoded as per VFP_SREG_N. */
225
switch (rn) {
226
case 0x00: /* vmov */
227
- case 0x01: /* vabs */
228
case 0x02: /* vneg */
229
case 0x03: /* vsqrt */
230
break;
231
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
232
case 0: /* cpy */
233
/* no-op */
234
break;
235
- case 1: /* abs */
236
- gen_vfp_abs(dp);
237
- break;
238
case 2: /* neg */
239
gen_vfp_neg(dp);
240
break;
241
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
242
index XXXXXXX..XXXXXXX 100644
243
--- a/target/arm/vfp.decode
244
+++ b/target/arm/vfp.decode
245
@@ -XXX,XX +XXX,XX @@ VMOV_imm_sp ---- 1110 1.11 imm4h:4 .... 1010 0000 imm4l:4 \
246
vd=%vd_sp
247
VMOV_imm_dp ---- 1110 1.11 imm4h:4 .... 1011 0000 imm4l:4 \
248
vd=%vd_dp
249
+
250
+VABS_sp ---- 1110 1.11 0000 .... 1010 11.0 .... \
251
+ vd=%vd_sp vm=%vm_sp
252
+VABS_dp ---- 1110 1.11 0000 .... 1011 11.0 .... \
253
+ vd=%vd_dp vm=%vm_dp
254
--
59
--
255
2.20.1
60
2.25.1
256
61
257
62
diff view generated by jsdifflib
1
Add the infrastructure for building and invoking a decodetree decoder
1
From: Schspa Shi <schspa@gmail.com>
2
for the AArch32 VFP encodings. At the moment the new decoder covers
3
nothing, so we always fall back to the existing hand-written decode.
4
2
5
We need to have one decoder for the unconditional insns and one for
3
We use 32bit value for linux,initrd-[start/end], when we have
6
the conditional insns, as otherwise the patterns for conditional
4
loader_start > 4GB, there will be a wrong initrd_start passed
7
insns would incorrectly match against the unconditional ones too.
5
to the kernel, and the kernel will report the following warning.
8
6
9
Since translate.c is over 14,000 lines long and we're going to be
7
[ 0.000000] ------------[ cut here ]------------
10
touching pretty much every line of the VFP code as part of the
8
[ 0.000000] initrd not fully accessible via the linear mapping -- please check your bootloader ...
11
decodetree conversion, we create a new translate-vfp.inc.c to hold
9
[ 0.000000] WARNING: CPU: 0 PID: 0 at arch/arm64/mm/init.c:355 arm64_memblock_init+0x158/0x244
12
the code which deals with VFP in the new scheme. It should be
10
[ 0.000000] Modules linked in:
13
possible to convert this into a standalone translation unit
11
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Tainted: G W 6.1.0-rc3-13250-g30a0b95b1335-dirty #28
14
eventually, but the conversion process will be much simpler if we
12
[ 0.000000] Hardware name: Horizon Sigi Virtual development board (DT)
15
simply #include it midway through translate.c to start with.
13
[ 0.000000] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
14
[ 0.000000] pc : arm64_memblock_init+0x158/0x244
15
[ 0.000000] lr : arm64_memblock_init+0x158/0x244
16
[ 0.000000] sp : ffff800009273df0
17
[ 0.000000] x29: ffff800009273df0 x28: 0000001000cc0010 x27: 0000800000000000
18
[ 0.000000] x26: 000000000050a3e2 x25: ffff800008b46000 x24: ffff800008b46000
19
[ 0.000000] x23: ffff800008a53000 x22: ffff800009420000 x21: ffff800008a53000
20
[ 0.000000] x20: 0000000004000000 x19: 0000000004000000 x18: 00000000ffff1020
21
[ 0.000000] x17: 6568632065736165 x16: 6c70202d2d20676e x15: 697070616d207261
22
[ 0.000000] x14: 656e696c20656874 x13: 0a2e2e2e20726564 x12: 0000000000000000
23
[ 0.000000] x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000
24
[ 0.000000] x8 : 0000000000000000 x7 : 796c6c756620746f x6 : 6e20647274696e69
25
[ 0.000000] x5 : ffff8000093c7c47 x4 : ffff800008a2102f x3 : ffff800009273a88
26
[ 0.000000] x2 : 80000000fffff038 x1 : 00000000000000c0 x0 : 0000000000000056
27
[ 0.000000] Call trace:
28
[ 0.000000] arm64_memblock_init+0x158/0x244
29
[ 0.000000] setup_arch+0x164/0x1cc
30
[ 0.000000] start_kernel+0x94/0x4ac
31
[ 0.000000] __primary_switched+0xb4/0xbc
32
[ 0.000000] ---[ end trace 0000000000000000 ]---
33
[ 0.000000] Zone ranges:
34
[ 0.000000] DMA [mem 0x0000001000000000-0x0000001007ffffff]
16
35
36
This doesn't affect any machine types we currently support, because
37
for all of our machine types the RAM starts well below the 4GB
38
mark, but it does demonstrate that we're not currently writing
39
the device-tree properties quite as intended.
40
41
To fix it, we can change it to write these values to the dtb using a
42
type width matching #address-cells. This is the intended size for
43
these dtb properties, and is how u-boot, for instance, writes them,
44
although in practice the Linux kernel will cope with them being any
45
width as long as they're big enough to fit the value.
46
47
Signed-off-by: Schspa Shi <schspa@gmail.com>
48
Message-id: 20221129160724.75667-1-schspa@gmail.com
49
[PMM: tweaked commit message]
50
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
51
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
---
52
---
20
target/arm/Makefile.objs | 13 +++++++++++++
53
hw/arm/boot.c | 10 ++++++----
21
target/arm/translate-vfp.inc.c | 31 +++++++++++++++++++++++++++++++
54
1 file changed, 6 insertions(+), 4 deletions(-)
22
target/arm/translate.c | 19 +++++++++++++++++++
23
target/arm/vfp-uncond.decode | 28 ++++++++++++++++++++++++++++
24
target/arm/vfp.decode | 28 ++++++++++++++++++++++++++++
25
5 files changed, 119 insertions(+)
26
create mode 100644 target/arm/translate-vfp.inc.c
27
create mode 100644 target/arm/vfp-uncond.decode
28
create mode 100644 target/arm/vfp.decode
29
55
30
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
56
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
31
index XXXXXXX..XXXXXXX 100644
57
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/Makefile.objs
58
--- a/hw/arm/boot.c
33
+++ b/target/arm/Makefile.objs
59
+++ b/hw/arm/boot.c
34
@@ -XXX,XX +XXX,XX @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
60
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
35
     $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\
36
     "GEN", $(TARGET_DIR)$@)
37
38
+target/arm/decode-vfp.inc.c: $(SRC_PATH)/target/arm/vfp.decode $(DECODETREE)
39
+    $(call quiet-command,\
40
+     $(PYTHON) $(DECODETREE) --static-decode disas_vfp -o $@ $<,\
41
+     "GEN", $(TARGET_DIR)$@)
42
+
43
+target/arm/decode-vfp-uncond.inc.c: $(SRC_PATH)/target/arm/vfp-uncond.decode $(DECODETREE)
44
+    $(call quiet-command,\
45
+     $(PYTHON) $(DECODETREE) --static-decode disas_vfp_uncond -o $@ $<,\
46
+     "GEN", $(TARGET_DIR)$@)
47
+
48
target/arm/translate-sve.o: target/arm/decode-sve.inc.c
49
+target/arm/translate.o: target/arm/decode-vfp.inc.c
50
+target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
51
+
52
obj-$(TARGET_AARCH64) += translate-sve.o sve_helper.o
53
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
54
new file mode 100644
55
index XXXXXXX..XXXXXXX
56
--- /dev/null
57
+++ b/target/arm/translate-vfp.inc.c
58
@@ -XXX,XX +XXX,XX @@
59
+/*
60
+ * ARM translation: AArch32 VFP instructions
61
+ *
62
+ * Copyright (c) 2003 Fabrice Bellard
63
+ * Copyright (c) 2005-2007 CodeSourcery
64
+ * Copyright (c) 2007 OpenedHand, Ltd.
65
+ * Copyright (c) 2019 Linaro, Ltd.
66
+ *
67
+ * This library is free software; you can redistribute it and/or
68
+ * modify it under the terms of the GNU Lesser General Public
69
+ * License as published by the Free Software Foundation; either
70
+ * version 2 of the License, or (at your option) any later version.
71
+ *
72
+ * This library is distributed in the hope that it will be useful,
73
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
74
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
75
+ * Lesser General Public License for more details.
76
+ *
77
+ * You should have received a copy of the GNU Lesser General Public
78
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
79
+ */
80
+
81
+/*
82
+ * This file is intended to be included from translate.c; it uses
83
+ * some macros and definitions provided by that file.
84
+ * It might be possible to convert it to a standalone .c file eventually.
85
+ */
86
+
87
+/* Include the generated VFP decoder */
88
+#include "decode-vfp.inc.c"
89
+#include "decode-vfp-uncond.inc.c"
90
diff --git a/target/arm/translate.c b/target/arm/translate.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/translate.c
93
+++ b/target/arm/translate.c
94
@@ -XXX,XX +XXX,XX @@ static inline void gen_mov_vreg_F0(int dp, int reg)
95
96
#define ARM_CP_RW_BIT (1 << 20)
97
98
+/* Include the VFP decoder */
99
+#include "translate-vfp.inc.c"
100
+
101
static inline void iwmmxt_load_reg(TCGv_i64 var, int reg)
102
{
103
tcg_gen_ld_i64(var, cpu_env, offsetof(CPUARMState, iwmmxt.regs[reg]));
104
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
105
return 1;
106
}
61
}
107
62
108
+ /*
63
if (binfo->initrd_size) {
109
+ * If the decodetree decoder handles this insn it will always
64
- rc = qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-start",
110
+ * emit code to either execute the insn or generate an appropriate
65
- binfo->initrd_start);
111
+ * exception; so we don't need to ever return non-zero to tell
66
+ rc = qemu_fdt_setprop_sized_cells(fdt, "/chosen", "linux,initrd-start",
112
+ * the calling code to emit an UNDEF exception.
67
+ acells, binfo->initrd_start);
113
+ */
68
if (rc < 0) {
114
+ if (extract32(insn, 28, 4) == 0xf) {
69
fprintf(stderr, "couldn't set /chosen/linux,initrd-start\n");
115
+ if (disas_vfp_uncond(s, insn)) {
70
goto fail;
116
+ return 0;
71
}
117
+ }
72
118
+ } else {
73
- rc = qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end",
119
+ if (disas_vfp(s, insn)) {
74
- binfo->initrd_start + binfo->initrd_size);
120
+ return 0;
75
+ rc = qemu_fdt_setprop_sized_cells(fdt, "/chosen", "linux,initrd-end",
121
+ }
76
+ acells,
122
+ }
77
+ binfo->initrd_start +
123
+
78
+ binfo->initrd_size);
124
/* FIXME: this access check should not take precedence over UNDEF
79
if (rc < 0) {
125
* for invalid encodings; we will generate incorrect syndrome information
80
fprintf(stderr, "couldn't set /chosen/linux,initrd-end\n");
126
* for attempts to execute invalid vfp/neon encodings with FP disabled.
81
goto fail;
127
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
128
new file mode 100644
129
index XXXXXXX..XXXXXXX
130
--- /dev/null
131
+++ b/target/arm/vfp-uncond.decode
132
@@ -XXX,XX +XXX,XX @@
133
+# AArch32 VFP instruction descriptions (unconditional insns)
134
+#
135
+# Copyright (c) 2019 Linaro, Ltd
136
+#
137
+# This library is free software; you can redistribute it and/or
138
+# modify it under the terms of the GNU Lesser General Public
139
+# License as published by the Free Software Foundation; either
140
+# version 2 of the License, or (at your option) any later version.
141
+#
142
+# This library is distributed in the hope that it will be useful,
143
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
144
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
145
+# Lesser General Public License for more details.
146
+#
147
+# You should have received a copy of the GNU Lesser General Public
148
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
149
+
150
+#
151
+# This file is processed by scripts/decodetree.py
152
+#
153
+# Encodings for the unconditional VFP instructions are here:
154
+# generally anything matching A32
155
+# 1111 1110 .... .... .... 101. ...0 ....
156
+# and T32
157
+# 1111 110. .... .... .... 101. .... ....
158
+# 1111 1110 .... .... .... 101. .... ....
159
+# (but those patterns might also cover some Neon instructions,
160
+# which do not live in this file.)
161
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
162
new file mode 100644
163
index XXXXXXX..XXXXXXX
164
--- /dev/null
165
+++ b/target/arm/vfp.decode
166
@@ -XXX,XX +XXX,XX @@
167
+# AArch32 VFP instruction descriptions (conditional insns)
168
+#
169
+# Copyright (c) 2019 Linaro, Ltd
170
+#
171
+# This library is free software; you can redistribute it and/or
172
+# modify it under the terms of the GNU Lesser General Public
173
+# License as published by the Free Software Foundation; either
174
+# version 2 of the License, or (at your option) any later version.
175
+#
176
+# This library is distributed in the hope that it will be useful,
177
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
178
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
179
+# Lesser General Public License for more details.
180
+#
181
+# You should have received a copy of the GNU Lesser General Public
182
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
183
+
184
+#
185
+# This file is processed by scripts/decodetree.py
186
+#
187
+# Encodings for the conditional VFP instructions are here:
188
+# generally anything matching A32
189
+# cccc 11.. .... .... .... 101. .... ....
190
+# and T32
191
+# 1110 110. .... .... .... 101. .... ....
192
+# 1110 1110 .... .... .... 101. .... ....
193
+# (but those patterns might also cover some Neon instructions,
194
+# which do not live in this file.)
195
--
82
--
196
2.20.1
83
2.25.1
197
198
diff view generated by jsdifflib
1
The NSACR register allows secure code to configure the FPU
1
From: Zhuojia Shen <chaosdefinition@hotmail.com>
2
to be inaccessible to non-secure code. If the NSACR.CP10
3
bit is set then:
4
* NS accesses to the FPU trap as UNDEF (ie to NS EL1 or EL2)
5
* CPACR.{CP10,CP11} behave as if RAZ/WI
6
* HCPTR.{TCP11,TCP10} behave as if RAO/WI
7
2
8
Note that we do not implement the NSACR.NSASEDIS bit which
3
In CPUID registers exposed to userspace, some registers were missing
9
gates only access to Advanced SIMD, in the same way that
4
and some fields were not exposed. This patch aligns exposed ID
10
we don't implement the equivalent CPACR.ASEDIS and HCPTR.TASE.
5
registers and their fields with what the upstream kernel currently
6
exposes.
11
7
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Specifically, the following new ID registers/fields are exposed to
9
userspace:
10
11
ID_AA64PFR1_EL1.BT: bits 3-0
12
ID_AA64PFR1_EL1.MTE: bits 11-8
13
ID_AA64PFR1_EL1.SME: bits 27-24
14
15
ID_AA64ZFR0_EL1.SVEver: bits 3-0
16
ID_AA64ZFR0_EL1.AES: bits 7-4
17
ID_AA64ZFR0_EL1.BitPerm: bits 19-16
18
ID_AA64ZFR0_EL1.BF16: bits 23-20
19
ID_AA64ZFR0_EL1.SHA3: bits 35-32
20
ID_AA64ZFR0_EL1.SM4: bits 43-40
21
ID_AA64ZFR0_EL1.I8MM: bits 47-44
22
ID_AA64ZFR0_EL1.F32MM: bits 55-52
23
ID_AA64ZFR0_EL1.F64MM: bits 59-56
24
25
ID_AA64SMFR0_EL1.F32F32: bit 32
26
ID_AA64SMFR0_EL1.B16F32: bit 34
27
ID_AA64SMFR0_EL1.F16F32: bit 35
28
ID_AA64SMFR0_EL1.I8I32: bits 39-36
29
ID_AA64SMFR0_EL1.F64F64: bit 48
30
ID_AA64SMFR0_EL1.I16I64: bits 55-52
31
ID_AA64SMFR0_EL1.FA64: bit 63
32
33
ID_AA64MMFR0_EL1.ECV: bits 63-60
34
35
ID_AA64MMFR1_EL1.AFP: bits 47-44
36
37
ID_AA64MMFR2_EL1.AT: bits 35-32
38
39
ID_AA64ISAR0_EL1.RNDR: bits 63-60
40
41
ID_AA64ISAR1_EL1.FRINTTS: bits 35-32
42
ID_AA64ISAR1_EL1.BF16: bits 47-44
43
ID_AA64ISAR1_EL1.DGH: bits 51-48
44
ID_AA64ISAR1_EL1.I8MM: bits 55-52
45
46
ID_AA64ISAR2_EL1.WFxT: bits 3-0
47
ID_AA64ISAR2_EL1.RPRES: bits 7-4
48
ID_AA64ISAR2_EL1.GPA3: bits 11-8
49
ID_AA64ISAR2_EL1.APA3: bits 15-12
50
51
The code is also refactored to use symbolic names for ID register fields
52
for better readability and maintainability.
53
54
Signed-off-by: Zhuojia Shen <chaosdefinition@hotmail.com>
55
Message-id: DS7PR12MB6309BC9133877BCC6FC419FEAC0D9@DS7PR12MB6309.namprd12.prod.outlook.com
56
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
57
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20190510110357.18825-1-peter.maydell@linaro.org
15
---
58
---
16
target/arm/helper.c | 75 +++++++++++++++++++++++++++++++++++++++++++--
59
target/arm/helper.c | 96 +++++++++++++++++++++++++++++++++++++--------
17
1 file changed, 73 insertions(+), 2 deletions(-)
60
1 file changed, 79 insertions(+), 17 deletions(-)
18
61
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
62
diff --git a/target/arm/helper.c b/target/arm/helper.c
20
index XXXXXXX..XXXXXXX 100644
63
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.c
64
--- a/target/arm/helper.c
22
+++ b/target/arm/helper.c
65
+++ b/target/arm/helper.c
23
@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
66
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
24
}
67
#ifdef CONFIG_USER_ONLY
25
value &= mask;
68
static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
26
}
69
{ .name = "ID_AA64PFR0_EL1",
27
+
70
- .exported_bits = 0x000f000f00ff0000,
28
+ /*
71
- .fixed_bits = 0x0000000000000011 },
29
+ * For A-profile AArch32 EL3 (but not M-profile secure mode), if NSACR.CP10
72
+ .exported_bits = R_ID_AA64PFR0_FP_MASK |
30
+ * is 0 then CPACR.{CP11,CP10} ignore writes and read as 0b00.
73
+ R_ID_AA64PFR0_ADVSIMD_MASK |
31
+ */
74
+ R_ID_AA64PFR0_SVE_MASK |
32
+ if (arm_feature(env, ARM_FEATURE_EL3) && !arm_el_is_aa64(env, 3) &&
75
+ R_ID_AA64PFR0_DIT_MASK,
33
+ !arm_is_secure(env) && !extract32(env->cp15.nsacr, 10, 1)) {
76
+ .fixed_bits = (0x1 << R_ID_AA64PFR0_EL0_SHIFT) |
34
+ value &= ~(0xf << 20);
77
+ (0x1 << R_ID_AA64PFR0_EL1_SHIFT) },
35
+ value |= env->cp15.cpacr_el1 & (0xf << 20);
78
{ .name = "ID_AA64PFR1_EL1",
36
+ }
79
- .exported_bits = 0x00000000000000f0 },
37
+
80
+ .exported_bits = R_ID_AA64PFR1_BT_MASK |
38
env->cp15.cpacr_el1 = value;
81
+ R_ID_AA64PFR1_SSBS_MASK |
39
}
82
+ R_ID_AA64PFR1_MTE_MASK |
40
83
+ R_ID_AA64PFR1_SME_MASK },
41
+static uint64_t cpacr_read(CPUARMState *env, const ARMCPRegInfo *ri)
84
{ .name = "ID_AA64PFR*_EL1_RESERVED",
42
+{
85
- .is_glob = true },
43
+ /*
86
- { .name = "ID_AA64ZFR0_EL1" },
44
+ * For A-profile AArch32 EL3 (but not M-profile secure mode), if NSACR.CP10
87
+ .is_glob = true },
45
+ * is 0 then CPACR.{CP11,CP10} ignore writes and read as 0b00.
88
+ { .name = "ID_AA64ZFR0_EL1",
46
+ */
89
+ .exported_bits = R_ID_AA64ZFR0_SVEVER_MASK |
47
+ uint64_t value = env->cp15.cpacr_el1;
90
+ R_ID_AA64ZFR0_AES_MASK |
48
+
91
+ R_ID_AA64ZFR0_BITPERM_MASK |
49
+ if (arm_feature(env, ARM_FEATURE_EL3) && !arm_el_is_aa64(env, 3) &&
92
+ R_ID_AA64ZFR0_BFLOAT16_MASK |
50
+ !arm_is_secure(env) && !extract32(env->cp15.nsacr, 10, 1)) {
93
+ R_ID_AA64ZFR0_SHA3_MASK |
51
+ value &= ~(0xf << 20);
94
+ R_ID_AA64ZFR0_SM4_MASK |
52
+ }
95
+ R_ID_AA64ZFR0_I8MM_MASK |
53
+ return value;
96
+ R_ID_AA64ZFR0_F32MM_MASK |
54
+}
97
+ R_ID_AA64ZFR0_F64MM_MASK },
55
+
98
+ { .name = "ID_AA64SMFR0_EL1",
56
+
99
+ .exported_bits = R_ID_AA64SMFR0_F32F32_MASK |
57
static void cpacr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
100
+ R_ID_AA64SMFR0_B16F32_MASK |
58
{
101
+ R_ID_AA64SMFR0_F16F32_MASK |
59
/* Call cpacr_write() so that we reset with the correct RAO bits set
102
+ R_ID_AA64SMFR0_I8I32_MASK |
60
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
103
+ R_ID_AA64SMFR0_F64F64_MASK |
61
{ .name = "CPACR", .state = ARM_CP_STATE_BOTH, .opc0 = 3,
104
+ R_ID_AA64SMFR0_I16I64_MASK |
62
.crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
105
+ R_ID_AA64SMFR0_FA64_MASK },
63
.access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
106
{ .name = "ID_AA64MMFR0_EL1",
64
- .resetfn = cpacr_reset, .writefn = cpacr_write },
107
- .fixed_bits = 0x00000000ff000000 },
65
+ .resetfn = cpacr_reset, .writefn = cpacr_write, .readfn = cpacr_read },
108
- { .name = "ID_AA64MMFR1_EL1" },
66
REGINFO_SENTINEL
109
+ .exported_bits = R_ID_AA64MMFR0_ECV_MASK,
67
};
110
+ .fixed_bits = (0xf << R_ID_AA64MMFR0_TGRAN64_SHIFT) |
68
111
+ (0xf << R_ID_AA64MMFR0_TGRAN4_SHIFT) },
69
@@ -XXX,XX +XXX,XX @@ uint64_t arm_hcr_el2_eff(CPUARMState *env)
112
+ { .name = "ID_AA64MMFR1_EL1",
70
return ret;
113
+ .exported_bits = R_ID_AA64MMFR1_AFP_MASK },
71
}
114
+ { .name = "ID_AA64MMFR2_EL1",
72
115
+ .exported_bits = R_ID_AA64MMFR2_AT_MASK },
73
+static void cptr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
116
{ .name = "ID_AA64MMFR*_EL1_RESERVED",
74
+ uint64_t value)
117
- .is_glob = true },
75
+{
118
+ .is_glob = true },
76
+ /*
119
{ .name = "ID_AA64DFR0_EL1",
77
+ * For A-profile AArch32 EL3, if NSACR.CP10
120
- .fixed_bits = 0x0000000000000006 },
78
+ * is 0 then HCPTR.{TCP11,TCP10} ignore writes and read as 1.
121
- { .name = "ID_AA64DFR1_EL1" },
79
+ */
122
+ .fixed_bits = (0x6 << R_ID_AA64DFR0_DEBUGVER_SHIFT) },
80
+ if (arm_feature(env, ARM_FEATURE_EL3) && !arm_el_is_aa64(env, 3) &&
123
+ { .name = "ID_AA64DFR1_EL1" },
81
+ !arm_is_secure(env) && !extract32(env->cp15.nsacr, 10, 1)) {
124
{ .name = "ID_AA64DFR*_EL1_RESERVED",
82
+ value &= ~(0x3 << 10);
125
- .is_glob = true },
83
+ value |= env->cp15.cptr_el[2] & (0x3 << 10);
126
+ .is_glob = true },
84
+ }
127
{ .name = "ID_AA64AFR*",
85
+ env->cp15.cptr_el[2] = value;
128
- .is_glob = true },
86
+}
129
+ .is_glob = true },
87
+
130
{ .name = "ID_AA64ISAR0_EL1",
88
+static uint64_t cptr_el2_read(CPUARMState *env, const ARMCPRegInfo *ri)
131
- .exported_bits = 0x00fffffff0fffff0 },
89
+{
132
+ .exported_bits = R_ID_AA64ISAR0_AES_MASK |
90
+ /*
133
+ R_ID_AA64ISAR0_SHA1_MASK |
91
+ * For A-profile AArch32 EL3, if NSACR.CP10
134
+ R_ID_AA64ISAR0_SHA2_MASK |
92
+ * is 0 then HCPTR.{TCP11,TCP10} ignore writes and read as 1.
135
+ R_ID_AA64ISAR0_CRC32_MASK |
93
+ */
136
+ R_ID_AA64ISAR0_ATOMIC_MASK |
94
+ uint64_t value = env->cp15.cptr_el[2];
137
+ R_ID_AA64ISAR0_RDM_MASK |
95
+
138
+ R_ID_AA64ISAR0_SHA3_MASK |
96
+ if (arm_feature(env, ARM_FEATURE_EL3) && !arm_el_is_aa64(env, 3) &&
139
+ R_ID_AA64ISAR0_SM3_MASK |
97
+ !arm_is_secure(env) && !extract32(env->cp15.nsacr, 10, 1)) {
140
+ R_ID_AA64ISAR0_SM4_MASK |
98
+ value |= 0x3 << 10;
141
+ R_ID_AA64ISAR0_DP_MASK |
99
+ }
142
+ R_ID_AA64ISAR0_FHM_MASK |
100
+ return value;
143
+ R_ID_AA64ISAR0_TS_MASK |
101
+}
144
+ R_ID_AA64ISAR0_RNDR_MASK },
102
+
145
{ .name = "ID_AA64ISAR1_EL1",
103
static const ARMCPRegInfo el2_cp_reginfo[] = {
146
- .exported_bits = 0x000000f0ffffffff },
104
{ .name = "HCR_EL2", .state = ARM_CP_STATE_AA64,
147
+ .exported_bits = R_ID_AA64ISAR1_DPB_MASK |
105
.type = ARM_CP_IO,
148
+ R_ID_AA64ISAR1_APA_MASK |
106
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
149
+ R_ID_AA64ISAR1_API_MASK |
107
{ .name = "CPTR_EL2", .state = ARM_CP_STATE_BOTH,
150
+ R_ID_AA64ISAR1_JSCVT_MASK |
108
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 2,
151
+ R_ID_AA64ISAR1_FCMA_MASK |
109
.access = PL2_RW, .accessfn = cptr_access, .resetvalue = 0,
152
+ R_ID_AA64ISAR1_LRCPC_MASK |
110
- .fieldoffset = offsetof(CPUARMState, cp15.cptr_el[2]) },
153
+ R_ID_AA64ISAR1_GPA_MASK |
111
+ .fieldoffset = offsetof(CPUARMState, cp15.cptr_el[2]),
154
+ R_ID_AA64ISAR1_GPI_MASK |
112
+ .readfn = cptr_el2_read, .writefn = cptr_el2_write },
155
+ R_ID_AA64ISAR1_FRINTTS_MASK |
113
{ .name = "MAIR_EL2", .state = ARM_CP_STATE_BOTH,
156
+ R_ID_AA64ISAR1_SB_MASK |
114
.opc0 = 3, .opc1 = 4, .crn = 10, .crm = 2, .opc2 = 0,
157
+ R_ID_AA64ISAR1_BF16_MASK |
115
.access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.mair_el[2]),
158
+ R_ID_AA64ISAR1_DGH_MASK |
116
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
159
+ R_ID_AA64ISAR1_I8MM_MASK },
117
break;
160
+ { .name = "ID_AA64ISAR2_EL1",
118
}
161
+ .exported_bits = R_ID_AA64ISAR2_WFXT_MASK |
119
162
+ R_ID_AA64ISAR2_RPRES_MASK |
120
+ /*
163
+ R_ID_AA64ISAR2_GPA3_MASK |
121
+ * The NSACR allows A-profile AArch32 EL3 and M-profile secure mode
164
+ R_ID_AA64ISAR2_APA3_MASK },
122
+ * to control non-secure access to the FPU. It doesn't have any
165
{ .name = "ID_AA64ISAR*_EL1_RESERVED",
123
+ * effect if EL3 is AArch64 or if EL3 doesn't exist at all.
166
- .is_glob = true },
124
+ */
167
+ .is_glob = true },
125
+ if ((arm_feature(env, ARM_FEATURE_EL3) && !arm_el_is_aa64(env, 3) &&
168
};
126
+ cur_el <= 2 && !arm_is_secure_below_el3(env))) {
169
modify_arm_cp_regs(v8_idregs, v8_user_idregs);
127
+ if (!extract32(env->cp15.nsacr, 10, 1)) {
170
#endif
128
+ /* FP insns act as UNDEF */
171
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
129
+ return cur_el == 2 ? 2 : 1;
172
#ifdef CONFIG_USER_ONLY
130
+ }
173
static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
131
+ }
174
{ .name = "MIDR_EL1",
132
+
175
- .exported_bits = 0x00000000ffffffff },
133
/* For the CPTR registers we don't need to guard with an ARM_FEATURE
176
- { .name = "REVIDR_EL1" },
134
* check because zero bits in the registers mean "don't trap".
177
+ .exported_bits = R_MIDR_EL1_REVISION_MASK |
135
*/
178
+ R_MIDR_EL1_PARTNUM_MASK |
179
+ R_MIDR_EL1_ARCHITECTURE_MASK |
180
+ R_MIDR_EL1_VARIANT_MASK |
181
+ R_MIDR_EL1_IMPLEMENTER_MASK },
182
+ { .name = "REVIDR_EL1" },
183
};
184
modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
185
#endif
136
--
186
--
137
2.20.1
187
2.25.1
138
139
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Typo comparing the sign of the field, twice, instead of also comparing
3
The header target/arm/kvm-consts.h checks CONFIG_KVM which is marked as
4
the mask of the field (which itself encodes both position and length).
4
poisoned in common code, so the files that include this header have to
5
be added to specific_ss and recompiled for each, qemu-system-arm and
6
qemu-system-aarch64. However, since the kvm headers are only optionally
7
used in kvm-constants.h for some sanity checks, we can additionally
8
check the NEED_CPU_H macro first to avoid the poisoned CONFIG_KVM macro,
9
so kvm-constants.h can also be used from "common" files (without the
10
sanity checks - which should be OK since they are still done from other
11
target-specific files instead). This way, and by adjusting some other
12
include statements in the related files here and there, we can move some
13
files from specific_ss into softmmu_ss, so that they only need to be
14
compiled once during the build process.
5
15
6
Reported-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Thomas Huth <thuth@redhat.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
17
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20190604154225.26992-1-richard.henderson@linaro.org
18
Message-id: 20221202154023.293614-1-thuth@redhat.com
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
20
---
13
scripts/decodetree.py | 2 +-
21
include/hw/misc/xlnx-zynqmp-apu-ctrl.h | 2 +-
14
1 file changed, 1 insertion(+), 1 deletion(-)
22
target/arm/kvm-consts.h | 8 ++++----
23
hw/misc/imx6_src.c | 2 +-
24
hw/misc/iotkit-sysctl.c | 1 -
25
hw/misc/meson.build | 11 +++++------
26
5 files changed, 11 insertions(+), 13 deletions(-)
15
27
16
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
28
diff --git a/include/hw/misc/xlnx-zynqmp-apu-ctrl.h b/include/hw/misc/xlnx-zynqmp-apu-ctrl.h
17
index XXXXXXX..XXXXXXX 100755
29
index XXXXXXX..XXXXXXX 100644
18
--- a/scripts/decodetree.py
30
--- a/include/hw/misc/xlnx-zynqmp-apu-ctrl.h
19
+++ b/scripts/decodetree.py
31
+++ b/include/hw/misc/xlnx-zynqmp-apu-ctrl.h
20
@@ -XXX,XX +XXX,XX @@ class Field:
32
@@ -XXX,XX +XXX,XX @@
21
return '{0}(insn, {1}, {2})'.format(extr, self.pos, self.len)
33
22
34
#include "hw/sysbus.h"
23
def __eq__(self, other):
35
#include "hw/register.h"
24
- return self.sign == other.sign and self.sign == other.sign
36
-#include "target/arm/cpu.h"
25
+ return self.sign == other.sign and self.mask == other.mask
37
+#include "target/arm/cpu-qom.h"
26
38
27
def __ne__(self, other):
39
#define TYPE_XLNX_ZYNQMP_APU_CTRL "xlnx.apu-ctrl"
28
return not self.__eq__(other)
40
OBJECT_DECLARE_SIMPLE_TYPE(XlnxZynqMPAPUCtrl, XLNX_ZYNQMP_APU_CTRL)
41
diff --git a/target/arm/kvm-consts.h b/target/arm/kvm-consts.h
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/kvm-consts.h
44
+++ b/target/arm/kvm-consts.h
45
@@ -XXX,XX +XXX,XX @@
46
#ifndef ARM_KVM_CONSTS_H
47
#define ARM_KVM_CONSTS_H
48
49
+#ifdef NEED_CPU_H
50
#ifdef CONFIG_KVM
51
#include <linux/kvm.h>
52
#include <linux/psci.h>
53
-
54
#define MISMATCH_CHECK(X, Y) QEMU_BUILD_BUG_ON(X != Y)
55
+#endif
56
+#endif
57
58
-#else
59
-
60
+#ifndef MISMATCH_CHECK
61
#define MISMATCH_CHECK(X, Y) QEMU_BUILD_BUG_ON(0)
62
-
63
#endif
64
65
#define CP_REG_SIZE_SHIFT 52
66
diff --git a/hw/misc/imx6_src.c b/hw/misc/imx6_src.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/hw/misc/imx6_src.c
69
+++ b/hw/misc/imx6_src.c
70
@@ -XXX,XX +XXX,XX @@
71
#include "qemu/log.h"
72
#include "qemu/main-loop.h"
73
#include "qemu/module.h"
74
-#include "arm-powerctl.h"
75
+#include "target/arm/arm-powerctl.h"
76
#include "hw/core/cpu.h"
77
78
#ifndef DEBUG_IMX6_SRC
79
diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/hw/misc/iotkit-sysctl.c
82
+++ b/hw/misc/iotkit-sysctl.c
83
@@ -XXX,XX +XXX,XX @@
84
#include "hw/qdev-properties.h"
85
#include "hw/arm/armsse-version.h"
86
#include "target/arm/arm-powerctl.h"
87
-#include "target/arm/cpu.h"
88
89
REG32(SECDBGSTAT, 0x0)
90
REG32(SECDBGSET, 0x4)
91
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
92
index XXXXXXX..XXXXXXX 100644
93
--- a/hw/misc/meson.build
94
+++ b/hw/misc/meson.build
95
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_IMX', if_true: files(
96
'imx25_ccm.c',
97
'imx31_ccm.c',
98
'imx6_ccm.c',
99
+ 'imx6_src.c',
100
'imx6ul_ccm.c',
101
'imx7_ccm.c',
102
'imx7_gpr.c',
103
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files(
104
))
105
softmmu_ss.add(when: 'CONFIG_SLAVIO', if_true: files('slavio_misc.c'))
106
softmmu_ss.add(when: 'CONFIG_ZYNQ', if_true: files('zynq_slcr.c'))
107
-specific_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: files('xlnx-zynqmp-crf.c'))
108
-specific_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: files('xlnx-zynqmp-apu-ctrl.c'))
109
+softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: files('xlnx-zynqmp-crf.c'))
110
+softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: files('xlnx-zynqmp-apu-ctrl.c'))
111
specific_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files('xlnx-versal-crl.c'))
112
softmmu_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files(
113
'xlnx-versal-xramc.c',
114
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_TZ_MPC', if_true: files('tz-mpc.c'))
115
softmmu_ss.add(when: 'CONFIG_TZ_MSC', if_true: files('tz-msc.c'))
116
softmmu_ss.add(when: 'CONFIG_TZ_PPC', if_true: files('tz-ppc.c'))
117
softmmu_ss.add(when: 'CONFIG_IOTKIT_SECCTL', if_true: files('iotkit-secctl.c'))
118
+softmmu_ss.add(when: 'CONFIG_IOTKIT_SYSCTL', if_true: files('iotkit-sysctl.c'))
119
softmmu_ss.add(when: 'CONFIG_IOTKIT_SYSINFO', if_true: files('iotkit-sysinfo.c'))
120
softmmu_ss.add(when: 'CONFIG_ARMSSE_CPU_PWRCTRL', if_true: files('armsse-cpu-pwrctrl.c'))
121
softmmu_ss.add(when: 'CONFIG_ARMSSE_CPUID', if_true: files('armsse-cpuid.c'))
122
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_GRLIB', if_true: files('grlib_ahb_apb_pnp.c'))
123
124
specific_ss.add(when: 'CONFIG_AVR_POWER', if_true: files('avr_power.c'))
125
126
-specific_ss.add(when: 'CONFIG_IMX', if_true: files('imx6_src.c'))
127
-specific_ss.add(when: 'CONFIG_IOTKIT_SYSCTL', if_true: files('iotkit-sysctl.c'))
128
-
129
specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c'))
130
131
specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 'mips_cpc.c'))
132
specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
133
134
-specific_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
135
+softmmu_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
136
137
# HPPA devices
138
softmmu_ss.add(when: 'CONFIG_LASI', if_true: files('lasi.c'))
29
--
139
--
30
2.20.1
140
2.25.1
31
141
32
142
diff view generated by jsdifflib
1
The Cortex-R5F initfn was not correctly setting up the MVFR
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
ID register values. Fill these in, since some subsequent patches
3
will use ID register checks rather than CPU feature bit checks.
4
2
3
When building with --disable-tcg on Darwin we get:
4
5
target/arm/cpu.c:725:16: error: incomplete definition of type 'struct TCGCPUOps'
6
cc->tcg_ops->do_interrupt(cs);
7
~~~~~~~~~~~^
8
9
Commit 083afd18a9 ("target/arm: Restrict cpu_exec_interrupt()
10
handler to sysemu") limited this block to system emulation,
11
but neglected to also limit it to TCG.
12
13
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
14
Reviewed-by: Fabiano Rosas <farosas@suse.de>
15
Message-id: 20221209110823.59495-1-philmd@linaro.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
17
---
8
target/arm/cpu.c | 2 ++
18
target/arm/cpu.c | 5 +++--
9
1 file changed, 2 insertions(+)
19
1 file changed, 3 insertions(+), 2 deletions(-)
10
20
11
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
21
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
12
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/cpu.c
23
--- a/target/arm/cpu.c
14
+++ b/target/arm/cpu.c
24
+++ b/target/arm/cpu.c
15
@@ -XXX,XX +XXX,XX @@ static void cortex_r5f_initfn(Object *obj)
25
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
16
26
arm_rebuild_hflags(env);
17
cortex_r5_initfn(obj);
18
set_feature(&cpu->env, ARM_FEATURE_VFP3);
19
+ cpu->isar.mvfr0 = 0x10110221;
20
+ cpu->isar.mvfr1 = 0x00000011;
21
}
27
}
22
28
23
static const ARMCPRegInfo cortexa8_cp_reginfo[] = {
29
-#ifndef CONFIG_USER_ONLY
30
+#if defined(CONFIG_TCG) && !defined(CONFIG_USER_ONLY)
31
32
static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
33
unsigned int target_el,
34
@@ -XXX,XX +XXX,XX @@ static bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
35
cc->tcg_ops->do_interrupt(cs);
36
return true;
37
}
38
-#endif /* !CONFIG_USER_ONLY */
39
+
40
+#endif /* CONFIG_TCG && !CONFIG_USER_ONLY */
41
42
void arm_cpu_update_virq(ARMCPU *cpu)
43
{
24
--
44
--
25
2.20.1
45
2.25.1
26
46
27
47
diff view generated by jsdifflib
Deleted patch
1
At the moment our -cpu max for AArch32 supports VFP short-vectors
2
because we always implement them, even for CPUs which should
3
not have them. The following commits are going to switch to
4
using the correct ID-register-check to enable or disable short
5
vector support, so we need to turn it on explicitly for -cpu max,
6
because Cortex-A15 doesn't implement it.
7
1
8
We don't enable this for the AArch64 -cpu max, because the v8A
9
architecture never supports short-vectors.
10
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
---
14
target/arm/cpu.c | 4 ++++
15
1 file changed, 4 insertions(+)
16
17
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/cpu.c
20
+++ b/target/arm/cpu.c
21
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
22
kvm_arm_set_cpu_features_from_host(cpu);
23
} else {
24
cortex_a15_initfn(obj);
25
+
26
+ /* old-style VFP short-vector support */
27
+ cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
28
+
29
#ifdef CONFIG_USER_ONLY
30
/* We don't set these in system emulation mode for the moment,
31
* since we don't correctly set (all of) the ID registers to
32
--
33
2.20.1
34
35
diff view generated by jsdifflib
Deleted patch
1
Convert the "single-precision" register moves to decodetree:
2
* VMSR
3
* VMRS
4
* VMOV between general purpose register and single precision
5
1
6
Note that the VMSR/VMRS conversions make our handling of
7
the "should this UNDEF?" checks consistent between the two
8
instructions:
9
* VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0
10
(previously was a nop)
11
* VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better
12
(previously was a nop)
13
* VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better
14
(previously would write to the register, which had no
15
guest-visible effect because we always UNDEF reads)
16
17
We also tighten up the decode: we were previously underdecoding
18
some SBZ or SBO bits.
19
20
The conversion of VMOV_single includes the expansion out of the
21
gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr()
22
sequences into the simpler direct load/store of the TCG temp via
23
neon_{load,store}_reg32(): we know in the new function that we're
24
always single-precision, we don't need to use the old-and-deprecated
25
cpu_F0* TCG globals, and we don't happen to have the declaration of
26
gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the
27
new function is.
28
29
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
31
---
32
target/arm/translate-vfp.inc.c | 161 +++++++++++++++++++++++++++++++++
33
target/arm/translate.c | 148 +-----------------------------
34
target/arm/vfp.decode | 4 +
35
3 files changed, 168 insertions(+), 145 deletions(-)
36
37
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/translate-vfp.inc.c
40
+++ b/target/arm/translate-vfp.inc.c
41
@@ -XXX,XX +XXX,XX @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
42
43
return true;
44
}
45
+
46
+static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
47
+{
48
+ TCGv_i32 tmp;
49
+ bool ignore_vfp_enabled = false;
50
+
51
+ if (arm_dc_feature(s, ARM_FEATURE_M)) {
52
+ /*
53
+ * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
54
+ * Writes to R15 are UNPREDICTABLE; we choose to undef.
55
+ */
56
+ if (a->rt == 15 || a->reg != ARM_VFP_FPSCR) {
57
+ return false;
58
+ }
59
+ }
60
+
61
+ switch (a->reg) {
62
+ case ARM_VFP_FPSID:
63
+ /*
64
+ * VFPv2 allows access to FPSID from userspace; VFPv3 restricts
65
+ * all ID registers to privileged access only.
66
+ */
67
+ if (IS_USER(s) && arm_dc_feature(s, ARM_FEATURE_VFP3)) {
68
+ return false;
69
+ }
70
+ ignore_vfp_enabled = true;
71
+ break;
72
+ case ARM_VFP_MVFR0:
73
+ case ARM_VFP_MVFR1:
74
+ if (IS_USER(s) || !arm_dc_feature(s, ARM_FEATURE_MVFR)) {
75
+ return false;
76
+ }
77
+ ignore_vfp_enabled = true;
78
+ break;
79
+ case ARM_VFP_MVFR2:
80
+ if (IS_USER(s) || !arm_dc_feature(s, ARM_FEATURE_V8)) {
81
+ return false;
82
+ }
83
+ ignore_vfp_enabled = true;
84
+ break;
85
+ case ARM_VFP_FPSCR:
86
+ break;
87
+ case ARM_VFP_FPEXC:
88
+ if (IS_USER(s)) {
89
+ return false;
90
+ }
91
+ ignore_vfp_enabled = true;
92
+ break;
93
+ case ARM_VFP_FPINST:
94
+ case ARM_VFP_FPINST2:
95
+ /* Not present in VFPv3 */
96
+ if (IS_USER(s) || arm_dc_feature(s, ARM_FEATURE_VFP3)) {
97
+ return false;
98
+ }
99
+ break;
100
+ default:
101
+ return false;
102
+ }
103
+
104
+ if (!full_vfp_access_check(s, ignore_vfp_enabled)) {
105
+ return true;
106
+ }
107
+
108
+ if (a->l) {
109
+ /* VMRS, move VFP special register to gp register */
110
+ switch (a->reg) {
111
+ case ARM_VFP_FPSID:
112
+ case ARM_VFP_FPEXC:
113
+ case ARM_VFP_FPINST:
114
+ case ARM_VFP_FPINST2:
115
+ case ARM_VFP_MVFR0:
116
+ case ARM_VFP_MVFR1:
117
+ case ARM_VFP_MVFR2:
118
+ tmp = load_cpu_field(vfp.xregs[a->reg]);
119
+ break;
120
+ case ARM_VFP_FPSCR:
121
+ if (a->rt == 15) {
122
+ tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
123
+ tcg_gen_andi_i32(tmp, tmp, 0xf0000000);
124
+ } else {
125
+ tmp = tcg_temp_new_i32();
126
+ gen_helper_vfp_get_fpscr(tmp, cpu_env);
127
+ }
128
+ break;
129
+ default:
130
+ g_assert_not_reached();
131
+ }
132
+
133
+ if (a->rt == 15) {
134
+ /* Set the 4 flag bits in the CPSR. */
135
+ gen_set_nzcv(tmp);
136
+ tcg_temp_free_i32(tmp);
137
+ } else {
138
+ store_reg(s, a->rt, tmp);
139
+ }
140
+ } else {
141
+ /* VMSR, move gp register to VFP special register */
142
+ switch (a->reg) {
143
+ case ARM_VFP_FPSID:
144
+ case ARM_VFP_MVFR0:
145
+ case ARM_VFP_MVFR1:
146
+ case ARM_VFP_MVFR2:
147
+ /* Writes are ignored. */
148
+ break;
149
+ case ARM_VFP_FPSCR:
150
+ tmp = load_reg(s, a->rt);
151
+ gen_helper_vfp_set_fpscr(cpu_env, tmp);
152
+ tcg_temp_free_i32(tmp);
153
+ gen_lookup_tb(s);
154
+ break;
155
+ case ARM_VFP_FPEXC:
156
+ /*
157
+ * TODO: VFP subarchitecture support.
158
+ * For now, keep the EN bit only
159
+ */
160
+ tmp = load_reg(s, a->rt);
161
+ tcg_gen_andi_i32(tmp, tmp, 1 << 30);
162
+ store_cpu_field(tmp, vfp.xregs[a->reg]);
163
+ gen_lookup_tb(s);
164
+ break;
165
+ case ARM_VFP_FPINST:
166
+ case ARM_VFP_FPINST2:
167
+ tmp = load_reg(s, a->rt);
168
+ store_cpu_field(tmp, vfp.xregs[a->reg]);
169
+ break;
170
+ default:
171
+ g_assert_not_reached();
172
+ }
173
+ }
174
+
175
+ return true;
176
+}
177
+
178
+static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a)
179
+{
180
+ TCGv_i32 tmp;
181
+
182
+ if (!vfp_access_check(s)) {
183
+ return true;
184
+ }
185
+
186
+ if (a->l) {
187
+ /* VFP to general purpose register */
188
+ tmp = tcg_temp_new_i32();
189
+ neon_load_reg32(tmp, a->vn);
190
+ if (a->rt == 15) {
191
+ /* Set the 4 flag bits in the CPSR. */
192
+ gen_set_nzcv(tmp);
193
+ tcg_temp_free_i32(tmp);
194
+ } else {
195
+ store_reg(s, a->rt, tmp);
196
+ }
197
+ } else {
198
+ /* general purpose register to VFP */
199
+ tmp = load_reg(s, a->rt);
200
+ neon_store_reg32(tmp, a->vn);
201
+ tcg_temp_free_i32(tmp);
202
+ }
203
+
204
+ return true;
205
+}
206
diff --git a/target/arm/translate.c b/target/arm/translate.c
207
index XXXXXXX..XXXXXXX 100644
208
--- a/target/arm/translate.c
209
+++ b/target/arm/translate.c
210
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
211
TCGv_i32 addr;
212
TCGv_i32 tmp;
213
TCGv_i32 tmp2;
214
- bool ignore_vfp_enabled = false;
215
216
if (!arm_dc_feature(s, ARM_FEATURE_VFP)) {
217
return 1;
218
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
219
* for invalid encodings; we will generate incorrect syndrome information
220
* for attempts to execute invalid vfp/neon encodings with FP disabled.
221
*/
222
- if ((insn & 0x0fe00fff) == 0x0ee00a10) {
223
- rn = (insn >> 16) & 0xf;
224
- if (rn == ARM_VFP_FPSID || rn == ARM_VFP_FPEXC || rn == ARM_VFP_MVFR2
225
- || rn == ARM_VFP_MVFR1 || rn == ARM_VFP_MVFR0) {
226
- ignore_vfp_enabled = true;
227
- }
228
- }
229
- if (!full_vfp_access_check(s, ignore_vfp_enabled)) {
230
+ if (!vfp_access_check(s)) {
231
return 0;
232
}
233
234
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
235
switch ((insn >> 24) & 0xf) {
236
case 0xe:
237
if (insn & (1 << 4)) {
238
- /* single register transfer */
239
- rd = (insn >> 12) & 0xf;
240
- if (dp) {
241
- /* already handled by decodetree */
242
- return 1;
243
- } else { /* !dp */
244
- bool is_sysreg;
245
-
246
- if ((insn & 0x6f) != 0x00)
247
- return 1;
248
- rn = VFP_SREG_N(insn);
249
-
250
- is_sysreg = extract32(insn, 21, 1);
251
-
252
- if (arm_dc_feature(s, ARM_FEATURE_M)) {
253
- /*
254
- * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
255
- * Writes to R15 are UNPREDICTABLE; we choose to undef.
256
- */
257
- if (is_sysreg && (rd == 15 || (rn >> 1) != ARM_VFP_FPSCR)) {
258
- return 1;
259
- }
260
- }
261
-
262
- if (insn & ARM_CP_RW_BIT) {
263
- /* vfp->arm */
264
- if (is_sysreg) {
265
- /* system register */
266
- rn >>= 1;
267
-
268
- switch (rn) {
269
- case ARM_VFP_FPSID:
270
- /* VFP2 allows access to FSID from userspace.
271
- VFP3 restricts all id registers to privileged
272
- accesses. */
273
- if (IS_USER(s)
274
- && arm_dc_feature(s, ARM_FEATURE_VFP3)) {
275
- return 1;
276
- }
277
- tmp = load_cpu_field(vfp.xregs[rn]);
278
- break;
279
- case ARM_VFP_FPEXC:
280
- if (IS_USER(s))
281
- return 1;
282
- tmp = load_cpu_field(vfp.xregs[rn]);
283
- break;
284
- case ARM_VFP_FPINST:
285
- case ARM_VFP_FPINST2:
286
- /* Not present in VFP3. */
287
- if (IS_USER(s)
288
- || arm_dc_feature(s, ARM_FEATURE_VFP3)) {
289
- return 1;
290
- }
291
- tmp = load_cpu_field(vfp.xregs[rn]);
292
- break;
293
- case ARM_VFP_FPSCR:
294
- if (rd == 15) {
295
- tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
296
- tcg_gen_andi_i32(tmp, tmp, 0xf0000000);
297
- } else {
298
- tmp = tcg_temp_new_i32();
299
- gen_helper_vfp_get_fpscr(tmp, cpu_env);
300
- }
301
- break;
302
- case ARM_VFP_MVFR2:
303
- if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
304
- return 1;
305
- }
306
- /* fall through */
307
- case ARM_VFP_MVFR0:
308
- case ARM_VFP_MVFR1:
309
- if (IS_USER(s)
310
- || !arm_dc_feature(s, ARM_FEATURE_MVFR)) {
311
- return 1;
312
- }
313
- tmp = load_cpu_field(vfp.xregs[rn]);
314
- break;
315
- default:
316
- return 1;
317
- }
318
- } else {
319
- gen_mov_F0_vreg(0, rn);
320
- tmp = gen_vfp_mrs();
321
- }
322
- if (rd == 15) {
323
- /* Set the 4 flag bits in the CPSR. */
324
- gen_set_nzcv(tmp);
325
- tcg_temp_free_i32(tmp);
326
- } else {
327
- store_reg(s, rd, tmp);
328
- }
329
- } else {
330
- /* arm->vfp */
331
- if (is_sysreg) {
332
- rn >>= 1;
333
- /* system register */
334
- switch (rn) {
335
- case ARM_VFP_FPSID:
336
- case ARM_VFP_MVFR0:
337
- case ARM_VFP_MVFR1:
338
- /* Writes are ignored. */
339
- break;
340
- case ARM_VFP_FPSCR:
341
- tmp = load_reg(s, rd);
342
- gen_helper_vfp_set_fpscr(cpu_env, tmp);
343
- tcg_temp_free_i32(tmp);
344
- gen_lookup_tb(s);
345
- break;
346
- case ARM_VFP_FPEXC:
347
- if (IS_USER(s))
348
- return 1;
349
- /* TODO: VFP subarchitecture support.
350
- * For now, keep the EN bit only */
351
- tmp = load_reg(s, rd);
352
- tcg_gen_andi_i32(tmp, tmp, 1 << 30);
353
- store_cpu_field(tmp, vfp.xregs[rn]);
354
- gen_lookup_tb(s);
355
- break;
356
- case ARM_VFP_FPINST:
357
- case ARM_VFP_FPINST2:
358
- if (IS_USER(s)) {
359
- return 1;
360
- }
361
- tmp = load_reg(s, rd);
362
- store_cpu_field(tmp, vfp.xregs[rn]);
363
- break;
364
- default:
365
- return 1;
366
- }
367
- } else {
368
- tmp = load_reg(s, rd);
369
- gen_vfp_msr(tmp);
370
- gen_mov_vreg_F0(0, rn);
371
- }
372
- }
373
- }
374
+ /* already handled by decodetree */
375
+ return 1;
376
} else {
377
/* data processing */
378
bool rd_is_dp = dp;
379
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
380
index XXXXXXX..XXXXXXX 100644
381
--- a/target/arm/vfp.decode
382
+++ b/target/arm/vfp.decode
383
@@ -XXX,XX +XXX,XX @@ VMOV_from_gp ---- 1110 0 0 index:1 0 .... rt:4 1011 .00 1 0000 \
384
385
VDUP ---- 1110 1 b:1 q:1 0 .... rt:4 1011 . 0 e:1 1 0000 \
386
vn=%vn_dp
387
+
388
+VMSR_VMRS ---- 1110 111 l:1 reg:4 rt:4 1010 0001 0000
389
+VMOV_single ---- 1110 000 l:1 .... rt:4 1010 . 001 0000 \
390
+ vn=%vn_sp
391
--
392
2.20.1
393
394
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP two-register transfer instructions to decodetree
2
(in the v8 Arm ARM these are the "Advanced SIMD and floating-point
3
64-bit move" encoding group).
4
1
5
Again, we expand out the sequences involving gen_vfp_msr() and
6
gen_msr_vfp().
7
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
---
11
target/arm/translate-vfp.inc.c | 70 ++++++++++++++++++++++++++++++++++
12
target/arm/translate.c | 46 +---------------------
13
target/arm/vfp.decode | 5 +++
14
3 files changed, 77 insertions(+), 44 deletions(-)
15
16
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate-vfp.inc.c
19
+++ b/target/arm/translate-vfp.inc.c
20
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a)
21
22
return true;
23
}
24
+
25
+static bool trans_VMOV_64_sp(DisasContext *s, arg_VMOV_64_sp *a)
26
+{
27
+ TCGv_i32 tmp;
28
+
29
+ /*
30
+ * VMOV between two general-purpose registers and two single precision
31
+ * floating point registers
32
+ */
33
+ if (!vfp_access_check(s)) {
34
+ return true;
35
+ }
36
+
37
+ if (a->op) {
38
+ /* fpreg to gpreg */
39
+ tmp = tcg_temp_new_i32();
40
+ neon_load_reg32(tmp, a->vm);
41
+ store_reg(s, a->rt, tmp);
42
+ tmp = tcg_temp_new_i32();
43
+ neon_load_reg32(tmp, a->vm + 1);
44
+ store_reg(s, a->rt2, tmp);
45
+ } else {
46
+ /* gpreg to fpreg */
47
+ tmp = load_reg(s, a->rt);
48
+ neon_store_reg32(tmp, a->vm);
49
+ tmp = load_reg(s, a->rt2);
50
+ neon_store_reg32(tmp, a->vm + 1);
51
+ }
52
+
53
+ return true;
54
+}
55
+
56
+static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_sp *a)
57
+{
58
+ TCGv_i32 tmp;
59
+
60
+ /*
61
+ * VMOV between two general-purpose registers and one double precision
62
+ * floating point register
63
+ */
64
+
65
+ /* UNDEF accesses to D16-D31 if they don't exist */
66
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
67
+ return false;
68
+ }
69
+
70
+ if (!vfp_access_check(s)) {
71
+ return true;
72
+ }
73
+
74
+ if (a->op) {
75
+ /* fpreg to gpreg */
76
+ tmp = tcg_temp_new_i32();
77
+ neon_load_reg32(tmp, a->vm * 2);
78
+ store_reg(s, a->rt, tmp);
79
+ tmp = tcg_temp_new_i32();
80
+ neon_load_reg32(tmp, a->vm * 2 + 1);
81
+ store_reg(s, a->rt2, tmp);
82
+ } else {
83
+ /* gpreg to fpreg */
84
+ tmp = load_reg(s, a->rt);
85
+ neon_store_reg32(tmp, a->vm * 2);
86
+ tcg_temp_free_i32(tmp);
87
+ tmp = load_reg(s, a->rt2);
88
+ neon_store_reg32(tmp, a->vm * 2 + 1);
89
+ tcg_temp_free_i32(tmp);
90
+ }
91
+
92
+ return true;
93
+}
94
diff --git a/target/arm/translate.c b/target/arm/translate.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/target/arm/translate.c
97
+++ b/target/arm/translate.c
98
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
99
case 0xc:
100
case 0xd:
101
if ((insn & 0x03e00000) == 0x00400000) {
102
- /* two-register transfer */
103
- rn = (insn >> 16) & 0xf;
104
- rd = (insn >> 12) & 0xf;
105
- if (dp) {
106
- VFP_DREG_M(rm, insn);
107
- } else {
108
- rm = VFP_SREG_M(insn);
109
- }
110
-
111
- if (insn & ARM_CP_RW_BIT) {
112
- /* vfp->arm */
113
- if (dp) {
114
- gen_mov_F0_vreg(0, rm * 2);
115
- tmp = gen_vfp_mrs();
116
- store_reg(s, rd, tmp);
117
- gen_mov_F0_vreg(0, rm * 2 + 1);
118
- tmp = gen_vfp_mrs();
119
- store_reg(s, rn, tmp);
120
- } else {
121
- gen_mov_F0_vreg(0, rm);
122
- tmp = gen_vfp_mrs();
123
- store_reg(s, rd, tmp);
124
- gen_mov_F0_vreg(0, rm + 1);
125
- tmp = gen_vfp_mrs();
126
- store_reg(s, rn, tmp);
127
- }
128
- } else {
129
- /* arm->vfp */
130
- if (dp) {
131
- tmp = load_reg(s, rd);
132
- gen_vfp_msr(tmp);
133
- gen_mov_vreg_F0(0, rm * 2);
134
- tmp = load_reg(s, rn);
135
- gen_vfp_msr(tmp);
136
- gen_mov_vreg_F0(0, rm * 2 + 1);
137
- } else {
138
- tmp = load_reg(s, rd);
139
- gen_vfp_msr(tmp);
140
- gen_mov_vreg_F0(0, rm);
141
- tmp = load_reg(s, rn);
142
- gen_vfp_msr(tmp);
143
- gen_mov_vreg_F0(0, rm + 1);
144
- }
145
- }
146
+ /* Already handled by decodetree */
147
+ return 1;
148
} else {
149
/* Load/store */
150
rn = (insn >> 16) & 0xf;
151
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
152
index XXXXXXX..XXXXXXX 100644
153
--- a/target/arm/vfp.decode
154
+++ b/target/arm/vfp.decode
155
@@ -XXX,XX +XXX,XX @@ VDUP ---- 1110 1 b:1 q:1 0 .... rt:4 1011 . 0 e:1 1 0000 \
156
VMSR_VMRS ---- 1110 111 l:1 reg:4 rt:4 1010 0001 0000
157
VMOV_single ---- 1110 000 l:1 .... rt:4 1010 . 001 0000 \
158
vn=%vn_sp
159
+
160
+VMOV_64_sp ---- 1100 010 op:1 rt2:4 rt:4 1010 00.1 .... \
161
+ vm=%vm_sp
162
+VMOV_64_dp ---- 1100 010 op:1 rt2:4 rt:4 1011 00.1 .... \
163
+ vm=%vm_dp
164
--
165
2.20.1
166
167
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP single load/store insns VLDR and VSTR to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 73 ++++++++++++++++++++++++++++++++++
7
target/arm/translate.c | 22 +---------
8
target/arm/vfp.decode | 7 ++++
9
3 files changed, 82 insertions(+), 20 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_sp *a)
16
17
return true;
18
}
19
+
20
+static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
21
+{
22
+ uint32_t offset;
23
+ TCGv_i32 addr;
24
+
25
+ if (!vfp_access_check(s)) {
26
+ return true;
27
+ }
28
+
29
+ offset = a->imm << 2;
30
+ if (!a->u) {
31
+ offset = -offset;
32
+ }
33
+
34
+ if (s->thumb && a->rn == 15) {
35
+ /* This is actually UNPREDICTABLE */
36
+ addr = tcg_temp_new_i32();
37
+ tcg_gen_movi_i32(addr, s->pc & ~2);
38
+ } else {
39
+ addr = load_reg(s, a->rn);
40
+ }
41
+ tcg_gen_addi_i32(addr, addr, offset);
42
+ if (a->l) {
43
+ gen_vfp_ld(s, false, addr);
44
+ gen_mov_vreg_F0(false, a->vd);
45
+ } else {
46
+ gen_mov_F0_vreg(false, a->vd);
47
+ gen_vfp_st(s, false, addr);
48
+ }
49
+ tcg_temp_free_i32(addr);
50
+
51
+ return true;
52
+}
53
+
54
+static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_sp *a)
55
+{
56
+ uint32_t offset;
57
+ TCGv_i32 addr;
58
+
59
+ /* UNDEF accesses to D16-D31 if they don't exist */
60
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
61
+ return false;
62
+ }
63
+
64
+ if (!vfp_access_check(s)) {
65
+ return true;
66
+ }
67
+
68
+ offset = a->imm << 2;
69
+ if (!a->u) {
70
+ offset = -offset;
71
+ }
72
+
73
+ if (s->thumb && a->rn == 15) {
74
+ /* This is actually UNPREDICTABLE */
75
+ addr = tcg_temp_new_i32();
76
+ tcg_gen_movi_i32(addr, s->pc & ~2);
77
+ } else {
78
+ addr = load_reg(s, a->rn);
79
+ }
80
+ tcg_gen_addi_i32(addr, addr, offset);
81
+ if (a->l) {
82
+ gen_vfp_ld(s, true, addr);
83
+ gen_mov_vreg_F0(true, a->vd);
84
+ } else {
85
+ gen_mov_F0_vreg(true, a->vd);
86
+ gen_vfp_st(s, true, addr);
87
+ }
88
+ tcg_temp_free_i32(addr);
89
+
90
+ return true;
91
+}
92
diff --git a/target/arm/translate.c b/target/arm/translate.c
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/translate.c
95
+++ b/target/arm/translate.c
96
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
97
else
98
rd = VFP_SREG_D(insn);
99
if ((insn & 0x01200000) == 0x01000000) {
100
- /* Single load/store */
101
- offset = (insn & 0xff) << 2;
102
- if ((insn & (1 << 23)) == 0)
103
- offset = -offset;
104
- if (s->thumb && rn == 15) {
105
- /* This is actually UNPREDICTABLE */
106
- addr = tcg_temp_new_i32();
107
- tcg_gen_movi_i32(addr, s->pc & ~2);
108
- } else {
109
- addr = load_reg(s, rn);
110
- }
111
- tcg_gen_addi_i32(addr, addr, offset);
112
- if (insn & (1 << 20)) {
113
- gen_vfp_ld(s, dp, addr);
114
- gen_mov_vreg_F0(dp, rd);
115
- } else {
116
- gen_mov_F0_vreg(dp, rd);
117
- gen_vfp_st(s, dp, addr);
118
- }
119
- tcg_temp_free_i32(addr);
120
+ /* Already handled by decodetree */
121
+ return 1;
122
} else {
123
/* load/store multiple */
124
int w = insn & (1 << 21);
125
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
126
index XXXXXXX..XXXXXXX 100644
127
--- a/target/arm/vfp.decode
128
+++ b/target/arm/vfp.decode
129
@@ -XXX,XX +XXX,XX @@ VMOV_64_sp ---- 1100 010 op:1 rt2:4 rt:4 1010 00.1 .... \
130
vm=%vm_sp
131
VMOV_64_dp ---- 1100 010 op:1 rt2:4 rt:4 1011 00.1 .... \
132
vm=%vm_dp
133
+
134
+# Note that the half-precision variants of VLDR and VSTR are
135
+# not part of this decodetree at all because they have bits [9:8] == 0b01
136
+VLDR_VSTR_sp ---- 1101 u:1 .0 l:1 rn:4 .... 1010 imm:8 \
137
+ vd=%vd_sp
138
+VLDR_VSTR_dp ---- 1101 u:1 .0 l:1 rn:4 .... 1011 imm:8 \
139
+ vd=%vd_dp
140
--
141
2.20.1
142
143
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP load/store multiple insns to decodetree.
2
This includes tightening up the UNDEF checking for pre-VFPv3
3
CPUs which only have D0-D15 : they now UNDEF for any access
4
to D16-D31, not merely when the smallest register in the
5
transfer list is in D16-D31.
6
1
7
This conversion does not try to share code between the single
8
precision and the double precision versions; this looks a bit
9
duplicative of code, but it leaves the door open for a future
10
refactoring which gets rid of the use of the "F0" registers
11
by inlining the various functions like gen_vfp_ld() and
12
gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }"
13
conditionalisation.
14
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
---
18
target/arm/translate-vfp.inc.c | 162 +++++++++++++++++++++++++++++++++
19
target/arm/translate.c | 97 +-------------------
20
target/arm/vfp.decode | 18 ++++
21
3 files changed, 183 insertions(+), 94 deletions(-)
22
23
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
24
index XXXXXXX..XXXXXXX 100644
25
--- a/target/arm/translate-vfp.inc.c
26
+++ b/target/arm/translate-vfp.inc.c
27
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_sp *a)
28
29
return true;
30
}
31
+
32
+static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a)
33
+{
34
+ uint32_t offset;
35
+ TCGv_i32 addr;
36
+ int i, n;
37
+
38
+ n = a->imm;
39
+
40
+ if (n == 0 || (a->vd + n) > 32) {
41
+ /*
42
+ * UNPREDICTABLE cases for bad immediates: we choose to
43
+ * UNDEF to avoid generating huge numbers of TCG ops
44
+ */
45
+ return false;
46
+ }
47
+ if (a->rn == 15 && a->w) {
48
+ /* writeback to PC is UNPREDICTABLE, we choose to UNDEF */
49
+ return false;
50
+ }
51
+
52
+ if (!vfp_access_check(s)) {
53
+ return true;
54
+ }
55
+
56
+ if (s->thumb && a->rn == 15) {
57
+ /* This is actually UNPREDICTABLE */
58
+ addr = tcg_temp_new_i32();
59
+ tcg_gen_movi_i32(addr, s->pc & ~2);
60
+ } else {
61
+ addr = load_reg(s, a->rn);
62
+ }
63
+ if (a->p) {
64
+ /* pre-decrement */
65
+ tcg_gen_addi_i32(addr, addr, -(a->imm << 2));
66
+ }
67
+
68
+ if (s->v8m_stackcheck && a->rn == 13 && a->w) {
69
+ /*
70
+ * Here 'addr' is the lowest address we will store to,
71
+ * and is either the old SP (if post-increment) or
72
+ * the new SP (if pre-decrement). For post-increment
73
+ * where the old value is below the limit and the new
74
+ * value is above, it is UNKNOWN whether the limit check
75
+ * triggers; we choose to trigger.
76
+ */
77
+ gen_helper_v8m_stackcheck(cpu_env, addr);
78
+ }
79
+
80
+ offset = 4;
81
+ for (i = 0; i < n; i++) {
82
+ if (a->l) {
83
+ /* load */
84
+ gen_vfp_ld(s, false, addr);
85
+ gen_mov_vreg_F0(false, a->vd + i);
86
+ } else {
87
+ /* store */
88
+ gen_mov_F0_vreg(false, a->vd + i);
89
+ gen_vfp_st(s, false, addr);
90
+ }
91
+ tcg_gen_addi_i32(addr, addr, offset);
92
+ }
93
+ if (a->w) {
94
+ /* writeback */
95
+ if (a->p) {
96
+ offset = -offset * n;
97
+ tcg_gen_addi_i32(addr, addr, offset);
98
+ }
99
+ store_reg(s, a->rn, addr);
100
+ } else {
101
+ tcg_temp_free_i32(addr);
102
+ }
103
+
104
+ return true;
105
+}
106
+
107
+static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
108
+{
109
+ uint32_t offset;
110
+ TCGv_i32 addr;
111
+ int i, n;
112
+
113
+ n = a->imm >> 1;
114
+
115
+ if (n == 0 || (a->vd + n) > 32 || n > 16) {
116
+ /*
117
+ * UNPREDICTABLE cases for bad immediates: we choose to
118
+ * UNDEF to avoid generating huge numbers of TCG ops
119
+ */
120
+ return false;
121
+ }
122
+ if (a->rn == 15 && a->w) {
123
+ /* writeback to PC is UNPREDICTABLE, we choose to UNDEF */
124
+ return false;
125
+ }
126
+
127
+ /* UNDEF accesses to D16-D31 if they don't exist */
128
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd + n) > 16) {
129
+ return false;
130
+ }
131
+
132
+ if (!vfp_access_check(s)) {
133
+ return true;
134
+ }
135
+
136
+ if (s->thumb && a->rn == 15) {
137
+ /* This is actually UNPREDICTABLE */
138
+ addr = tcg_temp_new_i32();
139
+ tcg_gen_movi_i32(addr, s->pc & ~2);
140
+ } else {
141
+ addr = load_reg(s, a->rn);
142
+ }
143
+ if (a->p) {
144
+ /* pre-decrement */
145
+ tcg_gen_addi_i32(addr, addr, -(a->imm << 2));
146
+ }
147
+
148
+ if (s->v8m_stackcheck && a->rn == 13 && a->w) {
149
+ /*
150
+ * Here 'addr' is the lowest address we will store to,
151
+ * and is either the old SP (if post-increment) or
152
+ * the new SP (if pre-decrement). For post-increment
153
+ * where the old value is below the limit and the new
154
+ * value is above, it is UNKNOWN whether the limit check
155
+ * triggers; we choose to trigger.
156
+ */
157
+ gen_helper_v8m_stackcheck(cpu_env, addr);
158
+ }
159
+
160
+ offset = 8;
161
+ for (i = 0; i < n; i++) {
162
+ if (a->l) {
163
+ /* load */
164
+ gen_vfp_ld(s, true, addr);
165
+ gen_mov_vreg_F0(true, a->vd + i);
166
+ } else {
167
+ /* store */
168
+ gen_mov_F0_vreg(true, a->vd + i);
169
+ gen_vfp_st(s, true, addr);
170
+ }
171
+ tcg_gen_addi_i32(addr, addr, offset);
172
+ }
173
+ if (a->w) {
174
+ /* writeback */
175
+ if (a->p) {
176
+ offset = -offset * n;
177
+ } else if (a->imm & 1) {
178
+ offset = 4;
179
+ } else {
180
+ offset = 0;
181
+ }
182
+
183
+ if (offset != 0) {
184
+ tcg_gen_addi_i32(addr, addr, offset);
185
+ }
186
+ store_reg(s, a->rn, addr);
187
+ } else {
188
+ tcg_temp_free_i32(addr);
189
+ }
190
+
191
+ return true;
192
+}
193
diff --git a/target/arm/translate.c b/target/arm/translate.c
194
index XXXXXXX..XXXXXXX 100644
195
--- a/target/arm/translate.c
196
+++ b/target/arm/translate.c
197
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
198
*/
199
static int disas_vfp_insn(DisasContext *s, uint32_t insn)
200
{
201
- uint32_t rd, rn, rm, op, i, n, offset, delta_d, delta_m, bank_mask;
202
+ uint32_t rd, rn, rm, op, i, n, delta_d, delta_m, bank_mask;
203
int dp, veclen;
204
- TCGv_i32 addr;
205
TCGv_i32 tmp;
206
TCGv_i32 tmp2;
207
208
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
209
break;
210
case 0xc:
211
case 0xd:
212
- if ((insn & 0x03e00000) == 0x00400000) {
213
- /* Already handled by decodetree */
214
- return 1;
215
- } else {
216
- /* Load/store */
217
- rn = (insn >> 16) & 0xf;
218
- if (dp)
219
- VFP_DREG_D(rd, insn);
220
- else
221
- rd = VFP_SREG_D(insn);
222
- if ((insn & 0x01200000) == 0x01000000) {
223
- /* Already handled by decodetree */
224
- return 1;
225
- } else {
226
- /* load/store multiple */
227
- int w = insn & (1 << 21);
228
- if (dp)
229
- n = (insn >> 1) & 0x7f;
230
- else
231
- n = insn & 0xff;
232
-
233
- if (w && !(((insn >> 23) ^ (insn >> 24)) & 1)) {
234
- /* P == U , W == 1 => UNDEF */
235
- return 1;
236
- }
237
- if (n == 0 || (rd + n) > 32 || (dp && n > 16)) {
238
- /* UNPREDICTABLE cases for bad immediates: we choose to
239
- * UNDEF to avoid generating huge numbers of TCG ops
240
- */
241
- return 1;
242
- }
243
- if (rn == 15 && w) {
244
- /* writeback to PC is UNPREDICTABLE, we choose to UNDEF */
245
- return 1;
246
- }
247
-
248
- if (s->thumb && rn == 15) {
249
- /* This is actually UNPREDICTABLE */
250
- addr = tcg_temp_new_i32();
251
- tcg_gen_movi_i32(addr, s->pc & ~2);
252
- } else {
253
- addr = load_reg(s, rn);
254
- }
255
- if (insn & (1 << 24)) /* pre-decrement */
256
- tcg_gen_addi_i32(addr, addr, -((insn & 0xff) << 2));
257
-
258
- if (s->v8m_stackcheck && rn == 13 && w) {
259
- /*
260
- * Here 'addr' is the lowest address we will store to,
261
- * and is either the old SP (if post-increment) or
262
- * the new SP (if pre-decrement). For post-increment
263
- * where the old value is below the limit and the new
264
- * value is above, it is UNKNOWN whether the limit check
265
- * triggers; we choose to trigger.
266
- */
267
- gen_helper_v8m_stackcheck(cpu_env, addr);
268
- }
269
-
270
- if (dp)
271
- offset = 8;
272
- else
273
- offset = 4;
274
- for (i = 0; i < n; i++) {
275
- if (insn & ARM_CP_RW_BIT) {
276
- /* load */
277
- gen_vfp_ld(s, dp, addr);
278
- gen_mov_vreg_F0(dp, rd + i);
279
- } else {
280
- /* store */
281
- gen_mov_F0_vreg(dp, rd + i);
282
- gen_vfp_st(s, dp, addr);
283
- }
284
- tcg_gen_addi_i32(addr, addr, offset);
285
- }
286
- if (w) {
287
- /* writeback */
288
- if (insn & (1 << 24))
289
- offset = -offset * n;
290
- else if (dp && (insn & 1))
291
- offset = 4;
292
- else
293
- offset = 0;
294
-
295
- if (offset != 0)
296
- tcg_gen_addi_i32(addr, addr, offset);
297
- store_reg(s, rn, addr);
298
- } else {
299
- tcg_temp_free_i32(addr);
300
- }
301
- }
302
- }
303
- break;
304
+ /* Already handled by decodetree */
305
+ return 1;
306
default:
307
/* Should never happen. */
308
return 1;
309
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
310
index XXXXXXX..XXXXXXX 100644
311
--- a/target/arm/vfp.decode
312
+++ b/target/arm/vfp.decode
313
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR_sp ---- 1101 u:1 .0 l:1 rn:4 .... 1010 imm:8 \
314
vd=%vd_sp
315
VLDR_VSTR_dp ---- 1101 u:1 .0 l:1 rn:4 .... 1011 imm:8 \
316
vd=%vd_dp
317
+
318
+# We split the load/store multiple up into two patterns to avoid
319
+# overlap with other insns in the "Advanced SIMD load/store and 64-bit move"
320
+# grouping:
321
+# P=0 U=0 W=0 is 64-bit VMOV
322
+# P=1 W=0 is VLDR/VSTR
323
+# P=U W=1 is UNDEF
324
+# leaving P=0 U=1 W=x and P=1 U=0 W=1 for load/store multiple.
325
+# These include FSTM/FLDM.
326
+VLDM_VSTM_sp ---- 1100 1 . w:1 l:1 rn:4 .... 1010 imm:8 \
327
+ vd=%vd_sp p=0 u=1
328
+VLDM_VSTM_dp ---- 1100 1 . w:1 l:1 rn:4 .... 1011 imm:8 \
329
+ vd=%vd_dp p=0 u=1
330
+
331
+VLDM_VSTM_sp ---- 1101 0.1 l:1 rn:4 .... 1010 imm:8 \
332
+ vd=%vd_sp p=1 u=0 w=1
333
+VLDM_VSTM_dp ---- 1101 0.1 l:1 rn:4 .... 1011 imm:8 \
334
+ vd=%vd_dp p=1 u=0 w=1
335
--
336
2.20.1
337
338
diff view generated by jsdifflib
Deleted patch
1
Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans
2
functions which perform the memory accesses by going via the TCG
3
globals cpu_F0s and cpu_F0d, to use local TCG temps instead.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
target/arm/translate-vfp.inc.c | 46 +++++++++++++++++++++-------------
9
target/arm/translate.c | 18 -------------
10
2 files changed, 28 insertions(+), 36 deletions(-)
11
12
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-vfp.inc.c
15
+++ b/target/arm/translate-vfp.inc.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_sp *a)
17
static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
18
{
19
uint32_t offset;
20
- TCGv_i32 addr;
21
+ TCGv_i32 addr, tmp;
22
23
if (!vfp_access_check(s)) {
24
return true;
25
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
26
addr = load_reg(s, a->rn);
27
}
28
tcg_gen_addi_i32(addr, addr, offset);
29
+ tmp = tcg_temp_new_i32();
30
if (a->l) {
31
- gen_vfp_ld(s, false, addr);
32
- gen_mov_vreg_F0(false, a->vd);
33
+ gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
34
+ neon_store_reg32(tmp, a->vd);
35
} else {
36
- gen_mov_F0_vreg(false, a->vd);
37
- gen_vfp_st(s, false, addr);
38
+ neon_load_reg32(tmp, a->vd);
39
+ gen_aa32_st32(s, tmp, addr, get_mem_index(s));
40
}
41
+ tcg_temp_free_i32(tmp);
42
tcg_temp_free_i32(addr);
43
44
return true;
45
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_sp *a)
46
{
47
uint32_t offset;
48
TCGv_i32 addr;
49
+ TCGv_i64 tmp;
50
51
/* UNDEF accesses to D16-D31 if they don't exist */
52
if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
53
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_sp *a)
54
addr = load_reg(s, a->rn);
55
}
56
tcg_gen_addi_i32(addr, addr, offset);
57
+ tmp = tcg_temp_new_i64();
58
if (a->l) {
59
- gen_vfp_ld(s, true, addr);
60
- gen_mov_vreg_F0(true, a->vd);
61
+ gen_aa32_ld64(s, tmp, addr, get_mem_index(s));
62
+ neon_store_reg64(tmp, a->vd);
63
} else {
64
- gen_mov_F0_vreg(true, a->vd);
65
- gen_vfp_st(s, true, addr);
66
+ neon_load_reg64(tmp, a->vd);
67
+ gen_aa32_st64(s, tmp, addr, get_mem_index(s));
68
}
69
+ tcg_temp_free_i64(tmp);
70
tcg_temp_free_i32(addr);
71
72
return true;
73
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_sp *a)
74
static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a)
75
{
76
uint32_t offset;
77
- TCGv_i32 addr;
78
+ TCGv_i32 addr, tmp;
79
int i, n;
80
81
n = a->imm;
82
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a)
83
}
84
85
offset = 4;
86
+ tmp = tcg_temp_new_i32();
87
for (i = 0; i < n; i++) {
88
if (a->l) {
89
/* load */
90
- gen_vfp_ld(s, false, addr);
91
- gen_mov_vreg_F0(false, a->vd + i);
92
+ gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
93
+ neon_store_reg32(tmp, a->vd + i);
94
} else {
95
/* store */
96
- gen_mov_F0_vreg(false, a->vd + i);
97
- gen_vfp_st(s, false, addr);
98
+ neon_load_reg32(tmp, a->vd + i);
99
+ gen_aa32_st32(s, tmp, addr, get_mem_index(s));
100
}
101
tcg_gen_addi_i32(addr, addr, offset);
102
}
103
+ tcg_temp_free_i32(tmp);
104
if (a->w) {
105
/* writeback */
106
if (a->p) {
107
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
108
{
109
uint32_t offset;
110
TCGv_i32 addr;
111
+ TCGv_i64 tmp;
112
int i, n;
113
114
n = a->imm >> 1;
115
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
116
}
117
118
offset = 8;
119
+ tmp = tcg_temp_new_i64();
120
for (i = 0; i < n; i++) {
121
if (a->l) {
122
/* load */
123
- gen_vfp_ld(s, true, addr);
124
- gen_mov_vreg_F0(true, a->vd + i);
125
+ gen_aa32_ld64(s, tmp, addr, get_mem_index(s));
126
+ neon_store_reg64(tmp, a->vd + i);
127
} else {
128
/* store */
129
- gen_mov_F0_vreg(true, a->vd + i);
130
- gen_vfp_st(s, true, addr);
131
+ neon_load_reg64(tmp, a->vd + i);
132
+ gen_aa32_st64(s, tmp, addr, get_mem_index(s));
133
}
134
tcg_gen_addi_i32(addr, addr, offset);
135
}
136
+ tcg_temp_free_i64(tmp);
137
if (a->w) {
138
/* writeback */
139
if (a->p) {
140
diff --git a/target/arm/translate.c b/target/arm/translate.c
141
index XXXXXXX..XXXXXXX 100644
142
--- a/target/arm/translate.c
143
+++ b/target/arm/translate.c
144
@@ -XXX,XX +XXX,XX @@ VFP_GEN_FIX(uhto, )
145
VFP_GEN_FIX(ulto, )
146
#undef VFP_GEN_FIX
147
148
-static inline void gen_vfp_ld(DisasContext *s, int dp, TCGv_i32 addr)
149
-{
150
- if (dp) {
151
- gen_aa32_ld64(s, cpu_F0d, addr, get_mem_index(s));
152
- } else {
153
- gen_aa32_ld32u(s, cpu_F0s, addr, get_mem_index(s));
154
- }
155
-}
156
-
157
-static inline void gen_vfp_st(DisasContext *s, int dp, TCGv_i32 addr)
158
-{
159
- if (dp) {
160
- gen_aa32_st64(s, cpu_F0d, addr, get_mem_index(s));
161
- } else {
162
- gen_aa32_st32(s, cpu_F0s, addr, get_mem_index(s));
163
- }
164
-}
165
-
166
static inline long vfp_reg_offset(bool dp, unsigned reg)
167
{
168
if (dp) {
169
--
170
2.20.1
171
172
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP VMLS instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 38 ++++++++++++++++++++++++++++++++++
7
target/arm/translate.c | 8 +------
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 44 insertions(+), 7 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLA_dp(DisasContext *s, arg_VMLA_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_VMLA_dp, a->vd, a->vn, a->vm, true);
18
}
19
+
20
+static void gen_VMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
21
+{
22
+ /*
23
+ * VMLS: vd = vd + -(vn * vm)
24
+ * Note that order of inputs to the add matters for NaNs.
25
+ */
26
+ TCGv_i32 tmp = tcg_temp_new_i32();
27
+
28
+ gen_helper_vfp_muls(tmp, vn, vm, fpst);
29
+ gen_helper_vfp_negs(tmp, tmp);
30
+ gen_helper_vfp_adds(vd, vd, tmp, fpst);
31
+ tcg_temp_free_i32(tmp);
32
+}
33
+
34
+static bool trans_VMLS_sp(DisasContext *s, arg_VMLS_sp *a)
35
+{
36
+ return do_vfp_3op_sp(s, gen_VMLS_sp, a->vd, a->vn, a->vm, true);
37
+}
38
+
39
+static void gen_VMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
40
+{
41
+ /*
42
+ * VMLS: vd = vd + -(vn * vm)
43
+ * Note that order of inputs to the add matters for NaNs.
44
+ */
45
+ TCGv_i64 tmp = tcg_temp_new_i64();
46
+
47
+ gen_helper_vfp_muld(tmp, vn, vm, fpst);
48
+ gen_helper_vfp_negd(tmp, tmp);
49
+ gen_helper_vfp_addd(vd, vd, tmp, fpst);
50
+ tcg_temp_free_i64(tmp);
51
+}
52
+
53
+static bool trans_VMLS_dp(DisasContext *s, arg_VMLS_sp *a)
54
+{
55
+ return do_vfp_3op_dp(s, gen_VMLS_dp, a->vd, a->vn, a->vm, true);
56
+}
57
diff --git a/target/arm/translate.c b/target/arm/translate.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/translate.c
60
+++ b/target/arm/translate.c
61
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
62
rn = VFP_SREG_N(insn);
63
64
switch (op) {
65
- case 0:
66
+ case 0 ... 1:
67
/* Already handled by decodetree */
68
return 1;
69
default:
70
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
71
for (;;) {
72
/* Perform the calculation. */
73
switch (op) {
74
- case 1: /* VMLS: fd + -(fn * fm) */
75
- gen_vfp_mul(dp);
76
- gen_vfp_F1_neg(dp);
77
- gen_mov_F0_vreg(dp, rd);
78
- gen_vfp_add(dp);
79
- break;
80
case 2: /* VNMLS: -fd + (fn * fm) */
81
/* Note that it isn't valid to replace (-A + B) with (B - A)
82
* or similar plausible looking simplifications
83
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
84
index XXXXXXX..XXXXXXX 100644
85
--- a/target/arm/vfp.decode
86
+++ b/target/arm/vfp.decode
87
@@ -XXX,XX +XXX,XX @@ VMLA_sp ---- 1110 0.00 .... .... 1010 .0.0 .... \
88
vm=%vm_sp vn=%vn_sp vd=%vd_sp
89
VMLA_dp ---- 1110 0.00 .... .... 1011 .0.0 .... \
90
vm=%vm_dp vn=%vn_dp vd=%vd_dp
91
+
92
+VMLS_sp ---- 1110 0.00 .... .... 1010 .1.0 .... \
93
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
94
+VMLS_dp ---- 1110 0.00 .... .... 1011 .1.0 .... \
95
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
96
--
97
2.20.1
98
99
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP VNMLS instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 42 ++++++++++++++++++++++++++++++++++
7
target/arm/translate.c | 24 +------------------
8
target/arm/vfp.decode | 5 ++++
9
3 files changed, 48 insertions(+), 23 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_dp(DisasContext *s, arg_VMLS_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_VMLS_dp, a->vd, a->vn, a->vm, true);
18
}
19
+
20
+static void gen_VNMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
21
+{
22
+ /*
23
+ * VNMLS: -fd + (fn * fm)
24
+ * Note that it isn't valid to replace (-A + B) with (B - A) or similar
25
+ * plausible looking simplifications because this will give wrong results
26
+ * for NaNs.
27
+ */
28
+ TCGv_i32 tmp = tcg_temp_new_i32();
29
+
30
+ gen_helper_vfp_muls(tmp, vn, vm, fpst);
31
+ gen_helper_vfp_negs(vd, vd);
32
+ gen_helper_vfp_adds(vd, vd, tmp, fpst);
33
+ tcg_temp_free_i32(tmp);
34
+}
35
+
36
+static bool trans_VNMLS_sp(DisasContext *s, arg_VNMLS_sp *a)
37
+{
38
+ return do_vfp_3op_sp(s, gen_VNMLS_sp, a->vd, a->vn, a->vm, true);
39
+}
40
+
41
+static void gen_VNMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
42
+{
43
+ /*
44
+ * VNMLS: -fd + (fn * fm)
45
+ * Note that it isn't valid to replace (-A + B) with (B - A) or similar
46
+ * plausible looking simplifications because this will give wrong results
47
+ * for NaNs.
48
+ */
49
+ TCGv_i64 tmp = tcg_temp_new_i64();
50
+
51
+ gen_helper_vfp_muld(tmp, vn, vm, fpst);
52
+ gen_helper_vfp_negd(vd, vd);
53
+ gen_helper_vfp_addd(vd, vd, tmp, fpst);
54
+ tcg_temp_free_i64(tmp);
55
+}
56
+
57
+static bool trans_VNMLS_dp(DisasContext *s, arg_VNMLS_sp *a)
58
+{
59
+ return do_vfp_3op_dp(s, gen_VNMLS_dp, a->vd, a->vn, a->vm, true);
60
+}
61
diff --git a/target/arm/translate.c b/target/arm/translate.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/translate.c
64
+++ b/target/arm/translate.c
65
@@ -XXX,XX +XXX,XX @@ VFP_OP2(div)
66
67
#undef VFP_OP2
68
69
-static inline void gen_vfp_F1_mul(int dp)
70
-{
71
- /* Like gen_vfp_mul() but put result in F1 */
72
- TCGv_ptr fpst = get_fpstatus_ptr(0);
73
- if (dp) {
74
- gen_helper_vfp_muld(cpu_F1d, cpu_F0d, cpu_F1d, fpst);
75
- } else {
76
- gen_helper_vfp_muls(cpu_F1s, cpu_F0s, cpu_F1s, fpst);
77
- }
78
- tcg_temp_free_ptr(fpst);
79
-}
80
-
81
static inline void gen_vfp_F1_neg(int dp)
82
{
83
/* Like gen_vfp_neg() but put result in F1 */
84
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
85
rn = VFP_SREG_N(insn);
86
87
switch (op) {
88
- case 0 ... 1:
89
+ case 0 ... 2:
90
/* Already handled by decodetree */
91
return 1;
92
default:
93
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
94
for (;;) {
95
/* Perform the calculation. */
96
switch (op) {
97
- case 2: /* VNMLS: -fd + (fn * fm) */
98
- /* Note that it isn't valid to replace (-A + B) with (B - A)
99
- * or similar plausible looking simplifications
100
- * because this will give wrong results for NaNs.
101
- */
102
- gen_vfp_F1_mul(dp);
103
- gen_mov_F0_vreg(dp, rd);
104
- gen_vfp_neg(dp);
105
- gen_vfp_add(dp);
106
- break;
107
case 3: /* VNMLA: -fd + -(fn * fm) */
108
gen_vfp_mul(dp);
109
gen_vfp_F1_neg(dp);
110
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/vfp.decode
113
+++ b/target/arm/vfp.decode
114
@@ -XXX,XX +XXX,XX @@ VMLS_sp ---- 1110 0.00 .... .... 1010 .1.0 .... \
115
vm=%vm_sp vn=%vn_sp vd=%vd_sp
116
VMLS_dp ---- 1110 0.00 .... .... 1011 .1.0 .... \
117
vm=%vm_dp vn=%vn_dp vd=%vd_dp
118
+
119
+VNMLS_sp ---- 1110 0.01 .... .... 1010 .0.0 .... \
120
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
121
+VNMLS_dp ---- 1110 0.01 .... .... 1011 .0.0 .... \
122
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
123
--
124
2.20.1
125
126
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP VNMLA instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 34 ++++++++++++++++++++++++++++++++++
7
target/arm/translate.c | 19 +------------------
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 40 insertions(+), 18 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VNMLS_dp(DisasContext *s, arg_VNMLS_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_VNMLS_dp, a->vd, a->vn, a->vm, true);
18
}
19
+
20
+static void gen_VNMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
21
+{
22
+ /* VNMLA: -fd + -(fn * fm) */
23
+ TCGv_i32 tmp = tcg_temp_new_i32();
24
+
25
+ gen_helper_vfp_muls(tmp, vn, vm, fpst);
26
+ gen_helper_vfp_negs(tmp, tmp);
27
+ gen_helper_vfp_negs(vd, vd);
28
+ gen_helper_vfp_adds(vd, vd, tmp, fpst);
29
+ tcg_temp_free_i32(tmp);
30
+}
31
+
32
+static bool trans_VNMLA_sp(DisasContext *s, arg_VNMLA_sp *a)
33
+{
34
+ return do_vfp_3op_sp(s, gen_VNMLA_sp, a->vd, a->vn, a->vm, true);
35
+}
36
+
37
+static void gen_VNMLA_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
38
+{
39
+ /* VNMLA: -fd + (fn * fm) */
40
+ TCGv_i64 tmp = tcg_temp_new_i64();
41
+
42
+ gen_helper_vfp_muld(tmp, vn, vm, fpst);
43
+ gen_helper_vfp_negd(tmp, tmp);
44
+ gen_helper_vfp_negd(vd, vd);
45
+ gen_helper_vfp_addd(vd, vd, tmp, fpst);
46
+ tcg_temp_free_i64(tmp);
47
+}
48
+
49
+static bool trans_VNMLA_dp(DisasContext *s, arg_VNMLA_sp *a)
50
+{
51
+ return do_vfp_3op_dp(s, gen_VNMLA_dp, a->vd, a->vn, a->vm, true);
52
+}
53
diff --git a/target/arm/translate.c b/target/arm/translate.c
54
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/translate.c
56
+++ b/target/arm/translate.c
57
@@ -XXX,XX +XXX,XX @@ VFP_OP2(div)
58
59
#undef VFP_OP2
60
61
-static inline void gen_vfp_F1_neg(int dp)
62
-{
63
- /* Like gen_vfp_neg() but put result in F1 */
64
- if (dp) {
65
- gen_helper_vfp_negd(cpu_F1d, cpu_F0d);
66
- } else {
67
- gen_helper_vfp_negs(cpu_F1s, cpu_F0s);
68
- }
69
-}
70
-
71
static inline void gen_vfp_abs(int dp)
72
{
73
if (dp)
74
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
75
rn = VFP_SREG_N(insn);
76
77
switch (op) {
78
- case 0 ... 2:
79
+ case 0 ... 3:
80
/* Already handled by decodetree */
81
return 1;
82
default:
83
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
84
for (;;) {
85
/* Perform the calculation. */
86
switch (op) {
87
- case 3: /* VNMLA: -fd + -(fn * fm) */
88
- gen_vfp_mul(dp);
89
- gen_vfp_F1_neg(dp);
90
- gen_mov_F0_vreg(dp, rd);
91
- gen_vfp_neg(dp);
92
- gen_vfp_add(dp);
93
- break;
94
case 4: /* mul: fn * fm */
95
gen_vfp_mul(dp);
96
break;
97
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
98
index XXXXXXX..XXXXXXX 100644
99
--- a/target/arm/vfp.decode
100
+++ b/target/arm/vfp.decode
101
@@ -XXX,XX +XXX,XX @@ VNMLS_sp ---- 1110 0.01 .... .... 1010 .0.0 .... \
102
vm=%vm_sp vn=%vn_sp vd=%vd_sp
103
VNMLS_dp ---- 1110 0.01 .... .... 1011 .0.0 .... \
104
vm=%vm_dp vn=%vn_dp vd=%vd_dp
105
+
106
+VNMLA_sp ---- 1110 0.01 .... .... 1010 .1.0 .... \
107
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
108
+VNMLA_dp ---- 1110 0.01 .... .... 1011 .1.0 .... \
109
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
110
--
111
2.20.1
112
113
diff view generated by jsdifflib
Deleted patch
1
Convert the VMUL instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 10 ++++++++++
7
target/arm/translate.c | 5 +----
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 16 insertions(+), 4 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VNMLA_dp(DisasContext *s, arg_VNMLA_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_VNMLA_dp, a->vd, a->vn, a->vm, true);
18
}
19
+
20
+static bool trans_VMUL_sp(DisasContext *s, arg_VMUL_sp *a)
21
+{
22
+ return do_vfp_3op_sp(s, gen_helper_vfp_muls, a->vd, a->vn, a->vm, false);
23
+}
24
+
25
+static bool trans_VMUL_dp(DisasContext *s, arg_VMUL_sp *a)
26
+{
27
+ return do_vfp_3op_dp(s, gen_helper_vfp_muld, a->vd, a->vn, a->vm, false);
28
+}
29
diff --git a/target/arm/translate.c b/target/arm/translate.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.c
32
+++ b/target/arm/translate.c
33
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
34
rn = VFP_SREG_N(insn);
35
36
switch (op) {
37
- case 0 ... 3:
38
+ case 0 ... 4:
39
/* Already handled by decodetree */
40
return 1;
41
default:
42
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
43
for (;;) {
44
/* Perform the calculation. */
45
switch (op) {
46
- case 4: /* mul: fn * fm */
47
- gen_vfp_mul(dp);
48
- break;
49
case 5: /* nmul: -(fn * fm) */
50
gen_vfp_mul(dp);
51
gen_vfp_neg(dp);
52
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/vfp.decode
55
+++ b/target/arm/vfp.decode
56
@@ -XXX,XX +XXX,XX @@ VNMLA_sp ---- 1110 0.01 .... .... 1010 .1.0 .... \
57
vm=%vm_sp vn=%vn_sp vd=%vd_sp
58
VNMLA_dp ---- 1110 0.01 .... .... 1011 .1.0 .... \
59
vm=%vm_dp vn=%vn_dp vd=%vd_dp
60
+
61
+VMUL_sp ---- 1110 0.10 .... .... 1010 .0.0 .... \
62
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
63
+VMUL_dp ---- 1110 0.10 .... .... 1011 .0.0 .... \
64
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
65
--
66
2.20.1
67
68
diff view generated by jsdifflib
Deleted patch
1
Convert the VNMUL instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 24 ++++++++++++++++++++++++
7
target/arm/translate.c | 7 +------
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 30 insertions(+), 6 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_dp(DisasContext *s, arg_VMUL_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_helper_vfp_muld, a->vd, a->vn, a->vm, false);
18
}
19
+
20
+static void gen_VNMUL_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
21
+{
22
+ /* VNMUL: -(fn * fm) */
23
+ gen_helper_vfp_muls(vd, vn, vm, fpst);
24
+ gen_helper_vfp_negs(vd, vd);
25
+}
26
+
27
+static bool trans_VNMUL_sp(DisasContext *s, arg_VNMUL_sp *a)
28
+{
29
+ return do_vfp_3op_sp(s, gen_VNMUL_sp, a->vd, a->vn, a->vm, false);
30
+}
31
+
32
+static void gen_VNMUL_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
33
+{
34
+ /* VNMUL: -(fn * fm) */
35
+ gen_helper_vfp_muld(vd, vn, vm, fpst);
36
+ gen_helper_vfp_negd(vd, vd);
37
+}
38
+
39
+static bool trans_VNMUL_dp(DisasContext *s, arg_VNMUL_sp *a)
40
+{
41
+ return do_vfp_3op_dp(s, gen_VNMUL_dp, a->vd, a->vn, a->vm, false);
42
+}
43
diff --git a/target/arm/translate.c b/target/arm/translate.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/translate.c
46
+++ b/target/arm/translate.c
47
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_##name(int dp) \
48
49
VFP_OP2(add)
50
VFP_OP2(sub)
51
-VFP_OP2(mul)
52
VFP_OP2(div)
53
54
#undef VFP_OP2
55
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
56
rn = VFP_SREG_N(insn);
57
58
switch (op) {
59
- case 0 ... 4:
60
+ case 0 ... 5:
61
/* Already handled by decodetree */
62
return 1;
63
default:
64
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
65
for (;;) {
66
/* Perform the calculation. */
67
switch (op) {
68
- case 5: /* nmul: -(fn * fm) */
69
- gen_vfp_mul(dp);
70
- gen_vfp_neg(dp);
71
- break;
72
case 6: /* add: fn + fm */
73
gen_vfp_add(dp);
74
break;
75
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
76
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/vfp.decode
78
+++ b/target/arm/vfp.decode
79
@@ -XXX,XX +XXX,XX @@ VMUL_sp ---- 1110 0.10 .... .... 1010 .0.0 .... \
80
vm=%vm_sp vn=%vn_sp vd=%vd_sp
81
VMUL_dp ---- 1110 0.10 .... .... 1011 .0.0 .... \
82
vm=%vm_dp vn=%vn_dp vd=%vd_dp
83
+
84
+VNMUL_sp ---- 1110 0.10 .... .... 1010 .1.0 .... \
85
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
86
+VNMUL_dp ---- 1110 0.10 .... .... 1011 .1.0 .... \
87
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
88
--
89
2.20.1
90
91
diff view generated by jsdifflib
Deleted patch
1
Convert the VADD instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 10 ++++++++++
7
target/arm/translate.c | 6 +-----
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 16 insertions(+), 5 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VNMUL_dp(DisasContext *s, arg_VNMUL_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_VNMUL_dp, a->vd, a->vn, a->vm, false);
18
}
19
+
20
+static bool trans_VADD_sp(DisasContext *s, arg_VADD_sp *a)
21
+{
22
+ return do_vfp_3op_sp(s, gen_helper_vfp_adds, a->vd, a->vn, a->vm, false);
23
+}
24
+
25
+static bool trans_VADD_dp(DisasContext *s, arg_VADD_sp *a)
26
+{
27
+ return do_vfp_3op_dp(s, gen_helper_vfp_addd, a->vd, a->vn, a->vm, false);
28
+}
29
diff --git a/target/arm/translate.c b/target/arm/translate.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.c
32
+++ b/target/arm/translate.c
33
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_##name(int dp) \
34
tcg_temp_free_ptr(fpst); \
35
}
36
37
-VFP_OP2(add)
38
VFP_OP2(sub)
39
VFP_OP2(div)
40
41
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
42
rn = VFP_SREG_N(insn);
43
44
switch (op) {
45
- case 0 ... 5:
46
+ case 0 ... 6:
47
/* Already handled by decodetree */
48
return 1;
49
default:
50
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
51
for (;;) {
52
/* Perform the calculation. */
53
switch (op) {
54
- case 6: /* add: fn + fm */
55
- gen_vfp_add(dp);
56
- break;
57
case 7: /* sub: fn - fm */
58
gen_vfp_sub(dp);
59
break;
60
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/vfp.decode
63
+++ b/target/arm/vfp.decode
64
@@ -XXX,XX +XXX,XX @@ VNMUL_sp ---- 1110 0.10 .... .... 1010 .1.0 .... \
65
vm=%vm_sp vn=%vn_sp vd=%vd_sp
66
VNMUL_dp ---- 1110 0.10 .... .... 1011 .1.0 .... \
67
vm=%vm_dp vn=%vn_dp vd=%vd_dp
68
+
69
+VADD_sp ---- 1110 0.11 .... .... 1010 .0.0 .... \
70
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
71
+VADD_dp ---- 1110 0.11 .... .... 1011 .0.0 .... \
72
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
73
--
74
2.20.1
75
76
diff view generated by jsdifflib
Deleted patch
1
Convert the VSUB instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 10 ++++++++++
7
target/arm/translate.c | 6 +-----
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 16 insertions(+), 5 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VADD_dp(DisasContext *s, arg_VADD_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_helper_vfp_addd, a->vd, a->vn, a->vm, false);
18
}
19
+
20
+static bool trans_VSUB_sp(DisasContext *s, arg_VSUB_sp *a)
21
+{
22
+ return do_vfp_3op_sp(s, gen_helper_vfp_subs, a->vd, a->vn, a->vm, false);
23
+}
24
+
25
+static bool trans_VSUB_dp(DisasContext *s, arg_VSUB_sp *a)
26
+{
27
+ return do_vfp_3op_dp(s, gen_helper_vfp_subd, a->vd, a->vn, a->vm, false);
28
+}
29
diff --git a/target/arm/translate.c b/target/arm/translate.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.c
32
+++ b/target/arm/translate.c
33
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_##name(int dp) \
34
tcg_temp_free_ptr(fpst); \
35
}
36
37
-VFP_OP2(sub)
38
VFP_OP2(div)
39
40
#undef VFP_OP2
41
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
42
rn = VFP_SREG_N(insn);
43
44
switch (op) {
45
- case 0 ... 6:
46
+ case 0 ... 7:
47
/* Already handled by decodetree */
48
return 1;
49
default:
50
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
51
for (;;) {
52
/* Perform the calculation. */
53
switch (op) {
54
- case 7: /* sub: fn - fm */
55
- gen_vfp_sub(dp);
56
- break;
57
case 8: /* div: fn / fm */
58
gen_vfp_div(dp);
59
break;
60
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/vfp.decode
63
+++ b/target/arm/vfp.decode
64
@@ -XXX,XX +XXX,XX @@ VADD_sp ---- 1110 0.11 .... .... 1010 .0.0 .... \
65
vm=%vm_sp vn=%vn_sp vd=%vd_sp
66
VADD_dp ---- 1110 0.11 .... .... 1011 .0.0 .... \
67
vm=%vm_dp vn=%vn_dp vd=%vd_dp
68
+
69
+VSUB_sp ---- 1110 0.11 .... .... 1010 .1.0 .... \
70
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
71
+VSUB_dp ---- 1110 0.11 .... .... 1011 .1.0 .... \
72
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
73
--
74
2.20.1
75
76
diff view generated by jsdifflib
Deleted patch
1
Convert the VDIV instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 10 ++++++++++
7
target/arm/translate.c | 21 +--------------------
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 16 insertions(+), 20 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VSUB_dp(DisasContext *s, arg_VSUB_sp *a)
16
{
17
return do_vfp_3op_dp(s, gen_helper_vfp_subd, a->vd, a->vn, a->vm, false);
18
}
19
+
20
+static bool trans_VDIV_sp(DisasContext *s, arg_VDIV_sp *a)
21
+{
22
+ return do_vfp_3op_sp(s, gen_helper_vfp_divs, a->vd, a->vn, a->vm, false);
23
+}
24
+
25
+static bool trans_VDIV_dp(DisasContext *s, arg_VDIV_sp *a)
26
+{
27
+ return do_vfp_3op_dp(s, gen_helper_vfp_divd, a->vd, a->vn, a->vm, false);
28
+}
29
diff --git a/target/arm/translate.c b/target/arm/translate.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.c
32
+++ b/target/arm/translate.c
33
@@ -XXX,XX +XXX,XX @@ static TCGv_ptr get_fpstatus_ptr(int neon)
34
return statusptr;
35
}
36
37
-#define VFP_OP2(name) \
38
-static inline void gen_vfp_##name(int dp) \
39
-{ \
40
- TCGv_ptr fpst = get_fpstatus_ptr(0); \
41
- if (dp) { \
42
- gen_helper_vfp_##name##d(cpu_F0d, cpu_F0d, cpu_F1d, fpst); \
43
- } else { \
44
- gen_helper_vfp_##name##s(cpu_F0s, cpu_F0s, cpu_F1s, fpst); \
45
- } \
46
- tcg_temp_free_ptr(fpst); \
47
-}
48
-
49
-VFP_OP2(div)
50
-
51
-#undef VFP_OP2
52
-
53
static inline void gen_vfp_abs(int dp)
54
{
55
if (dp)
56
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
57
rn = VFP_SREG_N(insn);
58
59
switch (op) {
60
- case 0 ... 7:
61
+ case 0 ... 8:
62
/* Already handled by decodetree */
63
return 1;
64
default:
65
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
66
for (;;) {
67
/* Perform the calculation. */
68
switch (op) {
69
- case 8: /* div: fn / fm */
70
- gen_vfp_div(dp);
71
- break;
72
case 10: /* VFNMA : fd = muladd(-fd, fn, fm) */
73
case 11: /* VFNMS : fd = muladd(-fd, -fn, fm) */
74
case 12: /* VFMA : fd = muladd( fd, fn, fm) */
75
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
76
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/vfp.decode
78
+++ b/target/arm/vfp.decode
79
@@ -XXX,XX +XXX,XX @@ VSUB_sp ---- 1110 0.11 .... .... 1010 .1.0 .... \
80
vm=%vm_sp vn=%vn_sp vd=%vd_sp
81
VSUB_dp ---- 1110 0.11 .... .... 1011 .1.0 .... \
82
vm=%vm_dp vn=%vn_dp vd=%vd_dp
83
+
84
+VDIV_sp ---- 1110 1.00 .... .... 1010 .0.0 .... \
85
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp
86
+VDIV_dp ---- 1110 1.00 .... .... 1011 .0.0 .... \
87
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
88
--
89
2.20.1
90
91
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP fused multiply-add instructions (VFNMA, VFNMS,
2
VFMA, VFMS) to decodetree.
3
1
4
Note that in the old decode structure we were implementing
5
these to honour the VFP vector stride/length. These instructions
6
were introduced in VFPv4, and in the v7A architecture they
7
are UNPREDICTABLE if the vector stride or length are non-zero.
8
In v8A they must UNDEF if stride or length are non-zero, like
9
all VFP instructions; we choose to UNDEF always.
10
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
---
14
target/arm/translate-vfp.inc.c | 121 +++++++++++++++++++++++++++++++++
15
target/arm/translate.c | 53 +--------------
16
target/arm/vfp.decode | 9 +++
17
3 files changed, 131 insertions(+), 52 deletions(-)
18
19
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate-vfp.inc.c
22
+++ b/target/arm/translate-vfp.inc.c
23
@@ -XXX,XX +XXX,XX @@ static bool trans_VDIV_dp(DisasContext *s, arg_VDIV_sp *a)
24
{
25
return do_vfp_3op_dp(s, gen_helper_vfp_divd, a->vd, a->vn, a->vm, false);
26
}
27
+
28
+static bool trans_VFM_sp(DisasContext *s, arg_VFM_sp *a)
29
+{
30
+ /*
31
+ * VFNMA : fd = muladd(-fd, fn, fm)
32
+ * VFNMS : fd = muladd(-fd, -fn, fm)
33
+ * VFMA : fd = muladd( fd, fn, fm)
34
+ * VFMS : fd = muladd( fd, -fn, fm)
35
+ *
36
+ * These are fused multiply-add, and must be done as one floating
37
+ * point operation with no rounding between the multiplication and
38
+ * addition steps. NB that doing the negations here as separate
39
+ * steps is correct : an input NaN should come out with its sign
40
+ * bit flipped if it is a negated-input.
41
+ */
42
+ TCGv_ptr fpst;
43
+ TCGv_i32 vn, vm, vd;
44
+
45
+ /*
46
+ * Present in VFPv4 only.
47
+ * In v7A, UNPREDICTABLE with non-zero vector length/stride; from
48
+ * v8A, must UNDEF. We choose to UNDEF for both v7A and v8A.
49
+ */
50
+ if (!arm_dc_feature(s, ARM_FEATURE_VFP4) ||
51
+ (s->vec_len != 0 || s->vec_stride != 0)) {
52
+ return false;
53
+ }
54
+
55
+ if (!vfp_access_check(s)) {
56
+ return true;
57
+ }
58
+
59
+ vn = tcg_temp_new_i32();
60
+ vm = tcg_temp_new_i32();
61
+ vd = tcg_temp_new_i32();
62
+
63
+ neon_load_reg32(vn, a->vn);
64
+ neon_load_reg32(vm, a->vm);
65
+ if (a->o2) {
66
+ /* VFNMS, VFMS */
67
+ gen_helper_vfp_negs(vn, vn);
68
+ }
69
+ neon_load_reg32(vd, a->vd);
70
+ if (a->o1 & 1) {
71
+ /* VFNMA, VFNMS */
72
+ gen_helper_vfp_negs(vd, vd);
73
+ }
74
+ fpst = get_fpstatus_ptr(0);
75
+ gen_helper_vfp_muladds(vd, vn, vm, vd, fpst);
76
+ neon_store_reg32(vd, a->vd);
77
+
78
+ tcg_temp_free_ptr(fpst);
79
+ tcg_temp_free_i32(vn);
80
+ tcg_temp_free_i32(vm);
81
+ tcg_temp_free_i32(vd);
82
+
83
+ return true;
84
+}
85
+
86
+static bool trans_VFM_dp(DisasContext *s, arg_VFM_sp *a)
87
+{
88
+ /*
89
+ * VFNMA : fd = muladd(-fd, fn, fm)
90
+ * VFNMS : fd = muladd(-fd, -fn, fm)
91
+ * VFMA : fd = muladd( fd, fn, fm)
92
+ * VFMS : fd = muladd( fd, -fn, fm)
93
+ *
94
+ * These are fused multiply-add, and must be done as one floating
95
+ * point operation with no rounding between the multiplication and
96
+ * addition steps. NB that doing the negations here as separate
97
+ * steps is correct : an input NaN should come out with its sign
98
+ * bit flipped if it is a negated-input.
99
+ */
100
+ TCGv_ptr fpst;
101
+ TCGv_i64 vn, vm, vd;
102
+
103
+ /*
104
+ * Present in VFPv4 only.
105
+ * In v7A, UNPREDICTABLE with non-zero vector length/stride; from
106
+ * v8A, must UNDEF. We choose to UNDEF for both v7A and v8A.
107
+ */
108
+ if (!arm_dc_feature(s, ARM_FEATURE_VFP4) ||
109
+ (s->vec_len != 0 || s->vec_stride != 0)) {
110
+ return false;
111
+ }
112
+
113
+ /* UNDEF accesses to D16-D31 if they don't exist. */
114
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vn | a->vm) & 0x10)) {
115
+ return false;
116
+ }
117
+
118
+ if (!vfp_access_check(s)) {
119
+ return true;
120
+ }
121
+
122
+ vn = tcg_temp_new_i64();
123
+ vm = tcg_temp_new_i64();
124
+ vd = tcg_temp_new_i64();
125
+
126
+ neon_load_reg64(vn, a->vn);
127
+ neon_load_reg64(vm, a->vm);
128
+ if (a->o2) {
129
+ /* VFNMS, VFMS */
130
+ gen_helper_vfp_negd(vn, vn);
131
+ }
132
+ neon_load_reg64(vd, a->vd);
133
+ if (a->o1 & 1) {
134
+ /* VFNMA, VFNMS */
135
+ gen_helper_vfp_negd(vd, vd);
136
+ }
137
+ fpst = get_fpstatus_ptr(0);
138
+ gen_helper_vfp_muladdd(vd, vn, vm, vd, fpst);
139
+ neon_store_reg64(vd, a->vd);
140
+
141
+ tcg_temp_free_ptr(fpst);
142
+ tcg_temp_free_i64(vn);
143
+ tcg_temp_free_i64(vm);
144
+ tcg_temp_free_i64(vd);
145
+
146
+ return true;
147
+}
148
diff --git a/target/arm/translate.c b/target/arm/translate.c
149
index XXXXXXX..XXXXXXX 100644
150
--- a/target/arm/translate.c
151
+++ b/target/arm/translate.c
152
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
153
rn = VFP_SREG_N(insn);
154
155
switch (op) {
156
- case 0 ... 8:
157
+ case 0 ... 13:
158
/* Already handled by decodetree */
159
return 1;
160
default:
161
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
162
for (;;) {
163
/* Perform the calculation. */
164
switch (op) {
165
- case 10: /* VFNMA : fd = muladd(-fd, fn, fm) */
166
- case 11: /* VFNMS : fd = muladd(-fd, -fn, fm) */
167
- case 12: /* VFMA : fd = muladd( fd, fn, fm) */
168
- case 13: /* VFMS : fd = muladd( fd, -fn, fm) */
169
- /* These are fused multiply-add, and must be done as one
170
- * floating point operation with no rounding between the
171
- * multiplication and addition steps.
172
- * NB that doing the negations here as separate steps is
173
- * correct : an input NaN should come out with its sign bit
174
- * flipped if it is a negated-input.
175
- */
176
- if (!arm_dc_feature(s, ARM_FEATURE_VFP4)) {
177
- return 1;
178
- }
179
- if (dp) {
180
- TCGv_ptr fpst;
181
- TCGv_i64 frd;
182
- if (op & 1) {
183
- /* VFNMS, VFMS */
184
- gen_helper_vfp_negd(cpu_F0d, cpu_F0d);
185
- }
186
- frd = tcg_temp_new_i64();
187
- tcg_gen_ld_f64(frd, cpu_env, vfp_reg_offset(dp, rd));
188
- if (op & 2) {
189
- /* VFNMA, VFNMS */
190
- gen_helper_vfp_negd(frd, frd);
191
- }
192
- fpst = get_fpstatus_ptr(0);
193
- gen_helper_vfp_muladdd(cpu_F0d, cpu_F0d,
194
- cpu_F1d, frd, fpst);
195
- tcg_temp_free_ptr(fpst);
196
- tcg_temp_free_i64(frd);
197
- } else {
198
- TCGv_ptr fpst;
199
- TCGv_i32 frd;
200
- if (op & 1) {
201
- /* VFNMS, VFMS */
202
- gen_helper_vfp_negs(cpu_F0s, cpu_F0s);
203
- }
204
- frd = tcg_temp_new_i32();
205
- tcg_gen_ld_f32(frd, cpu_env, vfp_reg_offset(dp, rd));
206
- if (op & 2) {
207
- gen_helper_vfp_negs(frd, frd);
208
- }
209
- fpst = get_fpstatus_ptr(0);
210
- gen_helper_vfp_muladds(cpu_F0s, cpu_F0s,
211
- cpu_F1s, frd, fpst);
212
- tcg_temp_free_ptr(fpst);
213
- tcg_temp_free_i32(frd);
214
- }
215
- break;
216
case 14: /* fconst */
217
if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
218
return 1;
219
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
220
index XXXXXXX..XXXXXXX 100644
221
--- a/target/arm/vfp.decode
222
+++ b/target/arm/vfp.decode
223
@@ -XXX,XX +XXX,XX @@ VDIV_sp ---- 1110 1.00 .... .... 1010 .0.0 .... \
224
vm=%vm_sp vn=%vn_sp vd=%vd_sp
225
VDIV_dp ---- 1110 1.00 .... .... 1011 .0.0 .... \
226
vm=%vm_dp vn=%vn_dp vd=%vd_dp
227
+
228
+VFM_sp ---- 1110 1.01 .... .... 1010 . o2:1 . 0 .... \
229
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp o1=1
230
+VFM_dp ---- 1110 1.01 .... .... 1011 . o2:1 . 0 .... \
231
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp o1=1
232
+VFM_sp ---- 1110 1.10 .... .... 1010 . o2:1 . 0 .... \
233
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp o1=2
234
+VFM_dp ---- 1110 1.10 .... .... 1011 . o2:1 . 0 .... \
235
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp o1=2
236
--
237
2.20.1
238
239
diff view generated by jsdifflib
Deleted patch
1
Convert the VNEG instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 10 ++++++++++
7
target/arm/translate.c | 6 +-----
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 16 insertions(+), 5 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VABS_dp(DisasContext *s, arg_VABS_dp *a)
16
{
17
return do_vfp_2op_dp(s, gen_helper_vfp_absd, a->vd, a->vm);
18
}
19
+
20
+static bool trans_VNEG_sp(DisasContext *s, arg_VNEG_sp *a)
21
+{
22
+ return do_vfp_2op_sp(s, gen_helper_vfp_negs, a->vd, a->vm);
23
+}
24
+
25
+static bool trans_VNEG_dp(DisasContext *s, arg_VNEG_dp *a)
26
+{
27
+ return do_vfp_2op_dp(s, gen_helper_vfp_negd, a->vd, a->vm);
28
+}
29
diff --git a/target/arm/translate.c b/target/arm/translate.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.c
32
+++ b/target/arm/translate.c
33
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
34
return 1;
35
case 15:
36
switch (rn) {
37
- case 1:
38
+ case 1 ... 2:
39
/* Already handled by decodetree */
40
return 1;
41
default:
42
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
43
/* rn is opcode, encoded as per VFP_SREG_N. */
44
switch (rn) {
45
case 0x00: /* vmov */
46
- case 0x02: /* vneg */
47
case 0x03: /* vsqrt */
48
break;
49
50
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
51
case 0: /* cpy */
52
/* no-op */
53
break;
54
- case 2: /* neg */
55
- gen_vfp_neg(dp);
56
- break;
57
case 3: /* sqrt */
58
gen_vfp_sqrt(dp);
59
break;
60
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/vfp.decode
63
+++ b/target/arm/vfp.decode
64
@@ -XXX,XX +XXX,XX @@ VABS_sp ---- 1110 1.11 0000 .... 1010 11.0 .... \
65
vd=%vd_sp vm=%vm_sp
66
VABS_dp ---- 1110 1.11 0000 .... 1011 11.0 .... \
67
vd=%vd_dp vm=%vm_dp
68
+
69
+VNEG_sp ---- 1110 1.11 0001 .... 1010 01.0 .... \
70
+ vd=%vd_sp vm=%vm_sp
71
+VNEG_dp ---- 1110 1.11 0001 .... 1011 01.0 .... \
72
+ vd=%vd_dp vm=%vm_dp
73
--
74
2.20.1
75
76
diff view generated by jsdifflib
Deleted patch
1
Convert the VSQRT instruction to decodetree.
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/arm/translate-vfp.inc.c | 20 ++++++++++++++++++++
7
target/arm/translate.c | 14 +-------------
8
target/arm/vfp.decode | 5 +++++
9
3 files changed, 26 insertions(+), 13 deletions(-)
10
11
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.inc.c
14
+++ b/target/arm/translate-vfp.inc.c
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VNEG_dp(DisasContext *s, arg_VNEG_dp *a)
16
{
17
return do_vfp_2op_dp(s, gen_helper_vfp_negd, a->vd, a->vm);
18
}
19
+
20
+static void gen_VSQRT_sp(TCGv_i32 vd, TCGv_i32 vm)
21
+{
22
+ gen_helper_vfp_sqrts(vd, vm, cpu_env);
23
+}
24
+
25
+static bool trans_VSQRT_sp(DisasContext *s, arg_VSQRT_sp *a)
26
+{
27
+ return do_vfp_2op_sp(s, gen_VSQRT_sp, a->vd, a->vm);
28
+}
29
+
30
+static void gen_VSQRT_dp(TCGv_i64 vd, TCGv_i64 vm)
31
+{
32
+ gen_helper_vfp_sqrtd(vd, vm, cpu_env);
33
+}
34
+
35
+static bool trans_VSQRT_dp(DisasContext *s, arg_VSQRT_dp *a)
36
+{
37
+ return do_vfp_2op_dp(s, gen_VSQRT_dp, a->vd, a->vm);
38
+}
39
diff --git a/target/arm/translate.c b/target/arm/translate.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate.c
42
+++ b/target/arm/translate.c
43
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_neg(int dp)
44
gen_helper_vfp_negs(cpu_F0s, cpu_F0s);
45
}
46
47
-static inline void gen_vfp_sqrt(int dp)
48
-{
49
- if (dp)
50
- gen_helper_vfp_sqrtd(cpu_F0d, cpu_F0d, cpu_env);
51
- else
52
- gen_helper_vfp_sqrts(cpu_F0s, cpu_F0s, cpu_env);
53
-}
54
-
55
static inline void gen_vfp_cmp(int dp)
56
{
57
if (dp)
58
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
59
return 1;
60
case 15:
61
switch (rn) {
62
- case 1 ... 2:
63
+ case 1 ... 3:
64
/* Already handled by decodetree */
65
return 1;
66
default:
67
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
68
/* rn is opcode, encoded as per VFP_SREG_N. */
69
switch (rn) {
70
case 0x00: /* vmov */
71
- case 0x03: /* vsqrt */
72
break;
73
74
case 0x04: /* vcvtb.f64.f16, vcvtb.f32.f16 */
75
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
76
case 0: /* cpy */
77
/* no-op */
78
break;
79
- case 3: /* sqrt */
80
- gen_vfp_sqrt(dp);
81
- break;
82
case 4: /* vcvtb.f32.f16, vcvtb.f64.f16 */
83
{
84
TCGv_ptr fpst = get_fpstatus_ptr(false);
85
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
86
index XXXXXXX..XXXXXXX 100644
87
--- a/target/arm/vfp.decode
88
+++ b/target/arm/vfp.decode
89
@@ -XXX,XX +XXX,XX @@ VNEG_sp ---- 1110 1.11 0001 .... 1010 01.0 .... \
90
vd=%vd_sp vm=%vm_sp
91
VNEG_dp ---- 1110 1.11 0001 .... 1011 01.0 .... \
92
vd=%vd_dp vm=%vm_dp
93
+
94
+VSQRT_sp ---- 1110 1.11 0001 .... 1010 11.0 .... \
95
+ vd=%vd_sp vm=%vm_sp
96
+VSQRT_dp ---- 1110 1.11 0001 .... 1011 11.0 .... \
97
+ vd=%vd_dp vm=%vm_dp
98
--
99
2.20.1
100
101
diff view generated by jsdifflib
Deleted patch
1
Convert the VFP comparison instructions to decodetree.
2
1
3
Note that comparison instructions should not honour the VFP
4
short-vector length and stride information: they are scalar-only
5
operations. This applies to all the 2-operand instructions except
6
for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is
7
implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.)
8
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
---
12
target/arm/translate-vfp.inc.c | 75 ++++++++++++++++++++++++++++++++++
13
target/arm/translate.c | 51 +----------------------
14
target/arm/vfp.decode | 5 +++
15
3 files changed, 81 insertions(+), 50 deletions(-)
16
17
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/translate-vfp.inc.c
20
+++ b/target/arm/translate-vfp.inc.c
21
@@ -XXX,XX +XXX,XX @@ static bool trans_VSQRT_dp(DisasContext *s, arg_VSQRT_dp *a)
22
{
23
return do_vfp_2op_dp(s, gen_VSQRT_dp, a->vd, a->vm);
24
}
25
+
26
+static bool trans_VCMP_sp(DisasContext *s, arg_VCMP_sp *a)
27
+{
28
+ TCGv_i32 vd, vm;
29
+
30
+ /* Vm/M bits must be zero for the Z variant */
31
+ if (a->z && a->vm != 0) {
32
+ return false;
33
+ }
34
+
35
+ if (!vfp_access_check(s)) {
36
+ return true;
37
+ }
38
+
39
+ vd = tcg_temp_new_i32();
40
+ vm = tcg_temp_new_i32();
41
+
42
+ neon_load_reg32(vd, a->vd);
43
+ if (a->z) {
44
+ tcg_gen_movi_i32(vm, 0);
45
+ } else {
46
+ neon_load_reg32(vm, a->vm);
47
+ }
48
+
49
+ if (a->e) {
50
+ gen_helper_vfp_cmpes(vd, vm, cpu_env);
51
+ } else {
52
+ gen_helper_vfp_cmps(vd, vm, cpu_env);
53
+ }
54
+
55
+ tcg_temp_free_i32(vd);
56
+ tcg_temp_free_i32(vm);
57
+
58
+ return true;
59
+}
60
+
61
+static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
62
+{
63
+ TCGv_i64 vd, vm;
64
+
65
+ /* Vm/M bits must be zero for the Z variant */
66
+ if (a->z && a->vm != 0) {
67
+ return false;
68
+ }
69
+
70
+ /* UNDEF accesses to D16-D31 if they don't exist. */
71
+ if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
72
+ return false;
73
+ }
74
+
75
+ if (!vfp_access_check(s)) {
76
+ return true;
77
+ }
78
+
79
+ vd = tcg_temp_new_i64();
80
+ vm = tcg_temp_new_i64();
81
+
82
+ neon_load_reg64(vd, a->vd);
83
+ if (a->z) {
84
+ tcg_gen_movi_i64(vm, 0);
85
+ } else {
86
+ neon_load_reg64(vm, a->vm);
87
+ }
88
+
89
+ if (a->e) {
90
+ gen_helper_vfp_cmped(vd, vm, cpu_env);
91
+ } else {
92
+ gen_helper_vfp_cmpd(vd, vm, cpu_env);
93
+ }
94
+
95
+ tcg_temp_free_i64(vd);
96
+ tcg_temp_free_i64(vm);
97
+
98
+ return true;
99
+}
100
diff --git a/target/arm/translate.c b/target/arm/translate.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/translate.c
103
+++ b/target/arm/translate.c
104
@@ -XXX,XX +XXX,XX @@ static inline void gen_vfp_neg(int dp)
105
gen_helper_vfp_negs(cpu_F0s, cpu_F0s);
106
}
107
108
-static inline void gen_vfp_cmp(int dp)
109
-{
110
- if (dp)
111
- gen_helper_vfp_cmpd(cpu_F0d, cpu_F1d, cpu_env);
112
- else
113
- gen_helper_vfp_cmps(cpu_F0s, cpu_F1s, cpu_env);
114
-}
115
-
116
-static inline void gen_vfp_cmpe(int dp)
117
-{
118
- if (dp)
119
- gen_helper_vfp_cmped(cpu_F0d, cpu_F1d, cpu_env);
120
- else
121
- gen_helper_vfp_cmpes(cpu_F0s, cpu_F1s, cpu_env);
122
-}
123
-
124
-static inline void gen_vfp_F1_ld0(int dp)
125
-{
126
- if (dp)
127
- tcg_gen_movi_i64(cpu_F1d, 0);
128
- else
129
- tcg_gen_movi_i32(cpu_F1s, 0);
130
-}
131
-
132
#define VFP_GEN_ITOF(name) \
133
static inline void gen_vfp_##name(int dp, int neon) \
134
{ \
135
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
136
case 15:
137
switch (rn) {
138
case 0 ... 3:
139
+ case 8 ... 11:
140
/* Already handled by decodetree */
141
return 1;
142
default:
143
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
144
rd_is_dp = false;
145
break;
146
147
- case 0x08: case 0x0a: /* vcmp, vcmpz */
148
- case 0x09: case 0x0b: /* vcmpe, vcmpez */
149
- no_output = true;
150
- break;
151
-
152
case 0x0c: /* vrintr */
153
case 0x0d: /* vrintz */
154
case 0x0e: /* vrintx */
155
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
156
/* Load the initial operands. */
157
if (op == 15) {
158
switch (rn) {
159
- case 0x08: case 0x09: /* Compare */
160
- gen_mov_F0_vreg(dp, rd);
161
- gen_mov_F1_vreg(dp, rm);
162
- break;
163
- case 0x0a: case 0x0b: /* Compare with zero */
164
- gen_mov_F0_vreg(dp, rd);
165
- gen_vfp_F1_ld0(dp);
166
- break;
167
case 0x14: /* vcvt fp <-> fixed */
168
case 0x15:
169
case 0x16:
170
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
171
gen_vfp_msr(tmp);
172
break;
173
}
174
- case 8: /* cmp */
175
- gen_vfp_cmp(dp);
176
- break;
177
- case 9: /* cmpe */
178
- gen_vfp_cmpe(dp);
179
- break;
180
- case 10: /* cmpz */
181
- gen_vfp_cmp(dp);
182
- break;
183
- case 11: /* cmpez */
184
- gen_vfp_F1_ld0(dp);
185
- gen_vfp_cmpe(dp);
186
- break;
187
case 12: /* vrintr */
188
{
189
TCGv_ptr fpst = get_fpstatus_ptr(0);
190
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
191
index XXXXXXX..XXXXXXX 100644
192
--- a/target/arm/vfp.decode
193
+++ b/target/arm/vfp.decode
194
@@ -XXX,XX +XXX,XX @@ VSQRT_sp ---- 1110 1.11 0001 .... 1010 11.0 .... \
195
vd=%vd_sp vm=%vm_sp
196
VSQRT_dp ---- 1110 1.11 0001 .... 1011 11.0 .... \
197
vd=%vd_dp vm=%vm_dp
198
+
199
+VCMP_sp ---- 1110 1.11 010 z:1 .... 1010 e:1 1.0 .... \
200
+ vd=%vd_sp vm=%vm_sp
201
+VCMP_dp ---- 1110 1.11 010 z:1 .... 1011 e:1 1.0 .... \
202
+ vd=%vd_dp vm=%vm_dp
203
--
204
2.20.1
205
206
diff view generated by jsdifflib
Deleted patch
1
Convert the VCVTT, VCVTB instructions that deal with conversion
2
from half-precision floats to f32 or 64 to decodetree.
3
1
4
Since we're no longer constrained to the old decoder's style
5
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
6
load of the right half of the input single-precision register
7
rather than loading the full 32 bits and then doing a
8
separate shift or sign-extension.
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
target/arm/translate-vfp.inc.c | 82 ++++++++++++++++++++++++++++++++++
14
target/arm/translate.c | 56 +----------------------
15
target/arm/vfp.decode | 6 +++
16
3 files changed, 89 insertions(+), 55 deletions(-)
17
18
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate-vfp.inc.c
21
+++ b/target/arm/translate-vfp.inc.c
22
@@ -XXX,XX +XXX,XX @@
23
#include "decode-vfp.inc.c"
24
#include "decode-vfp-uncond.inc.c"
25
26
+/*
27
+ * Return the offset of a 16-bit half of the specified VFP single-precision
28
+ * register. If top is true, returns the top 16 bits; otherwise the bottom
29
+ * 16 bits.
30
+ */
31
+static inline long vfp_f16_offset(unsigned reg, bool top)
32
+{
33
+ long offs = vfp_reg_offset(false, reg);
34
+#ifdef HOST_WORDS_BIGENDIAN
35
+ if (!top) {
36
+ offs += 2;
37
+ }
38
+#else
39
+ if (top) {
40
+ offs += 2;
41
+ }
42
+#endif
43
+ return offs;
44
+}
45
+
46
/*
47
* Check that VFP access is enabled. If it is, do the necessary
48
* M-profile lazy-FP handling and then return true.
49
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
50
51
return true;
52
}
53
+
54
+static bool trans_VCVT_f32_f16(DisasContext *s, arg_VCVT_f32_f16 *a)
55
+{
56
+ TCGv_ptr fpst;
57
+ TCGv_i32 ahp_mode;
58
+ TCGv_i32 tmp;
59
+
60
+ if (!dc_isar_feature(aa32_fp16_spconv, s)) {
61
+ return false;
62
+ }
63
+
64
+ if (!vfp_access_check(s)) {
65
+ return true;
66
+ }
67
+
68
+ fpst = get_fpstatus_ptr(false);
69
+ ahp_mode = get_ahp_flag();
70
+ tmp = tcg_temp_new_i32();
71
+ /* The T bit tells us if we want the low or high 16 bits of Vm */
72
+ tcg_gen_ld16u_i32(tmp, cpu_env, vfp_f16_offset(a->vm, a->t));
73
+ gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp_mode);
74
+ neon_store_reg32(tmp, a->vd);
75
+ tcg_temp_free_i32(ahp_mode);
76
+ tcg_temp_free_ptr(fpst);
77
+ tcg_temp_free_i32(tmp);
78
+ return true;
79
+}
80
+
81
+static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
82
+{
83
+ TCGv_ptr fpst;
84
+ TCGv_i32 ahp_mode;
85
+ TCGv_i32 tmp;
86
+ TCGv_i64 vd;
87
+
88
+ if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
89
+ return false;
90
+ }
91
+
92
+ /* UNDEF accesses to D16-D31 if they don't exist. */
93
+ if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
94
+ return false;
95
+ }
96
+
97
+ if (!vfp_access_check(s)) {
98
+ return true;
99
+ }
100
+
101
+ fpst = get_fpstatus_ptr(false);
102
+ ahp_mode = get_ahp_flag();
103
+ tmp = tcg_temp_new_i32();
104
+ /* The T bit tells us if we want the low or high 16 bits of Vm */
105
+ tcg_gen_ld16u_i32(tmp, cpu_env, vfp_f16_offset(a->vm, a->t));
106
+ vd = tcg_temp_new_i64();
107
+ gen_helper_vfp_fcvt_f16_to_f64(vd, tmp, fpst, ahp_mode);
108
+ neon_store_reg64(vd, a->vd);
109
+ tcg_temp_free_i32(ahp_mode);
110
+ tcg_temp_free_ptr(fpst);
111
+ tcg_temp_free_i32(tmp);
112
+ tcg_temp_free_i64(vd);
113
+ return true;
114
+}
115
diff --git a/target/arm/translate.c b/target/arm/translate.c
116
index XXXXXXX..XXXXXXX 100644
117
--- a/target/arm/translate.c
118
+++ b/target/arm/translate.c
119
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
120
return 1;
121
case 15:
122
switch (rn) {
123
- case 0 ... 3:
124
+ case 0 ... 5:
125
case 8 ... 11:
126
/* Already handled by decodetree */
127
return 1;
128
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
129
if (op == 15) {
130
/* rn is opcode, encoded as per VFP_SREG_N. */
131
switch (rn) {
132
- case 0x04: /* vcvtb.f64.f16, vcvtb.f32.f16 */
133
- case 0x05: /* vcvtt.f64.f16, vcvtt.f32.f16 */
134
- /*
135
- * VCVTB, VCVTT: only present with the halfprec extension
136
- * UNPREDICTABLE if bit 8 is set prior to ARMv8
137
- * (we choose to UNDEF)
138
- */
139
- if (dp) {
140
- if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
141
- return 1;
142
- }
143
- } else {
144
- if (!dc_isar_feature(aa32_fp16_spconv, s)) {
145
- return 1;
146
- }
147
- }
148
- rm_is_dp = false;
149
- break;
150
case 0x06: /* vcvtb.f16.f32, vcvtb.f16.f64 */
151
case 0x07: /* vcvtt.f16.f32, vcvtt.f16.f64 */
152
if (dp) {
153
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
154
switch (op) {
155
case 15: /* extension space */
156
switch (rn) {
157
- case 4: /* vcvtb.f32.f16, vcvtb.f64.f16 */
158
- {
159
- TCGv_ptr fpst = get_fpstatus_ptr(false);
160
- TCGv_i32 ahp_mode = get_ahp_flag();
161
- tmp = gen_vfp_mrs();
162
- tcg_gen_ext16u_i32(tmp, tmp);
163
- if (dp) {
164
- gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp,
165
- fpst, ahp_mode);
166
- } else {
167
- gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp,
168
- fpst, ahp_mode);
169
- }
170
- tcg_temp_free_i32(ahp_mode);
171
- tcg_temp_free_ptr(fpst);
172
- tcg_temp_free_i32(tmp);
173
- break;
174
- }
175
- case 5: /* vcvtt.f32.f16, vcvtt.f64.f16 */
176
- {
177
- TCGv_ptr fpst = get_fpstatus_ptr(false);
178
- TCGv_i32 ahp = get_ahp_flag();
179
- tmp = gen_vfp_mrs();
180
- tcg_gen_shri_i32(tmp, tmp, 16);
181
- if (dp) {
182
- gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp,
183
- fpst, ahp);
184
- } else {
185
- gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp,
186
- fpst, ahp);
187
- }
188
- tcg_temp_free_i32(tmp);
189
- tcg_temp_free_i32(ahp);
190
- tcg_temp_free_ptr(fpst);
191
- break;
192
- }
193
case 6: /* vcvtb.f16.f32, vcvtb.f16.f64 */
194
{
195
TCGv_ptr fpst = get_fpstatus_ptr(false);
196
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
197
index XXXXXXX..XXXXXXX 100644
198
--- a/target/arm/vfp.decode
199
+++ b/target/arm/vfp.decode
200
@@ -XXX,XX +XXX,XX @@ VCMP_sp ---- 1110 1.11 010 z:1 .... 1010 e:1 1.0 .... \
201
vd=%vd_sp vm=%vm_sp
202
VCMP_dp ---- 1110 1.11 010 z:1 .... 1011 e:1 1.0 .... \
203
vd=%vd_dp vm=%vm_dp
204
+
205
+# VCVTT and VCVTB from f16: Vd format depends on size bit; Vm is always vm_sp
206
+VCVT_f32_f16 ---- 1110 1.11 0010 .... 1010 t:1 1.0 .... \
207
+ vd=%vd_sp vm=%vm_sp
208
+VCVT_f64_f16 ---- 1110 1.11 0010 .... 1011 t:1 1.0 .... \
209
+ vd=%vd_dp vm=%vm_sp
210
--
211
2.20.1
212
213
diff view generated by jsdifflib