1
First set of arm patches for 6.2. I have a lot more in my
1
Some arm patches; my to-review queue is by no means empty, but
2
to-review queue still...
2
this is a big enough set of patches to be getting on with...
3
3
4
-- PMM
4
-- PMM
5
5
6
The following changes since commit d42685765653ec155fdf60910662f8830bdb2cef:
6
The following changes since commit cb9c6a8e5ad6a1f0ce164d352e3102df46986e22:
7
7
8
Open 6.2 development tree (2021-08-25 10:25:12 +0100)
8
.gitlab-ci.d/windows: Work-around timeout and OpenGL problems of the MSYS2 jobs (2023-01-04 18:58:33 +0000)
9
9
10
are available in the Git repository at:
10
are available in the Git repository at:
11
11
12
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20210825
12
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230105
13
13
14
for you to fetch changes up to 24b1a6aa43615be22c7ee66bd68ec5675f6a6a9a:
14
for you to fetch changes up to 93c9678de9dc7d2e68f9e8477da072bac30ef132:
15
15
16
docs: Document how to use gdb with unix sockets (2021-08-25 10:48:51 +0100)
16
hw/net: Fix read of uninitialized memory in imx_fec. (2023-01-05 15:33:00 +0000)
17
17
18
----------------------------------------------------------------
18
----------------------------------------------------------------
19
target-arm queue:
19
target-arm queue:
20
* More MVE emulation work
20
* Implement AArch32 ARMv8-R support
21
* Implement M-profile trapping on division by zero
21
* Add Cortex-R52 CPU
22
* kvm: use RCU_READ_LOCK_GUARD() in kvm_arch_fixup_msi_route()
22
* fix handling of HLT semihosting in system mode
23
* hw/char/pl011: add support for sending break
23
* hw/timer/ixm_epit: cleanup and fix bug in compare handling
24
* fsl-imx6ul: Instantiate SAI1/2/3 and ASRC as unimplemented devices
24
* target/arm: Coding style fixes
25
* hw/dma/pl330: Add memory region to replace default
25
* target/arm: Clean up includes
26
* sbsa-ref: Rename SBSA_GWDT enum value
26
* nseries: minor code cleanups
27
* fsl-imx7: Instantiate SAI1/2/3 as unimplemented devices
27
* target/arm: align exposed ID registers with Linux
28
* docs: Document how to use gdb with unix sockets
28
* hw/arm/smmu-common: remove unnecessary inlines
29
* i.MX7D: Handle GPT timers
30
* i.MX7D: Connect IRQs to GPIO devices
31
* i.MX6UL: Add a specific GPT timer instance
32
* hw/net: Fix read of uninitialized memory in imx_fec
29
33
30
----------------------------------------------------------------
34
----------------------------------------------------------------
31
Eduardo Habkost (1):
35
Alex Bennée (1):
32
sbsa-ref: Rename SBSA_GWDT enum value
36
target/arm: fix handling of HLT semihosting in system mode
33
37
34
Guenter Roeck (2):
38
Axel Heider (8):
35
fsl-imx6ul: Instantiate SAI1/2/3 and ASRC as unimplemented devices
39
hw/timer/imx_epit: improve comments
36
fsl-imx7: Instantiate SAI1/2/3 as unimplemented devices
40
hw/timer/imx_epit: cleanup CR defines
41
hw/timer/imx_epit: define SR_OCIF
42
hw/timer/imx_epit: update interrupt state on CR write access
43
hw/timer/imx_epit: hard reset initializes CR with 0
44
hw/timer/imx_epit: factor out register write handlers
45
hw/timer/imx_epit: remove explicit fields cnt and freq
46
hw/timer/imx_epit: fix compare timer handling
37
47
38
Hamza Mahfooz (1):
48
Claudio Fontana (1):
39
target/arm: kvm: use RCU_READ_LOCK_GUARD() in kvm_arch_fixup_msi_route()
49
target/arm: cleanup cpu includes
40
50
41
Jan Luebbe (1):
51
Fabiano Rosas (5):
42
hw/char/pl011: add support for sending break
52
target/arm: Fix checkpatch comment style warnings in helper.c
53
target/arm: Fix checkpatch space errors in helper.c
54
target/arm: Fix checkpatch brace errors in helper.c
55
target/arm: Remove unused includes from m_helper.c
56
target/arm: Remove unused includes from helper.c
43
57
44
Peter Maydell (37):
58
Jean-Christophe Dubois (4):
45
target/arm: Note that we handle VMOVL as a special case of VSHLL
59
i.MX7D: Connect GPT timers to IRQ
46
target/arm: Print MVE VPR in CPU dumps
60
i.MX7D: Compute clock frequency for the fixed frequency clocks.
47
target/arm: Fix MVE VSLI by 0 and VSRI by <dt>
61
i.MX6UL: Add a specific GPT timer instance for the i.MX6UL
48
target/arm: Fix signed VADDV
62
i.MX7D: Connect IRQs to GPIO devices.
49
target/arm: Fix mask handling for MVE narrowing operations
50
target/arm: Fix 48-bit saturating shifts
51
target/arm: Fix MVE 48-bit SQRSHRL for small right shifts
52
target/arm: Fix calculation of LTP mask when LR is 0
53
target/arm: Factor out mve_eci_mask()
54
target/arm: Fix VPT advance when ECI is non-zero
55
target/arm: Fix VLDRB/H/W for predicated elements
56
target/arm: Implement MVE VMULL (polynomial)
57
target/arm: Implement MVE incrementing/decrementing dup insns
58
target/arm: Factor out gen_vpst()
59
target/arm: Implement MVE integer vector comparisons
60
target/arm: Implement MVE integer vector-vs-scalar comparisons
61
target/arm: Implement MVE VPSEL
62
target/arm: Implement MVE VMLAS
63
target/arm: Implement MVE shift-by-scalar
64
target/arm: Move 'x' and 'a' bit definitions into vmlaldav formats
65
target/arm: Implement MVE integer min/max across vector
66
target/arm: Implement MVE VABAV
67
target/arm: Implement MVE narrowing moves
68
target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn
69
target/arm: Implement MVE VMLADAV and VMLSLDAV
70
target/arm: Implement MVE VMLA
71
target/arm: Implement MVE saturating doubling multiply accumulates
72
target/arm: Implement MVE VQABS, VQNEG
73
target/arm: Implement MVE VMAXA, VMINA
74
target/arm: Implement MVE VMOV to/from 2 general-purpose registers
75
target/arm: Implement MVE VPNOT
76
target/arm: Implement MVE VCTP
77
target/arm: Implement MVE scatter-gather insns
78
target/arm: Implement MVE scatter-gather immediate forms
79
target/arm: Implement MVE interleaving loads/stores
80
target/arm: Re-indent sdiv and udiv helpers
81
target/arm: Implement M-profile trapping on division by zero
82
63
83
Sebastian Meyer (1):
64
Peter Maydell (1):
84
docs: Document how to use gdb with unix sockets
65
target/arm:Set lg_page_size to 0 if either S1 or S2 asks for it
85
66
86
Wen, Jianxian (1):
67
Philippe Mathieu-Daudé (5):
87
hw/dma/pl330: Add memory region to replace default
68
hw/input/tsc2xxx: Constify set_transform()'s MouseTransformInfo arg
69
hw/arm/nseries: Constify various read-only arrays
70
hw/arm/nseries: Silent -Wmissing-field-initializers warning
71
hw/arm/smmu-common: Reduce smmu_inv_notifiers_mr() scope
72
hw/arm/smmu-common: Avoid using inlined functions with external linkage
88
73
89
docs/system/gdb.rst | 26 +-
74
Stephen Longfield (1):
90
include/hw/arm/fsl-imx7.h | 5 +
75
hw/net: Fix read of uninitialized memory in imx_fec.
91
target/arm/cpu.h | 1 +
92
target/arm/helper-mve.h | 283 ++++++++++
93
target/arm/helper.h | 4 +-
94
target/arm/translate-a32.h | 2 +
95
target/arm/vec_internal.h | 11 +
96
target/arm/mve.decode | 226 +++++++-
97
target/arm/t32.decode | 1 +
98
hw/arm/exynos4210.c | 3 +
99
hw/arm/fsl-imx6ul.c | 12 +
100
hw/arm/fsl-imx7.c | 7 +
101
hw/arm/sbsa-ref.c | 6 +-
102
hw/arm/xilinx_zynq.c | 3 +
103
hw/char/pl011.c | 6 +
104
hw/dma/pl330.c | 26 +-
105
target/arm/cpu.c | 3 +
106
target/arm/helper.c | 34 +-
107
target/arm/kvm.c | 17 +-
108
target/arm/m_helper.c | 4 +
109
target/arm/mve_helper.c | 1254 ++++++++++++++++++++++++++++++++++++++++++--
110
target/arm/translate-mve.c | 877 ++++++++++++++++++++++++++++++-
111
target/arm/translate-vfp.c | 2 +-
112
target/arm/translate.c | 37 +-
113
target/arm/vec_helper.c | 14 +-
114
25 files changed, 2746 insertions(+), 118 deletions(-)
115
76
77
Tobias Röhmel (7):
78
target/arm: Don't add all MIDR aliases for cores that implement PMSA
79
target/arm: Make RVBAR available for all ARMv8 CPUs
80
target/arm: Make stage_2_format for cache attributes optional
81
target/arm: Enable TTBCR_EAE for ARMv8-R AArch32
82
target/arm: Add PMSAv8r registers
83
target/arm: Add PMSAv8r functionality
84
target/arm: Add ARM Cortex-R52 CPU
85
86
Zhuojia Shen (1):
87
target/arm: align exposed ID registers with Linux
88
89
include/hw/arm/fsl-imx7.h | 20 +
90
include/hw/arm/smmu-common.h | 3 -
91
include/hw/input/tsc2xxx.h | 4 +-
92
include/hw/timer/imx_epit.h | 8 +-
93
include/hw/timer/imx_gpt.h | 1 +
94
target/arm/cpu.h | 6 +
95
target/arm/internals.h | 4 +
96
hw/arm/fsl-imx6ul.c | 2 +-
97
hw/arm/fsl-imx7.c | 41 +-
98
hw/arm/nseries.c | 28 +-
99
hw/arm/smmu-common.c | 15 +-
100
hw/input/tsc2005.c | 2 +-
101
hw/input/tsc210x.c | 3 +-
102
hw/misc/imx6ul_ccm.c | 6 -
103
hw/misc/imx7_ccm.c | 49 ++-
104
hw/net/imx_fec.c | 8 +-
105
hw/timer/imx_epit.c | 376 +++++++++-------
106
hw/timer/imx_gpt.c | 25 ++
107
target/arm/cpu.c | 35 +-
108
target/arm/cpu64.c | 6 -
109
target/arm/cpu_tcg.c | 42 ++
110
target/arm/debug_helper.c | 3 +
111
target/arm/helper.c | 871 +++++++++++++++++++++++++++++---------
112
target/arm/m_helper.c | 16 -
113
target/arm/machine.c | 28 ++
114
target/arm/ptw.c | 152 +++++--
115
target/arm/tlb_helper.c | 4 +
116
target/arm/translate.c | 2 +-
117
tests/tcg/aarch64/sysregs.c | 24 +-
118
tests/tcg/aarch64/Makefile.target | 7 +-
119
30 files changed, 1330 insertions(+), 461 deletions(-)
120
diff view generated by jsdifflib
Deleted patch
1
Although the architecture doesn't define it as an alias, VMOVL
2
(vector move long) is encoded as a VSHLL with a zero shift.
3
Add a comment in the decode file noting that we handle VMOVL
4
as part of VSHLL.
5
1
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
target/arm/mve.decode | 2 ++
10
1 file changed, 2 insertions(+)
11
12
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/mve.decode
15
+++ b/target/arm/mve.decode
16
@@ -XXX,XX +XXX,XX @@ VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
17
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
18
19
# VSHLL T1 encoding; the T2 VSHLL encoding is elsewhere in this file
20
+# Note that VMOVL is encoded as "VSHLL with a zero shift count"; we
21
+# implement it that way rather than special-casing it in the decode.
22
VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b
23
VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h
24
25
--
26
2.20.1
27
28
diff view generated by jsdifflib
1
Implement the MVE VLDR/VSTR insns which do scatter-gather using base
1
In get_phys_addr_twostage() we set the lg_page_size of the result to
2
addresses from Qm plus or minus an immediate offset (possibly with
2
the maximum of the stage 1 and stage 2 page sizes. This works for
3
writeback). Note that writeback is not predicated but it does have
3
the case where we do want to create a TLB entry, because we know the
4
to honour ECI state, so we have to add an eci_mask check to the
4
common TLB code only creates entries of the TARGET_PAGE_SIZE and
5
VSTR_SG macros (the VLDR_SG macros already needed this to be able
5
asking for a size larger than that only means that invalidations
6
to distinguish "skip beat" from "set predicated element to 0").
6
invalidate the whole larger area. However, if lg_page_size is
7
smaller than TARGET_PAGE_SIZE this effectively means "don't create a
8
TLB entry"; in this case if either S1 or S2 said "this covers less
9
than a page and can't go in a TLB" then the final result also should
10
be marked that way. Set the resulting page size to 0 if either
11
stage asked for a less-than-a-page entry, and expand the comment
12
to explain what's going on.
13
14
This has no effect for VMSA because currently the VMSA lookup always
15
returns results that cover at least TARGET_PAGE_SIZE; however when we
16
add v8R support it will reuse this code path, and for v8R the S1 and
17
S2 results can be smaller than TARGET_PAGE_SIZE.
7
18
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
21
Message-id: 20221212142708.610090-1-peter.maydell@linaro.org
10
---
22
---
11
target/arm/helper-mve.h | 5 +++
23
target/arm/ptw.c | 16 +++++++++++++---
12
target/arm/mve.decode | 10 +++++
24
1 file changed, 13 insertions(+), 3 deletions(-)
13
target/arm/mve_helper.c | 91 ++++++++++++++++++++++++--------------
14
target/arm/translate-mve.c | 72 ++++++++++++++++++++++++++++++
15
4 files changed, 146 insertions(+), 32 deletions(-)
16
25
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
26
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
18
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
28
--- a/target/arm/ptw.c
20
+++ b/target/arm/helper-mve.h
29
+++ b/target/arm/ptw.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
22
DEF_HELPER_FLAGS_4(mve_vstrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
DEF_HELPER_FLAGS_4(mve_vstrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
25
+DEF_HELPER_FLAGS_4(mve_vldrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(mve_vldrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(mve_vstrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(mve_vstrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+
30
DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32)
31
32
DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
36
+++ b/target/arm/mve.decode
37
@@ -XXX,XX +XXX,XX @@
38
&vmaxv qm rda size
39
&vabav qn qm rda size
40
&vldst_sg qd qm rn size msize os
41
+&vldst_sg_imm qd qm a w imm
42
43
# scatter-gather memory size is in bits 6:4
44
%sg_msize 6:1 4:1
45
@@ -XXX,XX +XXX,XX @@
46
@vldst_sg .... .... .... rn:4 .... ... size:2 ... ... os:1 &vldst_sg \
47
qd=%qd qm=%qm msize=%sg_msize
48
49
+# Qm is in the fields usually labeled Qn
50
+@vldst_sg_imm .... .... a:1 . w:1 . .... .... .... . imm:7 &vldst_sg_imm \
51
+ qd=%qd qm=%qn
52
+
53
@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
54
@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
55
@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
56
@@ -XXX,XX +XXX,XX @@ VLDR_S_sg 111 0 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg
57
VLDR_U_sg 111 1 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg
58
VSTR_sg 111 0 1100 1 . 00 .... ... 0 111 . .... .... @vldst_sg
59
60
+VLDRW_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1110 .... .... @vldst_sg_imm
61
+VLDRD_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1111 .... .... @vldst_sg_imm
62
+VSTRW_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1110 .... .... @vldst_sg_imm
63
+VSTRD_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1111 .... .... @vldst_sg_imm
64
+
65
# Moves between 2 32-bit vector lanes and 2 general purpose registers
66
VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
67
VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
68
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/target/arm/mve_helper.c
71
+++ b/target/arm/mve_helper.c
72
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
73
* For loads, predicated lanes are zeroed instead of retaining
74
* their previous values.
75
*/
76
-#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN) \
77
+#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN, WB) \
78
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
79
uint32_t base) \
80
{ \
81
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
82
addr = ADDRFN(base, m[H##ESIZE(e)]); \
83
d[H##ESIZE(e)] = (mask & 1) ? \
84
cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \
85
+ if (WB) { \
86
+ m[H##ESIZE(e)] = addr; \
87
+ } \
88
} \
89
mve_advance_vpt(env); \
90
}
31
}
91
32
92
/* We know here TYPE is unsigned so always the same as the offset type */
33
/*
93
-#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN) \
34
- * Use the maximum of the S1 & S2 page size, so that invalidation
94
+#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN, WB) \
35
- * of pages > TARGET_PAGE_SIZE works correctly.
95
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
36
+ * If either S1 or S2 returned a result smaller than TARGET_PAGE_SIZE,
96
uint32_t base) \
37
+ * this means "don't put this in the TLB"; in this case, return a
97
{ \
38
+ * result with lg_page_size == 0 to achieve that. Otherwise,
98
TYPE *d = vd; \
39
+ * use the maximum of the S1 & S2 page size, so that invalidation
99
TYPE *m = vm; \
40
+ * of pages > TARGET_PAGE_SIZE works correctly. (This works even though
100
uint16_t mask = mve_element_mask(env); \
41
+ * we know the combined result permissions etc only cover the minimum
101
+ uint16_t eci_mask = mve_eci_mask(env); \
42
+ * of the S1 and S2 page size, because we know that the common TLB code
102
unsigned e; \
43
+ * never actually creates TLB entries bigger than TARGET_PAGE_SIZE,
103
uint32_t addr; \
44
+ * and passing a larger page size value only affects invalidations.)
104
- for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
45
*/
105
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \
46
- if (result->f.lg_page_size < s1_lgpgsz) {
106
+ if (!(eci_mask & 1)) { \
47
+ if (result->f.lg_page_size < TARGET_PAGE_BITS ||
107
+ continue; \
48
+ s1_lgpgsz < TARGET_PAGE_BITS) {
108
+ } \
49
+ result->f.lg_page_size = 0;
109
addr = ADDRFN(base, m[H##ESIZE(e)]); \
50
+ } else if (result->f.lg_page_size < s1_lgpgsz) {
110
if (mask & 1) { \
51
result->f.lg_page_size = s1_lgpgsz;
111
cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \
112
} \
113
+ if (WB) { \
114
+ m[H##ESIZE(e)] = addr; \
115
+ } \
116
} \
117
mve_advance_vpt(env); \
118
}
52
}
119
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
53
120
* accesses, controlled by the predicate mask for the relevant beat,
121
* and with a single 32-bit offset in the first of the two Qm elements.
122
* Note that for QEMU our IMPDEF AIRCR.ENDIANNESS is always 0 (little).
123
+ * Address writeback happens on the odd beats and updates the address
124
+ * stored in the even-beat element.
125
*/
126
-#define DO_VLDR64_SG(OP, ADDRFN) \
127
+#define DO_VLDR64_SG(OP, ADDRFN, WB) \
128
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
129
uint32_t base) \
130
{ \
131
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
132
addr = ADDRFN(base, m[H4(e & ~1)]); \
133
addr += 4 * (e & 1); \
134
d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \
135
+ if (WB && (e & 1)) { \
136
+ m[H4(e & ~1)] = addr - 4; \
137
+ } \
138
} \
139
mve_advance_vpt(env); \
140
}
141
142
-#define DO_VSTR64_SG(OP, ADDRFN) \
143
+#define DO_VSTR64_SG(OP, ADDRFN, WB) \
144
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
145
uint32_t base) \
146
{ \
147
uint32_t *d = vd; \
148
uint32_t *m = vm; \
149
uint16_t mask = mve_element_mask(env); \
150
+ uint16_t eci_mask = mve_eci_mask(env); \
151
unsigned e; \
152
uint32_t addr; \
153
- for (e = 0; e < 16 / 4; e++, mask >>= 4) { \
154
+ for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
155
+ if (!(eci_mask & 1)) { \
156
+ continue; \
157
+ } \
158
addr = ADDRFN(base, m[H4(e & ~1)]); \
159
addr += 4 * (e & 1); \
160
if (mask & 1) { \
161
cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \
162
} \
163
+ if (WB && (e & 1)) { \
164
+ m[H4(e & ~1)] = addr - 4; \
165
+ } \
166
} \
167
mve_advance_vpt(env); \
168
}
169
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
170
#define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2))
171
#define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3))
172
173
-DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD)
174
-DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD)
175
-DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD)
176
+DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD, false)
177
+DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD, false)
178
+DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD, false)
179
180
-DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD)
181
-DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD)
182
-DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD)
183
-DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD)
184
-DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD)
185
-DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD)
186
-DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD)
187
+DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD, false)
188
+DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD, false)
189
+DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD, false)
190
+DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD, false)
191
+DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD, false)
192
+DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, false)
193
+DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD, false)
194
195
-DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH)
196
-DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH)
197
-DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH)
198
-DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW)
199
-DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD)
200
+DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH, false)
201
+DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH, false)
202
+DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH, false)
203
+DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW, false)
204
+DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD, false)
205
206
-DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD)
207
-DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD)
208
-DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD)
209
-DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD)
210
-DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD)
211
-DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD)
212
-DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD)
213
+DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD, false)
214
+DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD, false)
215
+DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD, false)
216
+DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD, false)
217
+DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD, false)
218
+DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD, false)
219
+DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD, false)
220
221
-DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH)
222
-DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH)
223
-DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW)
224
-DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD)
225
+DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH, false)
226
+DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH, false)
227
+DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW, false)
228
+DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD, false)
229
+
230
+DO_VLDR_SG(vldrw_sg_wb_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, true)
231
+DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true)
232
+DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true)
233
+DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
234
235
/*
236
* The mergemask(D, R, M) macro performs the operation "*D = R" but
237
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
238
index XXXXXXX..XXXXXXX 100644
239
--- a/target/arm/translate-mve.c
240
+++ b/target/arm/translate-mve.c
241
@@ -XXX,XX +XXX,XX @@ static bool trans_VSTR_sg(DisasContext *s, arg_vldst_sg *a)
242
243
#undef F
244
245
+static bool do_ldst_sg_imm(DisasContext *s, arg_vldst_sg_imm *a,
246
+ MVEGenLdStSGFn *fn, unsigned msize)
247
+{
248
+ uint32_t offset;
249
+ TCGv_ptr qd, qm;
250
+
251
+ if (!dc_isar_feature(aa32_mve, s) ||
252
+ !mve_check_qreg_bank(s, a->qd | a->qm) ||
253
+ !fn) {
254
+ return false;
255
+ }
256
+
257
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
258
+ return true;
259
+ }
260
+
261
+ offset = a->imm << msize;
262
+ if (!a->a) {
263
+ offset = -offset;
264
+ }
265
+
266
+ qd = mve_qreg_ptr(a->qd);
267
+ qm = mve_qreg_ptr(a->qm);
268
+ fn(cpu_env, qd, qm, tcg_constant_i32(offset));
269
+ tcg_temp_free_ptr(qd);
270
+ tcg_temp_free_ptr(qm);
271
+ mve_update_eci(s);
272
+ return true;
273
+}
274
+
275
+static bool trans_VLDRW_sg_imm(DisasContext *s, arg_vldst_sg_imm *a)
276
+{
277
+ static MVEGenLdStSGFn * const fns[] = {
278
+ gen_helper_mve_vldrw_sg_uw,
279
+ gen_helper_mve_vldrw_sg_wb_uw,
280
+ };
281
+ if (a->qd == a->qm) {
282
+ return false; /* UNPREDICTABLE */
283
+ }
284
+ return do_ldst_sg_imm(s, a, fns[a->w], MO_32);
285
+}
286
+
287
+static bool trans_VLDRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a)
288
+{
289
+ static MVEGenLdStSGFn * const fns[] = {
290
+ gen_helper_mve_vldrd_sg_ud,
291
+ gen_helper_mve_vldrd_sg_wb_ud,
292
+ };
293
+ if (a->qd == a->qm) {
294
+ return false; /* UNPREDICTABLE */
295
+ }
296
+ return do_ldst_sg_imm(s, a, fns[a->w], MO_64);
297
+}
298
+
299
+static bool trans_VSTRW_sg_imm(DisasContext *s, arg_vldst_sg_imm *a)
300
+{
301
+ static MVEGenLdStSGFn * const fns[] = {
302
+ gen_helper_mve_vstrw_sg_uw,
303
+ gen_helper_mve_vstrw_sg_wb_uw,
304
+ };
305
+ return do_ldst_sg_imm(s, a, fns[a->w], MO_32);
306
+}
307
+
308
+static bool trans_VSTRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a)
309
+{
310
+ static MVEGenLdStSGFn * const fns[] = {
311
+ gen_helper_mve_vstrd_sg_ud,
312
+ gen_helper_mve_vstrd_sg_wb_ud,
313
+ };
314
+ return do_ldst_sg_imm(s, a, fns[a->w], MO_64);
315
+}
316
+
317
static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
318
{
319
TCGv_ptr qd;
320
--
54
--
321
2.20.1
55
2.25.1
322
323
diff view generated by jsdifflib
1
Implement the MVE interleaving load/store functions VLD2, VLD4, VST2
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
and VST4. VLD2 loads 16 bytes of data from memory and writes to 2
3
consecutive Qregs; VLD4 loads 16 bytes of data from memory and writes
4
to 4 consecutive Qregs. The 'pattern' field in the encoding
5
determines the offset into memory which is accessed and also which
6
elements in the Qregs are written to. (The intention is that a
7
sequence of four consecutive VLD4 with different pattern values
8
performs a complete de-interleaving load of 64 bytes into all
9
elements of the 4 Qregs.) VST2 and VST4 do the same, but for stores.
10
2
3
Cores with PMSA have the MPUIR register which has the
4
same encoding as the MIDR alias with opc2=4. So we only
5
add that alias if we are not realizing a core that
6
implements PMSA.
7
8
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20221206102504.165775-2-tobias.roehmel@rwth-aachen.de
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
---
13
---
14
target/arm/helper-mve.h | 48 ++++++
14
target/arm/helper.c | 13 +++++++++----
15
target/arm/mve.decode | 11 ++
15
1 file changed, 9 insertions(+), 4 deletions(-)
16
target/arm/mve_helper.c | 342 +++++++++++++++++++++++++++++++++++++
17
target/arm/translate-mve.c | 94 ++++++++++
18
4 files changed, 495 insertions(+)
19
16
20
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
21
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/helper-mve.h
19
--- a/target/arm/helper.c
23
+++ b/target/arm/helper-mve.h
20
+++ b/target/arm/helper.c
24
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vldrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
25
DEF_HELPER_FLAGS_4(mve_vstrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
.access = PL1_R, .type = ARM_CP_NO_RAW, .resetvalue = cpu->midr,
26
DEF_HELPER_FLAGS_4(mve_vstrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
.fieldoffset = offsetof(CPUARMState, cp15.c0_cpuid),
27
24
.readfn = midr_read },
28
+DEF_HELPER_FLAGS_3(mve_vld20b, TCG_CALL_NO_WG, void, env, i32, i32)
25
- /* crn = 0 op1 = 0 crm = 0 op2 = 4,7 : AArch32 aliases of MIDR */
29
+DEF_HELPER_FLAGS_3(mve_vld20h, TCG_CALL_NO_WG, void, env, i32, i32)
26
- { .name = "MIDR", .type = ARM_CP_ALIAS | ARM_CP_CONST,
30
+DEF_HELPER_FLAGS_3(mve_vld20w, TCG_CALL_NO_WG, void, env, i32, i32)
27
- .cp = 15, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 4,
31
+
28
- .access = PL1_R, .resetvalue = cpu->midr },
32
+DEF_HELPER_FLAGS_3(mve_vld21b, TCG_CALL_NO_WG, void, env, i32, i32)
29
+ /* crn = 0 op1 = 0 crm = 0 op2 = 7 : AArch32 aliases of MIDR */
33
+DEF_HELPER_FLAGS_3(mve_vld21h, TCG_CALL_NO_WG, void, env, i32, i32)
30
{ .name = "MIDR", .type = ARM_CP_ALIAS | ARM_CP_CONST,
34
+DEF_HELPER_FLAGS_3(mve_vld21w, TCG_CALL_NO_WG, void, env, i32, i32)
31
.cp = 15, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 7,
35
+
32
.access = PL1_R, .resetvalue = cpu->midr },
36
+DEF_HELPER_FLAGS_3(mve_vld40b, TCG_CALL_NO_WG, void, env, i32, i32)
33
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
37
+DEF_HELPER_FLAGS_3(mve_vld40h, TCG_CALL_NO_WG, void, env, i32, i32)
34
.accessfn = access_aa64_tid1,
38
+DEF_HELPER_FLAGS_3(mve_vld40w, TCG_CALL_NO_WG, void, env, i32, i32)
35
.type = ARM_CP_CONST, .resetvalue = cpu->revidr },
39
+
36
};
40
+DEF_HELPER_FLAGS_3(mve_vld41b, TCG_CALL_NO_WG, void, env, i32, i32)
37
+ ARMCPRegInfo id_v8_midr_alias_cp_reginfo = {
41
+DEF_HELPER_FLAGS_3(mve_vld41h, TCG_CALL_NO_WG, void, env, i32, i32)
38
+ .name = "MIDR", .type = ARM_CP_ALIAS | ARM_CP_CONST,
42
+DEF_HELPER_FLAGS_3(mve_vld41w, TCG_CALL_NO_WG, void, env, i32, i32)
39
+ .cp = 15, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 4,
43
+
40
+ .access = PL1_R, .resetvalue = cpu->midr
44
+DEF_HELPER_FLAGS_3(mve_vld42b, TCG_CALL_NO_WG, void, env, i32, i32)
41
+ };
45
+DEF_HELPER_FLAGS_3(mve_vld42h, TCG_CALL_NO_WG, void, env, i32, i32)
42
ARMCPRegInfo id_cp_reginfo[] = {
46
+DEF_HELPER_FLAGS_3(mve_vld42w, TCG_CALL_NO_WG, void, env, i32, i32)
43
/* These are common to v8 and pre-v8 */
47
+
44
{ .name = "CTR",
48
+DEF_HELPER_FLAGS_3(mve_vld43b, TCG_CALL_NO_WG, void, env, i32, i32)
45
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
49
+DEF_HELPER_FLAGS_3(mve_vld43h, TCG_CALL_NO_WG, void, env, i32, i32)
46
}
50
+DEF_HELPER_FLAGS_3(mve_vld43w, TCG_CALL_NO_WG, void, env, i32, i32)
47
if (arm_feature(env, ARM_FEATURE_V8)) {
51
+
48
define_arm_cp_regs(cpu, id_v8_midr_cp_reginfo);
52
+DEF_HELPER_FLAGS_3(mve_vst20b, TCG_CALL_NO_WG, void, env, i32, i32)
49
+ if (!arm_feature(env, ARM_FEATURE_PMSA)) {
53
+DEF_HELPER_FLAGS_3(mve_vst20h, TCG_CALL_NO_WG, void, env, i32, i32)
50
+ define_one_arm_cp_reg(cpu, &id_v8_midr_alias_cp_reginfo);
54
+DEF_HELPER_FLAGS_3(mve_vst20w, TCG_CALL_NO_WG, void, env, i32, i32)
51
+ }
55
+
52
} else {
56
+DEF_HELPER_FLAGS_3(mve_vst21b, TCG_CALL_NO_WG, void, env, i32, i32)
53
define_arm_cp_regs(cpu, id_pre_v8_midr_cp_reginfo);
57
+DEF_HELPER_FLAGS_3(mve_vst21h, TCG_CALL_NO_WG, void, env, i32, i32)
54
}
58
+DEF_HELPER_FLAGS_3(mve_vst21w, TCG_CALL_NO_WG, void, env, i32, i32)
59
+
60
+DEF_HELPER_FLAGS_3(mve_vst40b, TCG_CALL_NO_WG, void, env, i32, i32)
61
+DEF_HELPER_FLAGS_3(mve_vst40h, TCG_CALL_NO_WG, void, env, i32, i32)
62
+DEF_HELPER_FLAGS_3(mve_vst40w, TCG_CALL_NO_WG, void, env, i32, i32)
63
+
64
+DEF_HELPER_FLAGS_3(mve_vst41b, TCG_CALL_NO_WG, void, env, i32, i32)
65
+DEF_HELPER_FLAGS_3(mve_vst41h, TCG_CALL_NO_WG, void, env, i32, i32)
66
+DEF_HELPER_FLAGS_3(mve_vst41w, TCG_CALL_NO_WG, void, env, i32, i32)
67
+
68
+DEF_HELPER_FLAGS_3(mve_vst42b, TCG_CALL_NO_WG, void, env, i32, i32)
69
+DEF_HELPER_FLAGS_3(mve_vst42h, TCG_CALL_NO_WG, void, env, i32, i32)
70
+DEF_HELPER_FLAGS_3(mve_vst42w, TCG_CALL_NO_WG, void, env, i32, i32)
71
+
72
+DEF_HELPER_FLAGS_3(mve_vst43b, TCG_CALL_NO_WG, void, env, i32, i32)
73
+DEF_HELPER_FLAGS_3(mve_vst43h, TCG_CALL_NO_WG, void, env, i32, i32)
74
+DEF_HELPER_FLAGS_3(mve_vst43w, TCG_CALL_NO_WG, void, env, i32, i32)
75
+
76
DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32)
77
78
DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
79
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
80
index XXXXXXX..XXXXXXX 100644
81
--- a/target/arm/mve.decode
82
+++ b/target/arm/mve.decode
83
@@ -XXX,XX +XXX,XX @@
84
&vabav qn qm rda size
85
&vldst_sg qd qm rn size msize os
86
&vldst_sg_imm qd qm a w imm
87
+&vldst_il qd rn size pat w
88
89
# scatter-gather memory size is in bits 6:4
90
%sg_msize 6:1 4:1
91
@@ -XXX,XX +XXX,XX @@
92
@vldst_sg_imm .... .... a:1 . w:1 . .... .... .... . imm:7 &vldst_sg_imm \
93
qd=%qd qm=%qn
94
95
+# Deinterleaving load/interleaving store
96
+@vldst_il .... .... .. w:1 . rn:4 .... ... size:2 pat:2 ..... &vldst_il \
97
+ qd=%qd
98
+
99
@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
100
@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
101
@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
102
@@ -XXX,XX +XXX,XX @@ VLDRD_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1111 .... .... @vldst_sg_imm
103
VSTRW_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1110 .... .... @vldst_sg_imm
104
VSTRD_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1111 .... .... @vldst_sg_imm
105
106
+# deinterleaving loads/interleaving stores
107
+VLD2 1111 1100 1 .. 1 .... ... 1 111 .. .. 00000 @vldst_il
108
+VLD4 1111 1100 1 .. 1 .... ... 1 111 .. .. 00001 @vldst_il
109
+VST2 1111 1100 1 .. 0 .... ... 1 111 .. .. 00000 @vldst_il
110
+VST4 1111 1100 1 .. 0 .... ... 1 111 .. .. 00001 @vldst_il
111
+
112
# Moves between 2 32-bit vector lanes and 2 general purpose registers
113
VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
114
VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
115
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
116
index XXXXXXX..XXXXXXX 100644
117
--- a/target/arm/mve_helper.c
118
+++ b/target/arm/mve_helper.c
119
@@ -XXX,XX +XXX,XX @@ DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true)
120
DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true)
121
DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
122
123
+/*
124
+ * Deinterleaving loads/interleaving stores.
125
+ *
126
+ * For these helpers we are passed the index of the first Qreg
127
+ * (VLD2/VST2 will also access Qn+1, VLD4/VST4 access Qn .. Qn+3)
128
+ * and the value of the base address register Rn.
129
+ * The helpers are specialized for pattern and element size, so
130
+ * for instance vld42h is VLD4 with pattern 2, element size MO_16.
131
+ *
132
+ * These insns are beatwise but not predicated, so we must honour ECI,
133
+ * but need not look at mve_element_mask().
134
+ *
135
+ * The pseudocode implements these insns with multiple memory accesses
136
+ * of the element size, but rules R_VVVG and R_FXDM permit us to make
137
+ * one 32-bit memory access per beat.
138
+ */
139
+#define DO_VLD4B(OP, O1, O2, O3, O4) \
140
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
141
+ uint32_t base) \
142
+ { \
143
+ int beat, e; \
144
+ uint16_t mask = mve_eci_mask(env); \
145
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
146
+ uint32_t addr, data; \
147
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
148
+ if ((mask & 1) == 0) { \
149
+ /* ECI says skip this beat */ \
150
+ continue; \
151
+ } \
152
+ addr = base + off[beat] * 4; \
153
+ data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
154
+ for (e = 0; e < 4; e++, data >>= 8) { \
155
+ uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \
156
+ qd[H1(off[beat])] = data; \
157
+ } \
158
+ } \
159
+ }
160
+
161
+#define DO_VLD4H(OP, O1, O2) \
162
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
163
+ uint32_t base) \
164
+ { \
165
+ int beat; \
166
+ uint16_t mask = mve_eci_mask(env); \
167
+ static const uint8_t off[4] = { O1, O1, O2, O2 }; \
168
+ uint32_t addr, data; \
169
+ int y; /* y counts 0 2 0 2 */ \
170
+ uint16_t *qd; \
171
+ for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \
172
+ if ((mask & 1) == 0) { \
173
+ /* ECI says skip this beat */ \
174
+ continue; \
175
+ } \
176
+ addr = base + off[beat] * 8 + (beat & 1) * 4; \
177
+ data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
178
+ qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \
179
+ qd[H2(off[beat])] = data; \
180
+ data >>= 16; \
181
+ qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \
182
+ qd[H2(off[beat])] = data; \
183
+ } \
184
+ }
185
+
186
+#define DO_VLD4W(OP, O1, O2, O3, O4) \
187
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
188
+ uint32_t base) \
189
+ { \
190
+ int beat; \
191
+ uint16_t mask = mve_eci_mask(env); \
192
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
193
+ uint32_t addr, data; \
194
+ uint32_t *qd; \
195
+ int y; \
196
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
197
+ if ((mask & 1) == 0) { \
198
+ /* ECI says skip this beat */ \
199
+ continue; \
200
+ } \
201
+ addr = base + off[beat] * 4; \
202
+ data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
203
+ y = (beat + (O1 & 2)) & 3; \
204
+ qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \
205
+ qd[H4(off[beat] >> 2)] = data; \
206
+ } \
207
+ }
208
+
209
+DO_VLD4B(vld40b, 0, 1, 10, 11)
210
+DO_VLD4B(vld41b, 2, 3, 12, 13)
211
+DO_VLD4B(vld42b, 4, 5, 14, 15)
212
+DO_VLD4B(vld43b, 6, 7, 8, 9)
213
+
214
+DO_VLD4H(vld40h, 0, 5)
215
+DO_VLD4H(vld41h, 1, 6)
216
+DO_VLD4H(vld42h, 2, 7)
217
+DO_VLD4H(vld43h, 3, 4)
218
+
219
+DO_VLD4W(vld40w, 0, 1, 10, 11)
220
+DO_VLD4W(vld41w, 2, 3, 12, 13)
221
+DO_VLD4W(vld42w, 4, 5, 14, 15)
222
+DO_VLD4W(vld43w, 6, 7, 8, 9)
223
+
224
+#define DO_VLD2B(OP, O1, O2, O3, O4) \
225
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
226
+ uint32_t base) \
227
+ { \
228
+ int beat, e; \
229
+ uint16_t mask = mve_eci_mask(env); \
230
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
231
+ uint32_t addr, data; \
232
+ uint8_t *qd; \
233
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
234
+ if ((mask & 1) == 0) { \
235
+ /* ECI says skip this beat */ \
236
+ continue; \
237
+ } \
238
+ addr = base + off[beat] * 2; \
239
+ data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
240
+ for (e = 0; e < 4; e++, data >>= 8) { \
241
+ qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \
242
+ qd[H1(off[beat] + (e >> 1))] = data; \
243
+ } \
244
+ } \
245
+ }
246
+
247
+#define DO_VLD2H(OP, O1, O2, O3, O4) \
248
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
249
+ uint32_t base) \
250
+ { \
251
+ int beat; \
252
+ uint16_t mask = mve_eci_mask(env); \
253
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
254
+ uint32_t addr, data; \
255
+ int e; \
256
+ uint16_t *qd; \
257
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
258
+ if ((mask & 1) == 0) { \
259
+ /* ECI says skip this beat */ \
260
+ continue; \
261
+ } \
262
+ addr = base + off[beat] * 4; \
263
+ data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
264
+ for (e = 0; e < 2; e++, data >>= 16) { \
265
+ qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \
266
+ qd[H2(off[beat])] = data; \
267
+ } \
268
+ } \
269
+ }
270
+
271
+#define DO_VLD2W(OP, O1, O2, O3, O4) \
272
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
273
+ uint32_t base) \
274
+ { \
275
+ int beat; \
276
+ uint16_t mask = mve_eci_mask(env); \
277
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
278
+ uint32_t addr, data; \
279
+ uint32_t *qd; \
280
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
281
+ if ((mask & 1) == 0) { \
282
+ /* ECI says skip this beat */ \
283
+ continue; \
284
+ } \
285
+ addr = base + off[beat]; \
286
+ data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
287
+ qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \
288
+ qd[H4(off[beat] >> 3)] = data; \
289
+ } \
290
+ }
291
+
292
+DO_VLD2B(vld20b, 0, 2, 12, 14)
293
+DO_VLD2B(vld21b, 4, 6, 8, 10)
294
+
295
+DO_VLD2H(vld20h, 0, 1, 6, 7)
296
+DO_VLD2H(vld21h, 2, 3, 4, 5)
297
+
298
+DO_VLD2W(vld20w, 0, 4, 24, 28)
299
+DO_VLD2W(vld21w, 8, 12, 16, 20)
300
+
301
+#define DO_VST4B(OP, O1, O2, O3, O4) \
302
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
303
+ uint32_t base) \
304
+ { \
305
+ int beat, e; \
306
+ uint16_t mask = mve_eci_mask(env); \
307
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
308
+ uint32_t addr, data; \
309
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
310
+ if ((mask & 1) == 0) { \
311
+ /* ECI says skip this beat */ \
312
+ continue; \
313
+ } \
314
+ addr = base + off[beat] * 4; \
315
+ data = 0; \
316
+ for (e = 3; e >= 0; e--) { \
317
+ uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \
318
+ data = (data << 8) | qd[H1(off[beat])]; \
319
+ } \
320
+ cpu_stl_le_data_ra(env, addr, data, GETPC()); \
321
+ } \
322
+ }
323
+
324
+#define DO_VST4H(OP, O1, O2) \
325
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
326
+ uint32_t base) \
327
+ { \
328
+ int beat; \
329
+ uint16_t mask = mve_eci_mask(env); \
330
+ static const uint8_t off[4] = { O1, O1, O2, O2 }; \
331
+ uint32_t addr, data; \
332
+ int y; /* y counts 0 2 0 2 */ \
333
+ uint16_t *qd; \
334
+ for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \
335
+ if ((mask & 1) == 0) { \
336
+ /* ECI says skip this beat */ \
337
+ continue; \
338
+ } \
339
+ addr = base + off[beat] * 8 + (beat & 1) * 4; \
340
+ qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \
341
+ data = qd[H2(off[beat])]; \
342
+ qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \
343
+ data |= qd[H2(off[beat])] << 16; \
344
+ cpu_stl_le_data_ra(env, addr, data, GETPC()); \
345
+ } \
346
+ }
347
+
348
+#define DO_VST4W(OP, O1, O2, O3, O4) \
349
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
350
+ uint32_t base) \
351
+ { \
352
+ int beat; \
353
+ uint16_t mask = mve_eci_mask(env); \
354
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
355
+ uint32_t addr, data; \
356
+ uint32_t *qd; \
357
+ int y; \
358
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
359
+ if ((mask & 1) == 0) { \
360
+ /* ECI says skip this beat */ \
361
+ continue; \
362
+ } \
363
+ addr = base + off[beat] * 4; \
364
+ y = (beat + (O1 & 2)) & 3; \
365
+ qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \
366
+ data = qd[H4(off[beat] >> 2)]; \
367
+ cpu_stl_le_data_ra(env, addr, data, GETPC()); \
368
+ } \
369
+ }
370
+
371
+DO_VST4B(vst40b, 0, 1, 10, 11)
372
+DO_VST4B(vst41b, 2, 3, 12, 13)
373
+DO_VST4B(vst42b, 4, 5, 14, 15)
374
+DO_VST4B(vst43b, 6, 7, 8, 9)
375
+
376
+DO_VST4H(vst40h, 0, 5)
377
+DO_VST4H(vst41h, 1, 6)
378
+DO_VST4H(vst42h, 2, 7)
379
+DO_VST4H(vst43h, 3, 4)
380
+
381
+DO_VST4W(vst40w, 0, 1, 10, 11)
382
+DO_VST4W(vst41w, 2, 3, 12, 13)
383
+DO_VST4W(vst42w, 4, 5, 14, 15)
384
+DO_VST4W(vst43w, 6, 7, 8, 9)
385
+
386
+#define DO_VST2B(OP, O1, O2, O3, O4) \
387
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
388
+ uint32_t base) \
389
+ { \
390
+ int beat, e; \
391
+ uint16_t mask = mve_eci_mask(env); \
392
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
393
+ uint32_t addr, data; \
394
+ uint8_t *qd; \
395
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
396
+ if ((mask & 1) == 0) { \
397
+ /* ECI says skip this beat */ \
398
+ continue; \
399
+ } \
400
+ addr = base + off[beat] * 2; \
401
+ data = 0; \
402
+ for (e = 3; e >= 0; e--) { \
403
+ qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \
404
+ data = (data << 8) | qd[H1(off[beat] + (e >> 1))]; \
405
+ } \
406
+ cpu_stl_le_data_ra(env, addr, data, GETPC()); \
407
+ } \
408
+ }
409
+
410
+#define DO_VST2H(OP, O1, O2, O3, O4) \
411
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
412
+ uint32_t base) \
413
+ { \
414
+ int beat; \
415
+ uint16_t mask = mve_eci_mask(env); \
416
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
417
+ uint32_t addr, data; \
418
+ int e; \
419
+ uint16_t *qd; \
420
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
421
+ if ((mask & 1) == 0) { \
422
+ /* ECI says skip this beat */ \
423
+ continue; \
424
+ } \
425
+ addr = base + off[beat] * 4; \
426
+ data = 0; \
427
+ for (e = 1; e >= 0; e--) { \
428
+ qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \
429
+ data = (data << 16) | qd[H2(off[beat])]; \
430
+ } \
431
+ cpu_stl_le_data_ra(env, addr, data, GETPC()); \
432
+ } \
433
+ }
434
+
435
+#define DO_VST2W(OP, O1, O2, O3, O4) \
436
+ void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \
437
+ uint32_t base) \
438
+ { \
439
+ int beat; \
440
+ uint16_t mask = mve_eci_mask(env); \
441
+ static const uint8_t off[4] = { O1, O2, O3, O4 }; \
442
+ uint32_t addr, data; \
443
+ uint32_t *qd; \
444
+ for (beat = 0; beat < 4; beat++, mask >>= 4) { \
445
+ if ((mask & 1) == 0) { \
446
+ /* ECI says skip this beat */ \
447
+ continue; \
448
+ } \
449
+ addr = base + off[beat]; \
450
+ qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \
451
+ data = qd[H4(off[beat] >> 3)]; \
452
+ cpu_stl_le_data_ra(env, addr, data, GETPC()); \
453
+ } \
454
+ }
455
+
456
+DO_VST2B(vst20b, 0, 2, 12, 14)
457
+DO_VST2B(vst21b, 4, 6, 8, 10)
458
+
459
+DO_VST2H(vst20h, 0, 1, 6, 7)
460
+DO_VST2H(vst21h, 2, 3, 4, 5)
461
+
462
+DO_VST2W(vst20w, 0, 4, 24, 28)
463
+DO_VST2W(vst21w, 8, 12, 16, 20)
464
+
465
/*
466
* The mergemask(D, R, M) macro performs the operation "*D = R" but
467
* storing only the bytes which correspond to 1 bits in M,
468
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
469
index XXXXXXX..XXXXXXX 100644
470
--- a/target/arm/translate-mve.c
471
+++ b/target/arm/translate-mve.c
472
@@ -XXX,XX +XXX,XX @@ static inline int vidup_imm(DisasContext *s, int x)
473
474
typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
475
typedef void MVEGenLdStSGFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
476
+typedef void MVEGenLdStIlFn(TCGv_ptr, TCGv_i32, TCGv_i32);
477
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
478
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
479
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
480
@@ -XXX,XX +XXX,XX @@ static bool trans_VSTRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a)
481
return do_ldst_sg_imm(s, a, fns[a->w], MO_64);
482
}
483
484
+static bool do_vldst_il(DisasContext *s, arg_vldst_il *a, MVEGenLdStIlFn *fn,
485
+ int addrinc)
486
+{
487
+ TCGv_i32 rn;
488
+
489
+ if (!dc_isar_feature(aa32_mve, s) ||
490
+ !mve_check_qreg_bank(s, a->qd) ||
491
+ !fn || (a->rn == 13 && a->w) || a->rn == 15) {
492
+ /* Variously UNPREDICTABLE or UNDEF or related-encoding */
493
+ return false;
494
+ }
495
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
496
+ return true;
497
+ }
498
+
499
+ rn = load_reg(s, a->rn);
500
+ /*
501
+ * We pass the index of Qd, not a pointer, because the helper must
502
+ * access multiple Q registers starting at Qd and working up.
503
+ */
504
+ fn(cpu_env, tcg_constant_i32(a->qd), rn);
505
+
506
+ if (a->w) {
507
+ tcg_gen_addi_i32(rn, rn, addrinc);
508
+ store_reg(s, a->rn, rn);
509
+ } else {
510
+ tcg_temp_free_i32(rn);
511
+ }
512
+ mve_update_and_store_eci(s);
513
+ return true;
514
+}
515
+
516
+/* This macro is just to make the arrays more compact in these functions */
517
+#define F(N) gen_helper_mve_##N
518
+
519
+static bool trans_VLD2(DisasContext *s, arg_vldst_il *a)
520
+{
521
+ static MVEGenLdStIlFn * const fns[4][4] = {
522
+ { F(vld20b), F(vld20h), F(vld20w), NULL, },
523
+ { F(vld21b), F(vld21h), F(vld21w), NULL, },
524
+ { NULL, NULL, NULL, NULL },
525
+ { NULL, NULL, NULL, NULL },
526
+ };
527
+ if (a->qd > 6) {
528
+ return false;
529
+ }
530
+ return do_vldst_il(s, a, fns[a->pat][a->size], 32);
531
+}
532
+
533
+static bool trans_VLD4(DisasContext *s, arg_vldst_il *a)
534
+{
535
+ static MVEGenLdStIlFn * const fns[4][4] = {
536
+ { F(vld40b), F(vld40h), F(vld40w), NULL, },
537
+ { F(vld41b), F(vld41h), F(vld41w), NULL, },
538
+ { F(vld42b), F(vld42h), F(vld42w), NULL, },
539
+ { F(vld43b), F(vld43h), F(vld43w), NULL, },
540
+ };
541
+ if (a->qd > 4) {
542
+ return false;
543
+ }
544
+ return do_vldst_il(s, a, fns[a->pat][a->size], 64);
545
+}
546
+
547
+static bool trans_VST2(DisasContext *s, arg_vldst_il *a)
548
+{
549
+ static MVEGenLdStIlFn * const fns[4][4] = {
550
+ { F(vst20b), F(vst20h), F(vst20w), NULL, },
551
+ { F(vst21b), F(vst21h), F(vst21w), NULL, },
552
+ { NULL, NULL, NULL, NULL },
553
+ { NULL, NULL, NULL, NULL },
554
+ };
555
+ if (a->qd > 6) {
556
+ return false;
557
+ }
558
+ return do_vldst_il(s, a, fns[a->pat][a->size], 32);
559
+}
560
+
561
+static bool trans_VST4(DisasContext *s, arg_vldst_il *a)
562
+{
563
+ static MVEGenLdStIlFn * const fns[4][4] = {
564
+ { F(vst40b), F(vst40h), F(vst40w), NULL, },
565
+ { F(vst41b), F(vst41h), F(vst41w), NULL, },
566
+ { F(vst42b), F(vst42h), F(vst42w), NULL, },
567
+ { F(vst43b), F(vst43h), F(vst43w), NULL, },
568
+ };
569
+ if (a->qd > 4) {
570
+ return false;
571
+ }
572
+ return do_vldst_il(s, a, fns[a->pat][a->size], 64);
573
+}
574
+
575
+#undef F
576
+
577
static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
578
{
579
TCGv_ptr qd;
580
--
55
--
581
2.20.1
56
2.25.1
582
57
583
58
diff view generated by jsdifflib
1
Include the MVE VPR register value in the CPU dumps produced by
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
arm_cpu_dump_state() if we are printing FPU information. This
3
makes it easier to interpret debug logs when predication is
4
active.
5
2
3
RVBAR shadows RVBAR_ELx where x is the highest exception
4
level if the highest EL is not EL3. This patch also allows
5
ARMv8 CPUs to change the reset address with
6
the rvbar property.
7
8
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20221206102504.165775-3-tobias.roehmel@rwth-aachen.de
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
---
12
---
9
target/arm/cpu.c | 3 +++
13
target/arm/cpu.c | 6 +++++-
10
1 file changed, 3 insertions(+)
14
target/arm/helper.c | 21 ++++++++++++++-------
15
2 files changed, 19 insertions(+), 8 deletions(-)
11
16
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
17
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
19
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
20
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags)
21
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset_hold(Object *obj)
17
i, v);
22
env->cp15.cpacr_el1 = FIELD_DP64(env->cp15.cpacr_el1,
18
}
23
CPACR, CP11, 3);
19
qemu_fprintf(f, "FPSCR: %08x\n", vfp_get_fpscr(env));
24
#endif
20
+ if (cpu_isar_feature(aa32_mve, cpu)) {
25
+ if (arm_feature(env, ARM_FEATURE_V8)) {
21
+ qemu_fprintf(f, "VPR: %08x\n", env->v7m.vpr);
26
+ env->cp15.rvbar = cpu->rvbar_prop;
27
+ env->regs[15] = cpu->rvbar_prop;
22
+ }
28
+ }
23
}
29
}
24
}
30
31
#if defined(CONFIG_USER_ONLY)
32
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
33
qdev_property_add_static(DEVICE(obj), &arm_cpu_reset_hivecs_property);
34
}
35
36
- if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
37
+ if (arm_feature(&cpu->env, ARM_FEATURE_V8)) {
38
object_property_add_uint64_ptr(obj, "rvbar",
39
&cpu->rvbar_prop,
40
OBJ_PROP_FLAG_READWRITE);
41
diff --git a/target/arm/helper.c b/target/arm/helper.c
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/helper.c
44
+++ b/target/arm/helper.c
45
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
46
if (!arm_feature(env, ARM_FEATURE_EL3) &&
47
!arm_feature(env, ARM_FEATURE_EL2)) {
48
ARMCPRegInfo rvbar = {
49
- .name = "RVBAR_EL1", .state = ARM_CP_STATE_AA64,
50
+ .name = "RVBAR_EL1", .state = ARM_CP_STATE_BOTH,
51
.opc0 = 3, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 1,
52
.access = PL1_R,
53
.fieldoffset = offsetof(CPUARMState, cp15.rvbar),
54
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
55
}
56
/* RVBAR_EL2 is only implemented if EL2 is the highest EL */
57
if (!arm_feature(env, ARM_FEATURE_EL3)) {
58
- ARMCPRegInfo rvbar = {
59
- .name = "RVBAR_EL2", .state = ARM_CP_STATE_AA64,
60
- .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 0, .opc2 = 1,
61
- .access = PL2_R,
62
- .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
63
+ ARMCPRegInfo rvbar[] = {
64
+ {
65
+ .name = "RVBAR_EL2", .state = ARM_CP_STATE_AA64,
66
+ .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 0, .opc2 = 1,
67
+ .access = PL2_R,
68
+ .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
69
+ },
70
+ { .name = "RVBAR", .type = ARM_CP_ALIAS,
71
+ .cp = 15, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 1,
72
+ .access = PL2_R,
73
+ .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
74
+ },
75
};
76
- define_one_arm_cp_reg(cpu, &rvbar);
77
+ define_arm_cp_regs(cpu, rvbar);
78
}
79
}
25
80
26
--
81
--
27
2.20.1
82
2.25.1
28
83
29
84
diff view generated by jsdifflib
Deleted patch
1
In the MVE shift-and-insert insns, we special case VSLI by 0
2
and VSRI by <dt>. VSRI by <dt> means "don't update the destination",
3
which is what we've implemented. However VSLI by 0 is "set
4
destination to the input", so we don't want to use the same
5
special-casing that we do for VSRI by <dt>.
6
1
7
Since the generic logic gives the right answer for a shift
8
by 0, just use that.
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
target/arm/mve_helper.c | 9 +++++----
14
1 file changed, 5 insertions(+), 4 deletions(-)
15
16
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/mve_helper.c
19
+++ b/target/arm/mve_helper.c
20
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
21
uint16_t mask; \
22
uint64_t shiftmask; \
23
unsigned e; \
24
- if (shift == 0 || shift == ESIZE * 8) { \
25
+ if (shift == ESIZE * 8) { \
26
/* \
27
- * Only VSLI can shift by 0; only VSRI can shift by <dt>. \
28
- * The generic logic would give the right answer for 0 but \
29
- * fails for <dt>. \
30
+ * Only VSRI can shift by <dt>; it should mean "don't \
31
+ * update the destination". The generic logic can't handle \
32
+ * this because it would try to shift by an out-of-range \
33
+ * amount, so special case it here. \
34
*/ \
35
goto done; \
36
} \
37
--
38
2.20.1
39
40
diff view generated by jsdifflib
Deleted patch
1
A cut-and-paste error meant we handled signed VADDV like
2
unsigned VADDV; fix the type used.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
target/arm/mve_helper.c | 6 +++---
8
1 file changed, 3 insertions(+), 3 deletions(-)
9
10
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/target/arm/mve_helper.c
13
+++ b/target/arm/mve_helper.c
14
@@ -XXX,XX +XXX,XX @@ DO_LDAVH(vrmlsldavhxsw, int32_t, int64_t, true, true)
15
return ra; \
16
} \
17
18
-DO_VADDV(vaddvsb, 1, uint8_t)
19
-DO_VADDV(vaddvsh, 2, uint16_t)
20
-DO_VADDV(vaddvsw, 4, uint32_t)
21
+DO_VADDV(vaddvsb, 1, int8_t)
22
+DO_VADDV(vaddvsh, 2, int16_t)
23
+DO_VADDV(vaddvsw, 4, int32_t)
24
DO_VADDV(vaddvub, 1, uint8_t)
25
DO_VADDV(vaddvuh, 2, uint16_t)
26
DO_VADDV(vaddvuw, 4, uint32_t)
27
--
28
2.20.1
29
30
diff view generated by jsdifflib
Deleted patch
1
In the MVE helpers for the narrowing operations (DO_VSHRN and
2
DO_VSHRN_SAT) we were using the wrong bits of the predicate mask for
3
the 'top' versions of the insn. This is because the loop works over
4
the double-sized input elements and shifts the predicate mask by that
5
many bits each time, but when we write out the half-sized output we
6
must look at the mask bits for whichever half of the element we are
7
writing to.
8
1
9
Correct this by shifting the whole mask right by ESIZE bits for the
10
'top' insns. This allows us also to simplify the saturation bit
11
checking (where we had noticed that we needed to look at a different
12
mask bit for the 'top' insn.)
13
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
16
---
17
target/arm/mve_helper.c | 4 +++-
18
1 file changed, 3 insertions(+), 1 deletion(-)
19
20
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/mve_helper.c
23
+++ b/target/arm/mve_helper.c
24
@@ -XXX,XX +XXX,XX @@ DO_VSHLL_ALL(vshllt, true)
25
TYPE *d = vd; \
26
uint16_t mask = mve_element_mask(env); \
27
unsigned le; \
28
+ mask >>= ESIZE * TOP; \
29
for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
30
TYPE r = FN(m[H##LESIZE(le)], shift); \
31
mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
32
@@ -XXX,XX +XXX,XX @@ static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max,
33
uint16_t mask = mve_element_mask(env); \
34
bool qc = false; \
35
unsigned le; \
36
+ mask >>= ESIZE * TOP; \
37
for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
38
bool sat = false; \
39
TYPE r = FN(m[H##LESIZE(le)], shift, &sat); \
40
mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
41
- qc |= sat && (mask & 1 << (TOP * ESIZE)); \
42
+ qc |= sat & mask & 1; \
43
} \
44
if (qc) { \
45
env->vfp.qc[0] = qc; \
46
--
47
2.20.1
48
49
diff view generated by jsdifflib
1
Implement the MVE gather-loads and scatter-stores which
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
form the address by adding a base value from a scalar
3
register to an offset in each element of a vector.
4
2
3
The v8R PMSAv8 has a two-stage MPU translation process, but, unlike
4
VMSAv8, the stage 2 attributes are in the same format as the stage 1
5
attributes (8-bit MAIR format). Rather than converting the MAIR
6
format to the format used for VMSA stage 2 (bits [5:2] of a VMSA
7
stage 2 descriptor) and then converting back to do the attribute
8
combination, allow combined_attrs_nofwb() to accept s2 attributes
9
that are already in the MAIR format.
10
11
We move the assert() to combined_attrs_fwb(), because that function
12
really does require a VMSA stage 2 attribute format. (We will never
13
get there for v8R, because PMSAv8 does not implement FEAT_S2FWB.)
14
15
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
17
Message-id: 20221206102504.165775-4-tobias.roehmel@rwth-aachen.de
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
19
---
8
target/arm/helper-mve.h | 32 +++++++++
20
target/arm/ptw.c | 10 ++++++++--
9
target/arm/mve.decode | 12 ++++
21
1 file changed, 8 insertions(+), 2 deletions(-)
10
target/arm/mve_helper.c | 129 +++++++++++++++++++++++++++++++++++++
11
target/arm/translate-mve.c | 97 ++++++++++++++++++++++++++++
12
4 files changed, 270 insertions(+)
13
22
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
23
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
15
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
25
--- a/target/arm/ptw.c
17
+++ b/target/arm/helper-mve.h
26
+++ b/target/arm/ptw.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32)
27
@@ -XXX,XX +XXX,XX @@ static uint8_t combined_attrs_nofwb(uint64_t hcr,
19
DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32)
28
{
20
DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32)
29
uint8_t s1lo, s2lo, s1hi, s2hi, s2_mair_attrs, ret_attrs;
21
30
22
+DEF_HELPER_FLAGS_4(mve_vldrb_sg_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
- s2_mair_attrs = convert_stage2_attrs(hcr, s2.attrs);
23
+DEF_HELPER_FLAGS_4(mve_vldrb_sg_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
+ if (s2.is_s2_format) {
24
+DEF_HELPER_FLAGS_4(mve_vldrh_sg_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
+ s2_mair_attrs = convert_stage2_attrs(hcr, s2.attrs);
34
+ } else {
35
+ s2_mair_attrs = s2.attrs;
36
+ }
37
38
s1lo = extract32(s1.attrs, 0, 4);
39
s2lo = extract32(s2_mair_attrs, 0, 4);
40
@@ -XXX,XX +XXX,XX @@ static uint8_t force_cacheattr_nibble_wb(uint8_t attr)
41
*/
42
static uint8_t combined_attrs_fwb(ARMCacheAttrs s1, ARMCacheAttrs s2)
43
{
44
+ assert(s2.is_s2_format && !s1.is_s2_format);
25
+
45
+
26
+DEF_HELPER_FLAGS_4(mve_vldrb_sg_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
46
switch (s2.attrs) {
27
+DEF_HELPER_FLAGS_4(mve_vldrb_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
47
case 7:
28
+DEF_HELPER_FLAGS_4(mve_vldrb_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
48
/* Use stage 1 attributes */
29
+DEF_HELPER_FLAGS_4(mve_vldrh_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
49
@@ -XXX,XX +XXX,XX @@ static ARMCacheAttrs combine_cacheattrs(uint64_t hcr,
30
+DEF_HELPER_FLAGS_4(mve_vldrh_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
50
ARMCacheAttrs ret;
31
+DEF_HELPER_FLAGS_4(mve_vldrw_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
51
bool tagged = false;
32
+DEF_HELPER_FLAGS_4(mve_vldrd_sg_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
52
33
+
53
- assert(s2.is_s2_format && !s1.is_s2_format);
34
+DEF_HELPER_FLAGS_4(mve_vstrb_sg_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
54
+ assert(!s1.is_s2_format);
35
+DEF_HELPER_FLAGS_4(mve_vstrb_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
55
ret.is_s2_format = false;
36
+DEF_HELPER_FLAGS_4(mve_vstrb_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
56
37
+DEF_HELPER_FLAGS_4(mve_vstrh_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
57
if (s1.attrs == 0xf0) {
38
+DEF_HELPER_FLAGS_4(mve_vstrh_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(mve_vstrw_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(mve_vstrd_sg_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+
42
+DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
+
44
+DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_4(mve_vldrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(mve_vldrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
48
+
49
+DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_4(mve_vstrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_4(mve_vstrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
53
+
54
DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32)
55
56
DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
57
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/mve.decode
60
+++ b/target/arm/mve.decode
61
@@ -XXX,XX +XXX,XX @@
62
&shl_scalar qda rm size
63
&vmaxv qm rda size
64
&vabav qn qm rda size
65
+&vldst_sg qd qm rn size msize os
66
+
67
+# scatter-gather memory size is in bits 6:4
68
+%sg_msize 6:1 4:1
69
70
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
71
# Note that both Rn and Qd are 3 bits only (no D bit)
72
@vldst_wn ... u:1 ... . . . . l:1 . rn:3 qd:3 . ... .. imm:7 &vldr_vstr
73
74
+@vldst_sg .... .... .... rn:4 .... ... size:2 ... ... os:1 &vldst_sg \
75
+ qd=%qd qm=%qm msize=%sg_msize
76
+
77
@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
78
@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
79
@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
80
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \
81
VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
82
size=2 p=1
83
84
+# gather loads/scatter stores
85
+VLDR_S_sg 111 0 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg
86
+VLDR_U_sg 111 1 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg
87
+VSTR_sg 111 0 1100 1 . 00 .... ... 0 111 . .... .... @vldst_sg
88
+
89
# Moves between 2 32-bit vector lanes and 2 general purpose registers
90
VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
91
VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
92
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/mve_helper.c
95
+++ b/target/arm/mve_helper.c
96
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
97
#undef DO_VLDR
98
#undef DO_VSTR
99
100
+/*
101
+ * Gather loads/scatter stores. Here each element of Qm specifies
102
+ * an offset to use from the base register Rm. In the _os_ versions
103
+ * that offset is scaled by the element size.
104
+ * For loads, predicated lanes are zeroed instead of retaining
105
+ * their previous values.
106
+ */
107
+#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN) \
108
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
109
+ uint32_t base) \
110
+ { \
111
+ TYPE *d = vd; \
112
+ OFFTYPE *m = vm; \
113
+ uint16_t mask = mve_element_mask(env); \
114
+ uint16_t eci_mask = mve_eci_mask(env); \
115
+ unsigned e; \
116
+ uint32_t addr; \
117
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \
118
+ if (!(eci_mask & 1)) { \
119
+ continue; \
120
+ } \
121
+ addr = ADDRFN(base, m[H##ESIZE(e)]); \
122
+ d[H##ESIZE(e)] = (mask & 1) ? \
123
+ cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \
124
+ } \
125
+ mve_advance_vpt(env); \
126
+ }
127
+
128
+/* We know here TYPE is unsigned so always the same as the offset type */
129
+#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN) \
130
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
131
+ uint32_t base) \
132
+ { \
133
+ TYPE *d = vd; \
134
+ TYPE *m = vm; \
135
+ uint16_t mask = mve_element_mask(env); \
136
+ unsigned e; \
137
+ uint32_t addr; \
138
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
139
+ addr = ADDRFN(base, m[H##ESIZE(e)]); \
140
+ if (mask & 1) { \
141
+ cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \
142
+ } \
143
+ } \
144
+ mve_advance_vpt(env); \
145
+ }
146
+
147
+/*
148
+ * 64-bit accesses are slightly different: they are done as two 32-bit
149
+ * accesses, controlled by the predicate mask for the relevant beat,
150
+ * and with a single 32-bit offset in the first of the two Qm elements.
151
+ * Note that for QEMU our IMPDEF AIRCR.ENDIANNESS is always 0 (little).
152
+ */
153
+#define DO_VLDR64_SG(OP, ADDRFN) \
154
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
155
+ uint32_t base) \
156
+ { \
157
+ uint32_t *d = vd; \
158
+ uint32_t *m = vm; \
159
+ uint16_t mask = mve_element_mask(env); \
160
+ uint16_t eci_mask = mve_eci_mask(env); \
161
+ unsigned e; \
162
+ uint32_t addr; \
163
+ for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
164
+ if (!(eci_mask & 1)) { \
165
+ continue; \
166
+ } \
167
+ addr = ADDRFN(base, m[H4(e & ~1)]); \
168
+ addr += 4 * (e & 1); \
169
+ d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \
170
+ } \
171
+ mve_advance_vpt(env); \
172
+ }
173
+
174
+#define DO_VSTR64_SG(OP, ADDRFN) \
175
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
176
+ uint32_t base) \
177
+ { \
178
+ uint32_t *d = vd; \
179
+ uint32_t *m = vm; \
180
+ uint16_t mask = mve_element_mask(env); \
181
+ unsigned e; \
182
+ uint32_t addr; \
183
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) { \
184
+ addr = ADDRFN(base, m[H4(e & ~1)]); \
185
+ addr += 4 * (e & 1); \
186
+ if (mask & 1) { \
187
+ cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \
188
+ } \
189
+ } \
190
+ mve_advance_vpt(env); \
191
+ }
192
+
193
+#define ADDR_ADD(BASE, OFFSET) ((BASE) + (OFFSET))
194
+#define ADDR_ADD_OSH(BASE, OFFSET) ((BASE) + ((OFFSET) << 1))
195
+#define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2))
196
+#define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3))
197
+
198
+DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD)
199
+DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD)
200
+DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD)
201
+
202
+DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD)
203
+DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD)
204
+DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD)
205
+DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD)
206
+DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD)
207
+DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD)
208
+DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD)
209
+
210
+DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH)
211
+DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH)
212
+DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH)
213
+DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW)
214
+DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD)
215
+
216
+DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD)
217
+DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD)
218
+DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD)
219
+DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD)
220
+DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD)
221
+DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD)
222
+DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD)
223
+
224
+DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH)
225
+DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH)
226
+DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW)
227
+DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD)
228
+
229
/*
230
* The mergemask(D, R, M) macro performs the operation "*D = R" but
231
* storing only the bytes which correspond to 1 bits in M,
232
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
233
index XXXXXXX..XXXXXXX 100644
234
--- a/target/arm/translate-mve.c
235
+++ b/target/arm/translate-mve.c
236
@@ -XXX,XX +XXX,XX @@ static inline int vidup_imm(DisasContext *s, int x)
237
#include "decode-mve.c.inc"
238
239
typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
240
+typedef void MVEGenLdStSGFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
241
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
242
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
243
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
244
@@ -XXX,XX +XXX,XX @@ DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h, MO_8)
245
DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w, MO_8)
246
DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w, MO_16)
247
248
+static bool do_ldst_sg(DisasContext *s, arg_vldst_sg *a, MVEGenLdStSGFn fn)
249
+{
250
+ TCGv_i32 addr;
251
+ TCGv_ptr qd, qm;
252
+
253
+ if (!dc_isar_feature(aa32_mve, s) ||
254
+ !mve_check_qreg_bank(s, a->qd | a->qm) ||
255
+ !fn || a->rn == 15) {
256
+ /* Rn case is UNPREDICTABLE */
257
+ return false;
258
+ }
259
+
260
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
261
+ return true;
262
+ }
263
+
264
+ addr = load_reg(s, a->rn);
265
+
266
+ qd = mve_qreg_ptr(a->qd);
267
+ qm = mve_qreg_ptr(a->qm);
268
+ fn(cpu_env, qd, qm, addr);
269
+ tcg_temp_free_ptr(qd);
270
+ tcg_temp_free_ptr(qm);
271
+ tcg_temp_free_i32(addr);
272
+ mve_update_eci(s);
273
+ return true;
274
+}
275
+
276
+/*
277
+ * The naming scheme here is "vldrb_sg_sh == in-memory byte loads
278
+ * signextended to halfword elements in register". _os_ indicates that
279
+ * the offsets in Qm should be scaled by the element size.
280
+ */
281
+/* This macro is just to make the arrays more compact in these functions */
282
+#define F(N) gen_helper_mve_##N
283
+
284
+/* VLDRB/VSTRB (ie msize 1) with OS=1 is UNPREDICTABLE; we UNDEF */
285
+static bool trans_VLDR_S_sg(DisasContext *s, arg_vldst_sg *a)
286
+{
287
+ static MVEGenLdStSGFn * const fns[2][4][4] = { {
288
+ { NULL, F(vldrb_sg_sh), F(vldrb_sg_sw), NULL },
289
+ { NULL, NULL, F(vldrh_sg_sw), NULL },
290
+ { NULL, NULL, NULL, NULL },
291
+ { NULL, NULL, NULL, NULL }
292
+ }, {
293
+ { NULL, NULL, NULL, NULL },
294
+ { NULL, NULL, F(vldrh_sg_os_sw), NULL },
295
+ { NULL, NULL, NULL, NULL },
296
+ { NULL, NULL, NULL, NULL }
297
+ }
298
+ };
299
+ if (a->qd == a->qm) {
300
+ return false; /* UNPREDICTABLE */
301
+ }
302
+ return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]);
303
+}
304
+
305
+static bool trans_VLDR_U_sg(DisasContext *s, arg_vldst_sg *a)
306
+{
307
+ static MVEGenLdStSGFn * const fns[2][4][4] = { {
308
+ { F(vldrb_sg_ub), F(vldrb_sg_uh), F(vldrb_sg_uw), NULL },
309
+ { NULL, F(vldrh_sg_uh), F(vldrh_sg_uw), NULL },
310
+ { NULL, NULL, F(vldrw_sg_uw), NULL },
311
+ { NULL, NULL, NULL, F(vldrd_sg_ud) }
312
+ }, {
313
+ { NULL, NULL, NULL, NULL },
314
+ { NULL, F(vldrh_sg_os_uh), F(vldrh_sg_os_uw), NULL },
315
+ { NULL, NULL, F(vldrw_sg_os_uw), NULL },
316
+ { NULL, NULL, NULL, F(vldrd_sg_os_ud) }
317
+ }
318
+ };
319
+ if (a->qd == a->qm) {
320
+ return false; /* UNPREDICTABLE */
321
+ }
322
+ return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]);
323
+}
324
+
325
+static bool trans_VSTR_sg(DisasContext *s, arg_vldst_sg *a)
326
+{
327
+ static MVEGenLdStSGFn * const fns[2][4][4] = { {
328
+ { F(vstrb_sg_ub), F(vstrb_sg_uh), F(vstrb_sg_uw), NULL },
329
+ { NULL, F(vstrh_sg_uh), F(vstrh_sg_uw), NULL },
330
+ { NULL, NULL, F(vstrw_sg_uw), NULL },
331
+ { NULL, NULL, NULL, F(vstrd_sg_ud) }
332
+ }, {
333
+ { NULL, NULL, NULL, NULL },
334
+ { NULL, F(vstrh_sg_os_uh), F(vstrh_sg_os_uw), NULL },
335
+ { NULL, NULL, F(vstrw_sg_os_uw), NULL },
336
+ { NULL, NULL, NULL, F(vstrd_sg_os_ud) }
337
+ }
338
+ };
339
+ return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]);
340
+}
341
+
342
+#undef F
343
+
344
static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
345
{
346
TCGv_ptr qd;
347
--
58
--
348
2.20.1
59
2.25.1
349
60
350
61
diff view generated by jsdifflib
1
Implement the MVE incrementing/decrementing dup insns VIDUP, VDDUP,
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
VIWDUP and VDWDUP. These fill the elements of a vector with
3
successively incrementing values, starting at the offset specified in
4
a general purpose register. The final value of the offset is written
5
back to this register. The wrapping variants take a second general
6
purpose register which specifies the point where the count should
7
wrap back to 0.
8
2
3
ARMv8-R AArch32 CPUs behave as if TTBCR.EAE is always 1 even
4
tough they don't have the TTBCR register.
5
See ARM Architecture Reference Manual Supplement - ARMv8, for the ARMv8-R
6
AArch32 architecture profile Version:A.c section C1.2.
7
8
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20221206102504.165775-5-tobias.roehmel@rwth-aachen.de
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
---
12
---
12
target/arm/helper-mve.h | 12 ++++
13
target/arm/internals.h | 4 ++++
13
target/arm/mve.decode | 25 ++++++++
14
target/arm/debug_helper.c | 3 +++
14
target/arm/mve_helper.c | 63 +++++++++++++++++++
15
target/arm/tlb_helper.c | 4 ++++
15
target/arm/translate-mve.c | 120 +++++++++++++++++++++++++++++++++++++
16
3 files changed, 11 insertions(+)
16
4 files changed, 220 insertions(+)
17
17
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
18
diff --git a/target/arm/internals.h b/target/arm/internals.h
19
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
20
--- a/target/arm/internals.h
21
+++ b/target/arm/helper-mve.h
21
+++ b/target/arm/internals.h
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32)
22
@@ -XXX,XX +XXX,XX @@ unsigned int arm_pamax(ARMCPU *cpu);
23
23
static inline bool extended_addresses_enabled(CPUARMState *env)
24
DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32)
25
26
+DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
27
+DEF_HELPER_FLAGS_4(mve_viduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
28
+DEF_HELPER_FLAGS_4(mve_vidupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
29
+
30
+DEF_HELPER_FLAGS_5(mve_viwdupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32)
31
+DEF_HELPER_FLAGS_5(mve_viwduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32)
32
+DEF_HELPER_FLAGS_5(mve_viwdupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32)
33
+
34
+DEF_HELPER_FLAGS_5(mve_vdwdupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32)
35
+DEF_HELPER_FLAGS_5(mve_vdwduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32)
36
+DEF_HELPER_FLAGS_5(mve_vdwdupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32)
37
+
38
DEF_HELPER_FLAGS_3(mve_vclsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
39
DEF_HELPER_FLAGS_3(mve_vclsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
40
DEF_HELPER_FLAGS_3(mve_vclsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
41
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/mve.decode
44
+++ b/target/arm/mve.decode
45
@@ -XXX,XX +XXX,XX @@
46
&2scalar qd qn rm size
47
&1imm qd imm cmode op
48
&2shift qd qm shift size
49
+&vidup qd rn size imm
50
+&viwdup qd rn rm size imm
51
52
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
53
# Note that both Rn and Qd are 3 bits only (no D bit)
54
@@ -XXX,XX +XXX,XX @@ VDUP 1110 1110 1 1 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=0
55
VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 1 1 0000 @vdup size=1
56
VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
57
58
+# Incrementing and decrementing dup
59
+
60
+# VIDUP, VDDUP format immediate: 1 << (immh:imml)
61
+%imm_vidup 7:1 0:1 !function=vidup_imm
62
+
63
+# VIDUP, VDDUP registers: Rm bits [3:1] from insn, bit 0 is 1;
64
+# Rn bits [3:1] from insn, bit 0 is 0
65
+%vidup_rm 1:3 !function=times_2_plus_1
66
+%vidup_rn 17:3 !function=times_2
67
+
68
+@vidup .... .... . . size:2 .... .... .... .... .... \
69
+ qd=%qd imm=%imm_vidup rn=%vidup_rn &vidup
70
+@viwdup .... .... . . size:2 .... .... .... .... .... \
71
+ qd=%qd imm=%imm_vidup rm=%vidup_rm rn=%vidup_rn &viwdup
72
+{
73
+ VIDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 111 . @vidup
74
+ VIWDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 ... . @viwdup
75
+}
76
+{
77
+ VDDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 111 . @vidup
78
+ VDWDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 ... . @viwdup
79
+}
80
+
81
# multiply-add long dual accumulate
82
# rdahi: bits [3:1] from insn, bit 0 is 1
83
# rdalo: bits [3:1] from insn, bit 0 is 0
84
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/mve_helper.c
87
+++ b/target/arm/mve_helper.c
88
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_sqrshr)(CPUARMState *env, uint32_t n, uint32_t shift)
89
{
24
{
90
return do_sqrshl_bhs(n, -(int8_t)shift, 32, true, &env->QF);
25
uint64_t tcr = env->cp15.tcr_el[arm_is_secure(env) ? 3 : 1];
91
}
26
+ if (arm_feature(env, ARM_FEATURE_PMSA) &&
92
+
27
+ arm_feature(env, ARM_FEATURE_V8)) {
93
+#define DO_VIDUP(OP, ESIZE, TYPE, FN) \
94
+ uint32_t HELPER(mve_##OP)(CPUARMState *env, void *vd, \
95
+ uint32_t offset, uint32_t imm) \
96
+ { \
97
+ TYPE *d = vd; \
98
+ uint16_t mask = mve_element_mask(env); \
99
+ unsigned e; \
100
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
101
+ mergemask(&d[H##ESIZE(e)], offset, mask); \
102
+ offset = FN(offset, imm); \
103
+ } \
104
+ mve_advance_vpt(env); \
105
+ return offset; \
106
+ }
107
+
108
+#define DO_VIWDUP(OP, ESIZE, TYPE, FN) \
109
+ uint32_t HELPER(mve_##OP)(CPUARMState *env, void *vd, \
110
+ uint32_t offset, uint32_t wrap, \
111
+ uint32_t imm) \
112
+ { \
113
+ TYPE *d = vd; \
114
+ uint16_t mask = mve_element_mask(env); \
115
+ unsigned e; \
116
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
117
+ mergemask(&d[H##ESIZE(e)], offset, mask); \
118
+ offset = FN(offset, wrap, imm); \
119
+ } \
120
+ mve_advance_vpt(env); \
121
+ return offset; \
122
+ }
123
+
124
+#define DO_VIDUP_ALL(OP, FN) \
125
+ DO_VIDUP(OP##b, 1, int8_t, FN) \
126
+ DO_VIDUP(OP##h, 2, int16_t, FN) \
127
+ DO_VIDUP(OP##w, 4, int32_t, FN)
128
+
129
+#define DO_VIWDUP_ALL(OP, FN) \
130
+ DO_VIWDUP(OP##b, 1, int8_t, FN) \
131
+ DO_VIWDUP(OP##h, 2, int16_t, FN) \
132
+ DO_VIWDUP(OP##w, 4, int32_t, FN)
133
+
134
+static uint32_t do_add_wrap(uint32_t offset, uint32_t wrap, uint32_t imm)
135
+{
136
+ offset += imm;
137
+ if (offset == wrap) {
138
+ offset = 0;
139
+ }
140
+ return offset;
141
+}
142
+
143
+static uint32_t do_sub_wrap(uint32_t offset, uint32_t wrap, uint32_t imm)
144
+{
145
+ if (offset == 0) {
146
+ offset = wrap;
147
+ }
148
+ offset -= imm;
149
+ return offset;
150
+}
151
+
152
+DO_VIDUP_ALL(vidup, DO_ADD)
153
+DO_VIWDUP_ALL(viwdup, do_add_wrap)
154
+DO_VIWDUP_ALL(vdwdup, do_sub_wrap)
155
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
156
index XXXXXXX..XXXXXXX 100644
157
--- a/target/arm/translate-mve.c
158
+++ b/target/arm/translate-mve.c
159
@@ -XXX,XX +XXX,XX @@
160
#include "translate.h"
161
#include "translate-a32.h"
162
163
+static inline int vidup_imm(DisasContext *s, int x)
164
+{
165
+ return 1 << x;
166
+}
167
+
168
/* Include the generated decoder */
169
#include "decode-mve.c.inc"
170
171
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
172
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
173
typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
174
typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
175
+typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32);
176
+typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32);
177
178
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
179
static inline long mve_qreg_offset(unsigned reg)
180
@@ -XXX,XX +XXX,XX @@ static bool trans_VSHLC(DisasContext *s, arg_VSHLC *a)
181
mve_update_eci(s);
182
return true;
183
}
184
+
185
+static bool do_vidup(DisasContext *s, arg_vidup *a, MVEGenVIDUPFn *fn)
186
+{
187
+ TCGv_ptr qd;
188
+ TCGv_i32 rn;
189
+
190
+ /*
191
+ * Vector increment/decrement with wrap and duplicate (VIDUP, VDDUP).
192
+ * This fills the vector with elements of successively increasing
193
+ * or decreasing values, starting from Rn.
194
+ */
195
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) {
196
+ return false;
197
+ }
198
+ if (a->size == MO_64) {
199
+ /* size 0b11 is another encoding */
200
+ return false;
201
+ }
202
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
203
+ return true;
28
+ return true;
204
+ }
29
+ }
205
+
30
return arm_el_is_aa64(env, 1) ||
206
+ qd = mve_qreg_ptr(a->qd);
31
(arm_feature(env, ARM_FEATURE_LPAE) && (tcr & TTBCR_EAE));
207
+ rn = load_reg(s, a->rn);
32
}
208
+ fn(rn, cpu_env, qd, rn, tcg_constant_i32(a->imm));
33
diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
209
+ store_reg(s, a->rn, rn);
34
index XXXXXXX..XXXXXXX 100644
210
+ tcg_temp_free_ptr(qd);
35
--- a/target/arm/debug_helper.c
211
+ mve_update_eci(s);
36
+++ b/target/arm/debug_helper.c
212
+ return true;
37
@@ -XXX,XX +XXX,XX @@ static uint32_t arm_debug_exception_fsr(CPUARMState *env)
213
+}
38
214
+
39
if (target_el == 2 || arm_el_is_aa64(env, target_el)) {
215
+static bool do_viwdup(DisasContext *s, arg_viwdup *a, MVEGenVIWDUPFn *fn)
40
using_lpae = true;
216
+{
41
+ } else if (arm_feature(env, ARM_FEATURE_PMSA) &&
217
+ TCGv_ptr qd;
42
+ arm_feature(env, ARM_FEATURE_V8)) {
218
+ TCGv_i32 rn, rm;
43
+ using_lpae = true;
219
+
44
} else {
220
+ /*
45
if (arm_feature(env, ARM_FEATURE_LPAE) &&
221
+ * Vector increment/decrement with wrap and duplicate (VIWDUp, VDWDUP)
46
(env->cp15.tcr_el[target_el] & TTBCR_EAE)) {
222
+ * This fills the vector with elements of successively increasing
47
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
223
+ * or decreasing values, starting from Rn. Rm specifies a point where
48
index XXXXXXX..XXXXXXX 100644
224
+ * the count wraps back around to 0. The updated offset is written back
49
--- a/target/arm/tlb_helper.c
225
+ * to Rn.
50
+++ b/target/arm/tlb_helper.c
226
+ */
51
@@ -XXX,XX +XXX,XX @@ bool regime_using_lpae_format(CPUARMState *env, ARMMMUIdx mmu_idx)
227
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) {
52
if (el == 2 || arm_el_is_aa64(env, el)) {
228
+ return false;
53
return true;
229
+ }
54
}
230
+ if (!fn || a->rm == 13 || a->rm == 15) {
55
+ if (arm_feature(env, ARM_FEATURE_PMSA) &&
231
+ /*
56
+ arm_feature(env, ARM_FEATURE_V8)) {
232
+ * size 0b11 is another encoding; Rm == 13 is UNPREDICTABLE;
233
+ * Rm == 13 is VIWDUP, VDWDUP.
234
+ */
235
+ return false;
236
+ }
237
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
238
+ return true;
57
+ return true;
239
+ }
58
+ }
240
+
59
if (arm_feature(env, ARM_FEATURE_LPAE)
241
+ qd = mve_qreg_ptr(a->qd);
60
&& (regime_tcr(env, mmu_idx) & TTBCR_EAE)) {
242
+ rn = load_reg(s, a->rn);
61
return true;
243
+ rm = load_reg(s, a->rm);
244
+ fn(rn, cpu_env, qd, rn, rm, tcg_constant_i32(a->imm));
245
+ store_reg(s, a->rn, rn);
246
+ tcg_temp_free_ptr(qd);
247
+ tcg_temp_free_i32(rm);
248
+ mve_update_eci(s);
249
+ return true;
250
+}
251
+
252
+static bool trans_VIDUP(DisasContext *s, arg_vidup *a)
253
+{
254
+ static MVEGenVIDUPFn * const fns[] = {
255
+ gen_helper_mve_vidupb,
256
+ gen_helper_mve_viduph,
257
+ gen_helper_mve_vidupw,
258
+ NULL,
259
+ };
260
+ return do_vidup(s, a, fns[a->size]);
261
+}
262
+
263
+static bool trans_VDDUP(DisasContext *s, arg_vidup *a)
264
+{
265
+ static MVEGenVIDUPFn * const fns[] = {
266
+ gen_helper_mve_vidupb,
267
+ gen_helper_mve_viduph,
268
+ gen_helper_mve_vidupw,
269
+ NULL,
270
+ };
271
+ /* VDDUP is just like VIDUP but with a negative immediate */
272
+ a->imm = -a->imm;
273
+ return do_vidup(s, a, fns[a->size]);
274
+}
275
+
276
+static bool trans_VIWDUP(DisasContext *s, arg_viwdup *a)
277
+{
278
+ static MVEGenVIWDUPFn * const fns[] = {
279
+ gen_helper_mve_viwdupb,
280
+ gen_helper_mve_viwduph,
281
+ gen_helper_mve_viwdupw,
282
+ NULL,
283
+ };
284
+ return do_viwdup(s, a, fns[a->size]);
285
+}
286
+
287
+static bool trans_VDWDUP(DisasContext *s, arg_viwdup *a)
288
+{
289
+ static MVEGenVIWDUPFn * const fns[] = {
290
+ gen_helper_mve_vdwdupb,
291
+ gen_helper_mve_vdwduph,
292
+ gen_helper_mve_vdwdupw,
293
+ NULL,
294
+ };
295
+ return do_viwdup(s, a, fns[a->size]);
296
+}
297
--
62
--
298
2.20.1
63
2.25.1
299
64
300
65
diff view generated by jsdifflib
1
Unlike A-profile, for M-profile the UDIV and SDIV insns can be
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
configured to raise an exception on division by zero, using the CCR
3
DIV_0_TRP bit.
4
2
5
Implement support for setting this bit by making the helper functions
3
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
6
raise the appropriate exception.
4
Message-id: 20221206102504.165775-6-tobias.roehmel@rwth-aachen.de
7
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20210730151636.17254-3-peter.maydell@linaro.org
11
---
6
---
12
target/arm/cpu.h | 1 +
7
target/arm/cpu.h | 6 +
13
target/arm/helper.h | 4 ++--
8
target/arm/cpu.c | 28 +++-
14
target/arm/helper.c | 19 +++++++++++++++++--
9
target/arm/helper.c | 302 +++++++++++++++++++++++++++++++++++++++++++
15
target/arm/m_helper.c | 4 ++++
10
target/arm/machine.c | 28 ++++
16
target/arm/translate.c | 4 ++--
11
4 files changed, 360 insertions(+), 4 deletions(-)
17
5 files changed, 26 insertions(+), 6 deletions(-)
18
12
19
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
20
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/cpu.h
15
--- a/target/arm/cpu.h
22
+++ b/target/arm/cpu.h
16
+++ b/target/arm/cpu.h
23
@@ -XXX,XX +XXX,XX @@
17
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
24
#define EXCP_LAZYFP 20 /* v7M fault during lazy FP stacking */
18
};
25
#define EXCP_LSERR 21 /* v8M LSERR SecureFault */
19
uint64_t sctlr_el[4];
26
#define EXCP_UNALIGNED 22 /* v7M UNALIGNED UsageFault */
20
};
27
+#define EXCP_DIVBYZERO 23 /* v7M DIVBYZERO UsageFault */
21
+ uint64_t vsctlr; /* Virtualization System control register. */
28
/* NB: add new EXCP_ defines to the array in arm_log_exception() too */
22
uint64_t cpacr_el1; /* Architectural feature access control register */
29
23
uint64_t cptr_el[4]; /* ARMv8 feature trap registers */
30
#define ARMV7M_EXCP_RESET 1
24
uint32_t c1_xscaleauxcr; /* XScale auxiliary control register. */
31
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
26
*/
27
uint32_t *rbar[M_REG_NUM_BANKS];
28
uint32_t *rlar[M_REG_NUM_BANKS];
29
+ uint32_t *hprbar;
30
+ uint32_t *hprlar;
31
uint32_t mair0[M_REG_NUM_BANKS];
32
uint32_t mair1[M_REG_NUM_BANKS];
33
+ uint32_t hprselr;
34
} pmsav8;
35
36
/* v8M SAU */
37
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
38
bool has_mpu;
39
/* PMSAv7 MPU number of supported regions */
40
uint32_t pmsav7_dregion;
41
+ /* PMSAv8 MPU number of supported hyp regions */
42
+ uint32_t pmsav8r_hdregion;
43
/* v8M SAU number of supported regions */
44
uint32_t sau_sregion;
45
46
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
32
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/helper.h
48
--- a/target/arm/cpu.c
34
+++ b/target/arm/helper.h
49
+++ b/target/arm/cpu.c
35
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(add_saturate, i32, env, i32, i32)
50
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset_hold(Object *obj)
36
DEF_HELPER_3(sub_saturate, i32, env, i32, i32)
51
sizeof(*env->pmsav7.dracr) * cpu->pmsav7_dregion);
37
DEF_HELPER_3(add_usaturate, i32, env, i32, i32)
52
}
38
DEF_HELPER_3(sub_usaturate, i32, env, i32, i32)
53
}
39
-DEF_HELPER_FLAGS_2(sdiv, TCG_CALL_NO_RWG_SE, s32, s32, s32)
54
+
40
-DEF_HELPER_FLAGS_2(udiv, TCG_CALL_NO_RWG_SE, i32, i32, i32)
55
+ if (cpu->pmsav8r_hdregion > 0) {
41
+DEF_HELPER_FLAGS_3(sdiv, TCG_CALL_NO_RWG, s32, env, s32, s32)
56
+ memset(env->pmsav8.hprbar, 0,
42
+DEF_HELPER_FLAGS_3(udiv, TCG_CALL_NO_RWG, i32, env, i32, i32)
57
+ sizeof(*env->pmsav8.hprbar) * cpu->pmsav8r_hdregion);
43
DEF_HELPER_FLAGS_1(rbit, TCG_CALL_NO_RWG_SE, i32, i32)
58
+ memset(env->pmsav8.hprlar, 0,
44
59
+ sizeof(*env->pmsav8.hprlar) * cpu->pmsav8r_hdregion);
45
#define PAS_OP(pfx) \
60
+ }
61
+
62
env->pmsav7.rnr[M_REG_NS] = 0;
63
env->pmsav7.rnr[M_REG_S] = 0;
64
env->pmsav8.mair0[M_REG_NS] = 0;
65
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
66
/* MPU can be configured out of a PMSA CPU either by setting has-mpu
67
* to false or by setting pmsav7-dregion to 0.
68
*/
69
- if (!cpu->has_mpu) {
70
- cpu->pmsav7_dregion = 0;
71
- }
72
- if (cpu->pmsav7_dregion == 0) {
73
+ if (!cpu->has_mpu || cpu->pmsav7_dregion == 0) {
74
cpu->has_mpu = false;
75
+ cpu->pmsav7_dregion = 0;
76
+ cpu->pmsav8r_hdregion = 0;
77
}
78
79
if (arm_feature(env, ARM_FEATURE_PMSA) &&
80
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
81
env->pmsav7.dracr = g_new0(uint32_t, nr);
82
}
83
}
84
+
85
+ if (cpu->pmsav8r_hdregion > 0xff) {
86
+ error_setg(errp, "PMSAv8 MPU EL2 #regions invalid %" PRIu32,
87
+ cpu->pmsav8r_hdregion);
88
+ return;
89
+ }
90
+
91
+ if (cpu->pmsav8r_hdregion) {
92
+ env->pmsav8.hprbar = g_new0(uint32_t,
93
+ cpu->pmsav8r_hdregion);
94
+ env->pmsav8.hprlar = g_new0(uint32_t,
95
+ cpu->pmsav8r_hdregion);
96
+ }
97
}
98
99
if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
46
diff --git a/target/arm/helper.c b/target/arm/helper.c
100
diff --git a/target/arm/helper.c b/target/arm/helper.c
47
index XXXXXXX..XXXXXXX 100644
101
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/helper.c
102
--- a/target/arm/helper.c
49
+++ b/target/arm/helper.c
103
+++ b/target/arm/helper.c
50
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sxtb16)(uint32_t x)
104
@@ -XXX,XX +XXX,XX @@ static void pmsav7_rgnr_write(CPUARMState *env, const ARMCPRegInfo *ri,
51
return res;
105
raw_write(env, ri, value);
52
}
106
}
53
107
54
+static void handle_possible_div0_trap(CPUARMState *env, uintptr_t ra)
108
+static void prbar_write(CPUARMState *env, const ARMCPRegInfo *ri,
55
+{
109
+ uint64_t value)
110
+{
111
+ ARMCPU *cpu = env_archcpu(env);
112
+
113
+ tlb_flush(CPU(cpu)); /* Mappings may have changed - purge! */
114
+ env->pmsav8.rbar[M_REG_NS][env->pmsav7.rnr[M_REG_NS]] = value;
115
+}
116
+
117
+static uint64_t prbar_read(CPUARMState *env, const ARMCPRegInfo *ri)
118
+{
119
+ return env->pmsav8.rbar[M_REG_NS][env->pmsav7.rnr[M_REG_NS]];
120
+}
121
+
122
+static void prlar_write(CPUARMState *env, const ARMCPRegInfo *ri,
123
+ uint64_t value)
124
+{
125
+ ARMCPU *cpu = env_archcpu(env);
126
+
127
+ tlb_flush(CPU(cpu)); /* Mappings may have changed - purge! */
128
+ env->pmsav8.rlar[M_REG_NS][env->pmsav7.rnr[M_REG_NS]] = value;
129
+}
130
+
131
+static uint64_t prlar_read(CPUARMState *env, const ARMCPRegInfo *ri)
132
+{
133
+ return env->pmsav8.rlar[M_REG_NS][env->pmsav7.rnr[M_REG_NS]];
134
+}
135
+
136
+static void prselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
137
+ uint64_t value)
138
+{
139
+ ARMCPU *cpu = env_archcpu(env);
140
+
56
+ /*
141
+ /*
57
+ * Take a division-by-zero exception if necessary; otherwise return
142
+ * Ignore writes that would select not implemented region.
58
+ * to get the usual non-trapping division behaviour (result of 0)
143
+ * This is architecturally UNPREDICTABLE.
59
+ */
144
+ */
60
+ if (arm_feature(env, ARM_FEATURE_M)
145
+ if (value >= cpu->pmsav7_dregion) {
61
+ && (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_DIV_0_TRP_MASK)) {
146
+ return;
62
+ raise_exception_ra(env, EXCP_DIVBYZERO, 0, 1, ra);
147
+ }
63
+ }
148
+
64
+}
149
+ env->pmsav7.rnr[M_REG_NS] = value;
65
+
150
+}
66
uint32_t HELPER(uxtb16)(uint32_t x)
151
+
67
{
152
+static void hprbar_write(CPUARMState *env, const ARMCPRegInfo *ri,
68
uint32_t res;
153
+ uint64_t value)
69
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(uxtb16)(uint32_t x)
154
+{
70
return res;
155
+ ARMCPU *cpu = env_archcpu(env);
156
+
157
+ tlb_flush(CPU(cpu)); /* Mappings may have changed - purge! */
158
+ env->pmsav8.hprbar[env->pmsav8.hprselr] = value;
159
+}
160
+
161
+static uint64_t hprbar_read(CPUARMState *env, const ARMCPRegInfo *ri)
162
+{
163
+ return env->pmsav8.hprbar[env->pmsav8.hprselr];
164
+}
165
+
166
+static void hprlar_write(CPUARMState *env, const ARMCPRegInfo *ri,
167
+ uint64_t value)
168
+{
169
+ ARMCPU *cpu = env_archcpu(env);
170
+
171
+ tlb_flush(CPU(cpu)); /* Mappings may have changed - purge! */
172
+ env->pmsav8.hprlar[env->pmsav8.hprselr] = value;
173
+}
174
+
175
+static uint64_t hprlar_read(CPUARMState *env, const ARMCPRegInfo *ri)
176
+{
177
+ return env->pmsav8.hprlar[env->pmsav8.hprselr];
178
+}
179
+
180
+static void hprenr_write(CPUARMState *env, const ARMCPRegInfo *ri,
181
+ uint64_t value)
182
+{
183
+ uint32_t n;
184
+ uint32_t bit;
185
+ ARMCPU *cpu = env_archcpu(env);
186
+
187
+ /* Ignore writes to unimplemented regions */
188
+ int rmax = MIN(cpu->pmsav8r_hdregion, 32);
189
+ value &= MAKE_64BIT_MASK(0, rmax);
190
+
191
+ tlb_flush(CPU(cpu)); /* Mappings may have changed - purge! */
192
+
193
+ /* Register alias is only valid for first 32 indexes */
194
+ for (n = 0; n < rmax; ++n) {
195
+ bit = extract32(value, n, 1);
196
+ env->pmsav8.hprlar[n] = deposit32(
197
+ env->pmsav8.hprlar[n], 0, 1, bit);
198
+ }
199
+}
200
+
201
+static uint64_t hprenr_read(CPUARMState *env, const ARMCPRegInfo *ri)
202
+{
203
+ uint32_t n;
204
+ uint32_t result = 0x0;
205
+ ARMCPU *cpu = env_archcpu(env);
206
+
207
+ /* Register alias is only valid for first 32 indexes */
208
+ for (n = 0; n < MIN(cpu->pmsav8r_hdregion, 32); ++n) {
209
+ if (env->pmsav8.hprlar[n] & 0x1) {
210
+ result |= (0x1 << n);
211
+ }
212
+ }
213
+ return result;
214
+}
215
+
216
+static void hprselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
217
+ uint64_t value)
218
+{
219
+ ARMCPU *cpu = env_archcpu(env);
220
+
221
+ /*
222
+ * Ignore writes that would select not implemented region.
223
+ * This is architecturally UNPREDICTABLE.
224
+ */
225
+ if (value >= cpu->pmsav8r_hdregion) {
226
+ return;
227
+ }
228
+
229
+ env->pmsav8.hprselr = value;
230
+}
231
+
232
+static void pmsav8r_regn_write(CPUARMState *env, const ARMCPRegInfo *ri,
233
+ uint64_t value)
234
+{
235
+ ARMCPU *cpu = env_archcpu(env);
236
+ uint8_t index = (extract32(ri->opc0, 0, 1) << 4) |
237
+ (extract32(ri->crm, 0, 3) << 1) | extract32(ri->opc2, 2, 1);
238
+
239
+ tlb_flush(CPU(cpu)); /* Mappings may have changed - purge! */
240
+
241
+ if (ri->opc1 & 4) {
242
+ if (index >= cpu->pmsav8r_hdregion) {
243
+ return;
244
+ }
245
+ if (ri->opc2 & 0x1) {
246
+ env->pmsav8.hprlar[index] = value;
247
+ } else {
248
+ env->pmsav8.hprbar[index] = value;
249
+ }
250
+ } else {
251
+ if (index >= cpu->pmsav7_dregion) {
252
+ return;
253
+ }
254
+ if (ri->opc2 & 0x1) {
255
+ env->pmsav8.rlar[M_REG_NS][index] = value;
256
+ } else {
257
+ env->pmsav8.rbar[M_REG_NS][index] = value;
258
+ }
259
+ }
260
+}
261
+
262
+static uint64_t pmsav8r_regn_read(CPUARMState *env, const ARMCPRegInfo *ri)
263
+{
264
+ ARMCPU *cpu = env_archcpu(env);
265
+ uint8_t index = (extract32(ri->opc0, 0, 1) << 4) |
266
+ (extract32(ri->crm, 0, 3) << 1) | extract32(ri->opc2, 2, 1);
267
+
268
+ if (ri->opc1 & 4) {
269
+ if (index >= cpu->pmsav8r_hdregion) {
270
+ return 0x0;
271
+ }
272
+ if (ri->opc2 & 0x1) {
273
+ return env->pmsav8.hprlar[index];
274
+ } else {
275
+ return env->pmsav8.hprbar[index];
276
+ }
277
+ } else {
278
+ if (index >= cpu->pmsav7_dregion) {
279
+ return 0x0;
280
+ }
281
+ if (ri->opc2 & 0x1) {
282
+ return env->pmsav8.rlar[M_REG_NS][index];
283
+ } else {
284
+ return env->pmsav8.rbar[M_REG_NS][index];
285
+ }
286
+ }
287
+}
288
+
289
+static const ARMCPRegInfo pmsav8r_cp_reginfo[] = {
290
+ { .name = "PRBAR",
291
+ .cp = 15, .opc1 = 0, .crn = 6, .crm = 3, .opc2 = 0,
292
+ .access = PL1_RW, .type = ARM_CP_NO_RAW,
293
+ .accessfn = access_tvm_trvm,
294
+ .readfn = prbar_read, .writefn = prbar_write },
295
+ { .name = "PRLAR",
296
+ .cp = 15, .opc1 = 0, .crn = 6, .crm = 3, .opc2 = 1,
297
+ .access = PL1_RW, .type = ARM_CP_NO_RAW,
298
+ .accessfn = access_tvm_trvm,
299
+ .readfn = prlar_read, .writefn = prlar_write },
300
+ { .name = "PRSELR", .resetvalue = 0,
301
+ .cp = 15, .opc1 = 0, .crn = 6, .crm = 2, .opc2 = 1,
302
+ .access = PL1_RW, .accessfn = access_tvm_trvm,
303
+ .writefn = prselr_write,
304
+ .fieldoffset = offsetof(CPUARMState, pmsav7.rnr[M_REG_NS]) },
305
+ { .name = "HPRBAR", .resetvalue = 0,
306
+ .cp = 15, .opc1 = 4, .crn = 6, .crm = 3, .opc2 = 0,
307
+ .access = PL2_RW, .type = ARM_CP_NO_RAW,
308
+ .readfn = hprbar_read, .writefn = hprbar_write },
309
+ { .name = "HPRLAR",
310
+ .cp = 15, .opc1 = 4, .crn = 6, .crm = 3, .opc2 = 1,
311
+ .access = PL2_RW, .type = ARM_CP_NO_RAW,
312
+ .readfn = hprlar_read, .writefn = hprlar_write },
313
+ { .name = "HPRSELR", .resetvalue = 0,
314
+ .cp = 15, .opc1 = 4, .crn = 6, .crm = 2, .opc2 = 1,
315
+ .access = PL2_RW,
316
+ .writefn = hprselr_write,
317
+ .fieldoffset = offsetof(CPUARMState, pmsav8.hprselr) },
318
+ { .name = "HPRENR",
319
+ .cp = 15, .opc1 = 4, .crn = 6, .crm = 1, .opc2 = 1,
320
+ .access = PL2_RW, .type = ARM_CP_NO_RAW,
321
+ .readfn = hprenr_read, .writefn = hprenr_write },
322
+};
323
+
324
static const ARMCPRegInfo pmsav7_cp_reginfo[] = {
325
/* Reset for all these registers is handled in arm_cpu_reset(),
326
* because the PMSAv7 is also used by M-profile CPUs, which do
327
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
328
.access = PL1_R, .type = ARM_CP_CONST,
329
.resetvalue = cpu->pmsav7_dregion << 8
330
};
331
+ /* HMPUIR is specific to PMSA V8 */
332
+ ARMCPRegInfo id_hmpuir_reginfo = {
333
+ .name = "HMPUIR",
334
+ .cp = 15, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 4,
335
+ .access = PL2_R, .type = ARM_CP_CONST,
336
+ .resetvalue = cpu->pmsav8r_hdregion
337
+ };
338
static const ARMCPRegInfo crn0_wi_reginfo = {
339
.name = "CRN0_WI", .cp = 15, .crn = 0, .crm = CP_ANY,
340
.opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_W,
341
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
342
define_arm_cp_regs(cpu, id_cp_reginfo);
343
if (!arm_feature(env, ARM_FEATURE_PMSA)) {
344
define_one_arm_cp_reg(cpu, &id_tlbtr_reginfo);
345
+ } else if (arm_feature(env, ARM_FEATURE_PMSA) &&
346
+ arm_feature(env, ARM_FEATURE_V8)) {
347
+ uint32_t i = 0;
348
+ char *tmp_string;
349
+
350
+ define_one_arm_cp_reg(cpu, &id_mpuir_reginfo);
351
+ define_one_arm_cp_reg(cpu, &id_hmpuir_reginfo);
352
+ define_arm_cp_regs(cpu, pmsav8r_cp_reginfo);
353
+
354
+ /* Register alias is only valid for first 32 indexes */
355
+ for (i = 0; i < MIN(cpu->pmsav7_dregion, 32); ++i) {
356
+ uint8_t crm = 0b1000 | extract32(i, 1, 3);
357
+ uint8_t opc1 = extract32(i, 4, 1);
358
+ uint8_t opc2 = extract32(i, 0, 1) << 2;
359
+
360
+ tmp_string = g_strdup_printf("PRBAR%u", i);
361
+ ARMCPRegInfo tmp_prbarn_reginfo = {
362
+ .name = tmp_string, .type = ARM_CP_ALIAS | ARM_CP_NO_RAW,
363
+ .cp = 15, .opc1 = opc1, .crn = 6, .crm = crm, .opc2 = opc2,
364
+ .access = PL1_RW, .resetvalue = 0,
365
+ .accessfn = access_tvm_trvm,
366
+ .writefn = pmsav8r_regn_write, .readfn = pmsav8r_regn_read
367
+ };
368
+ define_one_arm_cp_reg(cpu, &tmp_prbarn_reginfo);
369
+ g_free(tmp_string);
370
+
371
+ opc2 = extract32(i, 0, 1) << 2 | 0x1;
372
+ tmp_string = g_strdup_printf("PRLAR%u", i);
373
+ ARMCPRegInfo tmp_prlarn_reginfo = {
374
+ .name = tmp_string, .type = ARM_CP_ALIAS | ARM_CP_NO_RAW,
375
+ .cp = 15, .opc1 = opc1, .crn = 6, .crm = crm, .opc2 = opc2,
376
+ .access = PL1_RW, .resetvalue = 0,
377
+ .accessfn = access_tvm_trvm,
378
+ .writefn = pmsav8r_regn_write, .readfn = pmsav8r_regn_read
379
+ };
380
+ define_one_arm_cp_reg(cpu, &tmp_prlarn_reginfo);
381
+ g_free(tmp_string);
382
+ }
383
+
384
+ /* Register alias is only valid for first 32 indexes */
385
+ for (i = 0; i < MIN(cpu->pmsav8r_hdregion, 32); ++i) {
386
+ uint8_t crm = 0b1000 | extract32(i, 1, 3);
387
+ uint8_t opc1 = 0b100 | extract32(i, 4, 1);
388
+ uint8_t opc2 = extract32(i, 0, 1) << 2;
389
+
390
+ tmp_string = g_strdup_printf("HPRBAR%u", i);
391
+ ARMCPRegInfo tmp_hprbarn_reginfo = {
392
+ .name = tmp_string,
393
+ .type = ARM_CP_NO_RAW,
394
+ .cp = 15, .opc1 = opc1, .crn = 6, .crm = crm, .opc2 = opc2,
395
+ .access = PL2_RW, .resetvalue = 0,
396
+ .writefn = pmsav8r_regn_write, .readfn = pmsav8r_regn_read
397
+ };
398
+ define_one_arm_cp_reg(cpu, &tmp_hprbarn_reginfo);
399
+ g_free(tmp_string);
400
+
401
+ opc2 = extract32(i, 0, 1) << 2 | 0x1;
402
+ tmp_string = g_strdup_printf("HPRLAR%u", i);
403
+ ARMCPRegInfo tmp_hprlarn_reginfo = {
404
+ .name = tmp_string,
405
+ .type = ARM_CP_NO_RAW,
406
+ .cp = 15, .opc1 = opc1, .crn = 6, .crm = crm, .opc2 = opc2,
407
+ .access = PL2_RW, .resetvalue = 0,
408
+ .writefn = pmsav8r_regn_write, .readfn = pmsav8r_regn_read
409
+ };
410
+ define_one_arm_cp_reg(cpu, &tmp_hprlarn_reginfo);
411
+ g_free(tmp_string);
412
+ }
413
} else if (arm_feature(env, ARM_FEATURE_V7)) {
414
define_one_arm_cp_reg(cpu, &id_mpuir_reginfo);
415
}
416
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
417
sctlr.type |= ARM_CP_SUPPRESS_TB_END;
418
}
419
define_one_arm_cp_reg(cpu, &sctlr);
420
+
421
+ if (arm_feature(env, ARM_FEATURE_PMSA) &&
422
+ arm_feature(env, ARM_FEATURE_V8)) {
423
+ ARMCPRegInfo vsctlr = {
424
+ .name = "VSCTLR", .state = ARM_CP_STATE_AA32,
425
+ .cp = 15, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
426
+ .access = PL2_RW, .resetvalue = 0x0,
427
+ .fieldoffset = offsetoflow32(CPUARMState, cp15.vsctlr),
428
+ };
429
+ define_one_arm_cp_reg(cpu, &vsctlr);
430
+ }
431
}
432
433
if (cpu_isar_feature(aa64_lor, cpu)) {
434
diff --git a/target/arm/machine.c b/target/arm/machine.c
435
index XXXXXXX..XXXXXXX 100644
436
--- a/target/arm/machine.c
437
+++ b/target/arm/machine.c
438
@@ -XXX,XX +XXX,XX @@ static bool pmsav8_needed(void *opaque)
439
arm_feature(env, ARM_FEATURE_V8);
71
}
440
}
72
441
73
-int32_t HELPER(sdiv)(int32_t num, int32_t den)
442
+static bool pmsav8r_needed(void *opaque)
74
+int32_t HELPER(sdiv)(CPUARMState *env, int32_t num, int32_t den)
443
+{
75
{
444
+ ARMCPU *cpu = opaque;
76
if (den == 0) {
445
+ CPUARMState *env = &cpu->env;
77
+ handle_possible_div0_trap(env, GETPC());
446
+
78
return 0;
447
+ return arm_feature(env, ARM_FEATURE_PMSA) &&
448
+ arm_feature(env, ARM_FEATURE_V8) &&
449
+ !arm_feature(env, ARM_FEATURE_M);
450
+}
451
+
452
+static const VMStateDescription vmstate_pmsav8r = {
453
+ .name = "cpu/pmsav8/pmsav8r",
454
+ .version_id = 1,
455
+ .minimum_version_id = 1,
456
+ .needed = pmsav8r_needed,
457
+ .fields = (VMStateField[]) {
458
+ VMSTATE_VARRAY_UINT32(env.pmsav8.hprbar, ARMCPU,
459
+ pmsav8r_hdregion, 0, vmstate_info_uint32, uint32_t),
460
+ VMSTATE_VARRAY_UINT32(env.pmsav8.hprlar, ARMCPU,
461
+ pmsav8r_hdregion, 0, vmstate_info_uint32, uint32_t),
462
+ VMSTATE_END_OF_LIST()
463
+ },
464
+};
465
+
466
static const VMStateDescription vmstate_pmsav8 = {
467
.name = "cpu/pmsav8",
468
.version_id = 1,
469
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_pmsav8 = {
470
VMSTATE_UINT32(env.pmsav8.mair0[M_REG_NS], ARMCPU),
471
VMSTATE_UINT32(env.pmsav8.mair1[M_REG_NS], ARMCPU),
472
VMSTATE_END_OF_LIST()
473
+ },
474
+ .subsections = (const VMStateDescription * []) {
475
+ &vmstate_pmsav8r,
476
+ NULL
79
}
477
}
80
if (num == INT_MIN && den == -1) {
478
};
81
@@ -XXX,XX +XXX,XX @@ int32_t HELPER(sdiv)(int32_t num, int32_t den)
479
82
return num / den;
83
}
84
85
-uint32_t HELPER(udiv)(uint32_t num, uint32_t den)
86
+uint32_t HELPER(udiv)(CPUARMState *env, uint32_t num, uint32_t den)
87
{
88
if (den == 0) {
89
+ handle_possible_div0_trap(env, GETPC());
90
return 0;
91
}
92
return num / den;
93
@@ -XXX,XX +XXX,XX @@ void arm_log_exception(int idx)
94
[EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
95
[EXCP_LSERR] = "v8M LSERR UsageFault",
96
[EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
97
+ [EXCP_DIVBYZERO] = "v7M DIVBYZERO UsageFault",
98
};
99
100
if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
101
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
102
index XXXXXXX..XXXXXXX 100644
103
--- a/target/arm/m_helper.c
104
+++ b/target/arm/m_helper.c
105
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
106
armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
107
env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNALIGNED_MASK;
108
break;
109
+ case EXCP_DIVBYZERO:
110
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
111
+ env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_DIVBYZERO_MASK;
112
+ break;
113
case EXCP_SWI:
114
/* The PC already points to the next instruction. */
115
armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC, env->v7m.secure);
116
diff --git a/target/arm/translate.c b/target/arm/translate.c
117
index XXXXXXX..XXXXXXX 100644
118
--- a/target/arm/translate.c
119
+++ b/target/arm/translate.c
120
@@ -XXX,XX +XXX,XX @@ static bool op_div(DisasContext *s, arg_rrr *a, bool u)
121
t1 = load_reg(s, a->rn);
122
t2 = load_reg(s, a->rm);
123
if (u) {
124
- gen_helper_udiv(t1, t1, t2);
125
+ gen_helper_udiv(t1, cpu_env, t1, t2);
126
} else {
127
- gen_helper_sdiv(t1, t1, t2);
128
+ gen_helper_sdiv(t1, cpu_env, t1, t2);
129
}
130
tcg_temp_free_i32(t2);
131
store_reg(s, a->rd, t1);
132
--
480
--
133
2.20.1
481
2.25.1
134
482
135
483
diff view generated by jsdifflib
1
Implement the MVE VPNOT insn, which inverts the bits in VPR.P0
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
(subject to both predication and to beatwise execution).
2
3
3
Add PMSAv8r translation.
4
5
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20221206102504.165775-7-tobias.roehmel@rwth-aachen.de
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
9
---
7
target/arm/helper-mve.h | 1 +
10
target/arm/ptw.c | 126 ++++++++++++++++++++++++++++++++++++++---------
8
target/arm/mve.decode | 1 +
11
1 file changed, 104 insertions(+), 22 deletions(-)
9
target/arm/mve_helper.c | 17 +++++++++++++++++
12
10
target/arm/translate-mve.c | 19 +++++++++++++++++++
13
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
11
4 files changed, 38 insertions(+)
12
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
15
--- a/target/arm/ptw.c
16
+++ b/target/arm/helper-mve.h
16
+++ b/target/arm/ptw.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
@@ -XXX,XX +XXX,XX @@ static bool pmsav7_use_background_region(ARMCPU *cpu, ARMMMUIdx mmu_idx,
18
DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
19
19
if (arm_feature(env, ARM_FEATURE_M)) {
20
DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
return env->v7m.mpu_ctrl[is_secure] & R_V7M_MPU_CTRL_PRIVDEFENA_MASK;
21
+DEF_HELPER_FLAGS_1(mve_vpnot, TCG_CALL_NO_WG, void, env)
21
- } else {
22
22
- return regime_sctlr(env, mmu_idx) & SCTLR_BR;
23
DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
}
24
DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
+
25
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
25
+ if (mmu_idx == ARMMMUIdx_Stage2) {
26
index XXXXXXX..XXXXXXX 100644
26
+ return false;
27
--- a/target/arm/mve.decode
27
+ }
28
+++ b/target/arm/mve.decode
28
+
29
@@ -XXX,XX +XXX,XX @@ VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp
29
+ return regime_sctlr(env, mmu_idx) & SCTLR_BR;
30
VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp
31
32
{
33
+ VPNOT 1111 1110 0 0 11 000 1 000 0 1111 0100 1101
34
VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
35
VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar
36
}
30
}
37
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
31
38
index XXXXXXX..XXXXXXX 100644
32
static bool get_phys_addr_pmsav7(CPUARMState *env, uint32_t address,
39
--- a/target/arm/mve_helper.c
33
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_pmsav7(CPUARMState *env, uint32_t address,
40
+++ b/target/arm/mve_helper.c
34
return !(result->f.prot & (1 << access_type));
41
@@ -XXX,XX +XXX,XX @@ void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm)
42
mve_advance_vpt(env);
43
}
35
}
44
36
45
+void HELPER(mve_vpnot)(CPUARMState *env)
37
+static uint32_t *regime_rbar(CPUARMState *env, ARMMMUIdx mmu_idx,
38
+ uint32_t secure)
46
+{
39
+{
47
+ /*
40
+ if (regime_el(env, mmu_idx) == 2) {
48
+ * P0 bits for unexecuted beats (where eci_mask is 0) are unchanged.
41
+ return env->pmsav8.hprbar;
49
+ * P0 bits for predicated lanes in executed bits (where mask is 0) are 0.
42
+ } else {
50
+ * P0 bits otherwise are inverted.
43
+ return env->pmsav8.rbar[secure];
51
+ * (This is the same logic as VCMP.)
44
+ }
52
+ * This insn is itself subject to predication and to beat-wise execution,
53
+ * and after it executes VPT state advances in the usual way.
54
+ */
55
+ uint16_t mask = mve_element_mask(env);
56
+ uint16_t eci_mask = mve_eci_mask(env);
57
+ uint16_t beatpred = ~env->v7m.vpr & mask;
58
+ env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | (beatpred & eci_mask);
59
+ mve_advance_vpt(env);
60
+}
45
+}
61
+
46
+
62
#define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \
47
+static uint32_t *regime_rlar(CPUARMState *env, ARMMMUIdx mmu_idx,
63
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
48
+ uint32_t secure)
64
{ \
49
+{
65
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
50
+ if (regime_el(env, mmu_idx) == 2) {
66
index XXXXXXX..XXXXXXX 100644
51
+ return env->pmsav8.hprlar;
67
--- a/target/arm/translate-mve.c
52
+ } else {
68
+++ b/target/arm/translate-mve.c
53
+ return env->pmsav8.rlar[secure];
69
@@ -XXX,XX +XXX,XX @@ static bool trans_VPST(DisasContext *s, arg_VPST *a)
54
+ }
70
return true;
55
+}
56
+
57
bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
58
MMUAccessType access_type, ARMMMUIdx mmu_idx,
59
bool secure, GetPhysAddrResult *result,
60
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
61
bool hit = false;
62
uint32_t addr_page_base = address & TARGET_PAGE_MASK;
63
uint32_t addr_page_limit = addr_page_base + (TARGET_PAGE_SIZE - 1);
64
+ int region_counter;
65
+
66
+ if (regime_el(env, mmu_idx) == 2) {
67
+ region_counter = cpu->pmsav8r_hdregion;
68
+ } else {
69
+ region_counter = cpu->pmsav7_dregion;
70
+ }
71
72
result->f.lg_page_size = TARGET_PAGE_BITS;
73
result->f.phys_addr = address;
74
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
75
*mregion = -1;
76
}
77
78
+ if (mmu_idx == ARMMMUIdx_Stage2) {
79
+ fi->stage2 = true;
80
+ }
81
+
82
/*
83
* Unlike the ARM ARM pseudocode, we don't need to check whether this
84
* was an exception vector read from the vector table (which is always
85
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
86
hit = true;
87
}
88
89
- for (n = (int)cpu->pmsav7_dregion - 1; n >= 0; n--) {
90
+ uint32_t bitmask;
91
+ if (arm_feature(env, ARM_FEATURE_M)) {
92
+ bitmask = 0x1f;
93
+ } else {
94
+ bitmask = 0x3f;
95
+ fi->level = 0;
96
+ }
97
+
98
+ for (n = region_counter - 1; n >= 0; n--) {
99
/* region search */
100
/*
101
- * Note that the base address is bits [31:5] from the register
102
- * with bits [4:0] all zeroes, but the limit address is bits
103
- * [31:5] from the register with bits [4:0] all ones.
104
+ * Note that the base address is bits [31:x] from the register
105
+ * with bits [x-1:0] all zeroes, but the limit address is bits
106
+ * [31:x] from the register with bits [x:0] all ones. Where x is
107
+ * 5 for Cortex-M and 6 for Cortex-R
108
*/
109
- uint32_t base = env->pmsav8.rbar[secure][n] & ~0x1f;
110
- uint32_t limit = env->pmsav8.rlar[secure][n] | 0x1f;
111
+ uint32_t base = regime_rbar(env, mmu_idx, secure)[n] & ~bitmask;
112
+ uint32_t limit = regime_rlar(env, mmu_idx, secure)[n] | bitmask;
113
114
- if (!(env->pmsav8.rlar[secure][n] & 0x1)) {
115
+ if (!(regime_rlar(env, mmu_idx, secure)[n] & 0x1)) {
116
/* Region disabled */
117
continue;
118
}
119
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
120
* PMSAv7 where highest-numbered-region wins)
121
*/
122
fi->type = ARMFault_Permission;
123
- fi->level = 1;
124
+ if (arm_feature(env, ARM_FEATURE_M)) {
125
+ fi->level = 1;
126
+ }
127
return true;
128
}
129
130
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
131
}
132
133
if (!hit) {
134
- /* background fault */
135
- fi->type = ARMFault_Background;
136
+ if (arm_feature(env, ARM_FEATURE_M)) {
137
+ fi->type = ARMFault_Background;
138
+ } else {
139
+ fi->type = ARMFault_Permission;
140
+ }
141
return true;
142
}
143
144
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
145
/* hit using the background region */
146
get_phys_addr_pmsav7_default(env, mmu_idx, address, &result->f.prot);
147
} else {
148
- uint32_t ap = extract32(env->pmsav8.rbar[secure][matchregion], 1, 2);
149
- uint32_t xn = extract32(env->pmsav8.rbar[secure][matchregion], 0, 1);
150
+ uint32_t matched_rbar = regime_rbar(env, mmu_idx, secure)[matchregion];
151
+ uint32_t matched_rlar = regime_rlar(env, mmu_idx, secure)[matchregion];
152
+ uint32_t ap = extract32(matched_rbar, 1, 2);
153
+ uint32_t xn = extract32(matched_rbar, 0, 1);
154
bool pxn = false;
155
156
if (arm_feature(env, ARM_FEATURE_V8_1M)) {
157
- pxn = extract32(env->pmsav8.rlar[secure][matchregion], 4, 1);
158
+ pxn = extract32(matched_rlar, 4, 1);
159
}
160
161
if (m_is_system_region(env, address)) {
162
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
163
xn = 1;
164
}
165
166
- result->f.prot = simple_ap_to_rw_prot(env, mmu_idx, ap);
167
+ if (regime_el(env, mmu_idx) == 2) {
168
+ result->f.prot = simple_ap_to_rw_prot_is_user(ap,
169
+ mmu_idx != ARMMMUIdx_E2);
170
+ } else {
171
+ result->f.prot = simple_ap_to_rw_prot(env, mmu_idx, ap);
172
+ }
173
+
174
+ if (!arm_feature(env, ARM_FEATURE_M)) {
175
+ uint8_t attrindx = extract32(matched_rlar, 1, 3);
176
+ uint64_t mair = env->cp15.mair_el[regime_el(env, mmu_idx)];
177
+ uint8_t sh = extract32(matched_rlar, 3, 2);
178
+
179
+ if (regime_sctlr(env, mmu_idx) & SCTLR_WXN &&
180
+ result->f.prot & PAGE_WRITE && mmu_idx != ARMMMUIdx_Stage2) {
181
+ xn = 0x1;
182
+ }
183
+
184
+ if ((regime_el(env, mmu_idx) == 1) &&
185
+ regime_sctlr(env, mmu_idx) & SCTLR_UWXN && ap == 0x1) {
186
+ pxn = 0x1;
187
+ }
188
+
189
+ result->cacheattrs.is_s2_format = false;
190
+ result->cacheattrs.attrs = extract64(mair, attrindx * 8, 8);
191
+ result->cacheattrs.shareability = sh;
192
+ }
193
+
194
if (result->f.prot && !xn && !(pxn && !is_user)) {
195
result->f.prot |= PAGE_EXEC;
196
}
197
- /*
198
- * We don't need to look the attribute up in the MAIR0/MAIR1
199
- * registers because that only tells us about cacheability.
200
- */
201
+
202
if (mregion) {
203
*mregion = matchregion;
204
}
205
}
206
207
fi->type = ARMFault_Permission;
208
- fi->level = 1;
209
+ if (arm_feature(env, ARM_FEATURE_M)) {
210
+ fi->level = 1;
211
+ }
212
return !(result->f.prot & (1 << access_type));
71
}
213
}
72
214
73
+static bool trans_VPNOT(DisasContext *s, arg_VPNOT *a)
215
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
74
+{
216
cacheattrs1 = result->cacheattrs;
75
+ /*
217
memset(result, 0, sizeof(*result));
76
+ * Invert the predicate in VPR.P0. We have call out to
218
77
+ * a helper because this insn itself is beatwise and can
219
- ret = get_phys_addr_lpae(env, ptw, ipa, access_type, is_el0, result, fi);
78
+ * be predicated.
220
+ if (arm_feature(env, ARM_FEATURE_PMSA)) {
79
+ */
221
+ ret = get_phys_addr_pmsav8(env, ipa, access_type,
80
+ if (!dc_isar_feature(aa32_mve, s)) {
222
+ ptw->in_mmu_idx, is_secure, result, fi);
81
+ return false;
223
+ } else {
82
+ }
224
+ ret = get_phys_addr_lpae(env, ptw, ipa, access_type,
83
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
225
+ is_el0, result, fi);
84
+ return true;
226
+ }
85
+ }
227
fi->s2addr = ipa;
86
+
228
87
+ gen_helper_mve_vpnot(cpu_env);
229
/* Combine the S1 and S2 perms. */
88
+ mve_update_eci(s);
89
+ return true;
90
+}
91
+
92
static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
93
{
94
/* VADDV: vector add across vector */
95
--
230
--
96
2.20.1
231
2.25.1
97
232
98
233
diff view generated by jsdifflib
1
Implement the MVE saturating doubling multiply accumulate insns
1
From: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
2
VQDMLAH, VQRDMLAH, VQDMLASH and VQRDMLASH. These perform a multiply,
3
double, add the accumulator shifted by the element size, possibly
4
round, saturate to twice the element size, then take the high half of
5
the result. The *MLAH insns do vector * scalar + vector, and the
6
*MLASH insns do vector * vector + scalar.
7
2
3
All constants are taken from the ARM Cortex-R52 Processor TRM Revision: r1p3
4
5
Signed-off-by: Tobias Röhmel <tobias.roehmel@rwth-aachen.de>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20221206102504.165775-8-tobias.roehmel@rwth-aachen.de
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
---
9
---
11
target/arm/helper-mve.h | 16 +++++++
10
target/arm/cpu_tcg.c | 42 ++++++++++++++++++++++++++++++++++++++++++
12
target/arm/mve.decode | 5 ++
11
1 file changed, 42 insertions(+)
13
target/arm/mve_helper.c | 95 ++++++++++++++++++++++++++++++++++++++
14
target/arm/translate-mve.c | 4 ++
15
4 files changed, 120 insertions(+)
16
12
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
13
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
18
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
15
--- a/target/arm/cpu_tcg.c
20
+++ b/target/arm/helper-mve.h
16
+++ b/target/arm/cpu_tcg.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
22
DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
23
DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
}
24
20
25
+DEF_HELPER_FLAGS_4(mve_vqdmlahb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
+static void cortex_r52_initfn(Object *obj)
26
+DEF_HELPER_FLAGS_4(mve_vqdmlahh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
+{
27
+DEF_HELPER_FLAGS_4(mve_vqdmlahw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
+ ARMCPU *cpu = ARM_CPU(obj);
28
+
24
+
29
+DEF_HELPER_FLAGS_4(mve_vqrdmlahb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
+ set_feature(&cpu->env, ARM_FEATURE_V8);
30
+DEF_HELPER_FLAGS_4(mve_vqrdmlahh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
+ set_feature(&cpu->env, ARM_FEATURE_EL2);
31
+DEF_HELPER_FLAGS_4(mve_vqrdmlahw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
+ set_feature(&cpu->env, ARM_FEATURE_PMSA);
28
+ set_feature(&cpu->env, ARM_FEATURE_NEON);
29
+ set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
30
+ cpu->midr = 0x411fd133; /* r1p3 */
31
+ cpu->revidr = 0x00000000;
32
+ cpu->reset_fpsid = 0x41034023;
33
+ cpu->isar.mvfr0 = 0x10110222;
34
+ cpu->isar.mvfr1 = 0x12111111;
35
+ cpu->isar.mvfr2 = 0x00000043;
36
+ cpu->ctr = 0x8144c004;
37
+ cpu->reset_sctlr = 0x30c50838;
38
+ cpu->isar.id_pfr0 = 0x00000131;
39
+ cpu->isar.id_pfr1 = 0x10111001;
40
+ cpu->isar.id_dfr0 = 0x03010006;
41
+ cpu->id_afr0 = 0x00000000;
42
+ cpu->isar.id_mmfr0 = 0x00211040;
43
+ cpu->isar.id_mmfr1 = 0x40000000;
44
+ cpu->isar.id_mmfr2 = 0x01200000;
45
+ cpu->isar.id_mmfr3 = 0xf0102211;
46
+ cpu->isar.id_mmfr4 = 0x00000010;
47
+ cpu->isar.id_isar0 = 0x02101110;
48
+ cpu->isar.id_isar1 = 0x13112111;
49
+ cpu->isar.id_isar2 = 0x21232142;
50
+ cpu->isar.id_isar3 = 0x01112131;
51
+ cpu->isar.id_isar4 = 0x00010142;
52
+ cpu->isar.id_isar5 = 0x00010001;
53
+ cpu->isar.dbgdidr = 0x77168000;
54
+ cpu->clidr = (1 << 27) | (1 << 24) | 0x3;
55
+ cpu->ccsidr[0] = 0x700fe01a; /* 32KB L1 dcache */
56
+ cpu->ccsidr[1] = 0x201fe00a; /* 32KB L1 icache */
32
+
57
+
33
+DEF_HELPER_FLAGS_4(mve_vqdmlashb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
58
+ cpu->pmsav7_dregion = 16;
34
+DEF_HELPER_FLAGS_4(mve_vqdmlashh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
59
+ cpu->pmsav8r_hdregion = 16;
35
+DEF_HELPER_FLAGS_4(mve_vqdmlashw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_4(mve_vqrdmlashb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(mve_vqrdmlashh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(mve_vqrdmlashw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+
41
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
42
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
43
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
44
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/mve.decode
47
+++ b/target/arm/mve.decode
48
@@ -XXX,XX +XXX,XX @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
49
VMLA 111- 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar
50
VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar
51
52
+VQRDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 100 .... @2scalar
53
+VQRDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 100 .... @2scalar
54
+VQDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 110 .... @2scalar
55
+VQDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 110 .... @2scalar
56
+
57
# Vector add across vector
58
{
59
VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
60
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/mve_helper.c
63
+++ b/target/arm/mve_helper.c
64
@@ -XXX,XX +XXX,XX @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w)
65
mve_advance_vpt(env); \
66
}
67
68
+#define DO_2OP_SAT_ACC_SCALAR(OP, ESIZE, TYPE, FN) \
69
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
70
+ uint32_t rm) \
71
+ { \
72
+ TYPE *d = vd, *n = vn; \
73
+ TYPE m = rm; \
74
+ uint16_t mask = mve_element_mask(env); \
75
+ unsigned e; \
76
+ bool qc = false; \
77
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
78
+ bool sat = false; \
79
+ mergemask(&d[H##ESIZE(e)], \
80
+ FN(d[H##ESIZE(e)], n[H##ESIZE(e)], m, &sat), \
81
+ mask); \
82
+ qc |= sat & mask & 1; \
83
+ } \
84
+ if (qc) { \
85
+ env->vfp.qc[0] = qc; \
86
+ } \
87
+ mve_advance_vpt(env); \
88
+ }
89
+
90
/* provide unsigned 2-op scalar helpers for all sizes */
91
#define DO_2OP_SCALAR_U(OP, FN) \
92
DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \
93
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B)
94
DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H)
95
DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W)
96
97
+static int8_t do_vqdmlah_b(int8_t a, int8_t b, int8_t c, int round, bool *sat)
98
+{
99
+ int64_t r = (int64_t)a * b * 2 + ((int64_t)c << 8) + (round << 7);
100
+ return do_sat_bhw(r, INT16_MIN, INT16_MAX, sat) >> 8;
101
+}
60
+}
102
+
61
+
103
+static int16_t do_vqdmlah_h(int16_t a, int16_t b, int16_t c,
62
static void cortex_r5f_initfn(Object *obj)
104
+ int round, bool *sat)
105
+{
106
+ int64_t r = (int64_t)a * b * 2 + ((int64_t)c << 16) + (round << 15);
107
+ return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat) >> 16;
108
+}
109
+
110
+static int32_t do_vqdmlah_w(int32_t a, int32_t b, int32_t c,
111
+ int round, bool *sat)
112
+{
113
+ /*
114
+ * Architecturally we should do the entire add, double, round
115
+ * and then check for saturation. We do three saturating adds,
116
+ * but we need to be careful about the order. If the first
117
+ * m1 + m2 saturates then it's impossible for the *2+rc to
118
+ * bring it back into the non-saturated range. However, if
119
+ * m1 + m2 is negative then it's possible that doing the doubling
120
+ * would take the intermediate result below INT64_MAX and the
121
+ * addition of the rounding constant then brings it back in range.
122
+ * So we add half the rounding constant and half the "c << esize"
123
+ * before doubling rather than adding the rounding constant after
124
+ * the doubling.
125
+ */
126
+ int64_t m1 = (int64_t)a * b;
127
+ int64_t m2 = (int64_t)c << 31;
128
+ int64_t r;
129
+ if (sadd64_overflow(m1, m2, &r) ||
130
+ sadd64_overflow(r, (round << 30), &r) ||
131
+ sadd64_overflow(r, r, &r)) {
132
+ *sat = true;
133
+ return r < 0 ? INT32_MAX : INT32_MIN;
134
+ }
135
+ return r >> 32;
136
+}
137
+
138
+/*
139
+ * The *MLAH insns are vector * scalar + vector;
140
+ * the *MLASH insns are vector * vector + scalar
141
+ */
142
+#define DO_VQDMLAH_B(D, N, M, S) do_vqdmlah_b(N, M, D, 0, S)
143
+#define DO_VQDMLAH_H(D, N, M, S) do_vqdmlah_h(N, M, D, 0, S)
144
+#define DO_VQDMLAH_W(D, N, M, S) do_vqdmlah_w(N, M, D, 0, S)
145
+#define DO_VQRDMLAH_B(D, N, M, S) do_vqdmlah_b(N, M, D, 1, S)
146
+#define DO_VQRDMLAH_H(D, N, M, S) do_vqdmlah_h(N, M, D, 1, S)
147
+#define DO_VQRDMLAH_W(D, N, M, S) do_vqdmlah_w(N, M, D, 1, S)
148
+
149
+#define DO_VQDMLASH_B(D, N, M, S) do_vqdmlah_b(N, D, M, 0, S)
150
+#define DO_VQDMLASH_H(D, N, M, S) do_vqdmlah_h(N, D, M, 0, S)
151
+#define DO_VQDMLASH_W(D, N, M, S) do_vqdmlah_w(N, D, M, 0, S)
152
+#define DO_VQRDMLASH_B(D, N, M, S) do_vqdmlah_b(N, D, M, 1, S)
153
+#define DO_VQRDMLASH_H(D, N, M, S) do_vqdmlah_h(N, D, M, 1, S)
154
+#define DO_VQRDMLASH_W(D, N, M, S) do_vqdmlah_w(N, D, M, 1, S)
155
+
156
+DO_2OP_SAT_ACC_SCALAR(vqdmlahb, 1, int8_t, DO_VQDMLAH_B)
157
+DO_2OP_SAT_ACC_SCALAR(vqdmlahh, 2, int16_t, DO_VQDMLAH_H)
158
+DO_2OP_SAT_ACC_SCALAR(vqdmlahw, 4, int32_t, DO_VQDMLAH_W)
159
+DO_2OP_SAT_ACC_SCALAR(vqrdmlahb, 1, int8_t, DO_VQRDMLAH_B)
160
+DO_2OP_SAT_ACC_SCALAR(vqrdmlahh, 2, int16_t, DO_VQRDMLAH_H)
161
+DO_2OP_SAT_ACC_SCALAR(vqrdmlahw, 4, int32_t, DO_VQRDMLAH_W)
162
+
163
+DO_2OP_SAT_ACC_SCALAR(vqdmlashb, 1, int8_t, DO_VQDMLASH_B)
164
+DO_2OP_SAT_ACC_SCALAR(vqdmlashh, 2, int16_t, DO_VQDMLASH_H)
165
+DO_2OP_SAT_ACC_SCALAR(vqdmlashw, 4, int32_t, DO_VQDMLASH_W)
166
+DO_2OP_SAT_ACC_SCALAR(vqrdmlashb, 1, int8_t, DO_VQRDMLASH_B)
167
+DO_2OP_SAT_ACC_SCALAR(vqrdmlashh, 2, int16_t, DO_VQRDMLASH_H)
168
+DO_2OP_SAT_ACC_SCALAR(vqrdmlashw, 4, int32_t, DO_VQRDMLASH_W)
169
+
170
/* Vector by scalar plus vector */
171
#define DO_VMLA(D, N, M) ((N) * (M) + (D))
172
173
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
174
index XXXXXXX..XXXXXXX 100644
175
--- a/target/arm/translate-mve.c
176
+++ b/target/arm/translate-mve.c
177
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar)
178
DO_2OP_SCALAR(VBRSR, vbrsr)
179
DO_2OP_SCALAR(VMLA, vmla)
180
DO_2OP_SCALAR(VMLAS, vmlas)
181
+DO_2OP_SCALAR(VQDMLAH, vqdmlah)
182
+DO_2OP_SCALAR(VQRDMLAH, vqrdmlah)
183
+DO_2OP_SCALAR(VQDMLASH, vqdmlash)
184
+DO_2OP_SCALAR(VQRDMLASH, vqrdmlash)
185
186
static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a)
187
{
63
{
64
ARMCPU *cpu = ARM_CPU(obj);
65
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_tcg_cpus[] = {
66
.class_init = arm_v7m_class_init },
67
{ .name = "cortex-r5", .initfn = cortex_r5_initfn },
68
{ .name = "cortex-r5f", .initfn = cortex_r5f_initfn },
69
+ { .name = "cortex-r52", .initfn = cortex_r52_initfn },
70
{ .name = "ti925t", .initfn = ti925t_initfn },
71
{ .name = "sa1100", .initfn = sa1100_initfn },
72
{ .name = "sa1110", .initfn = sa1110_initfn },
188
--
73
--
189
2.20.1
74
2.25.1
190
75
191
76
diff view generated by jsdifflib
1
Implement the MVE VCTP insn, which sets the VPR.P0 predicate bits so
1
From: Alex Bennée <alex.bennee@linaro.org>
2
as to predicate any element at index Rn or greater is predicated. As
3
with VPNOT, this insn itself is predicable and subject to beatwise
4
execution.
5
2
6
The calculation of the mask is the same as is used to determine
3
The check semihosting_enabled() wants to know if the guest is
7
ltpmask in mve_element_mask(), but we precalculate masklen in
4
currently in user mode. Unlike the other cases the test was inverted
8
generated code to avoid having to have 4 helpers specialized by size.
5
causing us to block semihosting calls in non-EL0 modes.
9
6
10
We put the decode line in with the low-overhead-loop insns in
7
Cc: qemu-stable@nongnu.org
11
t32.decode because it's logically part of that collection of insn
8
Fixes: 19b26317e9 (target/arm: Honour -semihosting-config userspace=on)
12
patterns, even though it is an MVE only insn.
9
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
target/arm/translate.c | 2 +-
14
1 file changed, 1 insertion(+), 1 deletion(-)
13
15
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
16
---
17
target/arm/helper-mve.h | 2 ++
18
target/arm/translate-a32.h | 1 +
19
target/arm/t32.decode | 1 +
20
target/arm/mve_helper.c | 20 ++++++++++++++++++++
21
target/arm/translate-mve.c | 2 +-
22
target/arm/translate.c | 33 +++++++++++++++++++++++++++++++++
23
6 files changed, 58 insertions(+), 1 deletion(-)
24
25
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/helper-mve.h
28
+++ b/target/arm/helper-mve.h
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
DEF_HELPER_FLAGS_1(mve_vpnot, TCG_CALL_NO_WG, void, env)
32
33
+DEF_HELPER_FLAGS_2(mve_vctp, TCG_CALL_NO_WG, void, env, i32)
34
+
35
DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
36
DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
37
DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
38
diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate-a32.h
41
+++ b/target/arm/translate-a32.h
42
@@ -XXX,XX +XXX,XX @@ long neon_element_offset(int reg, int element, MemOp memop);
43
void gen_rev16(TCGv_i32 dest, TCGv_i32 var);
44
void clear_eci_state(DisasContext *s);
45
bool mve_eci_check(DisasContext *s);
46
+void mve_update_eci(DisasContext *s);
47
void mve_update_and_store_eci(DisasContext *s);
48
bool mve_skip_vmov(DisasContext *s, int vn, int index, int size);
49
50
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/t32.decode
53
+++ b/target/arm/t32.decode
54
@@ -XXX,XX +XXX,XX @@ BL 1111 0. .......... 11.1 ............ @branch24
55
# This is DLSTP
56
DLS 1111 0 0000 0 size:2 rn:4 1110 0000 0000 0001
57
}
58
+ VCTP 1111 0 0000 0 size:2 rn:4 1110 1000 0000 0001
59
]
60
}
61
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/mve_helper.c
64
+++ b/target/arm/mve_helper.c
65
@@ -XXX,XX +XXX,XX @@ void HELPER(mve_vpnot)(CPUARMState *env)
66
mve_advance_vpt(env);
67
}
68
69
+/*
70
+ * VCTP: P0 unexecuted bits unchanged, predicated bits zeroed,
71
+ * otherwise set according to value of Rn. The calculation of
72
+ * newmask here works in the same way as the calculation of the
73
+ * ltpmask in mve_element_mask(), but we have pre-calculated
74
+ * the masklen in the generated code.
75
+ */
76
+void HELPER(mve_vctp)(CPUARMState *env, uint32_t masklen)
77
+{
78
+ uint16_t mask = mve_element_mask(env);
79
+ uint16_t eci_mask = mve_eci_mask(env);
80
+ uint16_t newmask;
81
+
82
+ assert(masklen <= 16);
83
+ newmask = masklen ? MAKE_64BIT_MASK(0, masklen) : 0;
84
+ newmask &= mask;
85
+ env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | (newmask & eci_mask);
86
+ mve_advance_vpt(env);
87
+}
88
+
89
#define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \
90
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
91
{ \
92
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/translate-mve.c
95
+++ b/target/arm/translate-mve.c
96
@@ -XXX,XX +XXX,XX @@ bool mve_eci_check(DisasContext *s)
97
}
98
}
99
100
-static void mve_update_eci(DisasContext *s)
101
+void mve_update_eci(DisasContext *s)
102
{
103
/*
104
* The helper function will always update the CPUState field,
105
diff --git a/target/arm/translate.c b/target/arm/translate.c
16
diff --git a/target/arm/translate.c b/target/arm/translate.c
106
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/translate.c
18
--- a/target/arm/translate.c
108
+++ b/target/arm/translate.c
19
+++ b/target/arm/translate.c
109
@@ -XXX,XX +XXX,XX @@ static bool trans_LCTP(DisasContext *s, arg_LCTP *a)
20
@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
110
return true;
21
* semihosting, to provide some semblance of security
111
}
22
* (and for consistency with our 32-bit semihosting).
112
23
*/
113
+static bool trans_VCTP(DisasContext *s, arg_VCTP *a)
24
- if (semihosting_enabled(s->current_el != 0) &&
114
+{
25
+ if (semihosting_enabled(s->current_el == 0) &&
115
+ /*
26
(imm == (s->thumb ? 0x3c : 0xf000))) {
116
+ * M-profile Create Vector Tail Predicate. This insn is itself
27
gen_exception_internal_insn(s, EXCP_SEMIHOST);
117
+ * predicated and is subject to beatwise execution.
28
return;
118
+ */
119
+ TCGv_i32 rn_shifted, masklen;
120
+
121
+ if (!dc_isar_feature(aa32_mve, s) || a->rn == 13 || a->rn == 15) {
122
+ return false;
123
+ }
124
+
125
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
126
+ return true;
127
+ }
128
+
129
+ /*
130
+ * We pre-calculate the mask length here to avoid having
131
+ * to have multiple helpers specialized for size.
132
+ * We pass the helper "rn <= (1 << (4 - size)) ? (rn << size) : 16".
133
+ */
134
+ rn_shifted = tcg_temp_new_i32();
135
+ masklen = load_reg(s, a->rn);
136
+ tcg_gen_shli_i32(rn_shifted, masklen, a->size);
137
+ tcg_gen_movcond_i32(TCG_COND_LEU, masklen,
138
+ masklen, tcg_constant_i32(1 << (4 - a->size)),
139
+ rn_shifted, tcg_constant_i32(16));
140
+ gen_helper_mve_vctp(cpu_env, masklen);
141
+ tcg_temp_free_i32(masklen);
142
+ tcg_temp_free_i32(rn_shifted);
143
+ mve_update_eci(s);
144
+ return true;
145
+}
146
147
static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half)
148
{
149
--
29
--
150
2.20.1
30
2.25.1
151
31
152
32
diff view generated by jsdifflib
1
Implement the MVE VMAXA and VMINA insns, which take the absolute
1
From: Axel Heider <axel.heider@hensoldt.net>
2
value of the signed elements in the input vector and then accumulate
3
the unsigned max or min into the destination vector.
4
2
3
Fix typos, add background information
4
5
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
---
8
target/arm/helper-mve.h | 8 ++++++++
9
hw/timer/imx_epit.c | 20 ++++++++++++++++----
9
target/arm/mve.decode | 4 ++++
10
1 file changed, 16 insertions(+), 4 deletions(-)
10
target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++
11
target/arm/translate-mve.c | 2 ++
12
4 files changed, 40 insertions(+)
13
11
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
14
--- a/hw/timer/imx_epit.c
17
+++ b/target/arm/helper-mve.h
15
+++ b/hw/timer/imx_epit.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vqnegb, TCG_CALL_NO_WG, void, env, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ static void imx_epit_set_freq(IMXEPITState *s)
19
DEF_HELPER_FLAGS_3(mve_vqnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
17
}
20
DEF_HELPER_FLAGS_3(mve_vqnegw, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
22
+DEF_HELPER_FLAGS_3(mve_vmaxab, TCG_CALL_NO_WG, void, env, ptr, ptr)
23
+DEF_HELPER_FLAGS_3(mve_vmaxah, TCG_CALL_NO_WG, void, env, ptr, ptr)
24
+DEF_HELPER_FLAGS_3(mve_vmaxaw, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
+
26
+DEF_HELPER_FLAGS_3(mve_vminab, TCG_CALL_NO_WG, void, env, ptr, ptr)
27
+DEF_HELPER_FLAGS_3(mve_vminah, TCG_CALL_NO_WG, void, env, ptr, ptr)
28
+DEF_HELPER_FLAGS_3(mve_vminaw, TCG_CALL_NO_WG, void, env, ptr, ptr)
29
+
30
DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr)
31
DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr)
32
DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
36
+++ b/target/arm/mve.decode
37
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
38
VQMOVUNB 111 0 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op
39
VQMOVN_BS 111 0 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op
40
41
+ VMAXA 111 0 1110 0 . 11 .. 11 ... 0 1110 1 0 . 0 ... 1 @1op
42
+
43
VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
44
}
18
}
45
19
46
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
20
+/*
47
VQMOVUNT 111 0 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op
21
+ * This is called both on hardware (device) reset and software reset.
48
VQMOVN_TS 111 0 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op
22
+ */
49
23
static void imx_epit_reset(DeviceState *dev)
50
+ VMINA 111 0 1110 0 . 11 .. 11 ... 1 1110 1 0 . 0 ... 1 @1op
24
{
51
+
25
IMXEPITState *s = IMX_EPIT(dev);
52
VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
26
27
- /*
28
- * Soft reset doesn't touch some bits; hard reset clears them
29
- */
30
+ /* Soft reset doesn't touch some bits; hard reset clears them */
31
s->cr &= (CR_EN|CR_ENMOD|CR_STOPEN|CR_DOZEN|CR_WAITEN|CR_DBGEN);
32
s->sr = 0;
33
s->lr = EPIT_TIMER_MAX;
34
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write(void *opaque, hwaddr offset, uint64_t value,
35
ptimer_transaction_begin(s->timer_cmp);
36
ptimer_transaction_begin(s->timer_reload);
37
38
+ /* Update the frequency. Has been done already in case of a reset. */
39
if (!(s->cr & CR_SWR)) {
40
imx_epit_set_freq(s);
41
}
42
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write(void *opaque, hwaddr offset, uint64_t value,
43
break;
44
45
case 1: /* SR - ACK*/
46
- /* writing 1 to OCIF clear the OCIF bit */
47
+ /* writing 1 to OCIF clears the OCIF bit */
48
if (value & 0x01) {
49
s->sr = 0;
50
imx_epit_update_int(s);
51
@@ -XXX,XX +XXX,XX @@ static void imx_epit_realize(DeviceState *dev, Error **errp)
52
0x00001000);
53
sysbus_init_mmio(sbd, &s->iomem);
54
55
+ /*
56
+ * The reload timer keeps running when the peripheral is enabled. It is a
57
+ * kind of wall clock that does not generate any interrupts. The callback
58
+ * needs to be provided, but it does nothing as the ptimer already supports
59
+ * all necessary reloading functionality.
60
+ */
61
s->timer_reload = ptimer_init(imx_epit_reload, s, PTIMER_POLICY_LEGACY);
62
63
+ /*
64
+ * The compare timer is running only when the peripheral configuration is
65
+ * in a state that will generate compare interrupts.
66
+ */
67
s->timer_cmp = ptimer_init(imx_epit_cmp, s, PTIMER_POLICY_LEGACY);
53
}
68
}
54
69
55
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/mve_helper.c
58
+++ b/target/arm/mve_helper.c
59
@@ -XXX,XX +XXX,XX @@ DO_1OP_SAT(vqabsw, 4, int32_t, DO_VQABS_W)
60
DO_1OP_SAT(vqnegb, 1, int8_t, DO_VQNEG_B)
61
DO_1OP_SAT(vqnegh, 2, int16_t, DO_VQNEG_H)
62
DO_1OP_SAT(vqnegw, 4, int32_t, DO_VQNEG_W)
63
+
64
+/*
65
+ * VMAXA, VMINA: vd is unsigned; vm is signed, and we take its
66
+ * absolute value; we then do an unsigned comparison.
67
+ */
68
+#define DO_VMAXMINA(OP, ESIZE, STYPE, UTYPE, FN) \
69
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
70
+ { \
71
+ UTYPE *d = vd; \
72
+ STYPE *m = vm; \
73
+ uint16_t mask = mve_element_mask(env); \
74
+ unsigned e; \
75
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
76
+ UTYPE r = DO_ABS(m[H##ESIZE(e)]); \
77
+ r = FN(d[H##ESIZE(e)], r); \
78
+ mergemask(&d[H##ESIZE(e)], r, mask); \
79
+ } \
80
+ mve_advance_vpt(env); \
81
+ }
82
+
83
+DO_VMAXMINA(vmaxab, 1, int8_t, uint8_t, DO_MAX)
84
+DO_VMAXMINA(vmaxah, 2, int16_t, uint16_t, DO_MAX)
85
+DO_VMAXMINA(vmaxaw, 4, int32_t, uint32_t, DO_MAX)
86
+DO_VMAXMINA(vminab, 1, int8_t, uint8_t, DO_MIN)
87
+DO_VMAXMINA(vminah, 2, int16_t, uint16_t, DO_MIN)
88
+DO_VMAXMINA(vminaw, 4, int32_t, uint32_t, DO_MIN)
89
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
90
index XXXXXXX..XXXXXXX 100644
91
--- a/target/arm/translate-mve.c
92
+++ b/target/arm/translate-mve.c
93
@@ -XXX,XX +XXX,XX @@ DO_1OP(VABS, vabs)
94
DO_1OP(VNEG, vneg)
95
DO_1OP(VQABS, vqabs)
96
DO_1OP(VQNEG, vqneg)
97
+DO_1OP(VMAXA, vmaxa)
98
+DO_1OP(VMINA, vmina)
99
100
/* Narrowing moves: only size 0 and 1 are valid */
101
#define DO_VMOVN(INSN, FN) \
102
--
70
--
103
2.20.1
71
2.25.1
104
105
diff view generated by jsdifflib
1
From: Jan Luebbe <jlu@pengutronix.de>
1
From: Axel Heider <axel.heider@hensoldt.net>
2
2
3
Break events are currently only handled by chardev/char-serial.c, so we
3
remove unused defines, add needed defines
4
just ignore errors, which results in no behaviour change for other
5
chardevs.
6
4
7
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
5
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
8
Message-id: 20210806144700.3751979-1-jlu@pengutronix.de
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
8
---
12
hw/char/pl011.c | 6 ++++++
9
include/hw/timer/imx_epit.h | 4 ++--
13
1 file changed, 6 insertions(+)
10
hw/timer/imx_epit.c | 4 ++--
11
2 files changed, 4 insertions(+), 4 deletions(-)
14
12
15
diff --git a/hw/char/pl011.c b/hw/char/pl011.c
13
diff --git a/include/hw/timer/imx_epit.h b/include/hw/timer/imx_epit.h
16
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/char/pl011.c
15
--- a/include/hw/timer/imx_epit.h
18
+++ b/hw/char/pl011.c
16
+++ b/include/hw/timer/imx_epit.h
19
@@ -XXX,XX +XXX,XX @@
17
@@ -XXX,XX +XXX,XX @@
20
#include "hw/qdev-properties-system.h"
18
#define CR_OCIEN (1 << 2)
21
#include "migration/vmstate.h"
19
#define CR_RLD (1 << 3)
22
#include "chardev/char-fe.h"
20
#define CR_PRESCALE_SHIFT (4)
23
+#include "chardev/char-serial.h"
21
-#define CR_PRESCALE_MASK (0xfff)
24
#include "qemu/log.h"
22
+#define CR_PRESCALE_BITS (12)
25
#include "qemu/module.h"
23
#define CR_SWR (1 << 16)
26
#include "trace.h"
24
#define CR_IOVW (1 << 17)
27
@@ -XXX,XX +XXX,XX @@ static void pl011_write(void *opaque, hwaddr offset,
25
#define CR_DBGEN (1 << 18)
28
s->read_count = 0;
26
@@ -XXX,XX +XXX,XX @@
29
s->read_pos = 0;
27
#define CR_DOZEN (1 << 20)
30
}
28
#define CR_STOPEN (1 << 21)
31
+ if ((s->lcr ^ value) & 0x1) {
29
#define CR_CLKSRC_SHIFT (24)
32
+ int break_enable = value & 0x1;
30
-#define CR_CLKSRC_MASK (0x3 << CR_CLKSRC_SHIFT)
33
+ qemu_chr_fe_ioctl(&s->chr, CHR_IOCTL_SERIAL_SET_BREAK,
31
+#define CR_CLKSRC_BITS (2)
34
+ &break_enable);
32
35
+ }
33
#define EPIT_TIMER_MAX 0XFFFFFFFFUL
36
s->lcr = value;
34
37
pl011_set_read_trigger(s);
35
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
38
break;
36
index XXXXXXX..XXXXXXX 100644
37
--- a/hw/timer/imx_epit.c
38
+++ b/hw/timer/imx_epit.c
39
@@ -XXX,XX +XXX,XX @@ static void imx_epit_set_freq(IMXEPITState *s)
40
uint32_t clksrc;
41
uint32_t prescaler;
42
43
- clksrc = extract32(s->cr, CR_CLKSRC_SHIFT, 2);
44
- prescaler = 1 + extract32(s->cr, CR_PRESCALE_SHIFT, 12);
45
+ clksrc = extract32(s->cr, CR_CLKSRC_SHIFT, CR_CLKSRC_BITS);
46
+ prescaler = 1 + extract32(s->cr, CR_PRESCALE_SHIFT, CR_PRESCALE_BITS);
47
48
s->freq = imx_ccm_get_clock_frequency(s->ccm,
49
imx_epit_clocks[clksrc]) / prescaler;
39
--
50
--
40
2.20.1
51
2.25.1
41
42
diff view generated by jsdifflib
1
In some situations we need a mask telling us which parts of the
1
From: Axel Heider <axel.heider@hensoldt.net>
2
vector correspond to beats that are not being executed because of
3
ECI, separately from the combined "which bytes are predicated away"
4
mask. Factor this mask calculation out of mve_element_mask() into
5
its own function.
6
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
---
5
---
10
target/arm/mve_helper.c | 58 ++++++++++++++++++++++++-----------------
6
include/hw/timer/imx_epit.h | 2 ++
11
1 file changed, 34 insertions(+), 24 deletions(-)
7
hw/timer/imx_epit.c | 12 ++++++------
8
2 files changed, 8 insertions(+), 6 deletions(-)
12
9
13
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
10
diff --git a/include/hw/timer/imx_epit.h b/include/hw/timer/imx_epit.h
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/mve_helper.c
12
--- a/include/hw/timer/imx_epit.h
16
+++ b/target/arm/mve_helper.c
13
+++ b/include/hw/timer/imx_epit.h
17
@@ -XXX,XX +XXX,XX @@
14
@@ -XXX,XX +XXX,XX @@
18
#include "exec/exec-all.h"
15
#define CR_CLKSRC_SHIFT (24)
19
#include "tcg/tcg.h"
16
#define CR_CLKSRC_BITS (2)
20
17
21
+static uint16_t mve_eci_mask(CPUARMState *env)
18
+#define SR_OCIF (1 << 0)
22
+{
23
+ /*
24
+ * Return the mask of which elements in the MVE vector correspond
25
+ * to beats being executed. The mask has 1 bits for executed lanes
26
+ * and 0 bits where ECI says this beat was already executed.
27
+ */
28
+ int eci;
29
+
19
+
30
+ if ((env->condexec_bits & 0xf) != 0) {
20
#define EPIT_TIMER_MAX 0XFFFFFFFFUL
31
+ return 0xffff;
21
32
+ }
22
#define TYPE_IMX_EPIT "imx.epit"
33
+
23
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
34
+ eci = env->condexec_bits >> 4;
24
index XXXXXXX..XXXXXXX 100644
35
+ switch (eci) {
25
--- a/hw/timer/imx_epit.c
36
+ case ECI_NONE:
26
+++ b/hw/timer/imx_epit.c
37
+ return 0xffff;
27
@@ -XXX,XX +XXX,XX @@ static const IMXClk imx_epit_clocks[] = {
38
+ case ECI_A0:
28
*/
39
+ return 0xfff0;
29
static void imx_epit_update_int(IMXEPITState *s)
40
+ case ECI_A0A1:
41
+ return 0xff00;
42
+ case ECI_A0A1A2:
43
+ case ECI_A0A1A2B0:
44
+ return 0xf000;
45
+ default:
46
+ g_assert_not_reached();
47
+ }
48
+}
49
+
50
static uint16_t mve_element_mask(CPUARMState *env)
51
{
30
{
52
/*
31
- if (s->sr && (s->cr & CR_OCIEN) && (s->cr & CR_EN)) {
53
@@ -XXX,XX +XXX,XX @@ static uint16_t mve_element_mask(CPUARMState *env)
32
+ if ((s->sr & SR_OCIF) && (s->cr & CR_OCIEN) && (s->cr & CR_EN)) {
54
mask &= ltpmask;
33
qemu_irq_raise(s->irq);
55
}
34
} else {
56
35
qemu_irq_lower(s->irq);
57
- if ((env->condexec_bits & 0xf) == 0) {
36
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write(void *opaque, hwaddr offset, uint64_t value,
58
- /*
37
break;
59
- * ECI bits indicate which beats are already executed;
38
60
- * we handle this by effectively predicating them out.
39
case 1: /* SR - ACK*/
61
- */
40
- /* writing 1 to OCIF clears the OCIF bit */
62
- int eci = env->condexec_bits >> 4;
41
- if (value & 0x01) {
63
- switch (eci) {
42
- s->sr = 0;
64
- case ECI_NONE:
43
+ /* writing 1 to SR.OCIF clears this bit and turns the interrupt off */
65
- break;
44
+ if (value & SR_OCIF) {
66
- case ECI_A0:
45
+ s->sr = 0; /* SR.OCIF is the only bit in this register anyway */
67
- mask &= 0xfff0;
46
imx_epit_update_int(s);
68
- break;
47
}
69
- case ECI_A0A1:
48
break;
70
- mask &= 0xff00;
49
@@ -XXX,XX +XXX,XX @@ static void imx_epit_cmp(void *opaque)
71
- break;
50
IMXEPITState *s = IMX_EPIT(opaque);
72
- case ECI_A0A1A2:
51
73
- case ECI_A0A1A2B0:
52
DPRINTF("sr was %d\n", s->sr);
74
- mask &= 0xf000;
75
- break;
76
- default:
77
- g_assert_not_reached();
78
- }
79
- }
80
-
53
-
81
+ /*
54
- s->sr = 1;
82
+ * ECI bits indicate which beats are already executed;
55
+ /* Set interrupt status bit SR.OCIF and update the interrupt state */
83
+ * we handle this by effectively predicating them out.
56
+ s->sr |= SR_OCIF;
84
+ */
57
imx_epit_update_int(s);
85
+ mask &= mve_eci_mask(env);
86
return mask;
87
}
58
}
88
59
89
--
60
--
90
2.20.1
61
2.25.1
91
92
diff view generated by jsdifflib
1
From: Sebastian Meyer <meyer@absint.com>
1
From: Axel Heider <axel.heider@hensoldt.net>
2
2
3
With gdb 9.0 and better it is possible to connect to a gdbstub
3
The interrupt state can change due to:
4
over unix sockets, which is better than a TCP socket connection
4
- reset clears both SR.OCIF and CR.OCIE
5
in some situations. The QEMU command line to set this up is
5
- write to CR.EN or CR.OCIE
6
non-obvious; document it.
7
6
8
Signed-off-by: Sebastian Meyer <meyer@absint.com>
7
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
9
Message-id: 162867284829.27377.4784930719350564918-0@git.sr.ht
10
[PMM: Tweaked commit message; adjusted wording in a couple of
11
places; fixed rST formatting issue; moved section up out of
12
the 'advanced debugging options' subsection]
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
10
---
17
docs/system/gdb.rst | 26 +++++++++++++++++++++++++-
11
hw/timer/imx_epit.c | 16 ++++++++++++----
18
1 file changed, 25 insertions(+), 1 deletion(-)
12
1 file changed, 12 insertions(+), 4 deletions(-)
19
13
20
diff --git a/docs/system/gdb.rst b/docs/system/gdb.rst
14
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
21
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
22
--- a/docs/system/gdb.rst
16
--- a/hw/timer/imx_epit.c
23
+++ b/docs/system/gdb.rst
17
+++ b/hw/timer/imx_epit.c
24
@@ -XXX,XX +XXX,XX @@ The ``-s`` option will make QEMU listen for an incoming connection
18
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write(void *opaque, hwaddr offset, uint64_t value,
25
from gdb on TCP port 1234, and ``-S`` will make QEMU not start the
19
if (s->cr & CR_SWR) {
26
guest until you tell it to from gdb. (If you want to specify which
20
/* handle the reset */
27
TCP port to use or to use something other than TCP for the gdbstub
21
imx_epit_reset(DEVICE(s));
28
-connection, use the ``-gdb dev`` option instead of ``-s``.)
22
- /*
29
+connection, use the ``-gdb dev`` option instead of ``-s``. See
23
- * TODO: could we 'break' here? following operations appear
30
+`Using unix sockets`_ for an example.)
24
- * to duplicate the work imx_epit_reset() already did.
31
25
- */
32
.. parsed-literal::
26
}
33
27
34
@@ -XXX,XX +XXX,XX @@ not just those in the cluster you are currently working on::
28
+ /*
35
29
+ * The interrupt state can change due to:
36
(gdb) set schedule-multiple on
30
+ * - reset clears both SR.OCIF and CR.OCIE
37
31
+ * - write to CR.EN or CR.OCIE
38
+Using unix sockets
32
+ */
39
+==================
33
+ imx_epit_update_int(s);
40
+
34
+
41
+An alternate method for connecting gdb to the QEMU gdbstub is to use
35
+ /*
42
+a unix socket (if supported by your operating system). This is useful when
36
+ * TODO: could we 'break' here for reset? following operations appear
43
+running several tests in parallel, or if you do not have a known free TCP
37
+ * to duplicate the work imx_epit_reset() already did.
44
+port (e.g. when running automated tests).
38
+ */
45
+
39
+
46
+First create a chardev with the appropriate options, then
40
ptimer_transaction_begin(s->timer_cmp);
47
+instruct the gdbserver to use that device:
41
ptimer_transaction_begin(s->timer_reload);
48
+
49
+.. parsed-literal::
50
+
51
+ |qemu_system| -chardev socket,path=/tmp/gdb-socket,server=on,wait=off,id=gdb0 -gdb chardev:gdb0 -S ...
52
+
53
+Start gdb as before, but this time connect using the path to
54
+the socket::
55
+
56
+ (gdb) target remote /tmp/gdb-socket
57
+
58
+Note that to use a unix socket for the connection you will need
59
+gdb version 9.0 or newer.
60
+
61
Advanced debugging options
62
==========================
63
42
64
--
43
--
65
2.20.1
44
2.25.1
66
67
diff view generated by jsdifflib
1
Implement the MVE VABAV insn, which computes absolute differences
1
From: Axel Heider <axel.heider@hensoldt.net>
2
between elements of two vectors and accumulates the result into
3
a general purpose register.
4
2
3
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
6
---
8
target/arm/helper-mve.h | 7 +++++++
7
hw/timer/imx_epit.c | 20 ++++++++++++++------
9
target/arm/mve.decode | 6 ++++++
8
1 file changed, 14 insertions(+), 6 deletions(-)
10
target/arm/mve_helper.c | 26 +++++++++++++++++++++++
11
target/arm/translate-mve.c | 43 ++++++++++++++++++++++++++++++++++++++
12
4 files changed, 82 insertions(+)
13
9
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
12
--- a/hw/timer/imx_epit.c
17
+++ b/target/arm/helper-mve.h
13
+++ b/hw/timer/imx_epit.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32)
14
@@ -XXX,XX +XXX,XX @@ static void imx_epit_set_freq(IMXEPITState *s)
19
DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64)
15
/*
20
DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64)
16
* This is called both on hardware (device) reset and software reset.
21
17
*/
22
+DEF_HELPER_FLAGS_4(mve_vabavsb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
18
-static void imx_epit_reset(DeviceState *dev)
23
+DEF_HELPER_FLAGS_4(mve_vabavsh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
19
+static void imx_epit_reset(IMXEPITState *s, bool is_hard_reset)
24
+DEF_HELPER_FLAGS_4(mve_vabavsw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
20
{
25
+DEF_HELPER_FLAGS_4(mve_vabavub, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
21
- IMXEPITState *s = IMX_EPIT(dev);
26
+DEF_HELPER_FLAGS_4(mve_vabavuh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
22
-
27
+DEF_HELPER_FLAGS_4(mve_vabavuw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
23
/* Soft reset doesn't touch some bits; hard reset clears them */
28
+
24
- s->cr &= (CR_EN|CR_ENMOD|CR_STOPEN|CR_DOZEN|CR_WAITEN|CR_DBGEN);
29
DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
25
+ if (is_hard_reset) {
30
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
26
+ s->cr = 0;
31
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
27
+ } else {
32
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
28
+ s->cr &= (CR_EN|CR_ENMOD|CR_STOPEN|CR_DOZEN|CR_WAITEN|CR_DBGEN);
33
index XXXXXXX..XXXXXXX 100644
29
+ }
34
--- a/target/arm/mve.decode
30
s->sr = 0;
35
+++ b/target/arm/mve.decode
31
s->lr = EPIT_TIMER_MAX;
36
@@ -XXX,XX +XXX,XX @@
32
s->cmp = 0;
37
&vcmp_scalar qn rm size mask
33
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write(void *opaque, hwaddr offset, uint64_t value,
38
&shl_scalar qda rm size
34
s->cr = value & 0x03ffffff;
39
&vmaxv qm rda size
35
if (s->cr & CR_SWR) {
40
+&vabav qn qm rda size
36
/* handle the reset */
41
37
- imx_epit_reset(DEVICE(s));
42
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
38
+ imx_epit_reset(s, false);
43
# Note that both Rn and Qd are 3 bits only (no D bit)
39
}
44
@@ -XXX,XX +XXX,XX @@ VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar
40
45
rdahi=%rdahi rdalo=%rdalo
41
/*
42
@@ -XXX,XX +XXX,XX @@ static void imx_epit_realize(DeviceState *dev, Error **errp)
43
s->timer_cmp = ptimer_init(imx_epit_cmp, s, PTIMER_POLICY_LEGACY);
46
}
44
}
47
45
48
+@vabav .... .... .. size:2 .... rda:4 .... .... .... &vabav qn=%qn qm=%qm
46
+static void imx_epit_dev_reset(DeviceState *dev)
49
+
50
+VABAV_S 111 0 1110 10 .. ... 0 .... 1111 . 0 . 0 ... 1 @vabav
51
+VABAV_U 111 1 1110 10 .. ... 0 .... 1111 . 0 . 0 ... 1 @vabav
52
+
53
# Logical immediate operations (1 reg and modified-immediate)
54
55
# The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but
56
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/mve_helper.c
59
+++ b/target/arm/mve_helper.c
60
@@ -XXX,XX +XXX,XX @@ DO_VMAXMINV(vminavb, 1, int8_t, uint8_t, do_mina)
61
DO_VMAXMINV(vminavh, 2, int16_t, uint16_t, do_mina)
62
DO_VMAXMINV(vminavw, 4, int32_t, uint32_t, do_mina)
63
64
+#define DO_VABAV(OP, ESIZE, TYPE) \
65
+ uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
66
+ void *vm, uint32_t ra) \
67
+ { \
68
+ uint16_t mask = mve_element_mask(env); \
69
+ unsigned e; \
70
+ TYPE *m = vm, *n = vn; \
71
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
72
+ if (mask & 1) { \
73
+ int64_t n0 = n[H##ESIZE(e)]; \
74
+ int64_t m0 = m[H##ESIZE(e)]; \
75
+ uint32_t r = n0 >= m0 ? (n0 - m0) : (m0 - n0); \
76
+ ra += r; \
77
+ } \
78
+ } \
79
+ mve_advance_vpt(env); \
80
+ return ra; \
81
+ }
82
+
83
+DO_VABAV(vabavsb, 1, int8_t)
84
+DO_VABAV(vabavsh, 2, int16_t)
85
+DO_VABAV(vabavsw, 4, int32_t)
86
+DO_VABAV(vabavub, 1, uint8_t)
87
+DO_VABAV(vabavuh, 2, uint16_t)
88
+DO_VABAV(vabavuw, 4, uint32_t)
89
+
90
#define DO_VADDLV(OP, TYPE, LTYPE) \
91
uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \
92
uint64_t ra) \
93
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/translate-mve.c
96
+++ b/target/arm/translate-mve.c
97
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32);
98
typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32);
99
typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
100
typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
101
+typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
102
103
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
104
static inline long mve_qreg_offset(unsigned reg)
105
@@ -XXX,XX +XXX,XX @@ DO_VMAXV(VMAXAV, vmaxav)
106
DO_VMAXV(VMINV_S, vminvs)
107
DO_VMAXV(VMINV_U, vminvu)
108
DO_VMAXV(VMINAV, vminav)
109
+
110
+static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn)
111
+{
47
+{
112
+ /* Absolute difference accumulated across vector */
48
+ IMXEPITState *s = IMX_EPIT(dev);
113
+ TCGv_ptr qn, qm;
49
+ imx_epit_reset(s, true);
114
+ TCGv_i32 rda;
115
+
116
+ if (!dc_isar_feature(aa32_mve, s) ||
117
+ !mve_check_qreg_bank(s, a->qm | a->qn) ||
118
+ !fn || a->rda == 13 || a->rda == 15) {
119
+ /* Rda cases are UNPREDICTABLE */
120
+ return false;
121
+ }
122
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
123
+ return true;
124
+ }
125
+
126
+ qm = mve_qreg_ptr(a->qm);
127
+ qn = mve_qreg_ptr(a->qn);
128
+ rda = load_reg(s, a->rda);
129
+ fn(rda, cpu_env, qn, qm, rda);
130
+ store_reg(s, a->rda, rda);
131
+ tcg_temp_free_ptr(qm);
132
+ tcg_temp_free_ptr(qn);
133
+ mve_update_eci(s);
134
+ return true;
135
+}
50
+}
136
+
51
+
137
+#define DO_VABAV(INSN, FN) \
52
static void imx_epit_class_init(ObjectClass *klass, void *data)
138
+ static bool trans_##INSN(DisasContext *s, arg_vabav *a) \
53
{
139
+ { \
54
DeviceClass *dc = DEVICE_CLASS(klass);
140
+ static MVEGenVABAVFn * const fns[] = { \
55
141
+ gen_helper_mve_##FN##b, \
56
dc->realize = imx_epit_realize;
142
+ gen_helper_mve_##FN##h, \
57
- dc->reset = imx_epit_reset;
143
+ gen_helper_mve_##FN##w, \
58
+ dc->reset = imx_epit_dev_reset;
144
+ NULL, \
59
dc->vmsd = &vmstate_imx_timer_epit;
145
+ }; \
60
dc->desc = "i.MX periodic timer";
146
+ return do_vabav(s, a, fns[a->size]); \
61
}
147
+ }
148
+
149
+DO_VABAV(VABAV_S, vabavs)
150
+DO_VABAV(VABAV_U, vabavu)
151
--
62
--
152
2.20.1
63
2.25.1
153
154
diff view generated by jsdifflib
1
Implement the MVE integer vector comparison instructions. These are
1
From: Axel Heider <axel.heider@hensoldt.net>
2
"VCMP (vector)" encodings T1, T2 and T3, and "VPT (vector)" encodings
3
T1, T2 and T3.
4
2
5
These insns compare corresponding elements in each vector, and update
3
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
6
the VPR.P0 predicate bits with the results of the comparison. VPT
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
also sets the VPR.MASK01 and VPR.MASK23 fields -- it is effectively
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
"VCMP then VPST".
6
---
7
hw/timer/imx_epit.c | 215 ++++++++++++++++++++++++--------------------
8
1 file changed, 117 insertions(+), 98 deletions(-)
9
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
target/arm/helper-mve.h | 32 ++++++++++++++++++++++
14
target/arm/mve.decode | 18 +++++++++++-
15
target/arm/mve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++
16
target/arm/translate-mve.c | 47 ++++++++++++++++++++++++++++++++
17
4 files changed, 152 insertions(+), 1 deletion(-)
18
19
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
20
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper-mve.h
12
--- a/hw/timer/imx_epit.c
22
+++ b/target/arm/helper-mve.h
13
+++ b/hw/timer/imx_epit.c
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
14
@@ -XXX,XX +XXX,XX @@ static void imx_epit_reload_compare_timer(IMXEPITState *s)
24
DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
15
}
25
DEF_HELPER_FLAGS_3(mve_uqrshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
26
DEF_HELPER_FLAGS_3(mve_sqrshr, TCG_CALL_NO_RWG, i32, env, i32, i32)
27
+
28
+DEF_HELPER_FLAGS_3(mve_vcmpeqb, TCG_CALL_NO_WG, void, env, ptr, ptr)
29
+DEF_HELPER_FLAGS_3(mve_vcmpeqh, TCG_CALL_NO_WG, void, env, ptr, ptr)
30
+DEF_HELPER_FLAGS_3(mve_vcmpeqw, TCG_CALL_NO_WG, void, env, ptr, ptr)
31
+
32
+DEF_HELPER_FLAGS_3(mve_vcmpneb, TCG_CALL_NO_WG, void, env, ptr, ptr)
33
+DEF_HELPER_FLAGS_3(mve_vcmpneh, TCG_CALL_NO_WG, void, env, ptr, ptr)
34
+DEF_HELPER_FLAGS_3(mve_vcmpnew, TCG_CALL_NO_WG, void, env, ptr, ptr)
35
+
36
+DEF_HELPER_FLAGS_3(mve_vcmpcsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
37
+DEF_HELPER_FLAGS_3(mve_vcmpcsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
38
+DEF_HELPER_FLAGS_3(mve_vcmpcsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
39
+
40
+DEF_HELPER_FLAGS_3(mve_vcmphib, TCG_CALL_NO_WG, void, env, ptr, ptr)
41
+DEF_HELPER_FLAGS_3(mve_vcmphih, TCG_CALL_NO_WG, void, env, ptr, ptr)
42
+DEF_HELPER_FLAGS_3(mve_vcmphiw, TCG_CALL_NO_WG, void, env, ptr, ptr)
43
+
44
+DEF_HELPER_FLAGS_3(mve_vcmpgeb, TCG_CALL_NO_WG, void, env, ptr, ptr)
45
+DEF_HELPER_FLAGS_3(mve_vcmpgeh, TCG_CALL_NO_WG, void, env, ptr, ptr)
46
+DEF_HELPER_FLAGS_3(mve_vcmpgew, TCG_CALL_NO_WG, void, env, ptr, ptr)
47
+
48
+DEF_HELPER_FLAGS_3(mve_vcmpltb, TCG_CALL_NO_WG, void, env, ptr, ptr)
49
+DEF_HELPER_FLAGS_3(mve_vcmplth, TCG_CALL_NO_WG, void, env, ptr, ptr)
50
+DEF_HELPER_FLAGS_3(mve_vcmpltw, TCG_CALL_NO_WG, void, env, ptr, ptr)
51
+
52
+DEF_HELPER_FLAGS_3(mve_vcmpgtb, TCG_CALL_NO_WG, void, env, ptr, ptr)
53
+DEF_HELPER_FLAGS_3(mve_vcmpgth, TCG_CALL_NO_WG, void, env, ptr, ptr)
54
+DEF_HELPER_FLAGS_3(mve_vcmpgtw, TCG_CALL_NO_WG, void, env, ptr, ptr)
55
+
56
+DEF_HELPER_FLAGS_3(mve_vcmpleb, TCG_CALL_NO_WG, void, env, ptr, ptr)
57
+DEF_HELPER_FLAGS_3(mve_vcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr)
58
+DEF_HELPER_FLAGS_3(mve_vcmplew, TCG_CALL_NO_WG, void, env, ptr, ptr)
59
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
60
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/mve.decode
62
+++ b/target/arm/mve.decode
63
@@ -XXX,XX +XXX,XX @@
64
&2shift qd qm shift size
65
&vidup qd rn size imm
66
&viwdup qd rn rm size imm
67
+&vcmp qm qn size mask
68
69
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
70
# Note that both Rn and Qd are 3 bits only (no D bit)
71
@@ -XXX,XX +XXX,XX @@
72
@2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \
73
size=2 shift=%rshift_i5
74
75
+# Vector comparison; 4-bit Qm but 3-bit Qn
76
+%mask_22_13 22:1 13:3
77
+@vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13
78
+
79
# Vector loads and stores
80
81
# Widening loads and narrowing stores:
82
@@ -XXX,XX +XXX,XX @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
83
}
16
}
84
17
85
# Predicate operations
18
+static void imx_epit_write_cr(IMXEPITState *s, uint32_t value)
86
-%mask_22_13 22:1 13:3
19
+{
87
VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
20
+ uint32_t oldcr = s->cr;
88
21
+
89
# Logical immediate operations (1 reg and modified-immediate)
22
+ s->cr = value & 0x03ffffff;
90
@@ -XXX,XX +XXX,XX @@ VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
23
+
91
VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
24
+ if (s->cr & CR_SWR) {
92
25
+ /* handle the reset */
93
VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd
26
+ imx_epit_reset(s, false);
94
+
27
+ }
95
+# Comparisons. We expand out the conditions which are split across
28
+
96
+# encodings T1, T2, T3 and the fc bits. These include VPT, which is
29
+ /*
97
+# effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero.
30
+ * The interrupt state can change due to:
98
+VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp
31
+ * - reset clears both SR.OCIF and CR.OCIE
99
+VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp
32
+ * - write to CR.EN or CR.OCIE
100
+VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp
33
+ */
101
+VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp
34
+ imx_epit_update_int(s);
102
+VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp
35
+
103
+VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp
36
+ /*
104
+VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp
37
+ * TODO: could we 'break' here for reset? following operations appear
105
+VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp
38
+ * to duplicate the work imx_epit_reset() already did.
106
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
39
+ */
107
index XXXXXXX..XXXXXXX 100644
40
+
108
--- a/target/arm/mve_helper.c
41
+ ptimer_transaction_begin(s->timer_cmp);
109
+++ b/target/arm/mve_helper.c
42
+ ptimer_transaction_begin(s->timer_reload);
110
@@ -XXX,XX +XXX,XX @@ static uint32_t do_sub_wrap(uint32_t offset, uint32_t wrap, uint32_t imm)
43
+
111
DO_VIDUP_ALL(vidup, DO_ADD)
44
+ /* Update the frequency. Has been done already in case of a reset. */
112
DO_VIWDUP_ALL(viwdup, do_add_wrap)
45
+ if (!(s->cr & CR_SWR)) {
113
DO_VIWDUP_ALL(vdwdup, do_sub_wrap)
46
+ imx_epit_set_freq(s);
114
+
47
+ }
115
+/*
48
+
116
+ * Vector comparison.
49
+ if (s->freq && (s->cr & CR_EN) && !(oldcr & CR_EN)) {
117
+ * P0 bits for non-executed beats (where eci_mask is 0) are unchanged.
50
+ if (s->cr & CR_ENMOD) {
118
+ * P0 bits for predicated lanes in executed beats (where mask is 0) are 0.
51
+ if (s->cr & CR_RLD) {
119
+ * P0 bits otherwise are updated with the results of the comparisons.
52
+ ptimer_set_limit(s->timer_reload, s->lr, 1);
120
+ * We must also keep unchanged the MASK fields at the top of v7m.vpr.
53
+ ptimer_set_limit(s->timer_cmp, s->lr, 1);
121
+ */
54
+ } else {
122
+#define DO_VCMP(OP, ESIZE, TYPE, FN) \
55
+ ptimer_set_limit(s->timer_reload, EPIT_TIMER_MAX, 1);
123
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, void *vm) \
56
+ ptimer_set_limit(s->timer_cmp, EPIT_TIMER_MAX, 1);
124
+ { \
57
+ }
125
+ TYPE *n = vn, *m = vm; \
58
+ }
126
+ uint16_t mask = mve_element_mask(env); \
59
+
127
+ uint16_t eci_mask = mve_eci_mask(env); \
60
+ imx_epit_reload_compare_timer(s);
128
+ uint16_t beatpred = 0; \
61
+ ptimer_run(s->timer_reload, 0);
129
+ uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \
62
+ if (s->cr & CR_OCIEN) {
130
+ unsigned e; \
63
+ ptimer_run(s->timer_cmp, 0);
131
+ for (e = 0; e < 16 / ESIZE; e++) { \
64
+ } else {
132
+ bool r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)]); \
65
+ ptimer_stop(s->timer_cmp);
133
+ /* Comparison sets 0/1 bits for each byte in the element */ \
66
+ }
134
+ beatpred |= r * emask; \
67
+ } else if (!(s->cr & CR_EN)) {
135
+ emask <<= ESIZE; \
68
+ /* stop both timers */
136
+ } \
69
+ ptimer_stop(s->timer_reload);
137
+ beatpred &= mask; \
70
+ ptimer_stop(s->timer_cmp);
138
+ env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \
71
+ } else if (s->cr & CR_OCIEN) {
139
+ (beatpred & eci_mask); \
72
+ if (!(oldcr & CR_OCIEN)) {
140
+ mve_advance_vpt(env); \
73
+ imx_epit_reload_compare_timer(s);
141
+ }
74
+ ptimer_run(s->timer_cmp, 0);
142
+
75
+ }
143
+#define DO_VCMP_S(OP, FN) \
76
+ } else {
144
+ DO_VCMP(OP##b, 1, int8_t, FN) \
77
+ ptimer_stop(s->timer_cmp);
145
+ DO_VCMP(OP##h, 2, int16_t, FN) \
78
+ }
146
+ DO_VCMP(OP##w, 4, int32_t, FN)
79
+
147
+
80
+ ptimer_transaction_commit(s->timer_cmp);
148
+#define DO_VCMP_U(OP, FN) \
81
+ ptimer_transaction_commit(s->timer_reload);
149
+ DO_VCMP(OP##b, 1, uint8_t, FN) \
82
+}
150
+ DO_VCMP(OP##h, 2, uint16_t, FN) \
83
+
151
+ DO_VCMP(OP##w, 4, uint32_t, FN)
84
+static void imx_epit_write_sr(IMXEPITState *s, uint32_t value)
152
+
85
+{
153
+#define DO_EQ(N, M) ((N) == (M))
86
+ /* writing 1 to SR.OCIF clears this bit and turns the interrupt off */
154
+#define DO_NE(N, M) ((N) != (M))
87
+ if (value & SR_OCIF) {
155
+#define DO_EQ(N, M) ((N) == (M))
88
+ s->sr = 0; /* SR.OCIF is the only bit in this register anyway */
156
+#define DO_EQ(N, M) ((N) == (M))
89
+ imx_epit_update_int(s);
157
+#define DO_GE(N, M) ((N) >= (M))
90
+ }
158
+#define DO_LT(N, M) ((N) < (M))
91
+}
159
+#define DO_GT(N, M) ((N) > (M))
92
+
160
+#define DO_LE(N, M) ((N) <= (M))
93
+static void imx_epit_write_lr(IMXEPITState *s, uint32_t value)
161
+
94
+{
162
+DO_VCMP_U(vcmpeq, DO_EQ)
95
+ s->lr = value;
163
+DO_VCMP_U(vcmpne, DO_NE)
96
+
164
+DO_VCMP_U(vcmpcs, DO_GE)
97
+ ptimer_transaction_begin(s->timer_cmp);
165
+DO_VCMP_U(vcmphi, DO_GT)
98
+ ptimer_transaction_begin(s->timer_reload);
166
+DO_VCMP_S(vcmpge, DO_GE)
99
+ if (s->cr & CR_RLD) {
167
+DO_VCMP_S(vcmplt, DO_LT)
100
+ /* Also set the limit if the LRD bit is set */
168
+DO_VCMP_S(vcmpgt, DO_GT)
101
+ /* If IOVW bit is set then set the timer value */
169
+DO_VCMP_S(vcmple, DO_LE)
102
+ ptimer_set_limit(s->timer_reload, s->lr, s->cr & CR_IOVW);
170
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
103
+ ptimer_set_limit(s->timer_cmp, s->lr, 0);
171
index XXXXXXX..XXXXXXX 100644
104
+ } else if (s->cr & CR_IOVW) {
172
--- a/target/arm/translate-mve.c
105
+ /* If IOVW bit is set then set the timer value */
173
+++ b/target/arm/translate-mve.c
106
+ ptimer_set_count(s->timer_reload, s->lr);
174
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
107
+ }
175
typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
108
+ /*
176
typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32);
109
+ * Commit the change to s->timer_reload, so it can propagate. Otherwise
177
typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32);
110
+ * the timer interrupt may not fire properly. The commit must happen
178
+typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
111
+ * before calling imx_epit_reload_compare_timer(), which reads
179
112
+ * s->timer_reload internally again.
180
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
113
+ */
181
static inline long mve_qreg_offset(unsigned reg)
114
+ ptimer_transaction_commit(s->timer_reload);
182
@@ -XXX,XX +XXX,XX @@ static bool trans_VDWDUP(DisasContext *s, arg_viwdup *a)
115
+ imx_epit_reload_compare_timer(s);
183
};
116
+ ptimer_transaction_commit(s->timer_cmp);
184
return do_viwdup(s, a, fns[a->size]);
117
+}
118
+
119
+static void imx_epit_write_cmp(IMXEPITState *s, uint32_t value)
120
+{
121
+ s->cmp = value;
122
+
123
+ ptimer_transaction_begin(s->timer_cmp);
124
+ imx_epit_reload_compare_timer(s);
125
+ ptimer_transaction_commit(s->timer_cmp);
126
+}
127
+
128
static void imx_epit_write(void *opaque, hwaddr offset, uint64_t value,
129
unsigned size)
130
{
131
IMXEPITState *s = IMX_EPIT(opaque);
132
- uint64_t oldcr;
133
134
DPRINTF("(%s, value = 0x%08x)\n", imx_epit_reg_name(offset >> 2),
135
(uint32_t)value);
136
137
switch (offset >> 2) {
138
case 0: /* CR */
139
-
140
- oldcr = s->cr;
141
- s->cr = value & 0x03ffffff;
142
- if (s->cr & CR_SWR) {
143
- /* handle the reset */
144
- imx_epit_reset(s, false);
145
- }
146
-
147
- /*
148
- * The interrupt state can change due to:
149
- * - reset clears both SR.OCIF and CR.OCIE
150
- * - write to CR.EN or CR.OCIE
151
- */
152
- imx_epit_update_int(s);
153
-
154
- /*
155
- * TODO: could we 'break' here for reset? following operations appear
156
- * to duplicate the work imx_epit_reset() already did.
157
- */
158
-
159
- ptimer_transaction_begin(s->timer_cmp);
160
- ptimer_transaction_begin(s->timer_reload);
161
-
162
- /* Update the frequency. Has been done already in case of a reset. */
163
- if (!(s->cr & CR_SWR)) {
164
- imx_epit_set_freq(s);
165
- }
166
-
167
- if (s->freq && (s->cr & CR_EN) && !(oldcr & CR_EN)) {
168
- if (s->cr & CR_ENMOD) {
169
- if (s->cr & CR_RLD) {
170
- ptimer_set_limit(s->timer_reload, s->lr, 1);
171
- ptimer_set_limit(s->timer_cmp, s->lr, 1);
172
- } else {
173
- ptimer_set_limit(s->timer_reload, EPIT_TIMER_MAX, 1);
174
- ptimer_set_limit(s->timer_cmp, EPIT_TIMER_MAX, 1);
175
- }
176
- }
177
-
178
- imx_epit_reload_compare_timer(s);
179
- ptimer_run(s->timer_reload, 0);
180
- if (s->cr & CR_OCIEN) {
181
- ptimer_run(s->timer_cmp, 0);
182
- } else {
183
- ptimer_stop(s->timer_cmp);
184
- }
185
- } else if (!(s->cr & CR_EN)) {
186
- /* stop both timers */
187
- ptimer_stop(s->timer_reload);
188
- ptimer_stop(s->timer_cmp);
189
- } else if (s->cr & CR_OCIEN) {
190
- if (!(oldcr & CR_OCIEN)) {
191
- imx_epit_reload_compare_timer(s);
192
- ptimer_run(s->timer_cmp, 0);
193
- }
194
- } else {
195
- ptimer_stop(s->timer_cmp);
196
- }
197
-
198
- ptimer_transaction_commit(s->timer_cmp);
199
- ptimer_transaction_commit(s->timer_reload);
200
+ imx_epit_write_cr(s, (uint32_t)value);
201
break;
202
203
- case 1: /* SR - ACK*/
204
- /* writing 1 to SR.OCIF clears this bit and turns the interrupt off */
205
- if (value & SR_OCIF) {
206
- s->sr = 0; /* SR.OCIF is the only bit in this register anyway */
207
- imx_epit_update_int(s);
208
- }
209
+ case 1: /* SR */
210
+ imx_epit_write_sr(s, (uint32_t)value);
211
break;
212
213
- case 2: /* LR - set ticks */
214
- s->lr = value;
215
-
216
- ptimer_transaction_begin(s->timer_cmp);
217
- ptimer_transaction_begin(s->timer_reload);
218
- if (s->cr & CR_RLD) {
219
- /* Also set the limit if the LRD bit is set */
220
- /* If IOVW bit is set then set the timer value */
221
- ptimer_set_limit(s->timer_reload, s->lr, s->cr & CR_IOVW);
222
- ptimer_set_limit(s->timer_cmp, s->lr, 0);
223
- } else if (s->cr & CR_IOVW) {
224
- /* If IOVW bit is set then set the timer value */
225
- ptimer_set_count(s->timer_reload, s->lr);
226
- }
227
- /*
228
- * Commit the change to s->timer_reload, so it can propagate. Otherwise
229
- * the timer interrupt may not fire properly. The commit must happen
230
- * before calling imx_epit_reload_compare_timer(), which reads
231
- * s->timer_reload internally again.
232
- */
233
- ptimer_transaction_commit(s->timer_reload);
234
- imx_epit_reload_compare_timer(s);
235
- ptimer_transaction_commit(s->timer_cmp);
236
+ case 2: /* LR */
237
+ imx_epit_write_lr(s, (uint32_t)value);
238
break;
239
240
case 3: /* CMP */
241
- s->cmp = value;
242
-
243
- ptimer_transaction_begin(s->timer_cmp);
244
- imx_epit_reload_compare_timer(s);
245
- ptimer_transaction_commit(s->timer_cmp);
246
-
247
+ imx_epit_write_cmp(s, (uint32_t)value);
248
break;
249
250
default:
251
qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Bad register at offset 0x%"
252
HWADDR_PRIx "\n", TYPE_IMX_EPIT, __func__, offset);
253
-
254
break;
255
}
185
}
256
}
186
+
257
+
187
+static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn)
258
static void imx_epit_cmp(void *opaque)
188
+{
259
{
189
+ TCGv_ptr qn, qm;
260
IMXEPITState *s = IMX_EPIT(opaque);
190
+
191
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qm) ||
192
+ !fn) {
193
+ return false;
194
+ }
195
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
196
+ return true;
197
+ }
198
+
199
+ qn = mve_qreg_ptr(a->qn);
200
+ qm = mve_qreg_ptr(a->qm);
201
+ fn(cpu_env, qn, qm);
202
+ tcg_temp_free_ptr(qn);
203
+ tcg_temp_free_ptr(qm);
204
+ if (a->mask) {
205
+ /* VPT */
206
+ gen_vpst(s, a->mask);
207
+ }
208
+ mve_update_eci(s);
209
+ return true;
210
+}
211
+
212
+#define DO_VCMP(INSN, FN) \
213
+ static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \
214
+ { \
215
+ static MVEGenCmpFn * const fns[] = { \
216
+ gen_helper_mve_##FN##b, \
217
+ gen_helper_mve_##FN##h, \
218
+ gen_helper_mve_##FN##w, \
219
+ NULL, \
220
+ }; \
221
+ return do_vcmp(s, a, fns[a->size]); \
222
+ }
223
+
224
+DO_VCMP(VCMPEQ, vcmpeq)
225
+DO_VCMP(VCMPNE, vcmpne)
226
+DO_VCMP(VCMPCS, vcmpcs)
227
+DO_VCMP(VCMPHI, vcmphi)
228
+DO_VCMP(VCMPGE, vcmpge)
229
+DO_VCMP(VCMPLT, vcmplt)
230
+DO_VCMP(VCMPGT, vcmpgt)
231
+DO_VCMP(VCMPLE, vcmple)
232
--
261
--
233
2.20.1
262
2.25.1
234
235
diff view generated by jsdifflib
1
Implement the MVE instructions which perform shifts by a scalar.
1
From: Axel Heider <axel.heider@hensoldt.net>
2
These are VSHL T2, VRSHL T2, VQSHL T1 and VQRSHL T2. They take the
3
shift amount in a general purpose register and shift every element in
4
the vector by that amount.
5
2
6
Mostly we can reuse the helper functions for shift-by-immediate; we
3
The CNT register is a read-only register. There is no need to
7
do need two new helpers for VQRSHL.
4
store it's value, it can be calculated on demand.
5
The calculated frequency is needed temporarily only.
8
6
7
Note that this is a migration compatibility break for all boards
8
types that use the EPIT peripheral.
9
10
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
---
13
---
12
target/arm/helper-mve.h | 8 +++++++
14
include/hw/timer/imx_epit.h | 2 -
13
target/arm/mve.decode | 23 ++++++++++++++++---
15
hw/timer/imx_epit.c | 73 ++++++++++++++-----------------------
14
target/arm/mve_helper.c | 2 ++
16
2 files changed, 28 insertions(+), 47 deletions(-)
15
target/arm/translate-mve.c | 46 ++++++++++++++++++++++++++++++++++++++
16
4 files changed, 76 insertions(+), 3 deletions(-)
17
17
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
18
diff --git a/include/hw/timer/imx_epit.h b/include/hw/timer/imx_epit.h
19
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
20
--- a/include/hw/timer/imx_epit.h
21
+++ b/target/arm/helper-mve.h
21
+++ b/include/hw/timer/imx_epit.h
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
@@ -XXX,XX +XXX,XX @@ struct IMXEPITState {
23
DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
uint32_t sr;
24
DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
uint32_t lr;
25
25
uint32_t cmp;
26
+DEF_HELPER_FLAGS_4(mve_vqrshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
- uint32_t cnt;
27
+DEF_HELPER_FLAGS_4(mve_vqrshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
28
+DEF_HELPER_FLAGS_4(mve_vqrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
- uint32_t freq;
29
qemu_irq irq;
30
};
31
32
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/hw/timer/imx_epit.c
35
+++ b/hw/timer/imx_epit.c
36
@@ -XXX,XX +XXX,XX @@ static void imx_epit_update_int(IMXEPITState *s)
37
}
38
}
39
40
-/*
41
- * Must be called from within a ptimer_transaction_begin/commit block
42
- * for both s->timer_cmp and s->timer_reload.
43
- */
44
-static void imx_epit_set_freq(IMXEPITState *s)
45
+static uint32_t imx_epit_get_freq(IMXEPITState *s)
46
{
47
- uint32_t clksrc;
48
- uint32_t prescaler;
49
-
50
- clksrc = extract32(s->cr, CR_CLKSRC_SHIFT, CR_CLKSRC_BITS);
51
- prescaler = 1 + extract32(s->cr, CR_PRESCALE_SHIFT, CR_PRESCALE_BITS);
52
-
53
- s->freq = imx_ccm_get_clock_frequency(s->ccm,
54
- imx_epit_clocks[clksrc]) / prescaler;
55
-
56
- DPRINTF("Setting ptimer frequency to %u\n", s->freq);
57
-
58
- if (s->freq) {
59
- ptimer_set_freq(s->timer_reload, s->freq);
60
- ptimer_set_freq(s->timer_cmp, s->freq);
61
- }
62
+ uint32_t clksrc = extract32(s->cr, CR_CLKSRC_SHIFT, CR_CLKSRC_BITS);
63
+ uint32_t prescaler = 1 + extract32(s->cr, CR_PRESCALE_SHIFT, CR_PRESCALE_BITS);
64
+ uint32_t f_in = imx_ccm_get_clock_frequency(s->ccm, imx_epit_clocks[clksrc]);
65
+ uint32_t freq = f_in / prescaler;
66
+ DPRINTF("ptimer frequency is %u\n", freq);
67
+ return freq;
68
}
69
70
/*
71
@@ -XXX,XX +XXX,XX @@ static void imx_epit_reset(IMXEPITState *s, bool is_hard_reset)
72
s->sr = 0;
73
s->lr = EPIT_TIMER_MAX;
74
s->cmp = 0;
75
- s->cnt = 0;
76
ptimer_transaction_begin(s->timer_cmp);
77
ptimer_transaction_begin(s->timer_reload);
78
- /* stop both timers */
29
+
79
+
30
+DEF_HELPER_FLAGS_4(mve_vqrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
80
+ /*
31
+DEF_HELPER_FLAGS_4(mve_vqrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
81
+ * The reset switches off the input clock, so even if the CR.EN is still
32
+DEF_HELPER_FLAGS_4(mve_vqrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
82
+ * set, the timers are no longer running.
33
+
83
+ */
34
DEF_HELPER_FLAGS_4(mve_vshllbsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
84
+ assert(imx_epit_get_freq(s) == 0);
35
DEF_HELPER_FLAGS_4(mve_vshllbsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
85
ptimer_stop(s->timer_cmp);
36
DEF_HELPER_FLAGS_4(mve_vshllbub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
86
ptimer_stop(s->timer_reload);
37
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
87
- /* compute new frequency */
38
index XXXXXXX..XXXXXXX 100644
88
- imx_epit_set_freq(s);
39
--- a/target/arm/mve.decode
89
/* init both timers to EPIT_TIMER_MAX */
40
+++ b/target/arm/mve.decode
90
ptimer_set_limit(s->timer_cmp, EPIT_TIMER_MAX, 1);
41
@@ -XXX,XX +XXX,XX @@
91
ptimer_set_limit(s->timer_reload, EPIT_TIMER_MAX, 1);
42
&viwdup qd rn rm size imm
92
- if (s->freq && (s->cr & CR_EN)) {
43
&vcmp qm qn size mask
93
- /* if the timer is still enabled, restart it */
44
&vcmp_scalar qn rm size mask
94
- ptimer_run(s->timer_reload, 0);
45
+&shl_scalar qda rm size
95
- }
46
96
ptimer_transaction_commit(s->timer_cmp);
47
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
97
ptimer_transaction_commit(s->timer_reload);
48
# Note that both Rn and Qd are 3 bits only (no D bit)
49
@@ -XXX,XX +XXX,XX @@
50
@2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \
51
size=2 shift=%rshift_i5
52
53
+@shl_scalar .... .... .... size:2 .. .... .... .... rm:4 &shl_scalar qda=%qd
54
+
55
# Vector comparison; 4-bit Qm but 3-bit Qn
56
%mask_22_13 22:1 13:3
57
@vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13
58
@@ -XXX,XX +XXX,XX @@ VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_no
59
60
VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar
61
VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar
62
-VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
63
+
64
+{
65
+ VSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar
66
+ VRSHL_S_scalar 1110 1110 0 . 11 .. 11 ... 1 1110 0110 .... @shl_scalar
67
+ VQSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 1110 .... @shl_scalar
68
+ VQRSHL_S_scalar 1110 1110 0 . 11 .. 11 ... 1 1110 1110 .... @shl_scalar
69
+ VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
70
+}
71
+
72
+{
73
+ VSHL_U_scalar 1111 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar
74
+ VRSHL_U_scalar 1111 1110 0 . 11 .. 11 ... 1 1110 0110 .... @shl_scalar
75
+ VQSHL_U_scalar 1111 1110 0 . 11 .. 01 ... 1 1110 1110 .... @shl_scalar
76
+ VQRSHL_U_scalar 1111 1110 0 . 11 .. 11 ... 1 1110 1110 .... @shl_scalar
77
+ VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
78
+}
79
+
80
VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
81
VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
82
VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
83
@@ -XXX,XX +XXX,XX @@ VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
84
size=%size_28
85
}
98
}
86
99
87
-VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
100
-static uint32_t imx_epit_update_count(IMXEPITState *s)
101
-{
102
- s->cnt = ptimer_get_count(s->timer_reload);
88
-
103
-
89
VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
104
- return s->cnt;
90
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
105
-}
91
106
-
92
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
107
static uint64_t imx_epit_read(void *opaque, hwaddr offset, unsigned size)
93
index XXXXXXX..XXXXXXX 100644
108
{
94
--- a/target/arm/mve_helper.c
109
IMXEPITState *s = IMX_EPIT(opaque);
95
+++ b/target/arm/mve_helper.c
110
@@ -XXX,XX +XXX,XX @@ static uint64_t imx_epit_read(void *opaque, hwaddr offset, unsigned size)
96
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
111
break;
97
DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
112
98
DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
113
case 4: /* CNT */
99
DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
114
- imx_epit_update_count(s);
100
+DO_2SHIFT_SAT_U(vqrshli_u, DO_UQRSHL_OP)
115
- reg_value = s->cnt;
101
+DO_2SHIFT_SAT_S(vqrshli_s, DO_SQRSHL_OP)
116
+ reg_value = ptimer_get_count(s->timer_reload);
102
117
break;
103
/* Shift-and-insert; we always work with 64 bits at a time */
118
104
#define DO_2SHIFT_INSERT(OP, ESIZE, SHIFTFN, MASKFN) \
119
default:
105
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
120
@@ -XXX,XX +XXX,XX @@ static void imx_epit_reload_compare_timer(IMXEPITState *s)
106
index XXXXXXX..XXXXXXX 100644
121
{
107
--- a/target/arm/translate-mve.c
122
if ((s->cr & (CR_EN | CR_OCIEN)) == (CR_EN | CR_OCIEN)) {
108
+++ b/target/arm/translate-mve.c
123
/* if the compare feature is on and timers are running */
109
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VRSHRI_U, vrshli_u, true)
124
- uint32_t tmp = imx_epit_update_count(s);
110
DO_2SHIFT(VSRI, vsri, false)
125
+ uint32_t tmp = ptimer_get_count(s->timer_reload);
111
DO_2SHIFT(VSLI, vsli, false)
126
uint64_t next;
112
127
if (tmp > s->cmp) {
113
+static bool do_2shift_scalar(DisasContext *s, arg_shl_scalar *a,
128
/* It'll fire in this round of the timer */
114
+ MVEGenTwoOpShiftFn *fn)
129
@@ -XXX,XX +XXX,XX @@ static void imx_epit_reload_compare_timer(IMXEPITState *s)
115
+{
130
116
+ TCGv_ptr qda;
131
static void imx_epit_write_cr(IMXEPITState *s, uint32_t value)
117
+ TCGv_i32 rm;
132
{
118
+
133
+ uint32_t freq = 0;
119
+ if (!dc_isar_feature(aa32_mve, s) ||
134
uint32_t oldcr = s->cr;
120
+ !mve_check_qreg_bank(s, a->qda) ||
135
121
+ a->rm == 13 || a->rm == 15 || !fn) {
136
s->cr = value & 0x03ffffff;
122
+ /* Rm cases are UNPREDICTABLE */
137
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write_cr(IMXEPITState *s, uint32_t value)
123
+ return false;
138
ptimer_transaction_begin(s->timer_cmp);
124
+ }
139
ptimer_transaction_begin(s->timer_reload);
125
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
140
126
+ return true;
141
- /* Update the frequency. Has been done already in case of a reset. */
127
+ }
142
+ /*
128
+
143
+ * Update the frequency. In case of a reset the input clock was
129
+ qda = mve_qreg_ptr(a->qda);
144
+ * switched off, so this can be skipped.
130
+ rm = load_reg(s, a->rm);
145
+ */
131
+ fn(cpu_env, qda, qda, rm);
146
if (!(s->cr & CR_SWR)) {
132
+ tcg_temp_free_ptr(qda);
147
- imx_epit_set_freq(s);
133
+ tcg_temp_free_i32(rm);
148
+ freq = imx_epit_get_freq(s);
134
+ mve_update_eci(s);
149
+ if (freq) {
135
+ return true;
150
+ ptimer_set_freq(s->timer_reload, freq);
136
+}
151
+ ptimer_set_freq(s->timer_cmp, freq);
137
+
152
+ }
138
+#define DO_2SHIFT_SCALAR(INSN, FN) \
153
}
139
+ static bool trans_##INSN(DisasContext *s, arg_shl_scalar *a) \
154
140
+ { \
155
- if (s->freq && (s->cr & CR_EN) && !(oldcr & CR_EN)) {
141
+ static MVEGenTwoOpShiftFn * const fns[] = { \
156
+ if (freq && (s->cr & CR_EN) && !(oldcr & CR_EN)) {
142
+ gen_helper_mve_##FN##b, \
157
if (s->cr & CR_ENMOD) {
143
+ gen_helper_mve_##FN##h, \
158
if (s->cr & CR_RLD) {
144
+ gen_helper_mve_##FN##w, \
159
ptimer_set_limit(s->timer_reload, s->lr, 1);
145
+ NULL, \
160
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps imx_epit_ops = {
146
+ }; \
161
147
+ return do_2shift_scalar(s, a, fns[a->size]); \
162
static const VMStateDescription vmstate_imx_timer_epit = {
148
+ }
163
.name = TYPE_IMX_EPIT,
149
+
164
- .version_id = 2,
150
+DO_2SHIFT_SCALAR(VSHL_S_scalar, vshli_s)
165
- .minimum_version_id = 2,
151
+DO_2SHIFT_SCALAR(VSHL_U_scalar, vshli_u)
166
+ .version_id = 3,
152
+DO_2SHIFT_SCALAR(VRSHL_S_scalar, vrshli_s)
167
+ .minimum_version_id = 3,
153
+DO_2SHIFT_SCALAR(VRSHL_U_scalar, vrshli_u)
168
.fields = (VMStateField[]) {
154
+DO_2SHIFT_SCALAR(VQSHL_S_scalar, vqshli_s)
169
VMSTATE_UINT32(cr, IMXEPITState),
155
+DO_2SHIFT_SCALAR(VQSHL_U_scalar, vqshli_u)
170
VMSTATE_UINT32(sr, IMXEPITState),
156
+DO_2SHIFT_SCALAR(VQRSHL_S_scalar, vqrshli_s)
171
VMSTATE_UINT32(lr, IMXEPITState),
157
+DO_2SHIFT_SCALAR(VQRSHL_U_scalar, vqrshli_u)
172
VMSTATE_UINT32(cmp, IMXEPITState),
158
+
173
- VMSTATE_UINT32(cnt, IMXEPITState),
159
#define DO_VSHLL(INSN, FN) \
174
- VMSTATE_UINT32(freq, IMXEPITState),
160
static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
175
VMSTATE_PTIMER(timer_reload, IMXEPITState),
161
{ \
176
VMSTATE_PTIMER(timer_cmp, IMXEPITState),
177
VMSTATE_END_OF_LIST()
162
--
178
--
163
2.20.1
179
2.25.1
164
165
diff view generated by jsdifflib
1
From: Hamza Mahfooz <someguy@effective-light.com>
1
From: Axel Heider <axel.heider@hensoldt.net>
2
2
3
As per commit 5626f8c6d468 ("rcu: Add automatically released rcu_read_lock
3
- fix #1263 for CR writes
4
variants"), RCU_READ_LOCK_GUARD() should be used instead of
4
- rework compare time handling
5
rcu_read_{un}lock().
5
- The compare timer has to run even if CR.OCIEN is not set,
6
as SR.OCIF must be updated.
7
- The compare timer fires exactly once when the
8
compare value is less than the current value, but the
9
reload values is less than the compare value.
10
- The compare timer will never fire if the reload value is
11
less than the compare value. Disable it in this case.
6
12
7
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
13
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
8
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
14
[PMM: fixed minor style nits]
9
Message-id: 20210727235201.11491-1-someguy@effective-light.com
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
17
---
12
target/arm/kvm.c | 17 ++++++++---------
18
hw/timer/imx_epit.c | 192 ++++++++++++++++++++++++++------------------
13
1 file changed, 8 insertions(+), 9 deletions(-)
19
1 file changed, 116 insertions(+), 76 deletions(-)
14
20
15
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
21
diff --git a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c
16
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/kvm.c
23
--- a/hw/timer/imx_epit.c
18
+++ b/target/arm/kvm.c
24
+++ b/hw/timer/imx_epit.c
19
@@ -XXX,XX +XXX,XX @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
25
@@ -XXX,XX +XXX,XX @@
20
hwaddr xlat, len, doorbell_gpa;
26
* Originally written by Hans Jiang
21
MemoryRegionSection mrs;
27
* Updated by Peter Chubb
22
MemoryRegion *mr;
28
* Updated by Jean-Christophe Dubois <jcd@tribudubois.net>
23
- int ret = 1;
29
+ * Updated by Axel Heider
24
30
*
25
if (as == &address_space_memory) {
31
* This code is licensed under GPL version 2 or later. See
26
return 0;
32
* the COPYING file in the top-level directory.
27
@@ -XXX,XX +XXX,XX @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
33
@@ -XXX,XX +XXX,XX @@ static uint64_t imx_epit_read(void *opaque, hwaddr offset, unsigned size)
28
34
return reg_value;
29
/* MSI doorbell address is translated by an IOMMU */
35
}
30
36
31
- rcu_read_lock();
37
-/* Must be called from ptimer_transaction_begin/commit block for s->timer_cmp */
32
+ RCU_READ_LOCK_GUARD();
38
-static void imx_epit_reload_compare_timer(IMXEPITState *s)
33
+
39
+/*
34
mr = address_space_translate(as, address, &xlat, &len, true,
40
+ * Must be called from a ptimer_transaction_begin/commit block for
35
MEMTXATTRS_UNSPECIFIED);
41
+ * s->timer_cmp, but outside of a transaction block of s->timer_reload,
36
+
42
+ * so the proper counter value is read.
37
if (!mr) {
43
+ */
38
- goto unlock;
44
+static void imx_epit_update_compare_timer(IMXEPITState *s)
39
+ return 1;
45
{
46
- if ((s->cr & (CR_EN | CR_OCIEN)) == (CR_EN | CR_OCIEN)) {
47
- /* if the compare feature is on and timers are running */
48
- uint32_t tmp = ptimer_get_count(s->timer_reload);
49
- uint64_t next;
50
- if (tmp > s->cmp) {
51
- /* It'll fire in this round of the timer */
52
- next = tmp - s->cmp;
53
- } else { /* catch it next time around */
54
- next = tmp - s->cmp + ((s->cr & CR_RLD) ? EPIT_TIMER_MAX : s->lr);
55
+ uint64_t counter = 0;
56
+ bool is_oneshot = false;
57
+ /*
58
+ * The compare timer only has to run if the timer peripheral is active
59
+ * and there is an input clock, Otherwise it can be switched off.
60
+ */
61
+ bool is_active = (s->cr & CR_EN) && imx_epit_get_freq(s);
62
+ if (is_active) {
63
+ /*
64
+ * Calculate next timeout for compare timer. Reading the reload
65
+ * counter returns proper results only if pending transactions
66
+ * on it are committed here. Otherwise stale values are be read.
67
+ */
68
+ counter = ptimer_get_count(s->timer_reload);
69
+ uint64_t limit = ptimer_get_limit(s->timer_cmp);
70
+ /*
71
+ * The compare timer is a periodic timer if the limit is at least
72
+ * the compare value. Otherwise it may fire at most once in the
73
+ * current round.
74
+ */
75
+ bool is_oneshot = (limit >= s->cmp);
76
+ if (counter >= s->cmp) {
77
+ /* The compare timer fires in the current round. */
78
+ counter -= s->cmp;
79
+ } else if (!is_oneshot) {
80
+ /*
81
+ * The compare timer fires after a reload, as it is below the
82
+ * compare value already in this round. Note that the counter
83
+ * value calculated below can be above the 32-bit limit, which
84
+ * is legal here because the compare timer is an internal
85
+ * helper ptimer only.
86
+ */
87
+ counter += limit - s->cmp;
88
+ } else {
89
+ /*
90
+ * The compare timer won't fire in this round, and the limit is
91
+ * set to a value below the compare value. This practically means
92
+ * it will never fire, so it can be switched off.
93
+ */
94
+ is_active = false;
95
}
96
- ptimer_set_count(s->timer_cmp, next);
40
}
97
}
41
+
98
+
42
mrs = memory_region_find(mr, xlat, 1);
99
+ /*
43
+
100
+ * Set the compare timer and let it run, or stop it. This is agnostic
44
if (!mrs.mr) {
101
+ * of CR.OCIEN bit, as this bit affects interrupt generation only. The
45
- goto unlock;
102
+ * compare timer needs to run even if no interrupts are to be generated,
46
+ return 1;
103
+ * because the SR.OCIF bit must be updated also.
104
+ * Note that the timer might already be stopped or be running with
105
+ * counter values. However, finding out when an update is needed and
106
+ * when not is not trivial. It's much easier applying the setting again,
107
+ * as this does not harm either and the overhead is negligible.
108
+ */
109
+ if (is_active) {
110
+ ptimer_set_count(s->timer_cmp, counter);
111
+ ptimer_run(s->timer_cmp, is_oneshot ? 1 : 0);
112
+ } else {
113
+ ptimer_stop(s->timer_cmp);
114
+ }
115
+
116
}
117
118
static void imx_epit_write_cr(IMXEPITState *s, uint32_t value)
119
{
120
- uint32_t freq = 0;
121
uint32_t oldcr = s->cr;
122
123
s->cr = value & 0x03ffffff;
124
125
if (s->cr & CR_SWR) {
126
- /* handle the reset */
127
+ /*
128
+ * Reset clears CR.SWR again. It does not touch CR.EN, but the timers
129
+ * are still stopped because the input clock is disabled.
130
+ */
131
imx_epit_reset(s, false);
132
+ } else {
133
+ uint32_t freq;
134
+ uint32_t toggled_cr_bits = oldcr ^ s->cr;
135
+ /* re-initialize the limits if CR.RLD has changed */
136
+ bool set_limit = toggled_cr_bits & CR_RLD;
137
+ /* set the counter if the timer got just enabled and CR.ENMOD is set */
138
+ bool is_switched_on = (toggled_cr_bits & s->cr) & CR_EN;
139
+ bool set_counter = is_switched_on && (s->cr & CR_ENMOD);
140
+
141
+ ptimer_transaction_begin(s->timer_cmp);
142
+ ptimer_transaction_begin(s->timer_reload);
143
+ freq = imx_epit_get_freq(s);
144
+ if (freq) {
145
+ ptimer_set_freq(s->timer_reload, freq);
146
+ ptimer_set_freq(s->timer_cmp, freq);
147
+ }
148
+
149
+ if (set_limit || set_counter) {
150
+ uint64_t limit = (s->cr & CR_RLD) ? s->lr : EPIT_TIMER_MAX;
151
+ ptimer_set_limit(s->timer_reload, limit, set_counter ? 1 : 0);
152
+ if (set_limit) {
153
+ ptimer_set_limit(s->timer_cmp, limit, 0);
154
+ }
155
+ }
156
+ /*
157
+ * If there is an input clock and the peripheral is enabled, then
158
+ * ensure the wall clock timer is ticking. Otherwise stop the timers.
159
+ * The compare timer will be updated later.
160
+ */
161
+ if (freq && (s->cr & CR_EN)) {
162
+ ptimer_run(s->timer_reload, 0);
163
+ } else {
164
+ ptimer_stop(s->timer_reload);
165
+ }
166
+ /* Commit changes to reload timer, so they can propagate. */
167
+ ptimer_transaction_commit(s->timer_reload);
168
+ /* Update compare timer based on the committed reload timer value. */
169
+ imx_epit_update_compare_timer(s);
170
+ ptimer_transaction_commit(s->timer_cmp);
47
}
171
}
48
172
49
doorbell_gpa = mrs.offset_within_address_space;
173
/*
50
@@ -XXX,XX +XXX,XX @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
174
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write_cr(IMXEPITState *s, uint32_t value)
51
175
* - write to CR.EN or CR.OCIE
52
trace_kvm_arm_fixup_msi_route(address, doorbell_gpa);
176
*/
53
177
imx_epit_update_int(s);
54
- ret = 0;
178
-
55
-
179
- /*
56
-unlock:
180
- * TODO: could we 'break' here for reset? following operations appear
57
- rcu_read_unlock();
181
- * to duplicate the work imx_epit_reset() already did.
58
- return ret;
182
- */
59
+ return 0;
183
-
60
}
184
- ptimer_transaction_begin(s->timer_cmp);
61
185
- ptimer_transaction_begin(s->timer_reload);
62
int kvm_arch_add_msi_route_post(struct kvm_irq_routing_entry *route,
186
-
187
- /*
188
- * Update the frequency. In case of a reset the input clock was
189
- * switched off, so this can be skipped.
190
- */
191
- if (!(s->cr & CR_SWR)) {
192
- freq = imx_epit_get_freq(s);
193
- if (freq) {
194
- ptimer_set_freq(s->timer_reload, freq);
195
- ptimer_set_freq(s->timer_cmp, freq);
196
- }
197
- }
198
-
199
- if (freq && (s->cr & CR_EN) && !(oldcr & CR_EN)) {
200
- if (s->cr & CR_ENMOD) {
201
- if (s->cr & CR_RLD) {
202
- ptimer_set_limit(s->timer_reload, s->lr, 1);
203
- ptimer_set_limit(s->timer_cmp, s->lr, 1);
204
- } else {
205
- ptimer_set_limit(s->timer_reload, EPIT_TIMER_MAX, 1);
206
- ptimer_set_limit(s->timer_cmp, EPIT_TIMER_MAX, 1);
207
- }
208
- }
209
-
210
- imx_epit_reload_compare_timer(s);
211
- ptimer_run(s->timer_reload, 0);
212
- if (s->cr & CR_OCIEN) {
213
- ptimer_run(s->timer_cmp, 0);
214
- } else {
215
- ptimer_stop(s->timer_cmp);
216
- }
217
- } else if (!(s->cr & CR_EN)) {
218
- /* stop both timers */
219
- ptimer_stop(s->timer_reload);
220
- ptimer_stop(s->timer_cmp);
221
- } else if (s->cr & CR_OCIEN) {
222
- if (!(oldcr & CR_OCIEN)) {
223
- imx_epit_reload_compare_timer(s);
224
- ptimer_run(s->timer_cmp, 0);
225
- }
226
- } else {
227
- ptimer_stop(s->timer_cmp);
228
- }
229
-
230
- ptimer_transaction_commit(s->timer_cmp);
231
- ptimer_transaction_commit(s->timer_reload);
232
}
233
234
static void imx_epit_write_sr(IMXEPITState *s, uint32_t value)
235
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write_lr(IMXEPITState *s, uint32_t value)
236
/* If IOVW bit is set then set the timer value */
237
ptimer_set_count(s->timer_reload, s->lr);
238
}
239
- /*
240
- * Commit the change to s->timer_reload, so it can propagate. Otherwise
241
- * the timer interrupt may not fire properly. The commit must happen
242
- * before calling imx_epit_reload_compare_timer(), which reads
243
- * s->timer_reload internally again.
244
- */
245
+ /* Commit the changes to s->timer_reload, so they can propagate. */
246
ptimer_transaction_commit(s->timer_reload);
247
- imx_epit_reload_compare_timer(s);
248
+ /* Update the compare timer based on the committed reload timer value. */
249
+ imx_epit_update_compare_timer(s);
250
ptimer_transaction_commit(s->timer_cmp);
251
}
252
253
@@ -XXX,XX +XXX,XX @@ static void imx_epit_write_cmp(IMXEPITState *s, uint32_t value)
254
{
255
s->cmp = value;
256
257
+ /* Update the compare timer based on the committed reload timer value. */
258
ptimer_transaction_begin(s->timer_cmp);
259
- imx_epit_reload_compare_timer(s);
260
+ imx_epit_update_compare_timer(s);
261
ptimer_transaction_commit(s->timer_cmp);
262
}
263
264
@@ -XXX,XX +XXX,XX @@ static void imx_epit_cmp(void *opaque)
265
{
266
IMXEPITState *s = IMX_EPIT(opaque);
267
268
+ /* The cmp ptimer can't be running when the peripheral is disabled */
269
+ assert(s->cr & CR_EN);
270
+
271
DPRINTF("sr was %d\n", s->sr);
272
/* Set interrupt status bit SR.OCIF and update the interrupt state */
273
s->sr |= SR_OCIF;
63
--
274
--
64
2.20.1
275
2.25.1
65
66
diff view generated by jsdifflib
1
Factor out the "generate code to update VPR.MASK01/MASK23" part of
1
From: Fabiano Rosas <farosas@suse.de>
2
trans_VPST(); we are going to want to reuse it for the VPT insns.
3
2
3
Fix these:
4
5
WARNING: Block comments use a leading /* on a separate line
6
WARNING: Block comments use * on subsequent lines
7
WARNING: Block comments use a trailing */ on a separate line
8
9
Signed-off-by: Fabiano Rosas <farosas@suse.de>
10
Reviewed-by: Claudio Fontana <cfontana@suse.de>
11
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
12
Message-id: 20221213190537.511-2-farosas@suse.de
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
14
---
7
target/arm/translate-mve.c | 31 +++++++++++++++++--------------
15
target/arm/helper.c | 323 +++++++++++++++++++++++++++++---------------
8
1 file changed, 17 insertions(+), 14 deletions(-)
16
1 file changed, 215 insertions(+), 108 deletions(-)
9
17
10
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
18
diff --git a/target/arm/helper.c b/target/arm/helper.c
11
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
12
--- a/target/arm/translate-mve.c
20
--- a/target/arm/helper.c
13
+++ b/target/arm/translate-mve.c
21
+++ b/target/arm/helper.c
14
@@ -XXX,XX +XXX,XX @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a)
22
@@ -XXX,XX +XXX,XX @@ uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri)
15
return do_long_dual_acc(s, a, fns[a->x]);
23
static void write_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri,
16
}
24
uint64_t v)
17
25
{
18
-static bool trans_VPST(DisasContext *s, arg_VPST *a)
26
- /* Raw write of a coprocessor register (as needed for migration, etc).
19
+static void gen_vpst(DisasContext *s, uint32_t mask)
27
+ /*
20
{
28
+ * Raw write of a coprocessor register (as needed for migration, etc).
21
- TCGv_i32 vpr;
29
* Note that constant registers are treated as write-ignored; the
22
-
30
* caller should check for success by whether a readback gives the
23
- /* mask == 0 is a "related encoding" */
31
* value written.
24
- if (!dc_isar_feature(aa32_mve, s) || !a->mask) {
32
@@ -XXX,XX +XXX,XX @@ static void write_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri,
25
- return false;
33
26
- }
34
static bool raw_accessors_invalid(const ARMCPRegInfo *ri)
27
- if (!mve_eci_check(s) || !vfp_access_check(s)) {
35
{
28
- return true;
36
- /* Return true if the regdef would cause an assertion if you called
29
- }
37
+ /*
30
/*
38
+ * Return true if the regdef would cause an assertion if you called
31
* Set the VPR mask fields. We take advantage of MASK01 and MASK23
39
* read_raw_cp_reg() or write_raw_cp_reg() on it (ie if it is a
32
* being adjacent fields in the register.
40
* program bug for it not to have the NO_RAW flag).
41
* NB that returning false here doesn't necessarily mean that calling
42
@@ -XXX,XX +XXX,XX @@ bool write_list_to_cpustate(ARMCPU *cpu)
43
if (ri->type & ARM_CP_NO_RAW) {
44
continue;
45
}
46
- /* Write value and confirm it reads back as written
47
+ /*
48
+ * Write value and confirm it reads back as written
49
* (to catch read-only registers and partially read-only
50
* registers where the incoming migration value doesn't match)
51
*/
52
@@ -XXX,XX +XXX,XX @@ static gint cpreg_key_compare(gconstpointer a, gconstpointer b)
53
54
void init_cpreg_list(ARMCPU *cpu)
55
{
56
- /* Initialise the cpreg_tuples[] array based on the cp_regs hash.
57
+ /*
58
+ * Initialise the cpreg_tuples[] array based on the cp_regs hash.
59
* Note that we require cpreg_tuples[] to be sorted by key ID.
60
*/
61
GList *keys;
62
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_el3_aa32ns(CPUARMState *env,
63
return CP_ACCESS_OK;
64
}
65
66
-/* Some secure-only AArch32 registers trap to EL3 if used from
67
+/*
68
+ * Some secure-only AArch32 registers trap to EL3 if used from
69
* Secure EL1 (but are just ordinary UNDEF in other non-EL3 contexts).
70
* Note that an access from Secure EL1 can only happen if EL3 is AArch64.
71
* We assume that the .access field is set to PL1_RW.
72
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_trap_aa32s_el1(CPUARMState *env,
73
return CP_ACCESS_TRAP_UNCATEGORIZED;
74
}
75
76
-/* Check for traps to performance monitor registers, which are controlled
77
+/*
78
+ * Check for traps to performance monitor registers, which are controlled
79
* by MDCR_EL2.TPM for EL2 and MDCR_EL3.TPM for EL3.
80
*/
81
static CPAccessResult access_tpm(CPUARMState *env, const ARMCPRegInfo *ri,
82
@@ -XXX,XX +XXX,XX @@ static void fcse_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
83
ARMCPU *cpu = env_archcpu(env);
84
85
if (raw_read(env, ri) != value) {
86
- /* Unlike real hardware the qemu TLB uses virtual addresses,
87
+ /*
88
+ * Unlike real hardware the qemu TLB uses virtual addresses,
89
* not modified virtual addresses, so this causes a TLB flush.
90
*/
91
tlb_flush(CPU(cpu));
92
@@ -XXX,XX +XXX,XX @@ static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
93
94
if (raw_read(env, ri) != value && !arm_feature(env, ARM_FEATURE_PMSA)
95
&& !extended_addresses_enabled(env)) {
96
- /* For VMSA (when not using the LPAE long descriptor page table
97
+ /*
98
+ * For VMSA (when not using the LPAE long descriptor page table
99
* format) this register includes the ASID, so do a TLB flush.
100
* For PMSA it is purely a process ID and no action is needed.
101
*/
102
@@ -XXX,XX +XXX,XX @@ static void tlbiipas2is_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
103
}
104
105
static const ARMCPRegInfo cp_reginfo[] = {
106
- /* Define the secure and non-secure FCSE identifier CP registers
107
+ /*
108
+ * Define the secure and non-secure FCSE identifier CP registers
109
* separately because there is no secure bank in V8 (no _EL3). This allows
110
* the secure register to be properly reset and migrated. There is also no
111
* v8 EL1 version of the register so the non-secure instance stands alone.
112
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
113
.access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
114
.fieldoffset = offsetof(CPUARMState, cp15.fcseidr_s),
115
.resetvalue = 0, .writefn = fcse_write, .raw_writefn = raw_write, },
116
- /* Define the secure and non-secure context identifier CP registers
117
+ /*
118
+ * Define the secure and non-secure context identifier CP registers
119
* separately because there is no secure bank in V8 (no _EL3). This allows
120
* the secure register to be properly reset and migrated. In the
121
* non-secure case, the 32-bit register will have reset and migration
122
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
123
};
124
125
static const ARMCPRegInfo not_v8_cp_reginfo[] = {
126
- /* NB: Some of these registers exist in v8 but with more precise
127
+ /*
128
+ * NB: Some of these registers exist in v8 but with more precise
129
* definitions that don't use CP_ANY wildcards (mostly in v8_cp_reginfo[]).
130
*/
131
/* MMU Domain access control / MPU write buffer control */
132
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
133
.writefn = dacr_write, .raw_writefn = raw_write,
134
.bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dacr_s),
135
offsetoflow32(CPUARMState, cp15.dacr_ns) } },
136
- /* ARMv7 allocates a range of implementation defined TLB LOCKDOWN regs.
137
+ /*
138
+ * ARMv7 allocates a range of implementation defined TLB LOCKDOWN regs.
139
* For v6 and v5, these mappings are overly broad.
140
*/
141
{ .name = "TLB_LOCKDOWN", .cp = 15, .crn = 10, .crm = 0,
142
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
143
};
144
145
static const ARMCPRegInfo not_v6_cp_reginfo[] = {
146
- /* Not all pre-v6 cores implemented this WFI, so this is slightly
147
+ /*
148
+ * Not all pre-v6 cores implemented this WFI, so this is slightly
149
* over-broad.
150
*/
151
{ .name = "WFI_v5", .cp = 15, .crn = 7, .crm = 8, .opc1 = 0, .opc2 = 2,
152
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v6_cp_reginfo[] = {
153
};
154
155
static const ARMCPRegInfo not_v7_cp_reginfo[] = {
156
- /* Standard v6 WFI (also used in some pre-v6 cores); not in v7 (which
157
+ /*
158
+ * Standard v6 WFI (also used in some pre-v6 cores); not in v7 (which
159
* is UNPREDICTABLE; we choose to NOP as most implementations do).
160
*/
161
{ .name = "WFI_v6", .cp = 15, .crn = 7, .crm = 0, .opc1 = 0, .opc2 = 4,
162
.access = PL1_W, .type = ARM_CP_WFI },
163
- /* L1 cache lockdown. Not architectural in v6 and earlier but in practice
164
+ /*
165
+ * L1 cache lockdown. Not architectural in v6 and earlier but in practice
166
* implemented in 926, 946, 1026, 1136, 1176 and 11MPCore. StrongARM and
167
* OMAPCP will override this space.
168
*/
169
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v7_cp_reginfo[] = {
170
{ .name = "DUMMY", .cp = 15, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = CP_ANY,
171
.access = PL1_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
172
.resetvalue = 0 },
173
- /* We don't implement pre-v7 debug but most CPUs had at least a DBGDIDR;
174
+ /*
175
+ * We don't implement pre-v7 debug but most CPUs had at least a DBGDIDR;
176
* implementing it as RAZ means the "debug architecture version" bits
177
* will read as a reserved value, which should cause Linux to not try
178
* to use the debug hardware.
179
*/
180
{ .name = "DBGDIDR", .cp = 14, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 0,
181
.access = PL0_R, .type = ARM_CP_CONST, .resetvalue = 0 },
182
- /* MMU TLB control. Note that the wildcarding means we cover not just
183
+ /*
184
+ * MMU TLB control. Note that the wildcarding means we cover not just
185
* the unified TLB ops but also the dside/iside/inner-shareable variants.
186
*/
187
{ .name = "TLBIALL", .cp = 15, .crn = 8, .crm = CP_ANY,
188
@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
189
190
/* In ARMv8 most bits of CPACR_EL1 are RES0. */
191
if (!arm_feature(env, ARM_FEATURE_V8)) {
192
- /* ARMv7 defines bits for unimplemented coprocessors as RAZ/WI.
193
+ /*
194
+ * ARMv7 defines bits for unimplemented coprocessors as RAZ/WI.
195
* ASEDIS [31] and D32DIS [30] are both UNK/SBZP without VFP.
196
* TRCDIS [28] is RAZ/WI since we do not implement a trace macrocell.
197
*/
198
@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
199
value |= R_CPACR_ASEDIS_MASK;
200
}
201
202
- /* VFPv3 and upwards with NEON implement 32 double precision
203
+ /*
204
+ * VFPv3 and upwards with NEON implement 32 double precision
205
* registers (D0-D31).
206
*/
207
if (!cpu_isar_feature(aa32_simd_r32, env_archcpu(env))) {
208
@@ -XXX,XX +XXX,XX @@ static uint64_t cpacr_read(CPUARMState *env, const ARMCPRegInfo *ri)
209
210
static void cpacr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
211
{
212
- /* Call cpacr_write() so that we reset with the correct RAO bits set
213
+ /*
214
+ * Call cpacr_write() so that we reset with the correct RAO bits set
215
* for our CPU features.
216
*/
217
cpacr_write(env, ri, 0);
218
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
219
{ .name = "MVA_prefetch",
220
.cp = 15, .crn = 7, .crm = 13, .opc1 = 0, .opc2 = 1,
221
.access = PL1_W, .type = ARM_CP_NOP },
222
- /* We need to break the TB after ISB to execute self-modifying code
223
+ /*
224
+ * We need to break the TB after ISB to execute self-modifying code
225
* correctly and also to take any pending interrupts immediately.
226
* So use arm_cp_write_ignore() function instead of ARM_CP_NOP flag.
227
*/
228
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
229
.bank_fieldoffsets = { offsetof(CPUARMState, cp15.ifar_s),
230
offsetof(CPUARMState, cp15.ifar_ns) },
231
.resetvalue = 0, },
232
- /* Watchpoint Fault Address Register : should actually only be present
233
+ /*
234
+ * Watchpoint Fault Address Register : should actually only be present
235
* for 1136, 1176, 11MPCore.
236
*/
237
{ .name = "WFAR", .cp = 15, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 1,
238
@@ -XXX,XX +XXX,XX @@ static bool event_supported(uint16_t number)
239
static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
240
bool isread)
241
{
242
- /* Performance monitor registers user accessibility is controlled
243
+ /*
244
+ * Performance monitor registers user accessibility is controlled
245
* by PMUSERENR. MDCR_EL2.TPM and MDCR_EL3.TPM allow configurable
246
* trapping to EL2 or EL3 for other accesses.
247
*/
248
@@ -XXX,XX +XXX,XX @@ static CPAccessResult pmreg_access_ccntr(CPUARMState *env,
249
(MDCR_HPME | MDCR_HPMD | MDCR_HPMN | MDCR_HCCD | MDCR_HLP)
250
#define MDCR_EL3_PMU_ENABLE_BITS (MDCR_SPME | MDCR_SCCD)
251
252
-/* Returns true if the counter (pass 31 for PMCCNTR) should count events using
253
+/*
254
+ * Returns true if the counter (pass 31 for PMCCNTR) should count events using
255
* the current EL, security state, and register configuration.
256
*/
257
static bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
258
@@ -XXX,XX +XXX,XX @@ static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
259
static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
260
uint64_t value)
261
{
262
- /* The value of PMSELR.SEL affects the behavior of PMXEVTYPER and
263
+ /*
264
+ * The value of PMSELR.SEL affects the behavior of PMXEVTYPER and
265
* PMXEVCNTR. We allow [0..31] to be written to PMSELR here; in the
266
* meanwhile, we check PMSELR.SEL when PMXEVTYPER and PMXEVCNTR are
267
* accessed.
268
@@ -XXX,XX +XXX,XX @@ static void pmevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
269
env->cp15.c14_pmevtyper[counter] = value & PMXEVTYPER_MASK;
270
pmevcntr_op_finish(env, counter);
271
}
272
- /* Attempts to access PMXEVTYPER are CONSTRAINED UNPREDICTABLE when
273
+ /*
274
+ * Attempts to access PMXEVTYPER are CONSTRAINED UNPREDICTABLE when
275
* PMSELR value is equal to or greater than the number of implemented
276
* counters, but not equal to 0x1f. We opt to behave as a RAZ/WI.
277
*/
278
@@ -XXX,XX +XXX,XX @@ static uint64_t pmevcntr_read(CPUARMState *env, const ARMCPRegInfo *ri,
279
}
280
return ret;
281
} else {
282
- /* We opt to behave as a RAZ/WI when attempts to access PM[X]EVCNTR
283
- * are CONSTRAINED UNPREDICTABLE. */
284
+ /*
285
+ * We opt to behave as a RAZ/WI when attempts to access PM[X]EVCNTR
286
+ * are CONSTRAINED UNPREDICTABLE.
287
+ */
288
return 0;
289
}
290
}
291
@@ -XXX,XX +XXX,XX @@ static void pmintenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
292
static void vbar_write(CPUARMState *env, const ARMCPRegInfo *ri,
293
uint64_t value)
294
{
295
- /* Note that even though the AArch64 view of this register has bits
296
+ /*
297
+ * Note that even though the AArch64 view of this register has bits
298
* [10:0] all RES0 we can only mask the bottom 5, to comply with the
299
* architectural requirements for bits which are RES0 only in some
300
* contexts. (ARMv8 would permit us to do no masking at all, but ARMv7
301
@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
302
if (!arm_feature(env, ARM_FEATURE_EL2)) {
303
valid_mask &= ~SCR_HCE;
304
305
- /* On ARMv7, SMD (or SCD as it is called in v7) is only
306
+ /*
307
+ * On ARMv7, SMD (or SCD as it is called in v7) is only
308
* supported if EL2 exists. The bit is UNK/SBZP when
309
* EL2 is unavailable. In QEMU ARMv7, we force it to always zero
310
* when EL2 is unavailable.
311
@@ -XXX,XX +XXX,XX @@ static uint64_t ccsidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
312
{
313
ARMCPU *cpu = env_archcpu(env);
314
315
- /* Acquire the CSSELR index from the bank corresponding to the CCSIDR
316
+ /*
317
+ * Acquire the CSSELR index from the bank corresponding to the CCSIDR
318
* bank
319
*/
320
uint32_t index = A32_BANKED_REG_GET(env, csselr,
321
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
322
/* the old v6 WFI, UNPREDICTABLE in v7 but we choose to NOP */
323
{ .name = "NOP", .cp = 15, .crn = 7, .crm = 0, .opc1 = 0, .opc2 = 4,
324
.access = PL1_W, .type = ARM_CP_NOP },
325
- /* Performance monitors are implementation defined in v7,
326
+ /*
327
+ * Performance monitors are implementation defined in v7,
328
* but with an ARM recommended set of registers, which we
329
* follow.
33
*
330
*
34
- * This insn is not predicated, but it is subject to beat-wise
331
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
35
+ * Updating the masks is not predicated, but it is subject to beat-wise
332
.writefn = csselr_write, .resetvalue = 0,
36
* execution, and the mask is updated on the odd-numbered beats.
333
.bank_fieldoffsets = { offsetof(CPUARMState, cp15.csselr_s),
37
* So if PSR.ECI says we should skip beat 1, we mustn't update the
334
offsetof(CPUARMState, cp15.csselr_ns) } },
38
* 01 mask field.
335
- /* Auxiliary ID register: this actually has an IMPDEF value but for now
39
*/
336
+ /*
40
- vpr = load_cpu_field(v7m.vpr);
337
+ * Auxiliary ID register: this actually has an IMPDEF value but for now
41
+ TCGv_i32 vpr = load_cpu_field(v7m.vpr);
338
* just RAZ for all cores:
42
switch (s->eci) {
339
*/
43
case ECI_NONE:
340
{ .name = "AIDR", .state = ARM_CP_STATE_BOTH,
44
case ECI_A0:
341
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
45
/* Update both 01 and 23 fields */
342
.access = PL1_R, .type = ARM_CP_CONST,
46
tcg_gen_deposit_i32(vpr, vpr,
343
.accessfn = access_aa64_tid1,
47
- tcg_constant_i32(a->mask | (a->mask << 4)),
344
.resetvalue = 0 },
48
+ tcg_constant_i32(mask | (mask << 4)),
345
- /* Auxiliary fault status registers: these also are IMPDEF, and we
49
R_V7M_VPR_MASK01_SHIFT,
346
+ /*
50
R_V7M_VPR_MASK01_LENGTH + R_V7M_VPR_MASK23_LENGTH);
347
+ * Auxiliary fault status registers: these also are IMPDEF, and we
51
break;
348
* choose to RAZ/WI for all cores.
52
@@ -XXX,XX +XXX,XX @@ static bool trans_VPST(DisasContext *s, arg_VPST *a)
349
*/
53
case ECI_A0A1A2B0:
350
{ .name = "AFSR0_EL1", .state = ARM_CP_STATE_BOTH,
54
/* Update only the 23 mask field */
351
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
55
tcg_gen_deposit_i32(vpr, vpr,
352
.opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 1,
56
- tcg_constant_i32(a->mask),
353
.access = PL1_RW, .accessfn = access_tvm_trvm,
57
+ tcg_constant_i32(mask),
354
.type = ARM_CP_CONST, .resetvalue = 0 },
58
R_V7M_VPR_MASK23_SHIFT, R_V7M_VPR_MASK23_LENGTH);
355
- /* MAIR can just read-as-written because we don't implement caches
59
break;
356
+ /*
357
+ * MAIR can just read-as-written because we don't implement caches
358
* and so don't need to care about memory attributes.
359
*/
360
{ .name = "MAIR_EL1", .state = ARM_CP_STATE_AA64,
361
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
362
.opc0 = 3, .opc1 = 6, .crn = 10, .crm = 2, .opc2 = 0,
363
.access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.mair_el[3]),
364
.resetvalue = 0 },
365
- /* For non-long-descriptor page tables these are PRRR and NMRR;
366
+ /*
367
+ * For non-long-descriptor page tables these are PRRR and NMRR;
368
* regardless they still act as reads-as-written for QEMU.
369
*/
370
- /* MAIR0/1 are defined separately from their 64-bit counterpart which
371
+ /*
372
+ * MAIR0/1 are defined separately from their 64-bit counterpart which
373
* allows them to assign the correct fieldoffset based on the endianness
374
* handled in the field definitions.
375
*/
376
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
377
static CPAccessResult gt_cntfrq_access(CPUARMState *env, const ARMCPRegInfo *ri,
378
bool isread)
379
{
380
- /* CNTFRQ: not visible from PL0 if both PL0PCTEN and PL0VCTEN are zero.
381
+ /*
382
+ * CNTFRQ: not visible from PL0 if both PL0PCTEN and PL0VCTEN are zero.
383
* Writable only at the highest implemented exception level.
384
*/
385
int el = arm_current_el(env);
386
@@ -XXX,XX +XXX,XX @@ static CPAccessResult gt_stimer_access(CPUARMState *env,
387
const ARMCPRegInfo *ri,
388
bool isread)
389
{
390
- /* The AArch64 register view of the secure physical timer is
391
+ /*
392
+ * The AArch64 register view of the secure physical timer is
393
* always accessible from EL3, and configurably accessible from
394
* Secure EL1.
395
*/
396
@@ -XXX,XX +XXX,XX @@ static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
397
ARMGenericTimer *gt = &cpu->env.cp15.c14_timer[timeridx];
398
399
if (gt->ctl & 1) {
400
- /* Timer enabled: calculate and set current ISTATUS, irq, and
401
+ /*
402
+ * Timer enabled: calculate and set current ISTATUS, irq, and
403
* reset timer to when ISTATUS next has to change
404
*/
405
uint64_t offset = timeridx == GTIMER_VIRT ?
406
@@ -XXX,XX +XXX,XX @@ static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
407
/* Next transition is when we hit cval */
408
nexttick = gt->cval + offset;
409
}
410
- /* Note that the desired next expiry time might be beyond the
411
+ /*
412
+ * Note that the desired next expiry time might be beyond the
413
* signed-64-bit range of a QEMUTimer -- in this case we just
414
* set the timer for as far in the future as possible. When the
415
* timer expires we will reset the timer for any remaining period.
416
@@ -XXX,XX +XXX,XX @@ static void gt_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
417
/* Enable toggled */
418
gt_recalc_timer(cpu, timeridx);
419
} else if ((oldval ^ value) & 2) {
420
- /* IMASK toggled: don't need to recalculate,
421
+ /*
422
+ * IMASK toggled: don't need to recalculate,
423
* just set the interrupt line based on ISTATUS
424
*/
425
int irqstate = (oldval & 4) && !(value & 2);
426
@@ -XXX,XX +XXX,XX @@ static void arm_gt_cntfrq_reset(CPUARMState *env, const ARMCPRegInfo *opaque)
427
}
428
429
static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
430
- /* Note that CNTFRQ is purely reads-as-written for the benefit
431
+ /*
432
+ * Note that CNTFRQ is purely reads-as-written for the benefit
433
* of software; writing it doesn't actually change the timer frequency.
434
* Our reset value matches the fixed frequency we implement the timer at.
435
*/
436
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
437
.readfn = gt_virt_redir_cval_read, .raw_readfn = raw_read,
438
.writefn = gt_virt_redir_cval_write, .raw_writefn = raw_write,
439
},
440
- /* Secure timer -- this is actually restricted to only EL3
441
+ /*
442
+ * Secure timer -- this is actually restricted to only EL3
443
* and configurably Secure-EL1 via the accessfn.
444
*/
445
{ .name = "CNTPS_TVAL_EL1", .state = ARM_CP_STATE_AA64,
446
@@ -XXX,XX +XXX,XX @@ static CPAccessResult e2h_access(CPUARMState *env, const ARMCPRegInfo *ri,
447
448
#else
449
450
-/* In user-mode most of the generic timer registers are inaccessible
451
+/*
452
+ * In user-mode most of the generic timer registers are inaccessible
453
* however modern kernels (4.12+) allow access to cntvct_el0
454
*/
455
456
@@ -XXX,XX +XXX,XX @@ static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
457
{
458
ARMCPU *cpu = env_archcpu(env);
459
460
- /* Currently we have no support for QEMUTimer in linux-user so we
461
+ /*
462
+ * Currently we have no support for QEMUTimer in linux-user so we
463
* can't call gt_get_countervalue(env), instead we directly
464
* call the lower level functions.
465
*/
466
@@ -XXX,XX +XXX,XX @@ static CPAccessResult ats_access(CPUARMState *env, const ARMCPRegInfo *ri,
467
bool isread)
468
{
469
if (ri->opc2 & 4) {
470
- /* The ATS12NSO* operations must trap to EL3 or EL2 if executed in
471
+ /*
472
+ * The ATS12NSO* operations must trap to EL3 or EL2 if executed in
473
* Secure EL1 (which can only happen if EL3 is AArch64).
474
* They are simply UNDEF if executed from NS EL1.
475
* They function normally from EL2 or EL3.
476
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
477
}
478
}
479
} else {
480
- /* fsr is a DFSR/IFSR value for the short descriptor
481
+ /*
482
+ * fsr is a DFSR/IFSR value for the short descriptor
483
* translation table format (with WnR always clear).
484
* Convert it to a 32-bit PAR.
485
*/
486
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmsav8r_cp_reginfo[] = {
487
};
488
489
static const ARMCPRegInfo pmsav7_cp_reginfo[] = {
490
- /* Reset for all these registers is handled in arm_cpu_reset(),
491
+ /*
492
+ * Reset for all these registers is handled in arm_cpu_reset(),
493
* because the PMSAv7 is also used by M-profile CPUs, which do
494
* not register cpregs but still need the state to be reset.
495
*/
496
@@ -XXX,XX +XXX,XX @@ static void vmsa_ttbcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
497
}
498
499
if (arm_feature(env, ARM_FEATURE_LPAE)) {
500
- /* With LPAE the TTBCR could result in a change of ASID
501
+ /*
502
+ * With LPAE the TTBCR could result in a change of ASID
503
* via the TTBCR.A1 bit, so do a TLB flush.
504
*/
505
tlb_flush(CPU(cpu));
506
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
507
offsetoflow32(CPUARMState, cp15.tcr_el[1])} },
508
};
509
510
-/* Note that unlike TTBCR, writing to TTBCR2 does not require flushing
511
+/*
512
+ * Note that unlike TTBCR, writing to TTBCR2 does not require flushing
513
* qemu tlbs nor adjusting cached masks.
514
*/
515
static const ARMCPRegInfo ttbcr2_reginfo = {
516
@@ -XXX,XX +XXX,XX @@ static void omap_wfi_write(CPUARMState *env, const ARMCPRegInfo *ri,
517
static void omap_cachemaint_write(CPUARMState *env, const ARMCPRegInfo *ri,
518
uint64_t value)
519
{
520
- /* On OMAP there are registers indicating the max/min index of dcache lines
521
+ /*
522
+ * On OMAP there are registers indicating the max/min index of dcache lines
523
* containing a dirty line; cache flush operations have to reset these.
524
*/
525
env->cp15.c15_i_max = 0x000;
526
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo omap_cp_reginfo[] = {
527
.crm = 8, .opc1 = 0, .opc2 = 0, .access = PL1_RW,
528
.type = ARM_CP_NO_RAW,
529
.readfn = arm_cp_read_zero, .writefn = omap_wfi_write, },
530
- /* TODO: Peripheral port remap register:
531
+ /*
532
+ * TODO: Peripheral port remap register:
533
* On OMAP2 mcr p15, 0, rn, c15, c2, 4 sets up the interrupt controller
534
* base address at $rn & ~0xfff and map size of 0x200 << ($rn & 0xfff),
535
* when MMU is off.
536
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo xscale_cp_reginfo[] = {
537
.cp = 15, .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 1, .access = PL1_RW,
538
.fieldoffset = offsetof(CPUARMState, cp15.c1_xscaleauxcr),
539
.resetvalue = 0, },
540
- /* XScale specific cache-lockdown: since we have no cache we NOP these
541
+ /*
542
+ * XScale specific cache-lockdown: since we have no cache we NOP these
543
* and hope the guest does not really rely on cache behaviour.
544
*/
545
{ .name = "XSCALE_LOCK_ICACHE_LINE",
546
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo xscale_cp_reginfo[] = {
547
};
548
549
static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
550
- /* RAZ/WI the whole crn=15 space, when we don't have a more specific
551
+ /*
552
+ * RAZ/WI the whole crn=15 space, when we don't have a more specific
553
* implementation of this implementation-defined space.
554
* Ideally this should eventually disappear in favour of actually
555
* implementing the correct behaviour for all cores.
556
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
557
};
558
559
static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
560
- /* The cache test-and-clean instructions always return (1 << 30)
561
+ /*
562
+ * The cache test-and-clean instructions always return (1 << 30)
563
* to indicate that there are no dirty cache lines.
564
*/
565
{ .name = "TC_DCACHE", .cp = 15, .crn = 7, .crm = 10, .opc1 = 0, .opc2 = 3,
566
@@ -XXX,XX +XXX,XX @@ static uint64_t mpidr_read_val(CPUARMState *env)
567
568
if (arm_feature(env, ARM_FEATURE_V7MP)) {
569
mpidr |= (1U << 31);
570
- /* Cores which are uniprocessor (non-coherent)
571
+ /*
572
+ * Cores which are uniprocessor (non-coherent)
573
* but still implement the MP extensions set
574
* bit 30. (For instance, Cortex-R5).
575
*/
576
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tocu(CPUARMState *env, const ARMCPRegInfo *ri,
577
return do_cacheop_pou_access(env, HCR_TOCU | HCR_TPU);
578
}
579
580
-/* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
581
+/*
582
+ * See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
583
* Page D4-1736 (DDI0487A.b)
584
*/
585
586
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
587
static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
588
uint64_t value)
589
{
590
- /* Invalidate by VA, EL2
591
+ /*
592
+ * Invalidate by VA, EL2
593
* Currently handles both VAE2 and VALE2, since we don't support
594
* flush-last-level-only.
595
*/
596
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
597
static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
598
uint64_t value)
599
{
600
- /* Invalidate by VA, EL3
601
+ /*
602
+ * Invalidate by VA, EL3
603
* Currently handles both VAE3 and VALE3, since we don't support
604
* flush-last-level-only.
605
*/
606
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
607
static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
608
uint64_t value)
609
{
610
- /* Invalidate by VA, EL1&0 (AArch64 version).
611
+ /*
612
+ * Invalidate by VA, EL1&0 (AArch64 version).
613
* Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
614
* since we don't support flush-for-specific-ASID-only or
615
* flush-last-level-only.
616
@@ -XXX,XX +XXX,XX @@ static CPAccessResult sp_el0_access(CPUARMState *env, const ARMCPRegInfo *ri,
617
bool isread)
618
{
619
if (!(env->pstate & PSTATE_SP)) {
620
- /* Access to SP_EL0 is undefined if it's being used as
621
+ /*
622
+ * Access to SP_EL0 is undefined if it's being used as
623
* the stack pointer.
624
*/
625
return CP_ACCESS_TRAP_UNCATEGORIZED;
626
@@ -XXX,XX +XXX,XX @@ static void sctlr_write(CPUARMState *env, const ARMCPRegInfo *ri,
627
}
628
629
if (raw_read(env, ri) == value) {
630
- /* Skip the TLB flush if nothing actually changed; Linux likes
631
+ /*
632
+ * Skip the TLB flush if nothing actually changed; Linux likes
633
* to do a lot of pointless SCTLR writes.
634
*/
635
return;
636
@@ -XXX,XX +XXX,XX @@ static void mdcr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
637
}
638
639
static const ARMCPRegInfo v8_cp_reginfo[] = {
640
- /* Minimal set of EL0-visible registers. This will need to be expanded
641
+ /*
642
+ * Minimal set of EL0-visible registers. This will need to be expanded
643
* significantly for system emulation of AArch64 CPUs.
644
*/
645
{ .name = "NZCV", .state = ARM_CP_STATE_AA64,
646
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
647
.opc0 = 3, .opc1 = 0, .crn = 4, .crm = 0, .opc2 = 0,
648
.access = PL1_RW,
649
.fieldoffset = offsetof(CPUARMState, banked_spsr[BANK_SVC]) },
650
- /* We rely on the access checks not allowing the guest to write to the
651
+ /*
652
+ * We rely on the access checks not allowing the guest to write to the
653
* state field when SPSel indicates that it's being used as the stack
654
* pointer.
655
*/
656
@@ -XXX,XX +XXX,XX @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
657
if (arm_feature(env, ARM_FEATURE_EL3)) {
658
valid_mask &= ~HCR_HCD;
659
} else if (cpu->psci_conduit != QEMU_PSCI_CONDUIT_SMC) {
660
- /* Architecturally HCR.TSC is RES0 if EL3 is not implemented.
661
+ /*
662
+ * Architecturally HCR.TSC is RES0 if EL3 is not implemented.
663
* However, if we're using the SMC PSCI conduit then QEMU is
664
* effectively acting like EL3 firmware and so the guest at
665
* EL2 should retain the ability to prevent EL1 from being
666
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
667
.access = PL2_W, .type = ARM_CP_NO_RAW | ARM_CP_EL3_NO_EL2_UNDEF,
668
.writefn = tlbi_aa64_vae2is_write },
669
#ifndef CONFIG_USER_ONLY
670
- /* Unlike the other EL2-related AT operations, these must
671
+ /*
672
+ * Unlike the other EL2-related AT operations, these must
673
* UNDEF from EL3 if EL2 is not implemented, which is why we
674
* define them here rather than with the rest of the AT ops.
675
*/
676
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
677
.access = PL2_W, .accessfn = at_s1e2_access,
678
.type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC | ARM_CP_EL3_NO_EL2_UNDEF,
679
.writefn = ats_write64 },
680
- /* The AArch32 ATS1H* operations are CONSTRAINED UNPREDICTABLE
681
+ /*
682
+ * The AArch32 ATS1H* operations are CONSTRAINED UNPREDICTABLE
683
* if EL2 is not implemented; we choose to UNDEF. Behaviour at EL3
684
* with SCR.NS == 0 outside Monitor mode is UNPREDICTABLE; we choose
685
* to behave as if SCR.NS was 1.
686
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
687
.writefn = ats1h_write, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC },
688
{ .name = "CNTHCTL_EL2", .state = ARM_CP_STATE_BOTH,
689
.opc0 = 3, .opc1 = 4, .crn = 14, .crm = 1, .opc2 = 0,
690
- /* ARMv7 requires bit 0 and 1 to reset to 1. ARMv8 defines the
691
+ /*
692
+ * ARMv7 requires bit 0 and 1 to reset to 1. ARMv8 defines the
693
* reset values as IMPDEF. We choose to reset to 3 to comply with
694
* both ARMv7 and ARMv8.
695
*/
696
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_sec_cp_reginfo[] = {
697
static CPAccessResult nsacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
698
bool isread)
699
{
700
- /* The NSACR is RW at EL3, and RO for NS EL1 and NS EL2.
701
+ /*
702
+ * The NSACR is RW at EL3, and RO for NS EL1 and NS EL2.
703
* At Secure EL1 it traps to EL3 or EL2.
704
*/
705
if (arm_current_el(env) == 3) {
706
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
707
}
708
}
709
710
-/* We don't know until after realize whether there's a GICv3
711
+/*
712
+ * We don't know until after realize whether there's a GICv3
713
* attached, and that is what registers the gicv3 sysregs.
714
* So we have to fill in the GIC fields in ID_PFR/ID_PFR1_EL1/ID_AA64PFR0_EL1
715
* at runtime.
716
@@ -XXX,XX +XXX,XX @@ static uint64_t id_aa64pfr0_read(CPUARMState *env, const ARMCPRegInfo *ri)
717
}
718
#endif
719
720
-/* Shared logic between LORID and the rest of the LOR* registers.
721
+/*
722
+ * Shared logic between LORID and the rest of the LOR* registers.
723
* Secure state exclusion has already been dealt with.
724
*/
725
static CPAccessResult access_lor_ns(CPUARMState *env,
726
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
727
728
define_arm_cp_regs(cpu, cp_reginfo);
729
if (!arm_feature(env, ARM_FEATURE_V8)) {
730
- /* Must go early as it is full of wildcards that may be
731
+ /*
732
+ * Must go early as it is full of wildcards that may be
733
* overridden by later definitions.
734
*/
735
define_arm_cp_regs(cpu, not_v8_cp_reginfo);
736
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
737
.access = PL1_R, .type = ARM_CP_CONST,
738
.accessfn = access_aa32_tid3,
739
.resetvalue = cpu->isar.id_pfr0 },
740
- /* ID_PFR1 is not a plain ARM_CP_CONST because we don't know
741
+ /*
742
+ * ID_PFR1 is not a plain ARM_CP_CONST because we don't know
743
* the value of the GIC field until after we define these regs.
744
*/
745
{ .name = "ID_PFR1", .state = ARM_CP_STATE_BOTH,
746
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
747
748
define_arm_cp_regs(cpu, el3_regs);
749
}
750
- /* The behaviour of NSACR is sufficiently various that we don't
751
+ /*
752
+ * The behaviour of NSACR is sufficiently various that we don't
753
* try to describe it in a single reginfo:
754
* if EL3 is 64 bit, then trap to EL3 from S EL1,
755
* reads as constant 0xc00 from NS EL1 and NS EL2
756
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
757
if (cpu_isar_feature(aa32_jazelle, cpu)) {
758
define_arm_cp_regs(cpu, jazelle_regs);
759
}
760
- /* Slightly awkwardly, the OMAP and StrongARM cores need all of
761
+ /*
762
+ * Slightly awkwardly, the OMAP and StrongARM cores need all of
763
* cp15 crn=0 to be writes-ignored, whereas for other cores they should
764
* be read-only (ie write causes UNDEF exception).
765
*/
766
{
767
ARMCPRegInfo id_pre_v8_midr_cp_reginfo[] = {
768
- /* Pre-v8 MIDR space.
769
+ /*
770
+ * Pre-v8 MIDR space.
771
* Note that the MIDR isn't a simple constant register because
772
* of the TI925 behaviour where writes to another register can
773
* cause the MIDR value to change.
774
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
775
if (arm_feature(env, ARM_FEATURE_OMAPCP) ||
776
arm_feature(env, ARM_FEATURE_STRONGARM)) {
777
size_t i;
778
- /* Register the blanket "writes ignored" value first to cover the
779
+ /*
780
+ * Register the blanket "writes ignored" value first to cover the
781
* whole space. Then update the specific ID registers to allow write
782
* access, so that they ignore writes rather than causing them to
783
* UNDEF.
784
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
785
.raw_writefn = raw_write,
786
};
787
if (arm_feature(env, ARM_FEATURE_XSCALE)) {
788
- /* Normally we would always end the TB on an SCTLR write, but Linux
789
+ /*
790
+ * Normally we would always end the TB on an SCTLR write, but Linux
791
* arch/arm/mach-pxa/sleep.S expects two instructions following
792
* an MMU enable to execute from cache. Imitate this behaviour.
793
*/
794
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
795
void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
796
const ARMCPRegInfo *r, void *opaque)
797
{
798
- /* Define implementations of coprocessor registers.
799
+ /*
800
+ * Define implementations of coprocessor registers.
801
* We store these in a hashtable because typically
802
* there are less than 150 registers in a space which
803
* is 16*16*16*8*8 = 262144 in size.
804
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
60
default:
805
default:
61
g_assert_not_reached();
806
g_assert_not_reached();
62
}
807
}
63
store_cpu_field(vpr, v7m.vpr);
808
- /* The AArch64 pseudocode CheckSystemAccess() specifies that op1
64
+}
809
+ /*
65
+
810
+ * The AArch64 pseudocode CheckSystemAccess() specifies that op1
66
+static bool trans_VPST(DisasContext *s, arg_VPST *a)
811
* encodes a minimum access level for the register. We roll this
67
+{
812
* runtime check into our general permission check code, so check
68
+ /* mask == 0 is a "related encoding" */
813
* here that the reginfo's specified permissions are strict enough
69
+ if (!dc_isar_feature(aa32_mve, s) || !a->mask) {
814
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
70
+ return false;
815
assert((r->access & ~mask) == 0);
71
+ }
816
}
72
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
817
73
+ return true;
818
- /* Check that the register definition has enough info to handle
74
+ }
819
+ /*
75
+ gen_vpst(s, a->mask);
820
+ * Check that the register definition has enough info to handle
76
mve_update_and_store_eci(s);
821
* reads and writes if they are permitted.
77
return true;
822
*/
78
}
823
if (!(r->type & (ARM_CP_SPECIAL_MASK | ARM_CP_CONST))) {
824
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
825
continue;
826
}
827
if (state == ARM_CP_STATE_AA32) {
828
- /* Under AArch32 CP registers can be common
829
+ /*
830
+ * Under AArch32 CP registers can be common
831
* (same for secure and non-secure world) or banked.
832
*/
833
char *name;
834
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
835
g_assert_not_reached();
836
}
837
} else {
838
- /* AArch64 registers get mapped to non-secure instance
839
- * of AArch32 */
840
+ /*
841
+ * AArch64 registers get mapped to non-secure instance
842
+ * of AArch32
843
+ */
844
add_cpreg_to_hashtable(cpu, r, opaque, state,
845
ARM_CP_SECSTATE_NS,
846
crm, opc1, opc2, r->name);
847
@@ -XXX,XX +XXX,XX @@ void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque)
848
849
static int bad_mode_switch(CPUARMState *env, int mode, CPSRWriteType write_type)
850
{
851
- /* Return true if it is not valid for us to switch to
852
+ /*
853
+ * Return true if it is not valid for us to switch to
854
* this CPU mode (ie all the UNPREDICTABLE cases in
855
* the ARM ARM CPSRWriteByInstr pseudocode).
856
*/
857
@@ -XXX,XX +XXX,XX @@ static int bad_mode_switch(CPUARMState *env, int mode, CPSRWriteType write_type)
858
case ARM_CPU_MODE_UND:
859
case ARM_CPU_MODE_IRQ:
860
case ARM_CPU_MODE_FIQ:
861
- /* Note that we don't implement the IMPDEF NSACR.RFR which in v7
862
+ /*
863
+ * Note that we don't implement the IMPDEF NSACR.RFR which in v7
864
* allows FIQ mode to be Secure-only. (In v8 this doesn't exist.)
865
*/
866
- /* If HCR.TGE is set then changes from Monitor to NS PL1 via MSR
867
+ /*
868
+ * If HCR.TGE is set then changes from Monitor to NS PL1 via MSR
869
* and CPS are treated as illegal mode changes.
870
*/
871
if (write_type == CPSRWriteByInstr &&
872
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
873
env->GE = (val >> 16) & 0xf;
874
}
875
876
- /* In a V7 implementation that includes the security extensions but does
877
+ /*
878
+ * In a V7 implementation that includes the security extensions but does
879
* not include Virtualization Extensions the SCR.FW and SCR.AW bits control
880
* whether non-secure software is allowed to change the CPSR_F and CPSR_A
881
* bits respectively.
882
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
883
changed_daif = (env->daif ^ val) & mask;
884
885
if (changed_daif & CPSR_A) {
886
- /* Check to see if we are allowed to change the masking of async
887
+ /*
888
+ * Check to see if we are allowed to change the masking of async
889
* abort exceptions from a non-secure state.
890
*/
891
if (!(env->cp15.scr_el3 & SCR_AW)) {
892
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
893
}
894
895
if (changed_daif & CPSR_F) {
896
- /* Check to see if we are allowed to change the masking of FIQ
897
+ /*
898
+ * Check to see if we are allowed to change the masking of FIQ
899
* exceptions from a non-secure state.
900
*/
901
if (!(env->cp15.scr_el3 & SCR_FW)) {
902
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
903
mask &= ~CPSR_F;
904
}
905
906
- /* Check whether non-maskable FIQ (NMFI) support is enabled.
907
+ /*
908
+ * Check whether non-maskable FIQ (NMFI) support is enabled.
909
* If this bit is set software is not allowed to mask
910
* FIQs, but is allowed to set CPSR_F to 0.
911
*/
912
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
913
if (write_type != CPSRWriteRaw &&
914
((env->uncached_cpsr ^ val) & mask & CPSR_M)) {
915
if ((env->uncached_cpsr & CPSR_M) == ARM_CPU_MODE_USR) {
916
- /* Note that we can only get here in USR mode if this is a
917
+ /*
918
+ * Note that we can only get here in USR mode if this is a
919
* gdb stub write; for this case we follow the architectural
920
* behaviour for guest writes in USR mode of ignoring an attempt
921
* to switch mode. (Those are caught by translate.c for writes
922
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
923
*/
924
mask &= ~CPSR_M;
925
} else if (bad_mode_switch(env, val & CPSR_M, write_type)) {
926
- /* Attempt to switch to an invalid mode: this is UNPREDICTABLE in
927
+ /*
928
+ * Attempt to switch to an invalid mode: this is UNPREDICTABLE in
929
* v7, and has defined behaviour in v8:
930
* + leave CPSR.M untouched
931
* + allow changes to the other CPSR fields
932
@@ -XXX,XX +XXX,XX @@ static void switch_mode(CPUARMState *env, int mode)
933
env->regs[14] = env->banked_r14[r14_bank_number(mode)];
934
}
935
936
-/* Physical Interrupt Target EL Lookup Table
937
+/*
938
+ * Physical Interrupt Target EL Lookup Table
939
*
940
* [ From ARM ARM section G1.13.4 (Table G1-15) ]
941
*
942
@@ -XXX,XX +XXX,XX @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
943
if (arm_feature(env, ARM_FEATURE_EL3)) {
944
rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
945
} else {
946
- /* Either EL2 is the highest EL (and so the EL2 register width
947
+ /*
948
+ * Either EL2 is the highest EL (and so the EL2 register width
949
* is given by is64); or there is no EL2 or EL3, in which case
950
* the value of 'rw' does not affect the table lookup anyway.
951
*/
952
@@ -XXX,XX +XXX,XX @@ void aarch64_sync_64_to_32(CPUARMState *env)
953
env->banked_r13[bank_number(ARM_CPU_MODE_UND)] = env->xregs[23];
954
}
955
956
- /* Registers x24-x30 are mapped to r8-r14 in FIQ mode. If we are in FIQ
957
+ /*
958
+ * Registers x24-x30 are mapped to r8-r14 in FIQ mode. If we are in FIQ
959
* mode, then we can copy to r8-r14. Otherwise, we copy to the
960
* FIQ bank for r8-r14.
961
*/
962
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
963
/* High vectors. When enabled, base address cannot be remapped. */
964
addr += 0xffff0000;
965
} else {
966
- /* ARM v7 architectures provide a vector base address register to remap
967
+ /*
968
+ * ARM v7 architectures provide a vector base address register to remap
969
* the interrupt vector table.
970
* This register is only followed in non-monitor mode, and is banked.
971
* Note: only bits 31:5 are valid.
972
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
973
aarch64_sve_change_el(env, cur_el, new_el, is_a64(env));
974
975
if (cur_el < new_el) {
976
- /* Entry vector offset depends on whether the implemented EL
977
+ /*
978
+ * Entry vector offset depends on whether the implemented EL
979
* immediately lower than the target level is using AArch32 or AArch64
980
*/
981
bool is_aa64;
982
@@ -XXX,XX +XXX,XX @@ static void handle_semihosting(CPUState *cs)
983
}
984
#endif
985
986
-/* Handle a CPU exception for A and R profile CPUs.
987
+/*
988
+ * Handle a CPU exception for A and R profile CPUs.
989
* Do any appropriate logging, handle PSCI calls, and then hand off
990
* to the AArch64-entry or AArch32-entry function depending on the
991
* target exception level's register width.
992
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_interrupt(CPUState *cs)
993
}
994
#endif
995
996
- /* Hooks may change global state so BQL should be held, also the
997
+ /*
998
+ * Hooks may change global state so BQL should be held, also the
999
* BQL needs to be held for any modification of
1000
* cs->interrupt_request.
1001
*/
1002
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
1003
};
1004
}
1005
1006
-/* Note that signed overflow is undefined in C. The following routines are
1007
- careful to use unsigned types where modulo arithmetic is required.
1008
- Failure to do so _will_ break on newer gcc. */
1009
+/*
1010
+ * Note that signed overflow is undefined in C. The following routines are
1011
+ * careful to use unsigned types where modulo arithmetic is required.
1012
+ * Failure to do so _will_ break on newer gcc.
1013
+ */
1014
1015
/* Signed saturating arithmetic. */
1016
1017
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sel_flags)(uint32_t flags, uint32_t a, uint32_t b)
1018
return (a & mask) | (b & ~mask);
1019
}
1020
1021
-/* CRC helpers.
1022
+/*
1023
+ * CRC helpers.
1024
* The upper bytes of val (above the number specified by 'bytes') must have
1025
* been zeroed out by the caller.
1026
*/
1027
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
1028
return crc32c(acc, buf, bytes) ^ 0xffffffff;
1029
}
1030
1031
-/* Return the exception level to which FP-disabled exceptions should
1032
+/*
1033
+ * Return the exception level to which FP-disabled exceptions should
1034
* be taken, or 0 if FP is enabled.
1035
*/
1036
int fp_exception_el(CPUARMState *env, int cur_el)
1037
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
1038
#ifndef CONFIG_USER_ONLY
1039
uint64_t hcr_el2;
1040
1041
- /* CPACR and the CPTR registers don't exist before v6, so FP is
1042
+ /*
1043
+ * CPACR and the CPTR registers don't exist before v6, so FP is
1044
* always accessible
1045
*/
1046
if (!arm_feature(env, ARM_FEATURE_V6)) {
1047
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
1048
1049
hcr_el2 = arm_hcr_el2_eff(env);
1050
1051
- /* The CPACR controls traps to EL1, or PL1 if we're 32 bit:
1052
+ /*
1053
+ * The CPACR controls traps to EL1, or PL1 if we're 32 bit:
1054
* 0, 2 : trap EL0 and EL1/PL1 accesses
1055
* 1 : trap only EL0 accesses
1056
* 3 : trap no accesses
79
--
1057
--
80
2.20.1
1058
2.25.1
81
82
diff view generated by jsdifflib
1
We were not paying attention to the ECI state when advancing the VPT
1
From: Fabiano Rosas <farosas@suse.de>
2
state. Architecturally, VPT state advance happens for every beat
3
(see the pseudocode VPTAdvance()), so on every beat the 4 bits of
4
VPR.P0 corresponding to the current beat are inverted if required,
5
and at the end of beats 1 and 3 the VPR MASK fields are updated.
6
This means that if the ECI state says we should not be executing all
7
4 beats then we need to skip some of the updating of the VPR that we
8
currently do in mve_advance_vpt().
9
2
3
Fix the following:
4
5
ERROR: spaces required around that '|' (ctx:VxV)
6
ERROR: space required before the open parenthesis '('
7
ERROR: spaces required around that '+' (ctx:VxB)
8
ERROR: space prohibited between function name and open parenthesis '('
9
10
(the last two still have some occurrences in macros which I left
11
behind because it might impact readability)
12
13
Signed-off-by: Fabiano Rosas <farosas@suse.de>
14
Reviewed-by: Claudio Fontana <cfontana@suse.de>
15
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
16
Message-id: 20221213190537.511-3-farosas@suse.de
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
---
18
---
13
target/arm/mve_helper.c | 24 +++++++++++++++++-------
19
target/arm/helper.c | 42 +++++++++++++++++++++---------------------
14
1 file changed, 17 insertions(+), 7 deletions(-)
20
1 file changed, 21 insertions(+), 21 deletions(-)
15
21
16
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
22
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/mve_helper.c
24
--- a/target/arm/helper.c
19
+++ b/target/arm/mve_helper.c
25
+++ b/target/arm/helper.c
20
@@ -XXX,XX +XXX,XX @@ static void mve_advance_vpt(CPUARMState *env)
26
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_list(gpointer key, gpointer opaque)
21
/* Advance the VPT and ECI state if necessary */
27
uint32_t regidx = (uintptr_t)key;
22
uint32_t vpr = env->v7m.vpr;
28
const ARMCPRegInfo *ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
23
unsigned mask01, mask23;
29
24
+ uint16_t inv_mask;
30
- if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
25
+ uint16_t eci_mask = mve_eci_mask(env);
31
+ if (!(ri->type & (ARM_CP_NO_RAW | ARM_CP_ALIAS))) {
26
32
cpu->cpreg_indexes[cpu->cpreg_array_len] = cpreg_to_kvm_id(regidx);
27
if ((env->condexec_bits & 0xf) == 0) {
33
/* The value array need not be initialized at this point */
28
env->condexec_bits = (env->condexec_bits == (ECI_A0A1A2B0 << 4)) ?
34
cpu->cpreg_array_len++;
29
@@ -XXX,XX +XXX,XX @@ static void mve_advance_vpt(CPUARMState *env)
35
@@ -XXX,XX +XXX,XX @@ static void count_cpreg(gpointer key, gpointer opaque)
36
37
ri = g_hash_table_lookup(cpu->cp_regs, key);
38
39
- if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
40
+ if (!(ri->type & (ARM_CP_NO_RAW | ARM_CP_ALIAS))) {
41
cpu->cpreg_array_len++;
42
}
43
}
44
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
45
.resetfn = arm_cp_reset_ignore },
46
{ .name = "TPIDRRO_EL0", .state = ARM_CP_STATE_AA64,
47
.opc0 = 3, .opc1 = 3, .opc2 = 3, .crn = 13, .crm = 0,
48
- .access = PL0_R|PL1_W,
49
+ .access = PL0_R | PL1_W,
50
.fieldoffset = offsetof(CPUARMState, cp15.tpidrro_el[0]),
51
.resetvalue = 0},
52
{ .name = "TPIDRURO", .cp = 15, .crn = 13, .crm = 0, .opc1 = 0, .opc2 = 3,
53
- .access = PL0_R|PL1_W,
54
+ .access = PL0_R | PL1_W,
55
.bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tpidruro_s),
56
offsetoflow32(CPUARMState, cp15.tpidruro_ns) },
57
.resetfn = arm_cp_reset_ignore },
58
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
59
.resetvalue = 0 },
60
/* The cache ops themselves: these all NOP for QEMU */
61
{ .name = "IICR", .cp = 15, .crm = 5, .opc1 = 0,
62
- .access = PL1_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
63
+ .access = PL1_W, .type = ARM_CP_NOP | ARM_CP_64BIT },
64
{ .name = "IDCR", .cp = 15, .crm = 6, .opc1 = 0,
65
- .access = PL1_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
66
+ .access = PL1_W, .type = ARM_CP_NOP | ARM_CP_64BIT },
67
{ .name = "CDCR", .cp = 15, .crm = 12, .opc1 = 0,
68
- .access = PL0_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
69
+ .access = PL0_W, .type = ARM_CP_NOP | ARM_CP_64BIT },
70
{ .name = "PIR", .cp = 15, .crm = 12, .opc1 = 1,
71
- .access = PL0_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
72
+ .access = PL0_W, .type = ARM_CP_NOP | ARM_CP_64BIT },
73
{ .name = "PDR", .cp = 15, .crm = 12, .opc1 = 2,
74
- .access = PL0_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
75
+ .access = PL0_W, .type = ARM_CP_NOP | ARM_CP_64BIT },
76
{ .name = "CIDCR", .cp = 15, .crm = 14, .opc1 = 0,
77
- .access = PL1_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
78
+ .access = PL1_W, .type = ARM_CP_NOP | ARM_CP_64BIT },
79
};
80
81
static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
82
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
83
ARMCPRegInfo cbar = {
84
.name = "CBAR",
85
.cp = 15, .crn = 15, .crm = 0, .opc1 = 4, .opc2 = 0,
86
- .access = PL1_R|PL3_W, .resetvalue = cpu->reset_cbar,
87
+ .access = PL1_R | PL3_W, .resetvalue = cpu->reset_cbar,
88
.fieldoffset = offsetof(CPUARMState,
89
cp15.c15_config_base_address)
90
};
91
@@ -XXX,XX +XXX,XX @@ static void switch_mode(CPUARMState *env, int mode)
30
return;
92
return;
93
94
if (old_mode == ARM_CPU_MODE_FIQ) {
95
- memcpy (env->fiq_regs, env->regs + 8, 5 * sizeof(uint32_t));
96
- memcpy (env->regs + 8, env->usr_regs, 5 * sizeof(uint32_t));
97
+ memcpy(env->fiq_regs, env->regs + 8, 5 * sizeof(uint32_t));
98
+ memcpy(env->regs + 8, env->usr_regs, 5 * sizeof(uint32_t));
99
} else if (mode == ARM_CPU_MODE_FIQ) {
100
- memcpy (env->usr_regs, env->regs + 8, 5 * sizeof(uint32_t));
101
- memcpy (env->regs + 8, env->fiq_regs, 5 * sizeof(uint32_t));
102
+ memcpy(env->usr_regs, env->regs + 8, 5 * sizeof(uint32_t));
103
+ memcpy(env->regs + 8, env->fiq_regs, 5 * sizeof(uint32_t));
31
}
104
}
32
105
33
+ /* Invert P0 bits if needed, but only for beats we actually executed */
106
i = bank_number(old_mode);
34
mask01 = FIELD_EX32(vpr, V7M_VPR, MASK01);
107
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
35
mask23 = FIELD_EX32(vpr, V7M_VPR, MASK23);
108
RESULT(sum, n, 16); \
36
- if (mask01 > 8) {
109
if (sum >= 0) \
37
- /* high bit set, but not 0b1000: invert the relevant half of P0 */
110
ge |= 3 << (n * 2); \
38
- vpr ^= 0xff;
111
- } while(0)
39
+ /* Start by assuming we invert all bits corresponding to executed beats */
112
+ } while (0)
40
+ inv_mask = eci_mask;
113
41
+ if (mask01 <= 8) {
114
#define SARITH8(a, b, n, op) do { \
42
+ /* MASK01 says don't invert low half of P0 */
115
int32_t sum; \
43
+ inv_mask &= ~0xff;
116
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
44
}
117
RESULT(sum, n, 8); \
45
- if (mask23 > 8) {
118
if (sum >= 0) \
46
- /* high bit set, but not 0b1000: invert the relevant half of P0 */
119
ge |= 1 << n; \
47
- vpr ^= 0xff00;
120
- } while(0)
48
+ if (mask23 <= 8) {
121
+ } while (0)
49
+ /* MASK23 says don't invert high half of P0 */
122
50
+ inv_mask &= ~0xff00;
123
51
}
124
#define ADD16(a, b, n) SARITH16(a, b, n, +)
52
- vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1);
125
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
53
+ vpr ^= inv_mask;
126
RESULT(sum, n, 16); \
54
+ /* Only update MASK01 if beat 1 executed */
127
if ((sum >> 16) == 1) \
55
+ if (eci_mask & 0xf0) {
128
ge |= 3 << (n * 2); \
56
+ vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1);
129
- } while(0)
57
+ }
130
+ } while (0)
58
+ /* Beat 3 always executes, so update MASK23 */
131
59
vpr = FIELD_DP32(vpr, V7M_VPR, MASK23, mask23 << 1);
132
#define ADD8(a, b, n) do { \
60
env->v7m.vpr = vpr;
133
uint32_t sum; \
61
}
134
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
135
RESULT(sum, n, 8); \
136
if ((sum >> 8) == 1) \
137
ge |= 1 << n; \
138
- } while(0)
139
+ } while (0)
140
141
#define SUB16(a, b, n) do { \
142
uint32_t sum; \
143
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
144
RESULT(sum, n, 16); \
145
if ((sum >> 16) == 0) \
146
ge |= 3 << (n * 2); \
147
- } while(0)
148
+ } while (0)
149
150
#define SUB8(a, b, n) do { \
151
uint32_t sum; \
152
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
153
RESULT(sum, n, 8); \
154
if ((sum >> 8) == 0) \
155
ge |= 1 << n; \
156
- } while(0)
157
+ } while (0)
158
159
#define PFX u
160
#define ARITH_GE
62
--
161
--
63
2.20.1
162
2.25.1
64
65
diff view generated by jsdifflib
1
Implement the MVE VMLADAV and VMLSLDAV insns. Like the VMLALDAV and
1
From: Fabiano Rosas <farosas@suse.de>
2
VMLSLDAV insns already implemented, these accumulate multiplied
3
vector elements; but they accumulate a 32-bit result rather than a
4
64-bit one.
5
2
6
Note that these encodings overlap with what would be RdaHi=0b111 for
3
Fix this:
7
VMLALDAV, VMLSLDAV, VRMLALDAVH and VRMLSLDAVH.
4
ERROR: braces {} are necessary for all arms of this statement
8
5
6
Signed-off-by: Fabiano Rosas <farosas@suse.de>
7
Reviewed-by: Claudio Fontana <cfontana@suse.de>
8
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
9
Message-id: 20221213190537.511-4-farosas@suse.de
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
---
11
---
12
target/arm/helper-mve.h | 17 ++++++++++
12
target/arm/helper.c | 67 ++++++++++++++++++++++++++++-----------------
13
target/arm/mve.decode | 33 +++++++++++++++++---
13
1 file changed, 42 insertions(+), 25 deletions(-)
14
target/arm/mve_helper.c | 41 ++++++++++++++++++++++++
15
target/arm/translate-mve.c | 64 ++++++++++++++++++++++++++++++++++++++
16
4 files changed, 150 insertions(+), 5 deletions(-)
17
14
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
19
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
17
--- a/target/arm/helper.c
21
+++ b/target/arm/helper-mve.h
18
+++ b/target/arm/helper.c
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrmlaldavhuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
19
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
23
DEF_HELPER_FLAGS_4(mve_vrmlsldavhsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
20
env->CF = (val >> 29) & 1;
24
DEF_HELPER_FLAGS_4(mve_vrmlsldavhxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
21
env->VF = (val << 3) & 0x80000000;
25
22
}
26
+DEF_HELPER_FLAGS_4(mve_vmladavsb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
23
- if (mask & CPSR_Q)
27
+DEF_HELPER_FLAGS_4(mve_vmladavsh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
24
+ if (mask & CPSR_Q) {
28
+DEF_HELPER_FLAGS_4(mve_vmladavsw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
25
env->QF = ((val & CPSR_Q) != 0);
29
+DEF_HELPER_FLAGS_4(mve_vmladavub, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
26
- if (mask & CPSR_T)
30
+DEF_HELPER_FLAGS_4(mve_vmladavuh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
27
+ }
31
+DEF_HELPER_FLAGS_4(mve_vmladavuw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
28
+ if (mask & CPSR_T) {
32
+DEF_HELPER_FLAGS_4(mve_vmlsdavb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
29
env->thumb = ((val & CPSR_T) != 0);
33
+DEF_HELPER_FLAGS_4(mve_vmlsdavh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
30
+ }
34
+DEF_HELPER_FLAGS_4(mve_vmlsdavw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
31
if (mask & CPSR_IT_0_1) {
35
+
32
env->condexec_bits &= ~3;
36
+DEF_HELPER_FLAGS_4(mve_vmladavsxb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
33
env->condexec_bits |= (val >> 25) & 3;
37
+DEF_HELPER_FLAGS_4(mve_vmladavsxh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
34
@@ -XXX,XX +XXX,XX @@ static void switch_mode(CPUARMState *env, int mode)
38
+DEF_HELPER_FLAGS_4(mve_vmladavsxw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
35
int i;
39
+DEF_HELPER_FLAGS_4(mve_vmlsdavxb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
36
40
+DEF_HELPER_FLAGS_4(mve_vmlsdavxh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
37
old_mode = env->uncached_cpsr & CPSR_M;
41
+DEF_HELPER_FLAGS_4(mve_vmlsdavxw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32)
38
- if (mode == old_mode)
42
+
39
+ if (mode == old_mode) {
43
DEF_HELPER_FLAGS_3(mve_vaddvsb, TCG_CALL_NO_WG, i32, env, ptr, i32)
40
return;
44
DEF_HELPER_FLAGS_3(mve_vaddvub, TCG_CALL_NO_WG, i32, env, ptr, i32)
41
+ }
45
DEF_HELPER_FLAGS_3(mve_vaddvsh, TCG_CALL_NO_WG, i32, env, ptr, i32)
42
46
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
43
if (old_mode == ARM_CPU_MODE_FIQ) {
47
index XXXXXXX..XXXXXXX 100644
44
memcpy(env->fiq_regs, env->regs + 8, 5 * sizeof(uint32_t));
48
--- a/target/arm/mve.decode
45
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
49
+++ b/target/arm/mve.decode
46
new_mode = ARM_CPU_MODE_UND;
50
@@ -XXX,XX +XXX,XX @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
47
addr = 0x04;
51
%size_16 16:1 !function=plus_1
48
mask = CPSR_I;
52
49
- if (env->thumb)
53
&vmlaldav rdahi rdalo size qn qm x a
50
+ if (env->thumb) {
54
+&vmladav rda size qn qm x a
51
offset = 2;
55
52
- else
56
@vmlaldav .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \
53
+ } else {
57
qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav
54
offset = 4;
58
@vmlaldav_nosz .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \
55
+ }
59
qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav
56
break;
60
-VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
57
case EXCP_SWI:
61
-VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
58
new_mode = ARM_CPU_MODE_SVC;
62
+@vmladav .... .... .... ... . ... x:1 .... . . a:1 . qm:3 . \
59
@@ -XXX,XX +XXX,XX @@ static inline uint16_t add16_sat(uint16_t a, uint16_t b)
63
+ qn=%qn rda=%rdalo size=%size_16 &vmladav
60
64
+@vmladav_nosz .... .... .... ... . ... x:1 .... . . a:1 . qm:3 . \
61
res = a + b;
65
+ qn=%qn rda=%rdalo size=0 &vmladav
62
if (((res ^ a) & 0x8000) && !((a ^ b) & 0x8000)) {
66
63
- if (a & 0x8000)
67
-VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav
64
+ if (a & 0x8000) {
68
+{
65
res = 0x8000;
69
+ VMLADAV_S 1110 1110 1111 ... . ... . 1110 . 0 . 0 ... 0 @vmladav
66
- else
70
+ VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
67
+ } else {
71
+}
68
res = 0x7fff;
72
+{
69
+ }
73
+ VMLADAV_U 1111 1110 1111 ... . ... . 1110 . 0 . 0 ... 0 @vmladav
70
}
74
+ VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
71
return res;
75
+}
72
}
76
+
73
@@ -XXX,XX +XXX,XX @@ static inline uint8_t add8_sat(uint8_t a, uint8_t b)
77
+{
74
78
+ VMLSDAV 1110 1110 1111 ... . ... . 1110 . 0 . 0 ... 1 @vmladav
75
res = a + b;
79
+ VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav
76
if (((res ^ a) & 0x80) && !((a ^ b) & 0x80)) {
80
+}
77
- if (a & 0x80)
81
+
78
+ if (a & 0x80) {
82
+{
79
res = 0x80;
83
+ VMLSDAV 1111 1110 1111 ... 0 ... . 1110 . 0 . 0 ... 1 @vmladav_nosz
80
- else
84
+ VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz
81
+ } else {
85
+}
82
res = 0x7f;
86
+
83
+ }
87
+VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz
84
}
88
+VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz
85
return res;
89
86
}
87
@@ -XXX,XX +XXX,XX @@ static inline uint16_t sub16_sat(uint16_t a, uint16_t b)
88
89
res = a - b;
90
if (((res ^ a) & 0x8000) && ((a ^ b) & 0x8000)) {
91
- if (a & 0x8000)
92
+ if (a & 0x8000) {
93
res = 0x8000;
94
- else
95
+ } else {
96
res = 0x7fff;
97
+ }
98
}
99
return res;
100
}
101
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_sat(uint8_t a, uint8_t b)
102
103
res = a - b;
104
if (((res ^ a) & 0x80) && ((a ^ b) & 0x80)) {
105
- if (a & 0x80)
106
+ if (a & 0x80) {
107
res = 0x80;
108
- else
109
+ } else {
110
res = 0x7f;
111
+ }
112
}
113
return res;
114
}
115
@@ -XXX,XX +XXX,XX @@ static inline uint16_t add16_usat(uint16_t a, uint16_t b)
90
{
116
{
91
VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv
117
uint16_t res;
92
VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv
118
res = a + b;
93
VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv
119
- if (res < a)
94
VMINAV 1110 1110 1110 .. 00 .... 1111 1 0 . 0 ... 0 @vmaxv
120
+ if (res < a) {
95
+ VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz
121
res = 0xffff;
96
VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
122
+ }
123
return res;
97
}
124
}
98
125
126
static inline uint16_t sub16_usat(uint16_t a, uint16_t b)
99
{
127
{
100
VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv
128
- if (a > b)
101
VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv
129
+ if (a > b) {
102
+ VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz
130
return a - b;
103
VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
131
- else
132
+ } else {
133
return 0;
134
+ }
104
}
135
}
105
136
106
-VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz
137
static inline uint8_t add8_usat(uint8_t a, uint8_t b)
107
-
138
{
108
# Scalar operations
139
uint8_t res;
109
140
res = a + b;
110
VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar
141
- if (res < a)
111
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
142
+ if (res < a) {
112
index XXXXXXX..XXXXXXX 100644
143
res = 0xff;
113
--- a/target/arm/mve_helper.c
114
+++ b/target/arm/mve_helper.c
115
@@ -XXX,XX +XXX,XX @@ DO_LDAV(vmlsldavxsh, 2, int16_t, true, +=, -=)
116
DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=)
117
DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=)
118
119
+/*
120
+ * Multiply add dual accumulate ops
121
+ */
122
+#define DO_DAV(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC) \
123
+ uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
124
+ void *vm, uint32_t a) \
125
+ { \
126
+ uint16_t mask = mve_element_mask(env); \
127
+ unsigned e; \
128
+ TYPE *n = vn, *m = vm; \
129
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
130
+ if (mask & 1) { \
131
+ if (e & 1) { \
132
+ a ODDACC \
133
+ n[H##ESIZE(e - 1 * XCHG)] * m[H##ESIZE(e)]; \
134
+ } else { \
135
+ a EVENACC \
136
+ n[H##ESIZE(e + 1 * XCHG)] * m[H##ESIZE(e)]; \
137
+ } \
138
+ } \
139
+ } \
140
+ mve_advance_vpt(env); \
141
+ return a; \
142
+ }
144
+ }
143
+
145
return res;
144
+#define DO_DAV_S(INSN, XCHG, EVENACC, ODDACC) \
145
+ DO_DAV(INSN##b, 1, int8_t, XCHG, EVENACC, ODDACC) \
146
+ DO_DAV(INSN##h, 2, int16_t, XCHG, EVENACC, ODDACC) \
147
+ DO_DAV(INSN##w, 4, int32_t, XCHG, EVENACC, ODDACC)
148
+
149
+#define DO_DAV_U(INSN, XCHG, EVENACC, ODDACC) \
150
+ DO_DAV(INSN##b, 1, uint8_t, XCHG, EVENACC, ODDACC) \
151
+ DO_DAV(INSN##h, 2, uint16_t, XCHG, EVENACC, ODDACC) \
152
+ DO_DAV(INSN##w, 4, uint32_t, XCHG, EVENACC, ODDACC)
153
+
154
+DO_DAV_S(vmladavs, false, +=, +=)
155
+DO_DAV_U(vmladavu, false, +=, +=)
156
+DO_DAV_S(vmlsdav, false, +=, -=)
157
+DO_DAV_S(vmladavsx, true, +=, +=)
158
+DO_DAV_S(vmlsdavx, true, +=, -=)
159
+
160
/*
161
* Rounding multiply add long dual accumulate high. In the pseudocode
162
* this is implemented with a 72-bit internal accumulator value of which
163
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
164
index XXXXXXX..XXXXXXX 100644
165
--- a/target/arm/translate-mve.c
166
+++ b/target/arm/translate-mve.c
167
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TC
168
typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
169
typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
170
typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
171
+typedef void MVEGenDualAccOpFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
172
173
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
174
static inline long mve_qreg_offset(unsigned reg)
175
@@ -XXX,XX +XXX,XX @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a)
176
return do_long_dual_acc(s, a, fns[a->x]);
177
}
146
}
178
147
179
+static bool do_dual_acc(DisasContext *s, arg_vmladav *a, MVEGenDualAccOpFn *fn)
148
static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
180
+{
149
{
181
+ TCGv_ptr qn, qm;
150
- if (a > b)
182
+ TCGv_i32 rda;
151
+ if (a > b) {
183
+
152
return a - b;
184
+ if (!dc_isar_feature(aa32_mve, s) ||
153
- else
185
+ !mve_check_qreg_bank(s, a->qn) ||
154
+ } else {
186
+ !fn) {
155
return 0;
187
+ return false;
188
+ }
156
+ }
189
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
157
}
190
+ return true;
158
159
#define ADD16(a, b, n) RESULT(add16_usat(a, b), n, 16);
160
@@ -XXX,XX +XXX,XX @@ static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
161
162
static inline uint8_t do_usad(uint8_t a, uint8_t b)
163
{
164
- if (a > b)
165
+ if (a > b) {
166
return a - b;
167
- else
168
+ } else {
169
return b - a;
191
+ }
170
+ }
192
+
171
}
193
+ qn = mve_qreg_ptr(a->qn);
172
194
+ qm = mve_qreg_ptr(a->qm);
173
/* Unsigned sum of absolute byte differences. */
195
+
174
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sel_flags)(uint32_t flags, uint32_t a, uint32_t b)
196
+ /*
175
uint32_t mask;
197
+ * This insn is subject to beat-wise execution. Partial execution
176
198
+ * of an A=0 (no-accumulate) insn which does not execute the first
177
mask = 0;
199
+ * beat must start with the current rda value, not 0.
178
- if (flags & 1)
200
+ */
179
+ if (flags & 1) {
201
+ if (a->a || mve_skip_first_beat(s)) {
180
mask |= 0xff;
202
+ rda = load_reg(s, a->rda);
181
- if (flags & 2)
203
+ } else {
204
+ rda = tcg_const_i32(0);
205
+ }
182
+ }
206
+
183
+ if (flags & 2) {
207
+ fn(rda, cpu_env, qn, qm, rda);
184
mask |= 0xff00;
208
+ store_reg(s, a->rda, rda);
185
- if (flags & 4)
209
+ tcg_temp_free_ptr(qn);
210
+ tcg_temp_free_ptr(qm);
211
+
212
+ mve_update_eci(s);
213
+ return true;
214
+}
215
+
216
+#define DO_DUAL_ACC(INSN, FN) \
217
+ static bool trans_##INSN(DisasContext *s, arg_vmladav *a) \
218
+ { \
219
+ static MVEGenDualAccOpFn * const fns[4][2] = { \
220
+ { gen_helper_mve_##FN##b, gen_helper_mve_##FN##xb }, \
221
+ { gen_helper_mve_##FN##h, gen_helper_mve_##FN##xh }, \
222
+ { gen_helper_mve_##FN##w, gen_helper_mve_##FN##xw }, \
223
+ { NULL, NULL }, \
224
+ }; \
225
+ return do_dual_acc(s, a, fns[a->size][a->x]); \
226
+ }
186
+ }
227
+
187
+ if (flags & 4) {
228
+DO_DUAL_ACC(VMLADAV_S, vmladavs)
188
mask |= 0xff0000;
229
+DO_DUAL_ACC(VMLSDAV, vmlsdav)
189
- if (flags & 8)
230
+
190
+ }
231
+static bool trans_VMLADAV_U(DisasContext *s, arg_vmladav *a)
191
+ if (flags & 8) {
232
+{
192
mask |= 0xff000000;
233
+ static MVEGenDualAccOpFn * const fns[4][2] = {
193
+ }
234
+ { gen_helper_mve_vmladavub, NULL },
194
return (a & mask) | (b & ~mask);
235
+ { gen_helper_mve_vmladavuh, NULL },
195
}
236
+ { gen_helper_mve_vmladavuw, NULL },
196
237
+ { NULL, NULL },
238
+ };
239
+ return do_dual_acc(s, a, fns[a->size][a->x]);
240
+}
241
+
242
static void gen_vpst(DisasContext *s, uint32_t mask)
243
{
244
/*
245
--
197
--
246
2.20.1
198
2.25.1
247
248
diff view generated by jsdifflib
1
Implement the MVE 1-operand saturating operations VQABS and VQNEG.
1
From: Fabiano Rosas <farosas@suse.de>
2
2
3
Signed-off-by: Fabiano Rosas <farosas@suse.de>
4
Reviewed-by: Claudio Fontana <cfontana@suse.de>
5
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
6
Message-id: 20221213190537.511-5-farosas@suse.de
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
---
8
---
6
target/arm/helper-mve.h | 8 ++++++++
9
target/arm/m_helper.c | 16 ----------------
7
target/arm/mve.decode | 3 +++
10
1 file changed, 16 deletions(-)
8
target/arm/mve_helper.c | 37 +++++++++++++++++++++++++++++++++++++
9
target/arm/translate-mve.c | 2 ++
10
4 files changed, 50 insertions(+)
11
11
12
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
13
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/helper-mve.h
14
--- a/target/arm/m_helper.c
15
+++ b/target/arm/helper-mve.h
15
+++ b/target/arm/m_helper.c
16
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@
17
DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
17
*/
18
DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr)
18
19
19
#include "qemu/osdep.h"
20
+DEF_HELPER_FLAGS_3(mve_vqabsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
20
-#include "qemu/units.h"
21
+DEF_HELPER_FLAGS_3(mve_vqabsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
-#include "target/arm/idau.h"
22
+DEF_HELPER_FLAGS_3(mve_vqabsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
-#include "trace.h"
23
+
23
#include "cpu.h"
24
+DEF_HELPER_FLAGS_3(mve_vqnegb, TCG_CALL_NO_WG, void, env, ptr, ptr)
24
#include "internals.h"
25
+DEF_HELPER_FLAGS_3(mve_vqnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
-#include "exec/gdbstub.h"
26
+DEF_HELPER_FLAGS_3(mve_vqnegw, TCG_CALL_NO_WG, void, env, ptr, ptr)
26
#include "exec/helper-proto.h"
27
+
27
-#include "qemu/host-utils.h"
28
DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr)
28
#include "qemu/main-loop.h"
29
DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr)
29
#include "qemu/bitops.h"
30
DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr)
30
-#include "qemu/crc32c.h"
31
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
31
-#include "qemu/qemu-print.h"
32
index XXXXXXX..XXXXXXX 100644
32
#include "qemu/log.h"
33
--- a/target/arm/mve.decode
33
#include "exec/exec-all.h"
34
+++ b/target/arm/mve.decode
34
-#include <zlib.h> /* For crc32 */
35
@@ -XXX,XX +XXX,XX @@ VABS_fp 1111 1111 1 . 11 .. 01 ... 0 0111 01 . 0 ... 0 @1op
35
-#include "semihosting/semihost.h"
36
VNEG 1111 1111 1 . 11 .. 01 ... 0 0011 11 . 0 ... 0 @1op
36
-#include "sysemu/cpus.h"
37
VNEG_fp 1111 1111 1 . 11 .. 01 ... 0 0111 11 . 0 ... 0 @1op
37
-#include "sysemu/kvm.h"
38
38
-#include "qemu/range.h"
39
+VQABS 1111 1111 1 . 11 .. 00 ... 0 0111 01 . 0 ... 0 @1op
39
-#include "qapi/qapi-commands-machine-target.h"
40
+VQNEG 1111 1111 1 . 11 .. 00 ... 0 0111 11 . 0 ... 0 @1op
40
-#include "qapi/error.h"
41
+
41
-#include "qemu/guest-random.h"
42
&vdup qd rt size
42
#ifdef CONFIG_TCG
43
# Qd is in the fields usually named Qn
43
-#include "arm_ldst.h"
44
@vdup .... .... . . .. ... . rt:4 .... . . . . .... qd=%qn &vdup
44
#include "exec/cpu_ldst.h"
45
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
45
#include "semihosting/common-semi.h"
46
index XXXXXXX..XXXXXXX 100644
46
#endif
47
--- a/target/arm/mve_helper.c
48
+++ b/target/arm/mve_helper.c
49
@@ -XXX,XX +XXX,XX @@ void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm)
50
}
51
mve_advance_vpt(env);
52
}
53
+
54
+#define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \
55
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
56
+ { \
57
+ TYPE *d = vd, *m = vm; \
58
+ uint16_t mask = mve_element_mask(env); \
59
+ unsigned e; \
60
+ bool qc = false; \
61
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
62
+ bool sat = false; \
63
+ mergemask(&d[H##ESIZE(e)], FN(m[H##ESIZE(e)], &sat), mask); \
64
+ qc |= sat & mask & 1; \
65
+ } \
66
+ if (qc) { \
67
+ env->vfp.qc[0] = qc; \
68
+ } \
69
+ mve_advance_vpt(env); \
70
+ }
71
+
72
+#define DO_VQABS_B(N, SATP) \
73
+ do_sat_bhs(DO_ABS((int64_t)N), INT8_MIN, INT8_MAX, SATP)
74
+#define DO_VQABS_H(N, SATP) \
75
+ do_sat_bhs(DO_ABS((int64_t)N), INT16_MIN, INT16_MAX, SATP)
76
+#define DO_VQABS_W(N, SATP) \
77
+ do_sat_bhs(DO_ABS((int64_t)N), INT32_MIN, INT32_MAX, SATP)
78
+
79
+#define DO_VQNEG_B(N, SATP) do_sat_bhs(-(int64_t)N, INT8_MIN, INT8_MAX, SATP)
80
+#define DO_VQNEG_H(N, SATP) do_sat_bhs(-(int64_t)N, INT16_MIN, INT16_MAX, SATP)
81
+#define DO_VQNEG_W(N, SATP) do_sat_bhs(-(int64_t)N, INT32_MIN, INT32_MAX, SATP)
82
+
83
+DO_1OP_SAT(vqabsb, 1, int8_t, DO_VQABS_B)
84
+DO_1OP_SAT(vqabsh, 2, int16_t, DO_VQABS_H)
85
+DO_1OP_SAT(vqabsw, 4, int32_t, DO_VQABS_W)
86
+
87
+DO_1OP_SAT(vqnegb, 1, int8_t, DO_VQNEG_B)
88
+DO_1OP_SAT(vqnegh, 2, int16_t, DO_VQNEG_H)
89
+DO_1OP_SAT(vqnegw, 4, int32_t, DO_VQNEG_W)
90
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/translate-mve.c
93
+++ b/target/arm/translate-mve.c
94
@@ -XXX,XX +XXX,XX @@ DO_1OP(VCLZ, vclz)
95
DO_1OP(VCLS, vcls)
96
DO_1OP(VABS, vabs)
97
DO_1OP(VNEG, vneg)
98
+DO_1OP(VQABS, vqabs)
99
+DO_1OP(VQNEG, vqneg)
100
101
/* Narrowing moves: only size 0 and 1 are valid */
102
#define DO_VMOVN(INSN, FN) \
103
--
47
--
104
2.20.1
48
2.25.1
105
106
diff view generated by jsdifflib
1
We're about to make a code change to the sdiv and udiv helper
1
From: Fabiano Rosas <farosas@suse.de>
2
functions, so first fix their indentation and coding style.
3
2
3
Signed-off-by: Fabiano Rosas <farosas@suse.de>
4
Reviewed-by: Claudio Fontana <cfontana@suse.de>
5
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
6
Message-id: 20221213190537.511-6-farosas@suse.de
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210730151636.17254-2-peter.maydell@linaro.org
7
---
8
---
8
target/arm/helper.c | 15 +++++++++------
9
target/arm/helper.c | 7 -------
9
1 file changed, 9 insertions(+), 6 deletions(-)
10
1 file changed, 7 deletions(-)
10
11
11
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/helper.c
14
--- a/target/arm/helper.c
14
+++ b/target/arm/helper.c
15
+++ b/target/arm/helper.c
15
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(uxtb16)(uint32_t x)
16
@@ -XXX,XX +XXX,XX @@
16
17
*/
17
int32_t HELPER(sdiv)(int32_t num, int32_t den)
18
18
{
19
#include "qemu/osdep.h"
19
- if (den == 0)
20
-#include "qemu/units.h"
20
- return 0;
21
#include "qemu/log.h"
21
- if (num == INT_MIN && den == -1)
22
#include "trace.h"
22
- return INT_MIN;
23
#include "cpu.h"
23
+ if (den == 0) {
24
#include "internals.h"
24
+ return 0;
25
#include "exec/helper-proto.h"
25
+ }
26
-#include "qemu/host-utils.h"
26
+ if (num == INT_MIN && den == -1) {
27
#include "qemu/main-loop.h"
27
+ return INT_MIN;
28
#include "qemu/timer.h"
28
+ }
29
#include "qemu/bitops.h"
29
return num / den;
30
@@ -XXX,XX +XXX,XX @@
30
}
31
#include "exec/exec-all.h"
31
32
#include <zlib.h> /* For crc32 */
32
uint32_t HELPER(udiv)(uint32_t num, uint32_t den)
33
#include "hw/irq.h"
33
{
34
-#include "semihosting/semihost.h"
34
- if (den == 0)
35
-#include "sysemu/cpus.h"
35
- return 0;
36
#include "sysemu/cpu-timers.h"
36
+ if (den == 0) {
37
#include "sysemu/kvm.h"
37
+ return 0;
38
-#include "qemu/range.h"
38
+ }
39
#include "qapi/qapi-commands-machine-target.h"
39
return num / den;
40
#include "qapi/error.h"
40
}
41
#include "qemu/guest-random.h"
41
42
#ifdef CONFIG_TCG
43
-#include "arm_ldst.h"
44
-#include "exec/cpu_ldst.h"
45
#include "semihosting/common-semi.h"
46
#endif
47
#include "cpregs.h"
42
--
48
--
43
2.20.1
49
2.25.1
44
45
diff view generated by jsdifflib
1
Implement the MVE VMLA insn, which multiplies a vector by a scalar
1
From: Claudio Fontana <cfontana@suse.de>
2
and accumulates into another vector.
3
2
3
Remove some unused headers.
4
5
Signed-off-by: Claudio Fontana <cfontana@suse.de>
6
Acked-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Claudio Fontana <cfontana@suse.de>
8
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
9
Signed-off-by: Fabiano Rosas <farosas@suse.de>
10
Message-id: 20221213190537.511-7-farosas@suse.de
11
[added back some includes that are still needed at this point]
12
Signed-off-by: Fabiano Rosas <farosas@suse.de>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
14
---
7
target/arm/helper-mve.h | 4 ++++
15
target/arm/cpu.c | 1 -
8
target/arm/mve.decode | 1 +
16
target/arm/cpu64.c | 6 ------
9
target/arm/mve_helper.c | 5 +++++
17
2 files changed, 7 deletions(-)
10
target/arm/translate-mve.c | 1 +
11
4 files changed, 11 insertions(+)
12
18
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
19
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
14
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
21
--- a/target/arm/cpu.c
16
+++ b/target/arm/helper-mve.h
22
+++ b/target/arm/cpu.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i3
23
@@ -XXX,XX +XXX,XX @@
18
DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
#include "target/arm/idau.h"
19
DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
#include "qemu/module.h"
20
26
#include "qapi/error.h"
21
+DEF_HELPER_FLAGS_4(mve_vmlab, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
-#include "qapi/visitor.h"
22
+DEF_HELPER_FLAGS_4(mve_vmlah, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
#include "cpu.h"
23
+DEF_HELPER_FLAGS_4(mve_vmlaw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
#ifdef CONFIG_TCG
24
+
30
#include "hw/core/tcg-cpu-ops.h"
25
DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
26
DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
29
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
33
--- a/target/arm/cpu64.c
31
+++ b/target/arm/mve.decode
34
+++ b/target/arm/cpu64.c
32
@@ -XXX,XX +XXX,XX @@ VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
35
@@ -XXX,XX +XXX,XX @@
33
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
36
#include "qemu/osdep.h"
34
37
#include "qapi/error.h"
35
# The U bit (28) is don't-care because it does not affect the result
38
#include "cpu.h"
36
+VMLA 111- 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar
39
-#ifdef CONFIG_TCG
37
VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar
40
-#include "hw/core/tcg-cpu-ops.h"
38
41
-#endif /* CONFIG_TCG */
39
# Vector add across vector
42
#include "qemu/module.h"
40
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
43
-#if !defined(CONFIG_USER_ONLY)
41
index XXXXXXX..XXXXXXX 100644
44
-#include "hw/loader.h"
42
--- a/target/arm/mve_helper.c
45
-#endif
43
+++ b/target/arm/mve_helper.c
46
#include "sysemu/kvm.h"
44
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B)
47
#include "sysemu/hvf.h"
45
DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H)
48
#include "kvm_arm.h"
46
DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W)
47
48
+/* Vector by scalar plus vector */
49
+#define DO_VMLA(D, N, M) ((N) * (M) + (D))
50
+
51
+DO_2OP_ACC_SCALAR_U(vmla, DO_VMLA)
52
+
53
/* Vector by vector plus scalar */
54
#define DO_VMLAS(D, N, M) ((N) * (D) + (M))
55
56
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/translate-mve.c
59
+++ b/target/arm/translate-mve.c
60
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar)
61
DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar)
62
DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar)
63
DO_2OP_SCALAR(VBRSR, vbrsr)
64
+DO_2OP_SCALAR(VMLA, vmla)
65
DO_2OP_SCALAR(VMLAS, vmlas)
66
67
static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a)
68
--
49
--
69
2.20.1
50
2.25.1
70
71
diff view generated by jsdifflib
1
The MVEGenDualAccOpFn is a bit misnamed, since it is used for
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
the "long dual accumulate" operations that use a 64-bit
3
accumulator. Rename it to MVEGenLongDualAccOpFn so we can
4
use the former name for the 32-bit accumulator insns.
5
2
3
The pointed MouseTransformInfo structure is accessed read-only.
4
5
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20221220142520.24094-2-philmd@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
---
9
target/arm/translate-mve.c | 16 ++++++++--------
10
include/hw/input/tsc2xxx.h | 4 ++--
10
1 file changed, 8 insertions(+), 8 deletions(-)
11
hw/input/tsc2005.c | 2 +-
12
hw/input/tsc210x.c | 3 +--
13
3 files changed, 4 insertions(+), 5 deletions(-)
11
14
12
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
15
diff --git a/include/hw/input/tsc2xxx.h b/include/hw/input/tsc2xxx.h
13
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-mve.c
17
--- a/include/hw/input/tsc2xxx.h
15
+++ b/target/arm/translate-mve.c
18
+++ b/include/hw/input/tsc2xxx.h
16
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
19
@@ -XXX,XX +XXX,XX @@ uWireSlave *tsc2102_init(qemu_irq pint);
17
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
20
uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
18
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
21
I2SCodec *tsc210x_codec(uWireSlave *chip);
19
typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
22
uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
20
-typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
23
-void tsc210x_set_transform(uWireSlave *chip, MouseTransformInfo *info);
21
+typedef void MVEGenLongDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
24
+void tsc210x_set_transform(uWireSlave *chip, const MouseTransformInfo *info);
22
typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
25
void tsc210x_key_event(uWireSlave *chip, int key, int down);
23
typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
26
24
typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32);
27
/* tsc2005.c */
25
@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMULLT_scalar(DisasContext *s, arg_2scalar *a)
28
void *tsc2005_init(qemu_irq pintdav);
26
}
29
uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
27
30
-void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
28
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
31
+void tsc2005_set_transform(void *opaque, const MouseTransformInfo *info);
29
- MVEGenDualAccOpFn *fn)
32
30
+ MVEGenLongDualAccOpFn *fn)
33
#endif
34
diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
35
index XXXXXXX..XXXXXXX 100644
36
--- a/hw/input/tsc2005.c
37
+++ b/hw/input/tsc2005.c
38
@@ -XXX,XX +XXX,XX @@ void *tsc2005_init(qemu_irq pintdav)
39
* from the touchscreen. Assuming 12-bit precision was used during
40
* tslib calibration.
41
*/
42
-void tsc2005_set_transform(void *opaque, MouseTransformInfo *info)
43
+void tsc2005_set_transform(void *opaque, const MouseTransformInfo *info)
31
{
44
{
32
TCGv_ptr qn, qm;
45
TSC2005State *s = (TSC2005State *) opaque;
33
TCGv_i64 rda;
46
34
@@ -XXX,XX +XXX,XX @@ static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
47
diff --git a/hw/input/tsc210x.c b/hw/input/tsc210x.c
35
48
index XXXXXXX..XXXXXXX 100644
36
static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a)
49
--- a/hw/input/tsc210x.c
50
+++ b/hw/input/tsc210x.c
51
@@ -XXX,XX +XXX,XX @@ I2SCodec *tsc210x_codec(uWireSlave *chip)
52
* from the touchscreen. Assuming 12-bit precision was used during
53
* tslib calibration.
54
*/
55
-void tsc210x_set_transform(uWireSlave *chip,
56
- MouseTransformInfo *info)
57
+void tsc210x_set_transform(uWireSlave *chip, const MouseTransformInfo *info)
37
{
58
{
38
- static MVEGenDualAccOpFn * const fns[4][2] = {
59
TSC210xState *s = (TSC210xState *) chip->opaque;
39
+ static MVEGenLongDualAccOpFn * const fns[4][2] = {
60
#if 0
40
{ NULL, NULL },
41
{ gen_helper_mve_vmlaldavsh, gen_helper_mve_vmlaldavxsh },
42
{ gen_helper_mve_vmlaldavsw, gen_helper_mve_vmlaldavxsw },
43
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a)
44
45
static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a)
46
{
47
- static MVEGenDualAccOpFn * const fns[4][2] = {
48
+ static MVEGenLongDualAccOpFn * const fns[4][2] = {
49
{ NULL, NULL },
50
{ gen_helper_mve_vmlaldavuh, NULL },
51
{ gen_helper_mve_vmlaldavuw, NULL },
52
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a)
53
54
static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a)
55
{
56
- static MVEGenDualAccOpFn * const fns[4][2] = {
57
+ static MVEGenLongDualAccOpFn * const fns[4][2] = {
58
{ NULL, NULL },
59
{ gen_helper_mve_vmlsldavsh, gen_helper_mve_vmlsldavxsh },
60
{ gen_helper_mve_vmlsldavsw, gen_helper_mve_vmlsldavxsw },
61
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a)
62
63
static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a)
64
{
65
- static MVEGenDualAccOpFn * const fns[] = {
66
+ static MVEGenLongDualAccOpFn * const fns[] = {
67
gen_helper_mve_vrmlaldavhsw, gen_helper_mve_vrmlaldavhxsw,
68
};
69
return do_long_dual_acc(s, a, fns[a->x]);
70
@@ -XXX,XX +XXX,XX @@ static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a)
71
72
static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a)
73
{
74
- static MVEGenDualAccOpFn * const fns[] = {
75
+ static MVEGenLongDualAccOpFn * const fns[] = {
76
gen_helper_mve_vrmlaldavhuw, NULL,
77
};
78
return do_long_dual_acc(s, a, fns[a->x]);
79
@@ -XXX,XX +XXX,XX @@ static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a)
80
81
static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a)
82
{
83
- static MVEGenDualAccOpFn * const fns[] = {
84
+ static MVEGenLongDualAccOpFn * const fns[] = {
85
gen_helper_mve_vrmlsldavhsw, gen_helper_mve_vrmlsldavhxsw,
86
};
87
return do_long_dual_acc(s, a, fns[a->x]);
88
--
61
--
89
2.20.1
62
2.25.1
90
63
91
64
diff view generated by jsdifflib
1
In do_sqrshl48_d() and do_uqrshl48_d() we got some of the edge
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
cases wrong and failed to saturate correctly:
3
2
4
(1) In do_sqrshl48_d() we used the same code that do_shrshl_bhs()
3
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
does to obtain the saturated most-negative and most-positive 48-bit
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
signed values for the large-shift-left case. This gives (1 << 47)
5
Message-id: 20221220142520.24094-3-philmd@linaro.org
7
for saturate-to-most-negative, but we weren't sign-extending this
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
value to the 64-bit output as the pseudocode requires.
7
---
8
hw/arm/nseries.c | 18 +++++++++---------
9
1 file changed, 9 insertions(+), 9 deletions(-)
9
10
10
(2) For left shifts by less than 48, we copied the "8/16 bit" code
11
diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
11
from do_sqrshl_bhs() and do_uqrshl_bhs(). This doesn't do the right
12
thing because it assumes the C type we're working with is at least
13
twice the number of bits we're saturating to (so that a shift left by
14
bits-1 can't shift anything off the top of the value). This isn't
15
true for bits == 48, so we would incorrectly return 0 rather than the
16
most-positive value for situations like "shift (1 << 44) right by
17
20". Instead check for saturation by doing the shift and signextend
18
and then testing whether shifting back left again gives the original
19
value.
20
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
23
---
24
target/arm/mve_helper.c | 12 +++++-------
25
1 file changed, 5 insertions(+), 7 deletions(-)
26
27
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
28
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/mve_helper.c
13
--- a/hw/arm/nseries.c
30
+++ b/target/arm/mve_helper.c
14
+++ b/hw/arm/nseries.c
31
@@ -XXX,XX +XXX,XX @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift,
15
@@ -XXX,XX +XXX,XX @@ static void n8x0_i2c_setup(struct n800_s *s)
32
}
33
return src >> -shift;
34
} else if (shift < 48) {
35
- int64_t val = src << shift;
36
- int64_t extval = sextract64(val, 0, 48);
37
- if (!sat || val == extval) {
38
+ int64_t extval = sextract64(src << shift, 0, 48);
39
+ if (!sat || src == (extval >> shift)) {
40
return extval;
41
}
42
} else if (!sat || src == 0) {
43
@@ -XXX,XX +XXX,XX @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift,
44
}
45
46
*sat = 1;
47
- return (1ULL << 47) - (src >= 0);
48
+ return src >= 0 ? MAKE_64BIT_MASK(0, 47) : MAKE_64BIT_MASK(47, 17);
49
}
16
}
50
17
51
/* Operate on 64-bit values, but saturate at 48 bits */
18
/* Touchscreen and keypad controller */
52
@@ -XXX,XX +XXX,XX @@ static inline uint64_t do_uqrshl48_d(uint64_t src, int64_t shift,
19
-static MouseTransformInfo n800_pointercal = {
53
return extval;
20
+static const MouseTransformInfo n800_pointercal = {
54
}
21
.x = 800,
55
} else if (shift < 48) {
22
.y = 480,
56
- uint64_t val = src << shift;
23
.a = { 14560, -68, -3455208, -39, -9621, 35152972, 65536 },
57
- uint64_t extval = extract64(val, 0, 48);
24
};
58
- if (!sat || val == extval) {
25
59
+ uint64_t extval = extract64(src << shift, 0, 48);
26
-static MouseTransformInfo n810_pointercal = {
60
+ if (!sat || src == (extval >> shift)) {
27
+static const MouseTransformInfo n810_pointercal = {
61
return extval;
28
.x = 800,
62
}
29
.y = 480,
63
} else if (!sat || src == 0) {
30
.a = { 15041, 148, -4731056, 171, -10238, 35933380, 65536 },
31
@@ -XXX,XX +XXX,XX @@ static void n810_key_event(void *opaque, int keycode)
32
33
#define M    0
34
35
-static int n810_keys[0x80] = {
36
+static const int n810_keys[0x80] = {
37
[0x01] = 16,    /* Q */
38
[0x02] = 37,    /* K */
39
[0x03] = 24,    /* O */
40
@@ -XXX,XX +XXX,XX @@ static void n8x0_usb_setup(struct n800_s *s)
41
/* Setup done before the main bootloader starts by some early setup code
42
* - used when we want to run the main bootloader in emulation. This
43
* isn't documented. */
44
-static uint32_t n800_pinout[104] = {
45
+static const uint32_t n800_pinout[104] = {
46
0x080f00d8, 0x00d40808, 0x03080808, 0x080800d0,
47
0x00dc0808, 0x0b0f0f00, 0x080800b4, 0x00c00808,
48
0x08080808, 0x180800c4, 0x00b80000, 0x08080808,
49
@@ -XXX,XX +XXX,XX @@ static void n8x0_boot_init(void *opaque)
50
#define OMAP_TAG_CBUS        0x4e03
51
#define OMAP_TAG_EM_ASIC_BB5    0x4e04
52
53
-static struct omap_gpiosw_info_s {
54
+static const struct omap_gpiosw_info_s {
55
const char *name;
56
int line;
57
int type;
58
@@ -XXX,XX +XXX,XX @@ static struct omap_gpiosw_info_s {
59
{ NULL }
60
};
61
62
-static struct omap_partition_info_s {
63
+static const struct omap_partition_info_s {
64
uint32_t offset;
65
uint32_t size;
66
int mask;
67
@@ -XXX,XX +XXX,XX @@ static struct omap_partition_info_s {
68
{ 0, 0, 0, NULL }
69
};
70
71
-static uint8_t n8x0_bd_addr[6] = { N8X0_BD_ADDR };
72
+static const uint8_t n8x0_bd_addr[6] = { N8X0_BD_ADDR };
73
74
static int n8x0_atag_setup(void *p, int model)
75
{
76
uint8_t *b;
77
uint16_t *w;
78
uint32_t *l;
79
- struct omap_gpiosw_info_s *gpiosw;
80
- struct omap_partition_info_s *partition;
81
+ const struct omap_gpiosw_info_s *gpiosw;
82
+ const struct omap_partition_info_s *partition;
83
const char *tag;
84
85
w = p;
64
--
86
--
65
2.20.1
87
2.25.1
66
88
67
89
diff view generated by jsdifflib
Deleted patch
1
We got an edge case wrong in the 48-bit SQRSHRL implementation: if
2
the shift is to the right, although it always makes the result
3
smaller than the input value it might not be within the 48-bit range
4
the result is supposed to be if the input had some bits in [63..48]
5
set and the shift didn't bring all of those within the [47..0] range.
6
1
7
Handle this similarly to the way we already do for this case in
8
do_uqrshl48_d(): extend the calculated result from 48 bits,
9
and return that if not saturating or if it doesn't change the
10
result; otherwise fall through to return a saturated value.
11
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
---
15
target/arm/mve_helper.c | 11 +++++++++--
16
1 file changed, 9 insertions(+), 2 deletions(-)
17
18
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/mve_helper.c
21
+++ b/target/arm/mve_helper.c
22
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(mve_uqrshll)(CPUARMState *env, uint64_t n, uint32_t shift)
23
static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift,
24
bool round, uint32_t *sat)
25
{
26
+ int64_t val, extval;
27
+
28
if (shift <= -48) {
29
/* Rounding the sign bit always produces 0. */
30
if (round) {
31
@@ -XXX,XX +XXX,XX @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift,
32
} else if (shift < 0) {
33
if (round) {
34
src >>= -shift - 1;
35
- return (src >> 1) + (src & 1);
36
+ val = (src >> 1) + (src & 1);
37
+ } else {
38
+ val = src >> -shift;
39
+ }
40
+ extval = sextract64(val, 0, 48);
41
+ if (!sat || val == extval) {
42
+ return extval;
43
}
44
- return src >> -shift;
45
} else if (shift < 48) {
46
int64_t extval = sextract64(src << shift, 0, 48);
47
if (!sat || src == (extval >> shift)) {
48
--
49
2.20.1
50
51
diff view generated by jsdifflib
Deleted patch
1
In mve_element_mask(), we calculate a mask for tail predication which
2
should have a number of 1 bits based on the value of LR. However,
3
our MAKE_64BIT_MASK() macro has undefined behaviour when passed a
4
zero length. Special case this to give the all-zeroes mask we
5
require.
6
1
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
---
10
target/arm/mve_helper.c | 3 ++-
11
1 file changed, 2 insertions(+), 1 deletion(-)
12
13
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/mve_helper.c
16
+++ b/target/arm/mve_helper.c
17
@@ -XXX,XX +XXX,XX @@ static uint16_t mve_element_mask(CPUARMState *env)
18
*/
19
int masklen = env->regs[14] << env->v7m.ltpsize;
20
assert(masklen <= 16);
21
- mask &= MAKE_64BIT_MASK(0, masklen);
22
+ uint16_t ltpmask = masklen ? MAKE_64BIT_MASK(0, masklen) : 0;
23
+ mask &= ltpmask;
24
}
25
26
if ((env->condexec_bits & 0xf) == 0) {
27
--
28
2.20.1
29
30
diff view generated by jsdifflib
Deleted patch
1
For vector loads, predicated elements are zeroed, instead of
2
retaining their previous values (as happens for most data
3
processing operations). This means we need to distinguish
4
"beat not executed due to ECI" (don't touch destination
5
element) from "beat executed but predicated out" (zero
6
destination element).
7
1
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
---
11
target/arm/mve_helper.c | 8 +++++---
12
1 file changed, 5 insertions(+), 3 deletions(-)
13
14
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/mve_helper.c
17
+++ b/target/arm/mve_helper.c
18
@@ -XXX,XX +XXX,XX @@ static void mve_advance_vpt(CPUARMState *env)
19
env->v7m.vpr = vpr;
20
}
21
22
-
23
+/* For loads, predicated lanes are zeroed instead of keeping their old values */
24
#define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \
25
void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
26
{ \
27
TYPE *d = vd; \
28
uint16_t mask = mve_element_mask(env); \
29
+ uint16_t eci_mask = mve_eci_mask(env); \
30
unsigned b, e; \
31
/* \
32
* R_SXTM allows the dest reg to become UNKNOWN for abandoned \
33
@@ -XXX,XX +XXX,XX @@ static void mve_advance_vpt(CPUARMState *env)
34
* then take an exception. \
35
*/ \
36
for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
37
- if (mask & (1 << b)) { \
38
- d[H##ESIZE(e)] = cpu_##LDTYPE##_data_ra(env, addr, GETPC()); \
39
+ if (eci_mask & (1 << b)) { \
40
+ d[H##ESIZE(e)] = (mask & (1 << b)) ? \
41
+ cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \
42
} \
43
addr += MSIZE; \
44
} \
45
--
46
2.20.1
47
48
diff view generated by jsdifflib
1
From: Eduardo Habkost <ehabkost@redhat.com>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
The SBSA_GWDT enum value conflicts with the SBSA_GWDT() QOM type
3
Silent when compiling with -Wextra:
4
checking helper, preventing us from using a OBJECT_DEFINE* or
5
DEFINE_INSTANCE_CHECKER macro for the SBSA_GWDT() wrapper.
6
4
7
If I understand the SBSA 6.0 specification correctly, the signal
5
../hw/arm/nseries.c:1081:12: warning: missing field 'line' initializer [-Wmissing-field-initializers]
8
being connected to IRQ 16 is the WS0 output signal from the
6
{ NULL }
9
Generic Watchdog. Rename the enum value to SBSA_GWDT_WS0 to be
7
^
10
more explicit and avoid the name conflict.
11
8
12
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
13
Message-id: 20210806023119.431680-1-ehabkost@redhat.com
10
Message-id: 20221220142520.24094-4-philmd@linaro.org
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
13
---
17
hw/arm/sbsa-ref.c | 6 +++---
14
hw/arm/nseries.c | 10 ++++------
18
1 file changed, 3 insertions(+), 3 deletions(-)
15
1 file changed, 4 insertions(+), 6 deletions(-)
19
16
20
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
17
diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
21
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/arm/sbsa-ref.c
19
--- a/hw/arm/nseries.c
23
+++ b/hw/arm/sbsa-ref.c
20
+++ b/hw/arm/nseries.c
24
@@ -XXX,XX +XXX,XX @@ enum {
21
@@ -XXX,XX +XXX,XX @@ static const struct omap_gpiosw_info_s {
25
SBSA_GIC_DIST,
22
"headphone", N8X0_HEADPHONE_GPIO,
26
SBSA_GIC_REDIST,
23
OMAP_GPIOSW_TYPE_CONNECTION | OMAP_GPIOSW_INVERTED,
27
SBSA_SECURE_EC,
24
},
28
- SBSA_GWDT,
25
- { NULL }
29
+ SBSA_GWDT_WS0,
26
+ { /* end of list */ }
30
SBSA_GWDT_REFRESH,
27
}, n810_gpiosw_info[] = {
31
SBSA_GWDT_CONTROL,
28
{
32
SBSA_SMMU,
29
"gps_reset", N810_GPS_RESET_GPIO,
33
@@ -XXX,XX +XXX,XX @@ static const int sbsa_ref_irqmap[] = {
30
@@ -XXX,XX +XXX,XX @@ static const struct omap_gpiosw_info_s {
34
[SBSA_AHCI] = 10,
31
"slide", N810_SLIDE_GPIO,
35
[SBSA_EHCI] = 11,
32
OMAP_GPIOSW_TYPE_COVER | OMAP_GPIOSW_INVERTED,
36
[SBSA_SMMU] = 12, /* ... to 15 */
33
},
37
- [SBSA_GWDT] = 16,
34
- { NULL }
38
+ [SBSA_GWDT_WS0] = 16,
35
+ { /* end of list */ }
39
};
36
};
40
37
41
static const char * const valid_cpus[] = {
38
static const struct omap_partition_info_s {
42
@@ -XXX,XX +XXX,XX @@ static void create_wdt(const SBSAMachineState *sms)
39
@@ -XXX,XX +XXX,XX @@ static const struct omap_partition_info_s {
43
hwaddr cbase = sbsa_ref_memmap[SBSA_GWDT_CONTROL].base;
40
{ 0x00080000, 0x00200000, 0x0, "kernel" },
44
DeviceState *dev = qdev_new(TYPE_WDT_SBSA);
41
{ 0x00280000, 0x00200000, 0x3, "initfs" },
45
SysBusDevice *s = SYS_BUS_DEVICE(dev);
42
{ 0x00480000, 0x0fb80000, 0x3, "rootfs" },
46
- int irq = sbsa_ref_irqmap[SBSA_GWDT];
43
-
47
+ int irq = sbsa_ref_irqmap[SBSA_GWDT_WS0];
44
- { 0, 0, 0, NULL }
48
45
+ { /* end of list */ }
49
sysbus_realize_and_unref(s, &error_fatal);
46
}, n810_part_info[] = {
50
sysbus_mmio_map(s, 0, rbase);
47
{ 0x00000000, 0x00020000, 0x3, "bootloader" },
48
{ 0x00020000, 0x00060000, 0x0, "config" },
49
{ 0x00080000, 0x00220000, 0x0, "kernel" },
50
{ 0x002a0000, 0x00400000, 0x0, "initfs" },
51
{ 0x006a0000, 0x0f960000, 0x0, "rootfs" },
52
-
53
- { 0, 0, 0, NULL }
54
+ { /* end of list */ }
55
};
56
57
static const uint8_t n8x0_bd_addr[6] = { N8X0_BD_ADDR };
51
--
58
--
52
2.20.1
59
2.25.1
53
60
54
61
diff view generated by jsdifflib
1
Implement the MVE VMULL (polynomial) insn. Unlike Neon, this comes
1
From: Zhuojia Shen <chaosdefinition@hotmail.com>
2
in two flavours: 8x8->16 and a 16x16->32. Also unlike Neon, the
2
3
inputs are in either the low or the high half of each double-width
3
In CPUID registers exposed to userspace, some registers were missing
4
element.
4
and some fields were not exposed. This patch aligns exposed ID
5
5
registers and their fields with what the upstream kernel currently
6
The assembler for this insn indicates the size with "P8" or "P16",
6
exposes.
7
encoded into bit 28 as size = 0 or 1. We choose to follow the
7
8
same encoding as VQDMULL and decode this into a->size as MO_16
8
Specifically, the following new ID registers/fields are exposed to
9
or MO_32 indicating the size of the result elements. This then
9
userspace:
10
carries through to the helper function names where it then
10
11
matches up with the existing pmull_h() which does an 8x8->16
11
ID_AA64PFR1_EL1.BT: bits 3-0
12
operation and a new pmull_w() which does the 16x16->32.
12
ID_AA64PFR1_EL1.MTE: bits 11-8
13
13
ID_AA64PFR1_EL1.SME: bits 27-24
14
15
ID_AA64ZFR0_EL1.SVEver: bits 3-0
16
ID_AA64ZFR0_EL1.AES: bits 7-4
17
ID_AA64ZFR0_EL1.BitPerm: bits 19-16
18
ID_AA64ZFR0_EL1.BF16: bits 23-20
19
ID_AA64ZFR0_EL1.SHA3: bits 35-32
20
ID_AA64ZFR0_EL1.SM4: bits 43-40
21
ID_AA64ZFR0_EL1.I8MM: bits 47-44
22
ID_AA64ZFR0_EL1.F32MM: bits 55-52
23
ID_AA64ZFR0_EL1.F64MM: bits 59-56
24
25
ID_AA64SMFR0_EL1.F32F32: bit 32
26
ID_AA64SMFR0_EL1.B16F32: bit 34
27
ID_AA64SMFR0_EL1.F16F32: bit 35
28
ID_AA64SMFR0_EL1.I8I32: bits 39-36
29
ID_AA64SMFR0_EL1.F64F64: bit 48
30
ID_AA64SMFR0_EL1.I16I64: bits 55-52
31
ID_AA64SMFR0_EL1.FA64: bit 63
32
33
ID_AA64MMFR0_EL1.ECV: bits 63-60
34
35
ID_AA64MMFR1_EL1.AFP: bits 47-44
36
37
ID_AA64MMFR2_EL1.AT: bits 35-32
38
39
ID_AA64ISAR0_EL1.RNDR: bits 63-60
40
41
ID_AA64ISAR1_EL1.FRINTTS: bits 35-32
42
ID_AA64ISAR1_EL1.BF16: bits 47-44
43
ID_AA64ISAR1_EL1.DGH: bits 51-48
44
ID_AA64ISAR1_EL1.I8MM: bits 55-52
45
46
ID_AA64ISAR2_EL1.WFxT: bits 3-0
47
ID_AA64ISAR2_EL1.RPRES: bits 7-4
48
ID_AA64ISAR2_EL1.GPA3: bits 11-8
49
ID_AA64ISAR2_EL1.APA3: bits 15-12
50
51
The code is also refactored to use symbolic names for ID register fields
52
for better readability and maintainability.
53
54
The test case in tests/tcg/aarch64/sysregs.c is also updated to match
55
the intended behavior.
56
57
Signed-off-by: Zhuojia Shen <chaosdefinition@hotmail.com>
58
Message-id: DS7PR12MB6309FB585E10772928F14271ACE79@DS7PR12MB6309.namprd12.prod.outlook.com
59
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
60
[PMM: use Sn_n_Cn_Cn_n syntax to work with older assemblers
61
that don't recognize id_aa64isar2_el1 and id_aa64mmfr2_el1]
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
62
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
16
---
63
---
17
target/arm/helper-mve.h | 5 +++++
64
target/arm/helper.c | 96 +++++++++++++++++++++++++------
18
target/arm/vec_internal.h | 11 +++++++++++
65
tests/tcg/aarch64/sysregs.c | 24 ++++++--
19
target/arm/mve.decode | 14 ++++++++++----
66
tests/tcg/aarch64/Makefile.target | 7 ++-
20
target/arm/mve_helper.c | 16 ++++++++++++++++
67
3 files changed, 103 insertions(+), 24 deletions(-)
21
target/arm/translate-mve.c | 28 ++++++++++++++++++++++++++++
68
22
target/arm/vec_helper.c | 14 +++++++++++++-
69
diff --git a/target/arm/helper.c b/target/arm/helper.c
23
6 files changed, 83 insertions(+), 5 deletions(-)
24
25
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
26
index XXXXXXX..XXXXXXX 100644
70
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/helper-mve.h
71
--- a/target/arm/helper.c
28
+++ b/target/arm/helper-mve.h
72
+++ b/target/arm/helper.c
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
73
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
30
DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
74
#ifdef CONFIG_USER_ONLY
31
DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
75
static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
32
76
{ .name = "ID_AA64PFR0_EL1",
33
+DEF_HELPER_FLAGS_4(mve_vmullpbh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
77
- .exported_bits = 0x000f000f00ff0000,
34
+DEF_HELPER_FLAGS_4(mve_vmullpth, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
78
- .fixed_bits = 0x0000000000000011 },
35
+DEF_HELPER_FLAGS_4(mve_vmullpbw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
79
+ .exported_bits = R_ID_AA64PFR0_FP_MASK |
36
+DEF_HELPER_FLAGS_4(mve_vmullptw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
80
+ R_ID_AA64PFR0_ADVSIMD_MASK |
81
+ R_ID_AA64PFR0_SVE_MASK |
82
+ R_ID_AA64PFR0_DIT_MASK,
83
+ .fixed_bits = (0x1u << R_ID_AA64PFR0_EL0_SHIFT) |
84
+ (0x1u << R_ID_AA64PFR0_EL1_SHIFT) },
85
{ .name = "ID_AA64PFR1_EL1",
86
- .exported_bits = 0x00000000000000f0 },
87
+ .exported_bits = R_ID_AA64PFR1_BT_MASK |
88
+ R_ID_AA64PFR1_SSBS_MASK |
89
+ R_ID_AA64PFR1_MTE_MASK |
90
+ R_ID_AA64PFR1_SME_MASK },
91
{ .name = "ID_AA64PFR*_EL1_RESERVED",
92
- .is_glob = true },
93
- { .name = "ID_AA64ZFR0_EL1" },
94
+ .is_glob = true },
95
+ { .name = "ID_AA64ZFR0_EL1",
96
+ .exported_bits = R_ID_AA64ZFR0_SVEVER_MASK |
97
+ R_ID_AA64ZFR0_AES_MASK |
98
+ R_ID_AA64ZFR0_BITPERM_MASK |
99
+ R_ID_AA64ZFR0_BFLOAT16_MASK |
100
+ R_ID_AA64ZFR0_SHA3_MASK |
101
+ R_ID_AA64ZFR0_SM4_MASK |
102
+ R_ID_AA64ZFR0_I8MM_MASK |
103
+ R_ID_AA64ZFR0_F32MM_MASK |
104
+ R_ID_AA64ZFR0_F64MM_MASK },
105
+ { .name = "ID_AA64SMFR0_EL1",
106
+ .exported_bits = R_ID_AA64SMFR0_F32F32_MASK |
107
+ R_ID_AA64SMFR0_B16F32_MASK |
108
+ R_ID_AA64SMFR0_F16F32_MASK |
109
+ R_ID_AA64SMFR0_I8I32_MASK |
110
+ R_ID_AA64SMFR0_F64F64_MASK |
111
+ R_ID_AA64SMFR0_I16I64_MASK |
112
+ R_ID_AA64SMFR0_FA64_MASK },
113
{ .name = "ID_AA64MMFR0_EL1",
114
- .fixed_bits = 0x00000000ff000000 },
115
- { .name = "ID_AA64MMFR1_EL1" },
116
+ .exported_bits = R_ID_AA64MMFR0_ECV_MASK,
117
+ .fixed_bits = (0xfu << R_ID_AA64MMFR0_TGRAN64_SHIFT) |
118
+ (0xfu << R_ID_AA64MMFR0_TGRAN4_SHIFT) },
119
+ { .name = "ID_AA64MMFR1_EL1",
120
+ .exported_bits = R_ID_AA64MMFR1_AFP_MASK },
121
+ { .name = "ID_AA64MMFR2_EL1",
122
+ .exported_bits = R_ID_AA64MMFR2_AT_MASK },
123
{ .name = "ID_AA64MMFR*_EL1_RESERVED",
124
- .is_glob = true },
125
+ .is_glob = true },
126
{ .name = "ID_AA64DFR0_EL1",
127
- .fixed_bits = 0x0000000000000006 },
128
- { .name = "ID_AA64DFR1_EL1" },
129
+ .fixed_bits = (0x6u << R_ID_AA64DFR0_DEBUGVER_SHIFT) },
130
+ { .name = "ID_AA64DFR1_EL1" },
131
{ .name = "ID_AA64DFR*_EL1_RESERVED",
132
- .is_glob = true },
133
+ .is_glob = true },
134
{ .name = "ID_AA64AFR*",
135
- .is_glob = true },
136
+ .is_glob = true },
137
{ .name = "ID_AA64ISAR0_EL1",
138
- .exported_bits = 0x00fffffff0fffff0 },
139
+ .exported_bits = R_ID_AA64ISAR0_AES_MASK |
140
+ R_ID_AA64ISAR0_SHA1_MASK |
141
+ R_ID_AA64ISAR0_SHA2_MASK |
142
+ R_ID_AA64ISAR0_CRC32_MASK |
143
+ R_ID_AA64ISAR0_ATOMIC_MASK |
144
+ R_ID_AA64ISAR0_RDM_MASK |
145
+ R_ID_AA64ISAR0_SHA3_MASK |
146
+ R_ID_AA64ISAR0_SM3_MASK |
147
+ R_ID_AA64ISAR0_SM4_MASK |
148
+ R_ID_AA64ISAR0_DP_MASK |
149
+ R_ID_AA64ISAR0_FHM_MASK |
150
+ R_ID_AA64ISAR0_TS_MASK |
151
+ R_ID_AA64ISAR0_RNDR_MASK },
152
{ .name = "ID_AA64ISAR1_EL1",
153
- .exported_bits = 0x000000f0ffffffff },
154
+ .exported_bits = R_ID_AA64ISAR1_DPB_MASK |
155
+ R_ID_AA64ISAR1_APA_MASK |
156
+ R_ID_AA64ISAR1_API_MASK |
157
+ R_ID_AA64ISAR1_JSCVT_MASK |
158
+ R_ID_AA64ISAR1_FCMA_MASK |
159
+ R_ID_AA64ISAR1_LRCPC_MASK |
160
+ R_ID_AA64ISAR1_GPA_MASK |
161
+ R_ID_AA64ISAR1_GPI_MASK |
162
+ R_ID_AA64ISAR1_FRINTTS_MASK |
163
+ R_ID_AA64ISAR1_SB_MASK |
164
+ R_ID_AA64ISAR1_BF16_MASK |
165
+ R_ID_AA64ISAR1_DGH_MASK |
166
+ R_ID_AA64ISAR1_I8MM_MASK },
167
+ { .name = "ID_AA64ISAR2_EL1",
168
+ .exported_bits = R_ID_AA64ISAR2_WFXT_MASK |
169
+ R_ID_AA64ISAR2_RPRES_MASK |
170
+ R_ID_AA64ISAR2_GPA3_MASK |
171
+ R_ID_AA64ISAR2_APA3_MASK },
172
{ .name = "ID_AA64ISAR*_EL1_RESERVED",
173
- .is_glob = true },
174
+ .is_glob = true },
175
};
176
modify_arm_cp_regs(v8_idregs, v8_user_idregs);
177
#endif
178
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
179
#ifdef CONFIG_USER_ONLY
180
static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
181
{ .name = "MIDR_EL1",
182
- .exported_bits = 0x00000000ffffffff },
183
- { .name = "REVIDR_EL1" },
184
+ .exported_bits = R_MIDR_EL1_REVISION_MASK |
185
+ R_MIDR_EL1_PARTNUM_MASK |
186
+ R_MIDR_EL1_ARCHITECTURE_MASK |
187
+ R_MIDR_EL1_VARIANT_MASK |
188
+ R_MIDR_EL1_IMPLEMENTER_MASK },
189
+ { .name = "REVIDR_EL1" },
190
};
191
modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
192
#endif
193
diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
194
index XXXXXXX..XXXXXXX 100644
195
--- a/tests/tcg/aarch64/sysregs.c
196
+++ b/tests/tcg/aarch64/sysregs.c
197
@@ -XXX,XX +XXX,XX @@
198
#define HWCAP_CPUID (1 << 11)
199
#endif
200
201
+/*
202
+ * Older assemblers don't recognize newer system register names,
203
+ * but we can still access them by the Sn_n_Cn_Cn_n syntax.
204
+ */
205
+#define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
206
+#define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
37
+
207
+
38
DEF_HELPER_FLAGS_4(mve_vqdmulhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
208
int failed_bit_count;
39
DEF_HELPER_FLAGS_4(mve_vqdmulhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
209
40
DEF_HELPER_FLAGS_4(mve_vqdmulhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
210
/* Read and print system register `id' value */
41
diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h
211
@@ -XXX,XX +XXX,XX @@ int main(void)
212
* minimum valid fields - for the purposes of this check allowed
213
* to have non-zero values.
214
*/
215
- get_cpu_reg_check_mask(id_aa64isar0_el1, _m(00ff,ffff,f0ff,fff0));
216
- get_cpu_reg_check_mask(id_aa64isar1_el1, _m(0000,00f0,ffff,ffff));
217
+ get_cpu_reg_check_mask(id_aa64isar0_el1, _m(f0ff,ffff,f0ff,fff0));
218
+ get_cpu_reg_check_mask(id_aa64isar1_el1, _m(00ff,f0ff,ffff,ffff));
219
+ get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(0000,0000,0000,ffff));
220
/* TGran4 & TGran64 as pegged to -1 */
221
- get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(0000,0000,ff00,0000));
222
- get_cpu_reg_check_zero(id_aa64mmfr1_el1);
223
+ get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(f000,0000,ff00,0000));
224
+ get_cpu_reg_check_mask(id_aa64mmfr1_el1, _m(0000,f000,0000,0000));
225
+ get_cpu_reg_check_mask(SYS_ID_AA64MMFR2_EL1, _m(0000,000f,0000,0000));
226
/* EL1/EL0 reported as AA64 only */
227
get_cpu_reg_check_mask(id_aa64pfr0_el1, _m(000f,000f,00ff,0011));
228
- get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0000,00f0));
229
+ get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0f00,0fff));
230
/* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
231
get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006));
232
get_cpu_reg_check_zero(id_aa64dfr1_el1);
233
- get_cpu_reg_check_zero(id_aa64zfr0_el1);
234
+ get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff));
235
+#ifdef HAS_ARMV9_SME
236
+ get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
237
+#endif
238
239
get_cpu_reg_check_zero(id_aa64afr0_el1);
240
get_cpu_reg_check_zero(id_aa64afr1_el1);
241
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
42
index XXXXXXX..XXXXXXX 100644
242
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/vec_internal.h
243
--- a/tests/tcg/aarch64/Makefile.target
44
+++ b/target/arm/vec_internal.h
244
+++ b/tests/tcg/aarch64/Makefile.target
45
@@ -XXX,XX +XXX,XX @@ int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *);
245
@@ -XXX,XX +XXX,XX @@ config-cc.mak: Makefile
46
int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *);
246
     $(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); \
47
int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool);
247
     $(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \
48
248
     $(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \
49
+/*
249
-     $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE)) 3> config-cc.mak
50
+ * 8 x 8 -> 16 vector polynomial multiply where the inputs are
250
+     $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \
51
+ * in the low 8 bits of each 16-bit element
251
+     $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
52
+*/
252
-include config-cc.mak
53
+uint64_t pmull_h(uint64_t op1, uint64_t op2);
253
54
+/*
254
# Pauth Tests
55
+ * 16 x 16 -> 32 vector polynomial multiply where the inputs are
255
@@ -XXX,XX +XXX,XX @@ endif
56
+ * in the low 16 bits of each 32-bit element
256
ifneq ($(CROSS_CC_HAS_SVE),)
57
+ */
257
# System Registers Tests
58
+uint64_t pmull_w(uint64_t op1, uint64_t op2);
258
AARCH64_TESTS += sysregs
59
+
259
+ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
60
#endif /* TARGET_ARM_VEC_INTERNALS_H */
260
+sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
61
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
261
+else
62
index XXXXXXX..XXXXXXX 100644
262
sysregs: CFLAGS+=-march=armv8.1-a+sve
63
--- a/target/arm/mve.decode
263
+endif
64
+++ b/target/arm/mve.decode
264
65
@@ -XXX,XX +XXX,XX @@ VHADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 0 ... 0 @2op
265
# SVE ioctl test
66
VHSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op
266
AARCH64_TESTS += sve-ioctls
67
VHSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op
68
69
-VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
70
-VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
71
-VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
72
-VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
73
+{
74
+ VMULLP_B 111 . 1110 0 . 11 ... 1 ... 0 1110 . 0 . 0 ... 0 @2op_sz28
75
+ VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
76
+ VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
77
+}
78
+{
79
+ VMULLP_T 111 . 1110 0 . 11 ... 1 ... 1 1110 . 0 . 0 ... 0 @2op_sz28
80
+ VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
81
+ VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
82
+}
83
84
VQDMULH 1110 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op
85
VQRDMULH 1111 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op
86
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/mve_helper.c
89
+++ b/target/arm/mve_helper.c
90
@@ -XXX,XX +XXX,XX @@ DO_2OP_L(vmulltub, 1, 1, uint8_t, 2, uint16_t, DO_MUL)
91
DO_2OP_L(vmulltuh, 1, 2, uint16_t, 4, uint32_t, DO_MUL)
92
DO_2OP_L(vmulltuw, 1, 4, uint32_t, 8, uint64_t, DO_MUL)
93
94
+/*
95
+ * Polynomial multiply. We can always do this generating 64 bits
96
+ * of the result at a time, so we don't need to use DO_2OP_L.
97
+ */
98
+#define VMULLPH_MASK 0x00ff00ff00ff00ffULL
99
+#define VMULLPW_MASK 0x0000ffff0000ffffULL
100
+#define DO_VMULLPBH(N, M) pmull_h((N) & VMULLPH_MASK, (M) & VMULLPH_MASK)
101
+#define DO_VMULLPTH(N, M) DO_VMULLPBH((N) >> 8, (M) >> 8)
102
+#define DO_VMULLPBW(N, M) pmull_w((N) & VMULLPW_MASK, (M) & VMULLPW_MASK)
103
+#define DO_VMULLPTW(N, M) DO_VMULLPBW((N) >> 16, (M) >> 16)
104
+
105
+DO_2OP(vmullpbh, 8, uint64_t, DO_VMULLPBH)
106
+DO_2OP(vmullpth, 8, uint64_t, DO_VMULLPTH)
107
+DO_2OP(vmullpbw, 8, uint64_t, DO_VMULLPBW)
108
+DO_2OP(vmullptw, 8, uint64_t, DO_VMULLPTW)
109
+
110
/*
111
* Because the computation type is at least twice as large as required,
112
* these work for both signed and unsigned source types.
113
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
114
index XXXXXXX..XXXXXXX 100644
115
--- a/target/arm/translate-mve.c
116
+++ b/target/arm/translate-mve.c
117
@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMULLT(DisasContext *s, arg_2op *a)
118
return do_2op(s, a, fns[a->size]);
119
}
120
121
+static bool trans_VMULLP_B(DisasContext *s, arg_2op *a)
122
+{
123
+ /*
124
+ * Note that a->size indicates the output size, ie VMULL.P8
125
+ * is the 8x8->16 operation and a->size is MO_16; VMULL.P16
126
+ * is the 16x16->32 operation and a->size is MO_32.
127
+ */
128
+ static MVEGenTwoOpFn * const fns[] = {
129
+ NULL,
130
+ gen_helper_mve_vmullpbh,
131
+ gen_helper_mve_vmullpbw,
132
+ NULL,
133
+ };
134
+ return do_2op(s, a, fns[a->size]);
135
+}
136
+
137
+static bool trans_VMULLP_T(DisasContext *s, arg_2op *a)
138
+{
139
+ /* a->size is as for trans_VMULLP_B */
140
+ static MVEGenTwoOpFn * const fns[] = {
141
+ NULL,
142
+ gen_helper_mve_vmullpth,
143
+ gen_helper_mve_vmullptw,
144
+ NULL,
145
+ };
146
+ return do_2op(s, a, fns[a->size]);
147
+}
148
+
149
/*
150
* VADC and VSBC: these perform an add-with-carry or subtract-with-carry
151
* of the 32-bit elements in each lane of the input vectors, where the
152
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
153
index XXXXXXX..XXXXXXX 100644
154
--- a/target/arm/vec_helper.c
155
+++ b/target/arm/vec_helper.c
156
@@ -XXX,XX +XXX,XX @@ static uint64_t expand_byte_to_half(uint64_t x)
157
| ((x & 0xff000000) << 24);
158
}
159
160
-static uint64_t pmull_h(uint64_t op1, uint64_t op2)
161
+uint64_t pmull_w(uint64_t op1, uint64_t op2)
162
{
163
uint64_t result = 0;
164
int i;
165
+ for (i = 0; i < 16; ++i) {
166
+ uint64_t mask = (op1 & 0x0000000100000001ull) * 0xffffffff;
167
+ result ^= op2 & mask;
168
+ op1 >>= 1;
169
+ op2 <<= 1;
170
+ }
171
+ return result;
172
+}
173
174
+uint64_t pmull_h(uint64_t op1, uint64_t op2)
175
+{
176
+ uint64_t result = 0;
177
+ int i;
178
for (i = 0; i < 8; ++i) {
179
uint64_t mask = (op1 & 0x0001000100010001ull) * 0xffff;
180
result ^= op2 & mask;
181
--
267
--
182
2.20.1
268
2.25.1
183
184
diff view generated by jsdifflib
1
From: "Wen, Jianxian" <Jianxian.Wen@verisilicon.com>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
Add property memory region which can connect with IOMMU region to support SMMU translate.
3
This function is not used anywhere outside this file,
4
so we can make the function "static void".
4
5
5
Signed-off-by: Jianxian Wen <jianxian.wen@verisilicon.com>
6
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 4C23C17B8E87E74E906A25A3254A03F4FA1FEC31@SHASXM03.verisilicon.com
8
Reviewed-by: Eric Auger <eric.auger@redhat.com>
9
Message-id: 20221216214924.4711-2-philmd@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
11
---
10
hw/arm/exynos4210.c | 3 +++
12
include/hw/arm/smmu-common.h | 3 ---
11
hw/arm/xilinx_zynq.c | 3 +++
13
hw/arm/smmu-common.c | 2 +-
12
hw/dma/pl330.c | 26 ++++++++++++++++++++++----
14
2 files changed, 1 insertion(+), 4 deletions(-)
13
3 files changed, 28 insertions(+), 4 deletions(-)
14
15
15
diff --git a/hw/arm/exynos4210.c b/hw/arm/exynos4210.c
16
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/arm/exynos4210.c
18
--- a/include/hw/arm/smmu-common.h
18
+++ b/hw/arm/exynos4210.c
19
+++ b/include/hw/arm/smmu-common.h
19
@@ -XXX,XX +XXX,XX @@ static DeviceState *pl330_create(uint32_t base, qemu_or_irq *orgate,
20
@@ -XXX,XX +XXX,XX @@ void smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
20
int i;
21
/* Unmap the range of all the notifiers registered to any IOMMU mr */
21
22
void smmu_inv_notifiers_all(SMMUState *s);
22
dev = qdev_new("pl330");
23
23
+ object_property_set_link(OBJECT(dev), "memory",
24
-/* Unmap the range of all the notifiers registered to @mr */
24
+ OBJECT(get_system_memory()),
25
-void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr);
25
+ &error_fatal);
26
-
26
qdev_prop_set_uint8(dev, "num_events", nevents);
27
#endif /* HW_ARM_SMMU_COMMON_H */
27
qdev_prop_set_uint8(dev, "num_chnls", 8);
28
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
28
qdev_prop_set_uint8(dev, "num_periph_req", nreq);
29
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
30
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
31
--- a/hw/arm/xilinx_zynq.c
30
--- a/hw/arm/smmu-common.c
32
+++ b/hw/arm/xilinx_zynq.c
31
+++ b/hw/arm/smmu-common.c
33
@@ -XXX,XX +XXX,XX @@ static void zynq_init(MachineState *machine)
32
@@ -XXX,XX +XXX,XX @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
34
sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[39-IRQ_OFFSET]);
35
36
dev = qdev_new("pl330");
37
+ object_property_set_link(OBJECT(dev), "memory",
38
+ OBJECT(address_space_mem),
39
+ &error_fatal);
40
qdev_prop_set_uint8(dev, "num_chnls", 8);
41
qdev_prop_set_uint8(dev, "num_periph_req", 4);
42
qdev_prop_set_uint8(dev, "num_events", 16);
43
diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/hw/dma/pl330.c
46
+++ b/hw/dma/pl330.c
47
@@ -XXX,XX +XXX,XX @@ struct PL330State {
48
uint8_t num_faulting;
49
uint8_t periph_busy[PL330_PERIPH_NUM];
50
51
+ /* Memory region that DMA operation access */
52
+ MemoryRegion *mem_mr;
53
+ AddressSpace *mem_as;
54
};
55
56
#define TYPE_PL330 "pl330"
57
@@ -XXX,XX +XXX,XX @@ static inline const PL330InsnDesc *pl330_fetch_insn(PL330Chan *ch)
58
uint8_t opcode;
59
int i;
60
61
- dma_memory_read(&address_space_memory, ch->pc, &opcode, 1);
62
+ dma_memory_read(ch->parent->mem_as, ch->pc, &opcode, 1);
63
for (i = 0; insn_desc[i].size; i++) {
64
if ((opcode & insn_desc[i].opmask) == insn_desc[i].opcode) {
65
return &insn_desc[i];
66
@@ -XXX,XX +XXX,XX @@ static inline void pl330_exec_insn(PL330Chan *ch, const PL330InsnDesc *insn)
67
uint8_t buf[PL330_INSN_MAXSIZE];
68
69
assert(insn->size <= PL330_INSN_MAXSIZE);
70
- dma_memory_read(&address_space_memory, ch->pc, buf, insn->size);
71
+ dma_memory_read(ch->parent->mem_as, ch->pc, buf, insn->size);
72
insn->exec(ch, buf[0], &buf[1], insn->size - 1);
73
}
33
}
74
34
75
@@ -XXX,XX +XXX,XX @@ static int pl330_exec_cycle(PL330Chan *channel)
35
/* Unmap all notifiers attached to @mr */
76
if (q != NULL && q->len <= pl330_fifo_num_free(&s->fifo)) {
36
-inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
77
int len = q->len - (q->addr & (q->len - 1));
37
+static void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
78
38
{
79
- dma_memory_read(&address_space_memory, q->addr, buf, len);
39
IOMMUNotifier *n;
80
+ dma_memory_read(s->mem_as, q->addr, buf, len);
81
trace_pl330_exec_cycle(q->addr, len);
82
if (trace_event_get_state_backends(TRACE_PL330_HEXDUMP)) {
83
pl330_hexdump(buf, len);
84
@@ -XXX,XX +XXX,XX @@ static int pl330_exec_cycle(PL330Chan *channel)
85
fifo_res = pl330_fifo_get(&s->fifo, buf, len, q->tag);
86
}
87
if (fifo_res == PL330_FIFO_OK || q->z) {
88
- dma_memory_write(&address_space_memory, q->addr, buf, len);
89
+ dma_memory_write(s->mem_as, q->addr, buf, len);
90
trace_pl330_exec_cycle(q->addr, len);
91
if (trace_event_get_state_backends(TRACE_PL330_HEXDUMP)) {
92
pl330_hexdump(buf, len);
93
@@ -XXX,XX +XXX,XX @@ static void pl330_realize(DeviceState *dev, Error **errp)
94
"dma", PL330_IOMEM_SIZE);
95
sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->iomem);
96
97
+ if (!s->mem_mr) {
98
+ error_setg(errp, "'memory' link is not set");
99
+ return;
100
+ } else if (s->mem_mr == get_system_memory()) {
101
+ /* Avoid creating new AS for system memory. */
102
+ s->mem_as = &address_space_memory;
103
+ } else {
104
+ s->mem_as = g_new0(AddressSpace, 1);
105
+ address_space_init(s->mem_as, s->mem_mr,
106
+ memory_region_name(s->mem_mr));
107
+ }
108
+
109
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, pl330_exec_cycle_timer, s);
110
111
s->cfg[0] = (s->mgr_ns_at_rst ? 0x4 : 0) |
112
@@ -XXX,XX +XXX,XX @@ static Property pl330_properties[] = {
113
DEFINE_PROP_UINT8("rd_q_dep", PL330State, rd_q_dep, 16),
114
DEFINE_PROP_UINT16("data_buffer_dep", PL330State, data_buffer_dep, 256),
115
116
+ DEFINE_PROP_LINK("memory", PL330State, mem_mr,
117
+ TYPE_MEMORY_REGION, MemoryRegion *),
118
+
119
DEFINE_PROP_END_OF_LIST(),
120
};
121
40
122
--
41
--
123
2.20.1
42
2.25.1
124
43
125
44
diff view generated by jsdifflib
1
Implement the MVE narrowing move insns VMOVN, VQMOVN and VQMOVUN.
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
These take a double-width input, narrow it (possibly saturating) and
3
store the result to either the top or bottom half of the output
4
element.
5
2
3
When using Clang ("Apple clang version 14.0.0 (clang-1400.0.29.202)")
4
and building with -Wall we get:
5
6
hw/arm/smmu-common.c:173:33: warning: static function 'smmu_hash_remove_by_asid_iova' is used in an inline function with external linkage [-Wstatic-in-inline]
7
hw/arm/smmu-common.h:170:1: note: use 'static' to give inline function 'smmu_iotlb_inv_iova' internal linkage
8
void smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
9
^
10
static
11
12
None of our code base require / use inlined functions with external
13
linkage. Some places use internal inlining in the hot path. These
14
two functions are certainly not in any hot path and don't justify
15
any inlining, so these are likely oversights rather than intentional.
16
17
Reported-by: Stefan Weil <sw@weilnetz.de>
18
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
19
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
20
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
21
Reviewed-by: Eric Auger <eric.auger@redhat.com>
22
Message-id: 20221216214924.4711-3-philmd@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
---
24
---
9
target/arm/helper-mve.h | 20 ++++++++++
25
hw/arm/smmu-common.c | 13 ++++++-------
10
target/arm/mve.decode | 12 ++++++
26
1 file changed, 6 insertions(+), 7 deletions(-)
11
target/arm/mve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++
12
target/arm/translate-mve.c | 22 +++++++++++
13
4 files changed, 132 insertions(+)
14
27
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
28
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
16
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
30
--- a/hw/arm/smmu-common.c
18
+++ b/target/arm/helper-mve.h
31
+++ b/hw/arm/smmu-common.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr)
32
@@ -XXX,XX +XXX,XX @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, SMMUTLBEntry *new)
20
DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
33
g_hash_table_insert(bs->iotlb, key, new);
21
DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
23
+DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr)
24
+DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
+DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr)
26
+DEF_HELPER_FLAGS_3(mve_vmovnth, TCG_CALL_NO_WG, void, env, ptr, ptr)
27
+
28
+DEF_HELPER_FLAGS_3(mve_vqmovunbb, TCG_CALL_NO_WG, void, env, ptr, ptr)
29
+DEF_HELPER_FLAGS_3(mve_vqmovunbh, TCG_CALL_NO_WG, void, env, ptr, ptr)
30
+DEF_HELPER_FLAGS_3(mve_vqmovuntb, TCG_CALL_NO_WG, void, env, ptr, ptr)
31
+DEF_HELPER_FLAGS_3(mve_vqmovunth, TCG_CALL_NO_WG, void, env, ptr, ptr)
32
+
33
+DEF_HELPER_FLAGS_3(mve_vqmovnbsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
34
+DEF_HELPER_FLAGS_3(mve_vqmovnbsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
35
+DEF_HELPER_FLAGS_3(mve_vqmovntsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
36
+DEF_HELPER_FLAGS_3(mve_vqmovntsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
37
+
38
+DEF_HELPER_FLAGS_3(mve_vqmovnbub, TCG_CALL_NO_WG, void, env, ptr, ptr)
39
+DEF_HELPER_FLAGS_3(mve_vqmovnbuh, TCG_CALL_NO_WG, void, env, ptr, ptr)
40
+DEF_HELPER_FLAGS_3(mve_vqmovntub, TCG_CALL_NO_WG, void, env, ptr, ptr)
41
+DEF_HELPER_FLAGS_3(mve_vqmovntuh, TCG_CALL_NO_WG, void, env, ptr, ptr)
42
+
43
DEF_HELPER_FLAGS_4(mve_vand, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
44
DEF_HELPER_FLAGS_4(mve_vbic, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
45
DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
46
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/mve.decode
49
+++ b/target/arm/mve.decode
50
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
51
VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b
52
VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h
53
54
+ VQMOVUNB 111 0 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op
55
+ VQMOVN_BS 111 0 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op
56
+
57
VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
58
}
34
}
59
35
60
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
36
-inline void smmu_iotlb_inv_all(SMMUState *s)
61
VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b
37
+void smmu_iotlb_inv_all(SMMUState *s)
62
VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h
38
{
63
39
trace_smmu_iotlb_inv_all();
64
+ VMOVNB 111 1 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op
40
g_hash_table_remove_all(s->iotlb);
65
+ VQMOVN_BU 111 1 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op
41
@@ -XXX,XX +XXX,XX @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, gpointer value,
66
+
42
((entry->iova & ~info->mask) == info->iova);
67
VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
68
}
43
}
69
44
70
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
45
-inline void
71
VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b
46
-smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
72
VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h
47
- uint8_t tg, uint64_t num_pages, uint8_t ttl)
73
48
+void smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
74
+ VQMOVUNT 111 0 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op
49
+ uint8_t tg, uint64_t num_pages, uint8_t ttl)
75
+ VQMOVN_TS 111 0 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op
50
{
76
+
51
/* if tg is not set we use 4KB range invalidation */
77
VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
52
uint8_t granule = tg ? tg * 2 + 10 : 12;
53
@@ -XXX,XX +XXX,XX @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
54
&info);
78
}
55
}
79
56
80
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
57
-inline void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
81
VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b
58
+void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
82
VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h
83
84
+ VMOVNT 111 1 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op
85
+ VQMOVN_TU 111 1 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op
86
+
87
VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
88
}
89
90
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/mve_helper.c
93
+++ b/target/arm/mve_helper.c
94
@@ -XXX,XX +XXX,XX @@ DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH)
95
DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B)
96
DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H)
97
98
+#define DO_VMOVN(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE) \
99
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
100
+ { \
101
+ LTYPE *m = vm; \
102
+ TYPE *d = vd; \
103
+ uint16_t mask = mve_element_mask(env); \
104
+ unsigned le; \
105
+ mask >>= ESIZE * TOP; \
106
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
107
+ mergemask(&d[H##ESIZE(le * 2 + TOP)], \
108
+ m[H##LESIZE(le)], mask); \
109
+ } \
110
+ mve_advance_vpt(env); \
111
+ }
112
+
113
+DO_VMOVN(vmovnbb, false, 1, uint8_t, 2, uint16_t)
114
+DO_VMOVN(vmovnbh, false, 2, uint16_t, 4, uint32_t)
115
+DO_VMOVN(vmovntb, true, 1, uint8_t, 2, uint16_t)
116
+DO_VMOVN(vmovnth, true, 2, uint16_t, 4, uint32_t)
117
+
118
+#define DO_VMOVN_SAT(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \
119
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
120
+ { \
121
+ LTYPE *m = vm; \
122
+ TYPE *d = vd; \
123
+ uint16_t mask = mve_element_mask(env); \
124
+ bool qc = false; \
125
+ unsigned le; \
126
+ mask >>= ESIZE * TOP; \
127
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
128
+ bool sat = false; \
129
+ TYPE r = FN(m[H##LESIZE(le)], &sat); \
130
+ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
131
+ qc |= sat & mask & 1; \
132
+ } \
133
+ if (qc) { \
134
+ env->vfp.qc[0] = qc; \
135
+ } \
136
+ mve_advance_vpt(env); \
137
+ }
138
+
139
+#define DO_VMOVN_SAT_UB(BOP, TOP, FN) \
140
+ DO_VMOVN_SAT(BOP, false, 1, uint8_t, 2, uint16_t, FN) \
141
+ DO_VMOVN_SAT(TOP, true, 1, uint8_t, 2, uint16_t, FN)
142
+
143
+#define DO_VMOVN_SAT_UH(BOP, TOP, FN) \
144
+ DO_VMOVN_SAT(BOP, false, 2, uint16_t, 4, uint32_t, FN) \
145
+ DO_VMOVN_SAT(TOP, true, 2, uint16_t, 4, uint32_t, FN)
146
+
147
+#define DO_VMOVN_SAT_SB(BOP, TOP, FN) \
148
+ DO_VMOVN_SAT(BOP, false, 1, int8_t, 2, int16_t, FN) \
149
+ DO_VMOVN_SAT(TOP, true, 1, int8_t, 2, int16_t, FN)
150
+
151
+#define DO_VMOVN_SAT_SH(BOP, TOP, FN) \
152
+ DO_VMOVN_SAT(BOP, false, 2, int16_t, 4, int32_t, FN) \
153
+ DO_VMOVN_SAT(TOP, true, 2, int16_t, 4, int32_t, FN)
154
+
155
+#define DO_VQMOVN_SB(N, SATP) \
156
+ do_sat_bhs((int64_t)(N), INT8_MIN, INT8_MAX, SATP)
157
+#define DO_VQMOVN_UB(N, SATP) \
158
+ do_sat_bhs((uint64_t)(N), 0, UINT8_MAX, SATP)
159
+#define DO_VQMOVUN_B(N, SATP) \
160
+ do_sat_bhs((int64_t)(N), 0, UINT8_MAX, SATP)
161
+
162
+#define DO_VQMOVN_SH(N, SATP) \
163
+ do_sat_bhs((int64_t)(N), INT16_MIN, INT16_MAX, SATP)
164
+#define DO_VQMOVN_UH(N, SATP) \
165
+ do_sat_bhs((uint64_t)(N), 0, UINT16_MAX, SATP)
166
+#define DO_VQMOVUN_H(N, SATP) \
167
+ do_sat_bhs((int64_t)(N), 0, UINT16_MAX, SATP)
168
+
169
+DO_VMOVN_SAT_SB(vqmovnbsb, vqmovntsb, DO_VQMOVN_SB)
170
+DO_VMOVN_SAT_SH(vqmovnbsh, vqmovntsh, DO_VQMOVN_SH)
171
+DO_VMOVN_SAT_UB(vqmovnbub, vqmovntub, DO_VQMOVN_UB)
172
+DO_VMOVN_SAT_UH(vqmovnbuh, vqmovntuh, DO_VQMOVN_UH)
173
+DO_VMOVN_SAT_SB(vqmovunbb, vqmovuntb, DO_VQMOVUN_B)
174
+DO_VMOVN_SAT_SH(vqmovunbh, vqmovunth, DO_VQMOVUN_H)
175
+
176
uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
177
uint32_t shift)
178
{
59
{
179
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
60
trace_smmu_iotlb_inv_asid(asid);
180
index XXXXXXX..XXXXXXX 100644
61
g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid);
181
--- a/target/arm/translate-mve.c
62
@@ -XXX,XX +XXX,XX @@ error:
182
+++ b/target/arm/translate-mve.c
63
*
183
@@ -XXX,XX +XXX,XX @@ DO_1OP(VCLS, vcls)
64
* return 0 on success
184
DO_1OP(VABS, vabs)
65
*/
185
DO_1OP(VNEG, vneg)
66
-inline int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
186
67
- SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
187
+/* Narrowing moves: only size 0 and 1 are valid */
68
+int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
188
+#define DO_VMOVN(INSN, FN) \
69
+ SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
189
+ static bool trans_##INSN(DisasContext *s, arg_1op *a) \
190
+ { \
191
+ static MVEGenOneOpFn * const fns[] = { \
192
+ gen_helper_mve_##FN##b, \
193
+ gen_helper_mve_##FN##h, \
194
+ NULL, \
195
+ NULL, \
196
+ }; \
197
+ return do_1op(s, a, fns[a->size]); \
198
+ }
199
+
200
+DO_VMOVN(VMOVNB, vmovnb)
201
+DO_VMOVN(VMOVNT, vmovnt)
202
+DO_VMOVN(VQMOVUNB, vqmovunb)
203
+DO_VMOVN(VQMOVUNT, vqmovunt)
204
+DO_VMOVN(VQMOVN_BS, vqmovnbs)
205
+DO_VMOVN(VQMOVN_TS, vqmovnts)
206
+DO_VMOVN(VQMOVN_BU, vqmovnbu)
207
+DO_VMOVN(VQMOVN_TU, vqmovntu)
208
+
209
static bool trans_VREV16(DisasContext *s, arg_1op *a)
210
{
70
{
211
static MVEGenOneOpFn * const fns[] = {
71
if (!cfg->aa64) {
72
/*
212
--
73
--
213
2.20.1
74
2.25.1
214
75
215
76
diff view generated by jsdifflib
1
From: Guenter Roeck <linux@roeck-us.net>
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
2
3
Instantiate SAI1/2/3 as unimplemented devices to avoid Linux kernel crashes
3
So far the GPT timers were unable to raise IRQs to the processor.
4
such as the following.
5
4
6
Unhandled fault: external abort on non-linefetch (0x808) at 0xd19b0000
5
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
7
pgd = (ptrval)
8
[d19b0000] *pgd=82711811, *pte=308a0653, *ppte=308a0453
9
Internal error: : 808 [#1] SMP ARM
10
Modules linked in:
11
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc5 #1
12
...
13
[<c095e974>] (regmap_mmio_write32le) from [<c095eb48>] (regmap_mmio_write+0x3c/0x54)
14
[<c095eb48>] (regmap_mmio_write) from [<c09580f4>] (_regmap_write+0x4c/0x1f0)
15
[<c09580f4>] (_regmap_write) from [<c0959b28>] (regmap_write+0x3c/0x60)
16
[<c0959b28>] (regmap_write) from [<c0d41130>] (fsl_sai_runtime_resume+0x9c/0x1ec)
17
[<c0d41130>] (fsl_sai_runtime_resume) from [<c0942464>] (__rpm_callback+0x3c/0x108)
18
[<c0942464>] (__rpm_callback) from [<c0942590>] (rpm_callback+0x60/0x64)
19
[<c0942590>] (rpm_callback) from [<c0942b60>] (rpm_resume+0x5cc/0x808)
20
[<c0942b60>] (rpm_resume) from [<c0942dfc>] (__pm_runtime_resume+0x60/0xa0)
21
[<c0942dfc>] (__pm_runtime_resume) from [<c0d4231c>] (fsl_sai_probe+0x2b8/0x65c)
22
[<c0d4231c>] (fsl_sai_probe) from [<c0935b08>] (platform_probe+0x58/0xb8)
23
[<c0935b08>] (platform_probe) from [<c0933264>] (really_probe.part.0+0x9c/0x334)
24
[<c0933264>] (really_probe.part.0) from [<c093359c>] (__driver_probe_device+0xa0/0x138)
25
[<c093359c>] (__driver_probe_device) from [<c0933664>] (driver_probe_device+0x30/0xc8)
26
[<c0933664>] (driver_probe_device) from [<c0933c88>] (__driver_attach+0x90/0x130)
27
[<c0933c88>] (__driver_attach) from [<c0931060>] (bus_for_each_dev+0x78/0xb8)
28
[<c0931060>] (bus_for_each_dev) from [<c093254c>] (bus_add_driver+0xf0/0x1d8)
29
[<c093254c>] (bus_add_driver) from [<c0934a30>] (driver_register+0x88/0x118)
30
[<c0934a30>] (driver_register) from [<c01022c0>] (do_one_initcall+0x7c/0x3a4)
31
[<c01022c0>] (do_one_initcall) from [<c1601204>] (kernel_init_freeable+0x198/0x22c)
32
[<c1601204>] (kernel_init_freeable) from [<c0f5ff2c>] (kernel_init+0x10/0x128)
33
[<c0f5ff2c>] (kernel_init) from [<c010013c>] (ret_from_fork+0x14/0x38)
34
35
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
36
Message-id: 20210810175607.538090-1-linux@roeck-us.net
37
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
38
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
39
---
8
---
40
include/hw/arm/fsl-imx7.h | 5 +++++
9
include/hw/arm/fsl-imx7.h | 5 +++++
41
hw/arm/fsl-imx7.c | 7 +++++++
10
hw/arm/fsl-imx7.c | 10 ++++++++++
42
2 files changed, 12 insertions(+)
11
2 files changed, 15 insertions(+)
43
12
44
diff --git a/include/hw/arm/fsl-imx7.h b/include/hw/arm/fsl-imx7.h
13
diff --git a/include/hw/arm/fsl-imx7.h b/include/hw/arm/fsl-imx7.h
45
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
46
--- a/include/hw/arm/fsl-imx7.h
15
--- a/include/hw/arm/fsl-imx7.h
47
+++ b/include/hw/arm/fsl-imx7.h
16
+++ b/include/hw/arm/fsl-imx7.h
48
@@ -XXX,XX +XXX,XX @@ enum FslIMX7MemoryMap {
17
@@ -XXX,XX +XXX,XX @@ enum FslIMX7IRQs {
49
FSL_IMX7_UART6_ADDR = 0x30A80000,
18
FSL_IMX7_USB2_IRQ = 42,
50
FSL_IMX7_UART7_ADDR = 0x30A90000,
19
FSL_IMX7_USB3_IRQ = 40,
51
20
52
+ FSL_IMX7_SAI1_ADDR = 0x308A0000,
21
+ FSL_IMX7_GPT1_IRQ = 55,
53
+ FSL_IMX7_SAI2_ADDR = 0x308B0000,
22
+ FSL_IMX7_GPT2_IRQ = 54,
54
+ FSL_IMX7_SAI3_ADDR = 0x308C0000,
23
+ FSL_IMX7_GPT3_IRQ = 53,
55
+ FSL_IMX7_SAIn_SIZE = 0x10000,
24
+ FSL_IMX7_GPT4_IRQ = 52,
56
+
25
+
57
FSL_IMX7_ENET1_ADDR = 0x30BE0000,
26
FSL_IMX7_WDOG1_IRQ = 78,
58
FSL_IMX7_ENET2_ADDR = 0x30BF0000,
27
FSL_IMX7_WDOG2_IRQ = 79,
59
28
FSL_IMX7_WDOG3_IRQ = 10,
60
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
29
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
61
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
62
--- a/hw/arm/fsl-imx7.c
31
--- a/hw/arm/fsl-imx7.c
63
+++ b/hw/arm/fsl-imx7.c
32
+++ b/hw/arm/fsl-imx7.c
64
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
33
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
65
create_unimplemented_device("can1", FSL_IMX7_CAN1_ADDR, FSL_IMX7_CANn_SIZE);
34
FSL_IMX7_GPT4_ADDR,
66
create_unimplemented_device("can2", FSL_IMX7_CAN2_ADDR, FSL_IMX7_CANn_SIZE);
35
};
67
36
68
+ /*
37
+ static const int FSL_IMX7_GPTn_IRQ[FSL_IMX7_NUM_GPTS] = {
69
+ * SAI (Audio SSI (Synchronous Serial Interface))
38
+ FSL_IMX7_GPT1_IRQ,
70
+ */
39
+ FSL_IMX7_GPT2_IRQ,
71
+ create_unimplemented_device("sai1", FSL_IMX7_SAI1_ADDR, FSL_IMX7_SAIn_SIZE);
40
+ FSL_IMX7_GPT3_IRQ,
72
+ create_unimplemented_device("sai2", FSL_IMX7_SAI2_ADDR, FSL_IMX7_SAIn_SIZE);
41
+ FSL_IMX7_GPT4_IRQ,
73
+ create_unimplemented_device("sai2", FSL_IMX7_SAI3_ADDR, FSL_IMX7_SAIn_SIZE);
42
+ };
74
+
43
+
75
/*
44
s->gpt[i].ccm = IMX_CCM(&s->ccm);
76
* OCOTP
45
sysbus_realize(SYS_BUS_DEVICE(&s->gpt[i]), &error_abort);
77
*/
46
sysbus_mmio_map(SYS_BUS_DEVICE(&s->gpt[i]), 0, FSL_IMX7_GPTn_ADDR[i]);
47
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->gpt[i]), 0,
48
+ qdev_get_gpio_in(DEVICE(&s->a7mpcore),
49
+ FSL_IMX7_GPTn_IRQ[i]));
50
}
51
52
for (i = 0; i < FSL_IMX7_NUM_GPIOS; i++) {
78
--
53
--
79
2.20.1
54
2.25.1
80
81
diff view generated by jsdifflib
1
Implement the MVE VMOV forms that move data between 2 general-purpose
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
registers and 2 32-bit lanes in a vector register.
3
2
3
CCM derived clocks will have to be added later.
4
5
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
8
---
7
target/arm/translate-a32.h | 1 +
9
hw/misc/imx7_ccm.c | 49 +++++++++++++++++++++++++++++++++++++---------
8
target/arm/mve.decode | 4 ++
10
1 file changed, 40 insertions(+), 9 deletions(-)
9
target/arm/translate-mve.c | 85 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-vfp.c | 2 +-
11
4 files changed, 91 insertions(+), 1 deletion(-)
12
11
13
diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
12
diff --git a/hw/misc/imx7_ccm.c b/hw/misc/imx7_ccm.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate-a32.h
14
--- a/hw/misc/imx7_ccm.c
16
+++ b/target/arm/translate-a32.h
15
+++ b/hw/misc/imx7_ccm.c
17
@@ -XXX,XX +XXX,XX @@ void gen_rev16(TCGv_i32 dest, TCGv_i32 var);
16
@@ -XXX,XX +XXX,XX @@
18
void clear_eci_state(DisasContext *s);
17
#include "hw/misc/imx7_ccm.h"
19
bool mve_eci_check(DisasContext *s);
18
#include "migration/vmstate.h"
20
void mve_update_and_store_eci(DisasContext *s);
19
21
+bool mve_skip_vmov(DisasContext *s, int vn, int index, int size);
20
+#include "trace.h"
22
21
+
23
static inline TCGv_i32 load_cpu_offset(int offset)
22
+#define CKIH_FREQ 24000000 /* 24MHz crystal input */
23
+
24
static void imx7_analog_reset(DeviceState *dev)
24
{
25
{
25
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
26
IMX7AnalogState *s = IMX7_ANALOG(dev);
26
index XXXXXXX..XXXXXXX 100644
27
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx7_ccm = {
27
--- a/target/arm/mve.decode
28
static uint32_t imx7_ccm_get_clock_frequency(IMXCCMState *dev, IMXClk clock)
28
+++ b/target/arm/mve.decode
29
{
29
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \
30
/*
30
VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
31
- * This function is "consumed" by GPT emulation code, however on
31
size=2 p=1
32
- * i.MX7 each GPT block can have their own clock root. This means
32
33
- * that this functions needs somehow to know requester's identity
33
+# Moves between 2 32-bit vector lanes and 2 general purpose registers
34
- * and the way to pass it: be it via additional IMXClk constants
34
+VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
35
- * or by adding another argument to this method needs to be
35
+VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd
36
- * figured out
37
+ * This function is "consumed" by GPT emulation code. Some clocks
38
+ * have fixed frequencies and we can provide requested frequency
39
+ * easily. However for CCM provided clocks (like IPG) each GPT
40
+ * timer can have its own clock root.
41
+ * This means we need additionnal information when calling this
42
+ * function to know the requester's identity.
43
*/
44
- qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Not implemented\n",
45
- TYPE_IMX7_CCM, __func__);
46
- return 0;
47
+ uint32_t freq = 0;
36
+
48
+
37
# Vector 2-op
49
+ switch (clock) {
38
VAND 1110 1111 0 . 00 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
50
+ case CLK_NONE:
39
VBIC 1110 1111 0 . 01 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
51
+ break;
40
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
52
+ case CLK_32k:
41
index XXXXXXX..XXXXXXX 100644
53
+ freq = CKIL_FREQ;
42
--- a/target/arm/translate-mve.c
54
+ break;
43
+++ b/target/arm/translate-mve.c
55
+ case CLK_HIGH:
44
@@ -XXX,XX +XXX,XX @@ static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn)
56
+ freq = CKIH_FREQ;
45
57
+ break;
46
DO_VABAV(VABAV_S, vabavs)
58
+ case CLK_IPG:
47
DO_VABAV(VABAV_U, vabavu)
59
+ case CLK_IPG_HIGH:
48
+
60
+ /*
49
+static bool trans_VMOV_to_2gp(DisasContext *s, arg_VMOV_to_2gp *a)
61
+ * For now we don't have a way to figure out the device this
50
+{
62
+ * function is called for. Until then the IPG derived clocks
51
+ /*
63
+ * are left unimplemented.
52
+ * VMOV two 32-bit vector lanes to two general-purpose registers.
64
+ */
53
+ * This insn is not predicated but it is subject to beat-wise
65
+ qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Clock %d Not implemented\n",
54
+ * execution if it is not in an IT block. For us this means
66
+ TYPE_IMX7_CCM, __func__, clock);
55
+ * only that if PSR.ECI says we should not be executing the beat
67
+ break;
56
+ * corresponding to the lane of the vector register being accessed
68
+ default:
57
+ * then we should skip perfoming the move, and that we need to do
69
+ qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: unsupported clock %d\n",
58
+ * the usual check for bad ECI state and advance of ECI state.
70
+ TYPE_IMX7_CCM, __func__, clock);
59
+ * (If PSR.ECI is non-zero then we cannot be in an IT block.)
71
+ break;
60
+ */
61
+ TCGv_i32 tmp;
62
+ int vd;
63
+
64
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd) ||
65
+ a->rt == 13 || a->rt == 15 || a->rt2 == 13 || a->rt2 == 15 ||
66
+ a->rt == a->rt2) {
67
+ /* Rt/Rt2 cases are UNPREDICTABLE */
68
+ return false;
69
+ }
70
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
71
+ return true;
72
+ }
72
+ }
73
+
73
+
74
+ /* Convert Qreg index to Dreg for read_neon_element32() etc */
74
+ trace_ccm_clock_freq(clock, freq);
75
+ vd = a->qd * 2;
76
+
75
+
77
+ if (!mve_skip_vmov(s, vd, a->idx, MO_32)) {
76
+ return freq;
78
+ tmp = tcg_temp_new_i32();
79
+ read_neon_element32(tmp, vd, a->idx, MO_32);
80
+ store_reg(s, a->rt, tmp);
81
+ }
82
+ if (!mve_skip_vmov(s, vd + 1, a->idx, MO_32)) {
83
+ tmp = tcg_temp_new_i32();
84
+ read_neon_element32(tmp, vd + 1, a->idx, MO_32);
85
+ store_reg(s, a->rt2, tmp);
86
+ }
87
+
88
+ mve_update_and_store_eci(s);
89
+ return true;
90
+}
91
+
92
+static bool trans_VMOV_from_2gp(DisasContext *s, arg_VMOV_to_2gp *a)
93
+{
94
+ /*
95
+ * VMOV two general-purpose registers to two 32-bit vector lanes.
96
+ * This insn is not predicated but it is subject to beat-wise
97
+ * execution if it is not in an IT block. For us this means
98
+ * only that if PSR.ECI says we should not be executing the beat
99
+ * corresponding to the lane of the vector register being accessed
100
+ * then we should skip perfoming the move, and that we need to do
101
+ * the usual check for bad ECI state and advance of ECI state.
102
+ * (If PSR.ECI is non-zero then we cannot be in an IT block.)
103
+ */
104
+ TCGv_i32 tmp;
105
+ int vd;
106
+
107
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd) ||
108
+ a->rt == 13 || a->rt == 15 || a->rt2 == 13 || a->rt2 == 15) {
109
+ /* Rt/Rt2 cases are UNPREDICTABLE */
110
+ return false;
111
+ }
112
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
113
+ return true;
114
+ }
115
+
116
+ /* Convert Qreg idx to Dreg for read_neon_element32() etc */
117
+ vd = a->qd * 2;
118
+
119
+ if (!mve_skip_vmov(s, vd, a->idx, MO_32)) {
120
+ tmp = load_reg(s, a->rt);
121
+ write_neon_element32(tmp, vd, a->idx, MO_32);
122
+ tcg_temp_free_i32(tmp);
123
+ }
124
+ if (!mve_skip_vmov(s, vd + 1, a->idx, MO_32)) {
125
+ tmp = load_reg(s, a->rt2);
126
+ write_neon_element32(tmp, vd + 1, a->idx, MO_32);
127
+ tcg_temp_free_i32(tmp);
128
+ }
129
+
130
+ mve_update_and_store_eci(s);
131
+ return true;
132
+}
133
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/target/arm/translate-vfp.c
136
+++ b/target/arm/translate-vfp.c
137
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
138
return true;
139
}
77
}
140
78
141
-static bool mve_skip_vmov(DisasContext *s, int vn, int index, int size)
79
static void imx7_ccm_class_init(ObjectClass *klass, void *data)
142
+bool mve_skip_vmov(DisasContext *s, int vn, int index, int size)
143
{
144
/*
145
* In a CPU with MVE, the VMOV (vector lane to general-purpose register)
146
--
80
--
147
2.20.1
81
2.25.1
148
149
diff view generated by jsdifflib
1
From: Guenter Roeck <linux@roeck-us.net>
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
2
3
Instantiate SAI1/2/3 and ASRC as unimplemented devices to avoid random
3
The i.MX6UL doesn't support CLK_HIGH ou CLK_HIGH_DIV clock source.
4
Linux kernel crashes, such as
5
4
6
Unhandled fault: external abort on non-linefetch (0x808) at 0xd1580010
5
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
7
pgd = (ptrval)
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
[d1580010] *pgd=8231b811, *pte=02034653, *ppte=02034453
9
Internal error: : 808 [#1] SMP ARM
10
...
11
[<c095e974>] (regmap_mmio_write32le) from [<c095eb48>] (regmap_mmio_write+0x3c/0x54)
12
[<c095eb48>] (regmap_mmio_write) from [<c09580f4>] (_regmap_write+0x4c/0x1f0)
13
[<c09580f4>] (_regmap_write) from [<c095837c>] (_regmap_update_bits+0xe4/0xec)
14
[<c095837c>] (_regmap_update_bits) from [<c09599b4>] (regmap_update_bits_base+0x50/0x74)
15
[<c09599b4>] (regmap_update_bits_base) from [<c0d3e9e4>] (fsl_asrc_runtime_resume+0x1e4/0x21c)
16
[<c0d3e9e4>] (fsl_asrc_runtime_resume) from [<c0942464>] (__rpm_callback+0x3c/0x108)
17
[<c0942464>] (__rpm_callback) from [<c0942590>] (rpm_callback+0x60/0x64)
18
[<c0942590>] (rpm_callback) from [<c0942b60>] (rpm_resume+0x5cc/0x808)
19
[<c0942b60>] (rpm_resume) from [<c0942dfc>] (__pm_runtime_resume+0x60/0xa0)
20
[<c0942dfc>] (__pm_runtime_resume) from [<c0d3ecc4>] (fsl_asrc_probe+0x2a8/0x708)
21
[<c0d3ecc4>] (fsl_asrc_probe) from [<c0935b08>] (platform_probe+0x58/0xb8)
22
[<c0935b08>] (platform_probe) from [<c0933264>] (really_probe.part.0+0x9c/0x334)
23
[<c0933264>] (really_probe.part.0) from [<c093359c>] (__driver_probe_device+0xa0/0x138)
24
[<c093359c>] (__driver_probe_device) from [<c0933664>] (driver_probe_device+0x30/0xc8)
25
[<c0933664>] (driver_probe_device) from [<c0933c88>] (__driver_attach+0x90/0x130)
26
[<c0933c88>] (__driver_attach) from [<c0931060>] (bus_for_each_dev+0x78/0xb8)
27
[<c0931060>] (bus_for_each_dev) from [<c093254c>] (bus_add_driver+0xf0/0x1d8)
28
[<c093254c>] (bus_add_driver) from [<c0934a30>] (driver_register+0x88/0x118)
29
[<c0934a30>] (driver_register) from [<c01022c0>] (do_one_initcall+0x7c/0x3a4)
30
[<c01022c0>] (do_one_initcall) from [<c1601204>] (kernel_init_freeable+0x198/0x22c)
31
[<c1601204>] (kernel_init_freeable) from [<c0f5ff2c>] (kernel_init+0x10/0x128)
32
[<c0f5ff2c>] (kernel_init) from [<c010013c>] (ret_from_fork+0x14/0x38)
33
34
or
35
36
Unhandled fault: external abort on non-linefetch (0x808) at 0xd19b0000
37
pgd = (ptrval)
38
[d19b0000] *pgd=82711811, *pte=308a0653, *ppte=308a0453
39
Internal error: : 808 [#1] SMP ARM
40
...
41
[<c095e974>] (regmap_mmio_write32le) from [<c095eb48>] (regmap_mmio_write+0x3c/0x54)
42
[<c095eb48>] (regmap_mmio_write) from [<c09580f4>] (_regmap_write+0x4c/0x1f0)
43
[<c09580f4>] (_regmap_write) from [<c0959b28>] (regmap_write+0x3c/0x60)
44
[<c0959b28>] (regmap_write) from [<c0d41130>] (fsl_sai_runtime_resume+0x9c/0x1ec)
45
[<c0d41130>] (fsl_sai_runtime_resume) from [<c0942464>] (__rpm_callback+0x3c/0x108)
46
[<c0942464>] (__rpm_callback) from [<c0942590>] (rpm_callback+0x60/0x64)
47
[<c0942590>] (rpm_callback) from [<c0942b60>] (rpm_resume+0x5cc/0x808)
48
[<c0942b60>] (rpm_resume) from [<c0942dfc>] (__pm_runtime_resume+0x60/0xa0)
49
[<c0942dfc>] (__pm_runtime_resume) from [<c0d4231c>] (fsl_sai_probe+0x2b8/0x65c)
50
[<c0d4231c>] (fsl_sai_probe) from [<c0935b08>] (platform_probe+0x58/0xb8)
51
[<c0935b08>] (platform_probe) from [<c0933264>] (really_probe.part.0+0x9c/0x334)
52
[<c0933264>] (really_probe.part.0) from [<c093359c>] (__driver_probe_device+0xa0/0x138)
53
[<c093359c>] (__driver_probe_device) from [<c0933664>] (driver_probe_device+0x30/0xc8)
54
[<c0933664>] (driver_probe_device) from [<c0933c88>] (__driver_attach+0x90/0x130)
55
[<c0933c88>] (__driver_attach) from [<c0931060>] (bus_for_each_dev+0x78/0xb8)
56
[<c0931060>] (bus_for_each_dev) from [<c093254c>] (bus_add_driver+0xf0/0x1d8)
57
[<c093254c>] (bus_add_driver) from [<c0934a30>] (driver_register+0x88/0x118)
58
[<c0934a30>] (driver_register) from [<c01022c0>] (do_one_initcall+0x7c/0x3a4)
59
[<c01022c0>] (do_one_initcall) from [<c1601204>] (kernel_init_freeable+0x198/0x22c)
60
[<c1601204>] (kernel_init_freeable) from [<c0f5ff2c>] (kernel_init+0x10/0x128)
61
[<c0f5ff2c>] (kernel_init) from [<c010013c>] (ret_from_fork+0x14/0x38)
62
63
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
64
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
65
Message-id: 20210810160318.87376-1-linux@roeck-us.net
66
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
67
---
8
---
68
hw/arm/fsl-imx6ul.c | 12 ++++++++++++
9
include/hw/timer/imx_gpt.h | 1 +
69
1 file changed, 12 insertions(+)
10
hw/arm/fsl-imx6ul.c | 2 +-
11
hw/misc/imx6ul_ccm.c | 6 ------
12
hw/timer/imx_gpt.c | 25 +++++++++++++++++++++++++
13
4 files changed, 27 insertions(+), 7 deletions(-)
70
14
15
diff --git a/include/hw/timer/imx_gpt.h b/include/hw/timer/imx_gpt.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/include/hw/timer/imx_gpt.h
18
+++ b/include/hw/timer/imx_gpt.h
19
@@ -XXX,XX +XXX,XX @@
20
#define TYPE_IMX25_GPT "imx25.gpt"
21
#define TYPE_IMX31_GPT "imx31.gpt"
22
#define TYPE_IMX6_GPT "imx6.gpt"
23
+#define TYPE_IMX6UL_GPT "imx6ul.gpt"
24
#define TYPE_IMX7_GPT "imx7.gpt"
25
26
#define TYPE_IMX_GPT TYPE_IMX25_GPT
71
diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
27
diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
72
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
73
--- a/hw/arm/fsl-imx6ul.c
29
--- a/hw/arm/fsl-imx6ul.c
74
+++ b/hw/arm/fsl-imx6ul.c
30
+++ b/hw/arm/fsl-imx6ul.c
75
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
31
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6ul_init(Object *obj)
76
*/
32
*/
77
create_unimplemented_device("sdma", FSL_IMX6UL_SDMA_ADDR, 0x4000);
33
for (i = 0; i < FSL_IMX6UL_NUM_GPTS; i++) {
78
34
snprintf(name, NAME_SIZE, "gpt%d", i);
79
+ /*
35
- object_initialize_child(obj, name, &s->gpt[i], TYPE_IMX7_GPT);
80
+ * SAI (Audio SSI (Synchronous Serial Interface))
36
+ object_initialize_child(obj, name, &s->gpt[i], TYPE_IMX6UL_GPT);
81
+ */
37
}
82
+ create_unimplemented_device("sai1", FSL_IMX6UL_SAI1_ADDR, 0x4000);
38
83
+ create_unimplemented_device("sai2", FSL_IMX6UL_SAI2_ADDR, 0x4000);
39
/*
84
+ create_unimplemented_device("sai3", FSL_IMX6UL_SAI3_ADDR, 0x4000);
40
diff --git a/hw/misc/imx6ul_ccm.c b/hw/misc/imx6ul_ccm.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/hw/misc/imx6ul_ccm.c
43
+++ b/hw/misc/imx6ul_ccm.c
44
@@ -XXX,XX +XXX,XX @@ static uint32_t imx6ul_ccm_get_clock_frequency(IMXCCMState *dev, IMXClk clock)
45
case CLK_32k:
46
freq = CKIL_FREQ;
47
break;
48
- case CLK_HIGH:
49
- freq = CKIH_FREQ;
50
- break;
51
- case CLK_HIGH_DIV:
52
- freq = CKIH_FREQ / 8;
53
- break;
54
default:
55
qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: unsupported clock %d\n",
56
TYPE_IMX6UL_CCM, __func__, clock);
57
diff --git a/hw/timer/imx_gpt.c b/hw/timer/imx_gpt.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/hw/timer/imx_gpt.c
60
+++ b/hw/timer/imx_gpt.c
61
@@ -XXX,XX +XXX,XX @@ static const IMXClk imx6_gpt_clocks[] = {
62
CLK_HIGH, /* 111 reference clock */
63
};
64
65
+static const IMXClk imx6ul_gpt_clocks[] = {
66
+ CLK_NONE, /* 000 No clock source */
67
+ CLK_IPG, /* 001 ipg_clk, 532MHz*/
68
+ CLK_IPG_HIGH, /* 010 ipg_clk_highfreq */
69
+ CLK_EXT, /* 011 External clock */
70
+ CLK_32k, /* 100 ipg_clk_32k */
71
+ CLK_NONE, /* 101 not defined */
72
+ CLK_NONE, /* 110 not defined */
73
+ CLK_NONE, /* 111 not defined */
74
+};
85
+
75
+
86
/*
76
static const IMXClk imx7_gpt_clocks[] = {
87
* PWM
77
CLK_NONE, /* 000 No clock source */
88
*/
78
CLK_IPG, /* 001 ipg_clk, 532MHz*/
89
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
79
@@ -XXX,XX +XXX,XX @@ static void imx6_gpt_init(Object *obj)
90
create_unimplemented_device("pwm3", FSL_IMX6UL_PWM3_ADDR, 0x4000);
80
s->clocks = imx6_gpt_clocks;
91
create_unimplemented_device("pwm4", FSL_IMX6UL_PWM4_ADDR, 0x4000);
81
}
92
82
93
+ /*
83
+static void imx6ul_gpt_init(Object *obj)
94
+ * Audio ASRC (asynchronous sample rate converter)
84
+{
95
+ */
85
+ IMXGPTState *s = IMX_GPT(obj);
96
+ create_unimplemented_device("asrc", FSL_IMX6UL_ASRC_ADDR, 0x4000);
97
+
86
+
98
/*
87
+ s->clocks = imx6ul_gpt_clocks;
99
* CAN
88
+}
100
*/
89
+
90
static void imx7_gpt_init(Object *obj)
91
{
92
IMXGPTState *s = IMX_GPT(obj);
93
@@ -XXX,XX +XXX,XX @@ static const TypeInfo imx6_gpt_info = {
94
.instance_init = imx6_gpt_init,
95
};
96
97
+static const TypeInfo imx6ul_gpt_info = {
98
+ .name = TYPE_IMX6UL_GPT,
99
+ .parent = TYPE_IMX25_GPT,
100
+ .instance_init = imx6ul_gpt_init,
101
+};
102
+
103
static const TypeInfo imx7_gpt_info = {
104
.name = TYPE_IMX7_GPT,
105
.parent = TYPE_IMX25_GPT,
106
@@ -XXX,XX +XXX,XX @@ static void imx_gpt_register_types(void)
107
type_register_static(&imx25_gpt_info);
108
type_register_static(&imx31_gpt_info);
109
type_register_static(&imx6_gpt_info);
110
+ type_register_static(&imx6ul_gpt_info);
111
type_register_static(&imx7_gpt_info);
112
}
113
101
--
114
--
102
2.20.1
115
2.25.1
103
104
diff view generated by jsdifflib
1
Implement the MVE VMLAS insn, which multiplies a vector by a vector
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
and adds a scalar.
3
2
3
IRQs were not associated to the various GPIO devices inside i.MX7D.
4
This patch brings the i.MX7D on par with i.MX6.
5
6
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
7
Message-id: 20221226101418.415170-1-jcd@tribudubois.net
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
---
10
---
7
target/arm/helper-mve.h | 4 ++++
11
include/hw/arm/fsl-imx7.h | 15 +++++++++++++++
8
target/arm/mve.decode | 3 +++
12
hw/arm/fsl-imx7.c | 31 ++++++++++++++++++++++++++++++-
9
target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++
13
2 files changed, 45 insertions(+), 1 deletion(-)
10
target/arm/translate-mve.c | 1 +
11
4 files changed, 34 insertions(+)
12
14
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
15
diff --git a/include/hw/arm/fsl-imx7.h b/include/hw/arm/fsl-imx7.h
14
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
17
--- a/include/hw/arm/fsl-imx7.h
16
+++ b/target/arm/helper-mve.h
18
+++ b/include/hw/arm/fsl-imx7.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i3
19
@@ -XXX,XX +XXX,XX @@ enum FslIMX7IRQs {
18
DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
FSL_IMX7_GPT3_IRQ = 53,
19
DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
FSL_IMX7_GPT4_IRQ = 52,
20
22
21
+DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
+ FSL_IMX7_GPIO1_LOW_IRQ = 64,
22
+DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
+ FSL_IMX7_GPIO1_HIGH_IRQ = 65,
23
+DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
+ FSL_IMX7_GPIO2_LOW_IRQ = 66,
26
+ FSL_IMX7_GPIO2_HIGH_IRQ = 67,
27
+ FSL_IMX7_GPIO3_LOW_IRQ = 68,
28
+ FSL_IMX7_GPIO3_HIGH_IRQ = 69,
29
+ FSL_IMX7_GPIO4_LOW_IRQ = 70,
30
+ FSL_IMX7_GPIO4_HIGH_IRQ = 71,
31
+ FSL_IMX7_GPIO5_LOW_IRQ = 72,
32
+ FSL_IMX7_GPIO5_HIGH_IRQ = 73,
33
+ FSL_IMX7_GPIO6_LOW_IRQ = 74,
34
+ FSL_IMX7_GPIO6_HIGH_IRQ = 75,
35
+ FSL_IMX7_GPIO7_LOW_IRQ = 76,
36
+ FSL_IMX7_GPIO7_HIGH_IRQ = 77,
24
+
37
+
25
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
38
FSL_IMX7_WDOG1_IRQ = 78,
26
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
39
FSL_IMX7_WDOG2_IRQ = 79,
27
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
40
FSL_IMX7_WDOG3_IRQ = 10,
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
41
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
29
index XXXXXXX..XXXXXXX 100644
42
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
43
--- a/hw/arm/fsl-imx7.c
31
+++ b/target/arm/mve.decode
44
+++ b/hw/arm/fsl-imx7.c
32
@@ -XXX,XX +XXX,XX @@ VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
45
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
33
VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
46
FSL_IMX7_GPIO7_ADDR,
34
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
47
};
35
48
36
+# The U bit (28) is don't-care because it does not affect the result
49
+ static const int FSL_IMX7_GPIOn_LOW_IRQ[FSL_IMX7_NUM_GPIOS] = {
37
+VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar
50
+ FSL_IMX7_GPIO1_LOW_IRQ,
51
+ FSL_IMX7_GPIO2_LOW_IRQ,
52
+ FSL_IMX7_GPIO3_LOW_IRQ,
53
+ FSL_IMX7_GPIO4_LOW_IRQ,
54
+ FSL_IMX7_GPIO5_LOW_IRQ,
55
+ FSL_IMX7_GPIO6_LOW_IRQ,
56
+ FSL_IMX7_GPIO7_LOW_IRQ,
57
+ };
38
+
58
+
39
# Vector add across vector
59
+ static const int FSL_IMX7_GPIOn_HIGH_IRQ[FSL_IMX7_NUM_GPIOS] = {
40
{
60
+ FSL_IMX7_GPIO1_HIGH_IRQ,
41
VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
61
+ FSL_IMX7_GPIO2_HIGH_IRQ,
42
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
62
+ FSL_IMX7_GPIO3_HIGH_IRQ,
43
index XXXXXXX..XXXXXXX 100644
63
+ FSL_IMX7_GPIO4_HIGH_IRQ,
44
--- a/target/arm/mve_helper.c
64
+ FSL_IMX7_GPIO5_HIGH_IRQ,
45
+++ b/target/arm/mve_helper.c
65
+ FSL_IMX7_GPIO6_HIGH_IRQ,
46
@@ -XXX,XX +XXX,XX @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w)
66
+ FSL_IMX7_GPIO7_HIGH_IRQ,
47
mve_advance_vpt(env); \
67
+ };
68
+
69
sysbus_realize(SYS_BUS_DEVICE(&s->gpio[i]), &error_abort);
70
- sysbus_mmio_map(SYS_BUS_DEVICE(&s->gpio[i]), 0, FSL_IMX7_GPIOn_ADDR[i]);
71
+ sysbus_mmio_map(SYS_BUS_DEVICE(&s->gpio[i]), 0,
72
+ FSL_IMX7_GPIOn_ADDR[i]);
73
+
74
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->gpio[i]), 0,
75
+ qdev_get_gpio_in(DEVICE(&s->a7mpcore),
76
+ FSL_IMX7_GPIOn_LOW_IRQ[i]));
77
+
78
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->gpio[i]), 1,
79
+ qdev_get_gpio_in(DEVICE(&s->a7mpcore),
80
+ FSL_IMX7_GPIOn_HIGH_IRQ[i]));
48
}
81
}
49
82
50
+/* "accumulating" version where FN takes d as well as n and m */
83
/*
51
+#define DO_2OP_ACC_SCALAR(OP, ESIZE, TYPE, FN) \
52
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
53
+ uint32_t rm) \
54
+ { \
55
+ TYPE *d = vd, *n = vn; \
56
+ TYPE m = rm; \
57
+ uint16_t mask = mve_element_mask(env); \
58
+ unsigned e; \
59
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
60
+ mergemask(&d[H##ESIZE(e)], \
61
+ FN(d[H##ESIZE(e)], n[H##ESIZE(e)], m), mask); \
62
+ } \
63
+ mve_advance_vpt(env); \
64
+ }
65
+
66
/* provide unsigned 2-op scalar helpers for all sizes */
67
#define DO_2OP_SCALAR_U(OP, FN) \
68
DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \
69
@@ -XXX,XX +XXX,XX @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w)
70
DO_2OP_SCALAR(OP##h, 2, int16_t, FN) \
71
DO_2OP_SCALAR(OP##w, 4, int32_t, FN)
72
73
+#define DO_2OP_ACC_SCALAR_U(OP, FN) \
74
+ DO_2OP_ACC_SCALAR(OP##b, 1, uint8_t, FN) \
75
+ DO_2OP_ACC_SCALAR(OP##h, 2, uint16_t, FN) \
76
+ DO_2OP_ACC_SCALAR(OP##w, 4, uint32_t, FN)
77
+
78
DO_2OP_SCALAR_U(vadd_scalar, DO_ADD)
79
DO_2OP_SCALAR_U(vsub_scalar, DO_SUB)
80
DO_2OP_SCALAR_U(vmul_scalar, DO_MUL)
81
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B)
82
DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H)
83
DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W)
84
85
+/* Vector by vector plus scalar */
86
+#define DO_VMLAS(D, N, M) ((N) * (D) + (M))
87
+
88
+DO_2OP_ACC_SCALAR_U(vmlas, DO_VMLAS)
89
+
90
/*
91
* Long saturating scalar ops. As with DO_2OP_L, TYPE and H are for the
92
* input (smaller) type and LESIZE, LTYPE, LH for the output (long) type.
93
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/translate-mve.c
96
+++ b/target/arm/translate-mve.c
97
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar)
98
DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar)
99
DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar)
100
DO_2OP_SCALAR(VBRSR, vbrsr)
101
+DO_2OP_SCALAR(VMLAS, vmlas)
102
103
static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a)
104
{
105
--
84
--
106
2.20.1
85
2.25.1
107
108
diff view generated by jsdifflib
1
Implement the MVE integer vector comparison instructions that compare
1
From: Stephen Longfield <slongfield@google.com>
2
each element against a scalar from a general purpose register. These
3
are "VCMP (vector)" encodings T4, T5 and T6 and "VPT (vector)"
4
encodings T4, T5 and T6.
5
2
6
We have to move the decodetree pattern for VPST, because it
3
Size is used at lines 1088/1188 for the loop, which reads the last 4
7
overlaps with VCMP T4 with size = 0b11.
4
bytes from the crc_ptr so it does need to get increased, however it
5
shouldn't be increased before the buffer is passed to CRC computation,
6
or the crc32 function will access uninitialized memory.
8
7
8
This was pointed out to me by clg@kaod.org during the code review of
9
a similar patch to hw/net/ftgmac100.c
10
11
Change-Id: Ib0464303b191af1e28abeb2f5105eb25aadb5e9b
12
Signed-off-by: Stephen Longfield <slongfield@google.com>
13
Reviewed-by: Patrick Venture <venture@google.com>
14
Message-id: 20221221183202.3788132-1-slongfield@google.com
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
---
17
---
12
target/arm/helper-mve.h | 32 +++++++++++++++++++++++++++
18
hw/net/imx_fec.c | 8 ++++----
13
target/arm/mve.decode | 18 +++++++++++++---
19
1 file changed, 4 insertions(+), 4 deletions(-)
14
target/arm/mve_helper.c | 44 +++++++++++++++++++++++++++++++-------
15
target/arm/translate-mve.c | 43 +++++++++++++++++++++++++++++++++++++
16
4 files changed, 126 insertions(+), 11 deletions(-)
17
20
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
21
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
19
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
23
--- a/hw/net/imx_fec.c
21
+++ b/target/arm/helper-mve.h
24
+++ b/hw/net/imx_fec.c
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vcmpgtw, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
23
DEF_HELPER_FLAGS_3(mve_vcmpleb, TCG_CALL_NO_WG, void, env, ptr, ptr)
26
return 0;
24
DEF_HELPER_FLAGS_3(mve_vcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
DEF_HELPER_FLAGS_3(mve_vcmplew, TCG_CALL_NO_WG, void, env, ptr, ptr)
26
+
27
+DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
28
+DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
29
+DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_3(mve_vcmpne_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
32
+DEF_HELPER_FLAGS_3(mve_vcmpne_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
33
+DEF_HELPER_FLAGS_3(mve_vcmpne_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
34
+
35
+DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
36
+DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
37
+DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
38
+
39
+DEF_HELPER_FLAGS_3(mve_vcmphi_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
40
+DEF_HELPER_FLAGS_3(mve_vcmphi_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
41
+DEF_HELPER_FLAGS_3(mve_vcmphi_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_3(mve_vcmpge_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
44
+DEF_HELPER_FLAGS_3(mve_vcmpge_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
45
+DEF_HELPER_FLAGS_3(mve_vcmpge_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
46
+
47
+DEF_HELPER_FLAGS_3(mve_vcmplt_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
48
+DEF_HELPER_FLAGS_3(mve_vcmplt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
49
+DEF_HELPER_FLAGS_3(mve_vcmplt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
50
+
51
+DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
52
+DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
53
+DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
54
+
55
+DEF_HELPER_FLAGS_3(mve_vcmple_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32)
56
+DEF_HELPER_FLAGS_3(mve_vcmple_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32)
57
+DEF_HELPER_FLAGS_3(mve_vcmple_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32)
58
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/mve.decode
61
+++ b/target/arm/mve.decode
62
@@ -XXX,XX +XXX,XX @@
63
&vidup qd rn size imm
64
&viwdup qd rn rm size imm
65
&vcmp qm qn size mask
66
+&vcmp_scalar qn rm size mask
67
68
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
69
# Note that both Rn and Qd are 3 bits only (no D bit)
70
@@ -XXX,XX +XXX,XX @@
71
# Vector comparison; 4-bit Qm but 3-bit Qn
72
%mask_22_13 22:1 13:3
73
@vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13
74
+@vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \
75
+ mask=%mask_22_13
76
77
# Vector loads and stores
78
79
@@ -XXX,XX +XXX,XX @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
80
rdahi=%rdahi rdalo=%rdalo
81
}
82
83
-# Predicate operations
84
-VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
85
-
86
# Logical immediate operations (1 reg and modified-immediate)
87
88
# The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but
89
@@ -XXX,XX +XXX,XX @@ VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp
90
VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp
91
VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp
92
VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp
93
+
94
+{
95
+ VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
96
+ VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar
97
+}
98
+VCMPNE_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 0 0 .... @vcmp_scalar
99
+VCMPCS_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 1 0 .... @vcmp_scalar
100
+VCMPHI_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 1 0 .... @vcmp_scalar
101
+VCMPGE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 0 0 .... @vcmp_scalar
102
+VCMPLT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 0 0 .... @vcmp_scalar
103
+VCMPGT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 1 0 .... @vcmp_scalar
104
+VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 1 0 .... @vcmp_scalar
105
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
106
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/mve_helper.c
108
+++ b/target/arm/mve_helper.c
109
@@ -XXX,XX +XXX,XX @@ DO_VIWDUP_ALL(vdwdup, do_sub_wrap)
110
mve_advance_vpt(env); \
111
}
27
}
112
28
113
-#define DO_VCMP_S(OP, FN) \
29
- /* 4 bytes for the CRC. */
114
- DO_VCMP(OP##b, 1, int8_t, FN) \
30
- size += 4;
115
- DO_VCMP(OP##h, 2, int16_t, FN) \
31
crc = cpu_to_be32(crc32(~0, buf, size));
116
- DO_VCMP(OP##w, 4, int32_t, FN)
32
+ /* Increase size by 4, loop below reads the last 4 bytes from crc_ptr. */
117
+#define DO_VCMP_SCALAR(OP, ESIZE, TYPE, FN) \
33
+ size += 4;
118
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
34
crc_ptr = (uint8_t *) &crc;
119
+ uint32_t rm) \
35
120
+ { \
36
/* Huge frames are truncated. */
121
+ TYPE *n = vn; \
37
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
122
+ uint16_t mask = mve_element_mask(env); \
38
return 0;
123
+ uint16_t eci_mask = mve_eci_mask(env); \
124
+ uint16_t beatpred = 0; \
125
+ uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \
126
+ unsigned e; \
127
+ for (e = 0; e < 16 / ESIZE; e++) { \
128
+ bool r = FN(n[H##ESIZE(e)], (TYPE)rm); \
129
+ /* Comparison sets 0/1 bits for each byte in the element */ \
130
+ beatpred |= r * emask; \
131
+ emask <<= ESIZE; \
132
+ } \
133
+ beatpred &= mask; \
134
+ env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \
135
+ (beatpred & eci_mask); \
136
+ mve_advance_vpt(env); \
137
+ }
138
139
-#define DO_VCMP_U(OP, FN) \
140
- DO_VCMP(OP##b, 1, uint8_t, FN) \
141
- DO_VCMP(OP##h, 2, uint16_t, FN) \
142
- DO_VCMP(OP##w, 4, uint32_t, FN)
143
+#define DO_VCMP_S(OP, FN) \
144
+ DO_VCMP(OP##b, 1, int8_t, FN) \
145
+ DO_VCMP(OP##h, 2, int16_t, FN) \
146
+ DO_VCMP(OP##w, 4, int32_t, FN) \
147
+ DO_VCMP_SCALAR(OP##_scalarb, 1, int8_t, FN) \
148
+ DO_VCMP_SCALAR(OP##_scalarh, 2, int16_t, FN) \
149
+ DO_VCMP_SCALAR(OP##_scalarw, 4, int32_t, FN)
150
+
151
+#define DO_VCMP_U(OP, FN) \
152
+ DO_VCMP(OP##b, 1, uint8_t, FN) \
153
+ DO_VCMP(OP##h, 2, uint16_t, FN) \
154
+ DO_VCMP(OP##w, 4, uint32_t, FN) \
155
+ DO_VCMP_SCALAR(OP##_scalarb, 1, uint8_t, FN) \
156
+ DO_VCMP_SCALAR(OP##_scalarh, 2, uint16_t, FN) \
157
+ DO_VCMP_SCALAR(OP##_scalarw, 4, uint32_t, FN)
158
159
#define DO_EQ(N, M) ((N) == (M))
160
#define DO_NE(N, M) ((N) != (M))
161
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
162
index XXXXXXX..XXXXXXX 100644
163
--- a/target/arm/translate-mve.c
164
+++ b/target/arm/translate-mve.c
165
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
166
typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32);
167
typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32);
168
typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
169
+typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
170
171
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
172
static inline long mve_qreg_offset(unsigned reg)
173
@@ -XXX,XX +XXX,XX @@ static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn)
174
return true;
175
}
176
177
+static bool do_vcmp_scalar(DisasContext *s, arg_vcmp_scalar *a,
178
+ MVEGenScalarCmpFn *fn)
179
+{
180
+ TCGv_ptr qn;
181
+ TCGv_i32 rm;
182
+
183
+ if (!dc_isar_feature(aa32_mve, s) || !fn || a->rm == 13) {
184
+ return false;
185
+ }
186
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
187
+ return true;
188
+ }
189
+
190
+ qn = mve_qreg_ptr(a->qn);
191
+ if (a->rm == 15) {
192
+ /* Encoding Rm=0b1111 means "constant zero" */
193
+ rm = tcg_constant_i32(0);
194
+ } else {
195
+ rm = load_reg(s, a->rm);
196
+ }
197
+ fn(cpu_env, qn, rm);
198
+ tcg_temp_free_ptr(qn);
199
+ tcg_temp_free_i32(rm);
200
+ if (a->mask) {
201
+ /* VPT */
202
+ gen_vpst(s, a->mask);
203
+ }
204
+ mve_update_eci(s);
205
+ return true;
206
+}
207
+
208
#define DO_VCMP(INSN, FN) \
209
static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \
210
{ \
211
@@ -XXX,XX +XXX,XX @@ static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn)
212
NULL, \
213
}; \
214
return do_vcmp(s, a, fns[a->size]); \
215
+ } \
216
+ static bool trans_##INSN##_scalar(DisasContext *s, \
217
+ arg_vcmp_scalar *a) \
218
+ { \
219
+ static MVEGenScalarCmpFn * const fns[] = { \
220
+ gen_helper_mve_##FN##_scalarb, \
221
+ gen_helper_mve_##FN##_scalarh, \
222
+ gen_helper_mve_##FN##_scalarw, \
223
+ NULL, \
224
+ }; \
225
+ return do_vcmp_scalar(s, a, fns[a->size]); \
226
}
39
}
227
40
228
DO_VCMP(VCMPEQ, vcmpeq)
41
- /* 4 bytes for the CRC. */
42
- size += 4;
43
crc = cpu_to_be32(crc32(~0, buf, size));
44
+ /* Increase size by 4, loop below reads the last 4 bytes from crc_ptr. */
45
+ size += 4;
46
crc_ptr = (uint8_t *) &crc;
47
48
if (shift16) {
229
--
49
--
230
2.20.1
50
2.25.1
231
232
diff view generated by jsdifflib
Deleted patch
1
Implement the MVE VPSEL insn, which sets each byte of the destination
2
vector Qd to the byte from either Qn or Qm depending on the value of
3
the corresponding bit in VPR.P0.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
target/arm/helper-mve.h | 2 ++
9
target/arm/mve.decode | 7 +++++--
10
target/arm/mve_helper.c | 19 +++++++++++++++++++
11
target/arm/translate-mve.c | 2 ++
12
4 files changed, 28 insertions(+), 2 deletions(-)
13
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
17
+++ b/target/arm/helper-mve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
22
+DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
+
24
DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/mve.decode
30
+++ b/target/arm/mve.decode
31
@@ -XXX,XX +XXX,XX @@ VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd
32
# effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero.
33
VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp
34
VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp
35
-VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp
36
-VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp
37
+{
38
+ VPSEL 1111 1110 0 . 11 ... 1 ... 0 1111 . 0 . 0 ... 1 @2op_nosz
39
+ VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp
40
+ VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp
41
+}
42
VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp
43
VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp
44
VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp
45
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/mve_helper.c
48
+++ b/target/arm/mve_helper.c
49
@@ -XXX,XX +XXX,XX @@ DO_VCMP_S(vcmpge, DO_GE)
50
DO_VCMP_S(vcmplt, DO_LT)
51
DO_VCMP_S(vcmpgt, DO_GT)
52
DO_VCMP_S(vcmple, DO_LE)
53
+
54
+void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm)
55
+{
56
+ /*
57
+ * Qd[n] = VPR.P0[n] ? Qn[n] : Qm[n]
58
+ * but note that whether bytes are written to Qd is still subject
59
+ * to (all forms of) predication in the usual way.
60
+ */
61
+ uint64_t *d = vd, *n = vn, *m = vm;
62
+ uint16_t mask = mve_element_mask(env);
63
+ uint16_t p0 = FIELD_EX32(env->v7m.vpr, V7M_VPR, P0);
64
+ unsigned e;
65
+ for (e = 0; e < 16 / 8; e++, mask >>= 8, p0 >>= 8) {
66
+ uint64_t r = m[H8(e)];
67
+ mergemask(&r, n[H8(e)], p0);
68
+ mergemask(&d[H8(e)], r, mask);
69
+ }
70
+ mve_advance_vpt(env);
71
+}
72
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
73
index XXXXXXX..XXXXXXX 100644
74
--- a/target/arm/translate-mve.c
75
+++ b/target/arm/translate-mve.c
76
@@ -XXX,XX +XXX,XX @@ DO_LOGIC(VORR, gen_helper_mve_vorr)
77
DO_LOGIC(VORN, gen_helper_mve_vorn)
78
DO_LOGIC(VEOR, gen_helper_mve_veor)
79
80
+DO_LOGIC(VPSEL, gen_helper_mve_vpsel)
81
+
82
#define DO_2OP(INSN, FN) \
83
static bool trans_##INSN(DisasContext *s, arg_2op *a) \
84
{ \
85
--
86
2.20.1
87
88
diff view generated by jsdifflib
Deleted patch
1
All the users of the vmlaldav formats have an 'x bit in bit 12 and an
2
'a' bit in bit 5; move these to the format rather than specifying them
3
in each insn pattern.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
target/arm/mve.decode | 16 ++++++++--------
9
1 file changed, 8 insertions(+), 8 deletions(-)
10
11
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/mve.decode
14
+++ b/target/arm/mve.decode
15
@@ -XXX,XX +XXX,XX @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
16
17
&vmlaldav rdahi rdalo size qn qm x a
18
19
-@vmlaldav .... .... . ... ... . ... . .... .... qm:3 . \
20
+@vmlaldav .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \
21
qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav
22
-@vmlaldav_nosz .... .... . ... ... . ... . .... .... qm:3 . \
23
+@vmlaldav_nosz .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \
24
qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav
25
-VMLALDAV_S 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
26
-VMLALDAV_U 1111 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
27
+VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
28
+VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
29
30
-VMLSLDAV 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav
31
+VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav
32
33
-VRMLALDAVH_S 1110 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz
34
-VRMLALDAVH_U 1111 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz
35
+VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
36
+VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
37
38
-VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_nosz
39
+VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz
40
41
# Scalar operations
42
43
--
44
2.20.1
45
46
diff view generated by jsdifflib
Deleted patch
1
Implement the MVE integer min/max across vector insns
2
VMAXV, VMINV, VMAXAV and VMINAV, which find the maximum
3
from the vector elements and a general purpose register,
4
and store the maximum back into the general purpose
5
register.
6
1
7
These insns overlap with VRMLALDAVH (they use what would
8
be RdaHi=0b110).
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
target/arm/helper-mve.h | 20 ++++++++++++
14
target/arm/mve.decode | 18 +++++++++--
15
target/arm/mve_helper.c | 66 ++++++++++++++++++++++++++++++++++++++
16
target/arm/translate-mve.c | 48 +++++++++++++++++++++++++++
17
4 files changed, 150 insertions(+), 2 deletions(-)
18
19
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper-mve.h
22
+++ b/target/arm/helper-mve.h
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
24
DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
25
DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
26
27
+DEF_HELPER_FLAGS_3(mve_vmaxvsb, TCG_CALL_NO_WG, i32, env, ptr, i32)
28
+DEF_HELPER_FLAGS_3(mve_vmaxvsh, TCG_CALL_NO_WG, i32, env, ptr, i32)
29
+DEF_HELPER_FLAGS_3(mve_vmaxvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
30
+DEF_HELPER_FLAGS_3(mve_vmaxvub, TCG_CALL_NO_WG, i32, env, ptr, i32)
31
+DEF_HELPER_FLAGS_3(mve_vmaxvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
32
+DEF_HELPER_FLAGS_3(mve_vmaxvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
33
+DEF_HELPER_FLAGS_3(mve_vmaxavb, TCG_CALL_NO_WG, i32, env, ptr, i32)
34
+DEF_HELPER_FLAGS_3(mve_vmaxavh, TCG_CALL_NO_WG, i32, env, ptr, i32)
35
+DEF_HELPER_FLAGS_3(mve_vmaxavw, TCG_CALL_NO_WG, i32, env, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_3(mve_vminvsb, TCG_CALL_NO_WG, i32, env, ptr, i32)
38
+DEF_HELPER_FLAGS_3(mve_vminvsh, TCG_CALL_NO_WG, i32, env, ptr, i32)
39
+DEF_HELPER_FLAGS_3(mve_vminvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
40
+DEF_HELPER_FLAGS_3(mve_vminvub, TCG_CALL_NO_WG, i32, env, ptr, i32)
41
+DEF_HELPER_FLAGS_3(mve_vminvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
42
+DEF_HELPER_FLAGS_3(mve_vminvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
43
+DEF_HELPER_FLAGS_3(mve_vminavb, TCG_CALL_NO_WG, i32, env, ptr, i32)
44
+DEF_HELPER_FLAGS_3(mve_vminavh, TCG_CALL_NO_WG, i32, env, ptr, i32)
45
+DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32)
46
+
47
DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64)
48
DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64)
49
50
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/mve.decode
53
+++ b/target/arm/mve.decode
54
@@ -XXX,XX +XXX,XX @@
55
&vcmp qm qn size mask
56
&vcmp_scalar qn rm size mask
57
&shl_scalar qda rm size
58
+&vmaxv qm rda size
59
60
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
61
# Note that both Rn and Qd are 3 bits only (no D bit)
62
@@ -XXX,XX +XXX,XX @@
63
@vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \
64
mask=%mask_22_13
65
66
+@vmaxv .... .... .... size:2 .. rda:4 .... .... .... &vmaxv qm=%qm
67
+
68
# Vector loads and stores
69
70
# Widening loads and narrowing stores:
71
@@ -XXX,XX +XXX,XX @@ VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav
72
73
VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav
74
75
-VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
76
-VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
77
+{
78
+ VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv
79
+ VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv
80
+ VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv
81
+ VMINAV 1110 1110 1110 .. 00 .... 1111 1 0 . 0 ... 0 @vmaxv
82
+ VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
83
+}
84
+
85
+{
86
+ VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv
87
+ VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv
88
+ VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz
89
+}
90
91
VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz
92
93
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/mve_helper.c
96
+++ b/target/arm/mve_helper.c
97
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvub, 1, uint8_t)
98
DO_VADDV(vaddvuh, 2, uint16_t)
99
DO_VADDV(vaddvuw, 4, uint32_t)
100
101
+/*
102
+ * Vector max/min across vector. Unlike VADDV, we must
103
+ * read ra as the element size, not its full width.
104
+ * We work with int64_t internally for simplicity.
105
+ */
106
+#define DO_VMAXMINV(OP, ESIZE, TYPE, RATYPE, FN) \
107
+ uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \
108
+ uint32_t ra_in) \
109
+ { \
110
+ uint16_t mask = mve_element_mask(env); \
111
+ unsigned e; \
112
+ TYPE *m = vm; \
113
+ int64_t ra = (RATYPE)ra_in; \
114
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
115
+ if (mask & 1) { \
116
+ ra = FN(ra, m[H##ESIZE(e)]); \
117
+ } \
118
+ } \
119
+ mve_advance_vpt(env); \
120
+ return ra; \
121
+ } \
122
+
123
+#define DO_VMAXMINV_U(INSN, FN) \
124
+ DO_VMAXMINV(INSN##b, 1, uint8_t, uint8_t, FN) \
125
+ DO_VMAXMINV(INSN##h, 2, uint16_t, uint16_t, FN) \
126
+ DO_VMAXMINV(INSN##w, 4, uint32_t, uint32_t, FN)
127
+#define DO_VMAXMINV_S(INSN, FN) \
128
+ DO_VMAXMINV(INSN##b, 1, int8_t, int8_t, FN) \
129
+ DO_VMAXMINV(INSN##h, 2, int16_t, int16_t, FN) \
130
+ DO_VMAXMINV(INSN##w, 4, int32_t, int32_t, FN)
131
+
132
+/*
133
+ * Helpers for max and min of absolute values across vector:
134
+ * note that we only take the absolute value of 'm', not 'n'
135
+ */
136
+static int64_t do_maxa(int64_t n, int64_t m)
137
+{
138
+ if (m < 0) {
139
+ m = -m;
140
+ }
141
+ return MAX(n, m);
142
+}
143
+
144
+static int64_t do_mina(int64_t n, int64_t m)
145
+{
146
+ if (m < 0) {
147
+ m = -m;
148
+ }
149
+ return MIN(n, m);
150
+}
151
+
152
+DO_VMAXMINV_S(vmaxvs, DO_MAX)
153
+DO_VMAXMINV_U(vmaxvu, DO_MAX)
154
+DO_VMAXMINV_S(vminvs, DO_MIN)
155
+DO_VMAXMINV_U(vminvu, DO_MIN)
156
+/*
157
+ * VMAXAV, VMINAV treat the general purpose input as unsigned
158
+ * and the vector elements as signed.
159
+ */
160
+DO_VMAXMINV(vmaxavb, 1, int8_t, uint8_t, do_maxa)
161
+DO_VMAXMINV(vmaxavh, 2, int16_t, uint16_t, do_maxa)
162
+DO_VMAXMINV(vmaxavw, 4, int32_t, uint32_t, do_maxa)
163
+DO_VMAXMINV(vminavb, 1, int8_t, uint8_t, do_mina)
164
+DO_VMAXMINV(vminavh, 2, int16_t, uint16_t, do_mina)
165
+DO_VMAXMINV(vminavw, 4, int32_t, uint32_t, do_mina)
166
+
167
#define DO_VADDLV(OP, TYPE, LTYPE) \
168
uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \
169
uint64_t ra) \
170
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
171
index XXXXXXX..XXXXXXX 100644
172
--- a/target/arm/translate-mve.c
173
+++ b/target/arm/translate-mve.c
174
@@ -XXX,XX +XXX,XX @@ DO_VCMP(VCMPGE, vcmpge)
175
DO_VCMP(VCMPLT, vcmplt)
176
DO_VCMP(VCMPGT, vcmpgt)
177
DO_VCMP(VCMPLE, vcmple)
178
+
179
+static bool do_vmaxv(DisasContext *s, arg_vmaxv *a, MVEGenVADDVFn fn)
180
+{
181
+ /*
182
+ * MIN/MAX operations across a vector: compute the min or
183
+ * max of the initial value in a general purpose register
184
+ * and all the elements in the vector, and store it back
185
+ * into the general purpose register.
186
+ */
187
+ TCGv_ptr qm;
188
+ TCGv_i32 rda;
189
+
190
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qm) ||
191
+ !fn || a->rda == 13 || a->rda == 15) {
192
+ /* Rda cases are UNPREDICTABLE */
193
+ return false;
194
+ }
195
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
196
+ return true;
197
+ }
198
+
199
+ qm = mve_qreg_ptr(a->qm);
200
+ rda = load_reg(s, a->rda);
201
+ fn(rda, cpu_env, qm, rda);
202
+ store_reg(s, a->rda, rda);
203
+ tcg_temp_free_ptr(qm);
204
+ mve_update_eci(s);
205
+ return true;
206
+}
207
+
208
+#define DO_VMAXV(INSN, FN) \
209
+ static bool trans_##INSN(DisasContext *s, arg_vmaxv *a) \
210
+ { \
211
+ static MVEGenVADDVFn * const fns[] = { \
212
+ gen_helper_mve_##FN##b, \
213
+ gen_helper_mve_##FN##h, \
214
+ gen_helper_mve_##FN##w, \
215
+ NULL, \
216
+ }; \
217
+ return do_vmaxv(s, a, fns[a->size]); \
218
+ }
219
+
220
+DO_VMAXV(VMAXV_S, vmaxvs)
221
+DO_VMAXV(VMAXV_U, vmaxvu)
222
+DO_VMAXV(VMAXAV, vmaxav)
223
+DO_VMAXV(VMINV_S, vminvs)
224
+DO_VMAXV(VMINV_U, vminvu)
225
+DO_VMAXV(VMINAV, vminav)
226
--
227
2.20.1
228
229
diff view generated by jsdifflib