1
Just my fp16 work, plus some small stuff for the sbsa-ref board;
1
First pullreq for 6.0: mostly my v8.1M work, plus some other
2
but my rule of thumb is to send a pullreq once I get over about
2
bits and pieces. (I still have a lot of stuff in my to-review
3
30 patches...
3
folder, which I may or may not get to before the Christmas break...)
4
4
5
thanks
5
-- PMM
6
-- PMM
6
7
7
The following changes since commit 2f4c51c0f384d7888a04b4815861e6d5fd244d75:
8
The following changes since commit 5e7b204dbfae9a562fc73684986f936b97f63877:
8
9
9
Merge remote-tracking branch 'remotes/kraxel/tags/usb-20200831-pull-request' into staging (2020-08-31 19:39:13 +0100)
10
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2020-12-09 20:08:54 +0000)
10
11
11
are available in the Git repository at:
12
are available in the Git repository at:
12
13
13
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200901
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20201210
14
15
15
for you to fetch changes up to 3f462bf0f6ea6382dd1502d4eb1fcd33c8e774f5:
16
for you to fetch changes up to 71f916be1c7e9ede0e37d9cabc781b5a9e8638ff:
16
17
17
hw/arm/sbsa-ref : Add embedded controller in secure memory (2020-09-01 14:01:34 +0100)
18
hw/arm/armv7m: Correct typo in QOM object name (2020-12-10 11:44:56 +0000)
18
19
19
----------------------------------------------------------------
20
----------------------------------------------------------------
20
target-arm queue:
21
target-arm queue:
21
* Implement fp16 support for AArch32 VFP and Neon
22
* hw/arm/smmuv3: Fix up L1STD_SPAN decoding
22
* hw/arm/sbsa-ref: add "reg" property to DT cpu nodes
23
* xlnx-zynqmp: Support Xilinx ZynqMP CAN controllers
23
* hw/arm/sbsa-ref : Add embedded controller in secure memory
24
* sbsa-ref: allow to use Cortex-A53/57/72 cpus
25
* Various minor code cleanups
26
* hw/intc/armv7m_nvic: Make all of system PPB range be RAZWI/BusFault
27
* Implement more pieces of ARMv8.1M support
24
28
25
----------------------------------------------------------------
29
----------------------------------------------------------------
26
Graeme Gregory (2):
30
Alex Chen (4):
27
hw/misc/sbsa_ec : Add an embedded controller for sbsa-ref
31
i.MX25: Fix bad printf format specifiers
28
hw/arm/sbsa-ref : Add embedded controller in secure memory
32
i.MX31: Fix bad printf format specifiers
33
i.MX6: Fix bad printf format specifiers
34
i.MX6ul: Fix bad printf format specifiers
29
35
30
Leif Lindholm (1):
36
Havard Skinnemoen (1):
31
hw/arm/sbsa-ref: add "reg" property to DT cpu nodes
37
tests/qtest/npcm7xx_rng-test: dump random data on failure
32
38
33
Peter Maydell (44):
39
Kunkun Jiang (1):
34
target/arm: Remove local definitions of float constants
40
hw/arm/smmuv3: Fix up L1STD_SPAN decoding
35
target/arm: Use correct ID register check for aa32_fp16_arith
36
target/arm: Implement VFP fp16 for VFP_BINOP operations
37
target/arm: Implement VFP fp16 VMLA, VMLS, VNMLS, VNMLA, VNMUL
38
target/arm: Macroify trans functions for VFMA, VFMS, VFNMA, VFNMS
39
target/arm: Implement VFP fp16 for fused-multiply-add
40
target/arm: Macroify uses of do_vfp_2op_sp() and do_vfp_2op_dp()
41
target/arm: Implement VFP fp16 for VABS, VNEG, VSQRT
42
target/arm: Implement VFP fp16 for VMOV immediate
43
target/arm: Implement VFP fp16 VCMP
44
target/arm: Implement VFP fp16 VLDR and VSTR
45
target/arm: Implement VFP fp16 VCVT between float and integer
46
target/arm: Make VFP_CONV_FIX macros take separate float type and float size
47
target/arm: Use macros instead of open-coding fp16 conversion helpers
48
target/arm: Implement VFP fp16 VCVT between float and fixed-point
49
target/arm: Implement VFP vp16 VCVT-with-specified-rounding-mode
50
target/arm: Implement VFP fp16 VSEL
51
target/arm: Implement VFP fp16 VRINT*
52
target/arm: Implement new VFP fp16 insn VINS
53
target/arm: Implement new VFP fp16 insn VMOVX
54
target/arm: Implement VFP fp16 VMOV between gp and halfprec registers
55
target/arm: Implement FP16 for Neon VADD, VSUB, VABD, VMUL
56
target/arm: Implement fp16 for Neon VRECPE, VRSQRTE using gvec
57
target/arm: Implement fp16 for Neon VABS, VNEG of floats
58
target/arm: Implement fp16 for VCEQ, VCGE, VCGT comparisons
59
target/arm: Implement fp16 for VACGE, VACGT
60
target/arm: Implement fp16 for Neon VMAX, VMIN
61
target/arm: Implement fp16 for Neon VMAXNM, VMINNM
62
target/arm: Implement fp16 for Neon VMLA, VMLS operations
63
target/arm: Implement fp16 for Neon VFMA, VMFS
64
target/arm: Implement fp16 for Neon fp compare-vs-0
65
target/arm: Implement fp16 for Neon VRECPS
66
target/arm: Implement fp16 for Neon VRSQRTS
67
target/arm: Implement fp16 for Neon pairwise fp ops
68
target/arm: Implement fp16 for Neon float-integer VCVT
69
target/arm: Convert Neon VCVT fixed-point to gvec
70
target/arm: Implement fp16 for Neon VCVT fixed-point
71
target/arm: Implement fp16 for Neon VCVT with rounding modes
72
target/arm: Implement fp16 for Neon VRINT-with-specified-rounding-mode
73
target/arm: Implement fp16 for Neon VRINTX
74
target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations
75
target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations
76
target/arm: Implement fp16 for Neon VMUL, VMLA, VMLS
77
target/arm: Enable FP16 in '-cpu max'
78
41
79
target/arm/cpu.h | 7 +-
42
Marcin Juszkiewicz (1):
80
target/arm/helper.h | 133 ++++++-
43
sbsa-ref: allow to use Cortex-A53/57/72 cpus
81
target/arm/neon-dp.decode | 8 +-
82
target/arm/vfp-uncond.decode | 27 +-
83
target/arm/vfp.decode | 34 +-
84
hw/arm/sbsa-ref.c | 43 ++-
85
hw/misc/sbsa_ec.c | 98 +++++
86
target/arm/cpu.c | 3 +-
87
target/arm/cpu64.c | 10 +-
88
target/arm/helper-a64.c | 11 -
89
target/arm/translate-sve.c | 4 -
90
target/arm/vec_helper.c | 431 ++++++++++++++++++++-
91
target/arm/vfp_helper.c | 244 +++++-------
92
hw/misc/meson.build | 2 +
93
target/arm/translate-neon.c.inc | 755 +++++++++++++------------------------
94
target/arm/translate-vfp.c.inc | 810 ++++++++++++++++++++++++++++++++++++----
95
16 files changed, 1819 insertions(+), 801 deletions(-)
96
create mode 100644 hw/misc/sbsa_ec.c
97
44
45
Peter Maydell (25):
46
hw/intc/armv7m_nvic: Make all of system PPB range be RAZWI/BusFault
47
target/arm: Implement v8.1M PXN extension
48
target/arm: Don't clobber ID_PFR1.Security on M-profile cores
49
target/arm: Implement VSCCLRM insn
50
target/arm: Implement CLRM instruction
51
target/arm: Enforce M-profile VMRS/VMSR register restrictions
52
target/arm: Refactor M-profile VMSR/VMRS handling
53
target/arm: Move general-use constant expanders up in translate.c
54
target/arm: Implement VLDR/VSTR system register
55
target/arm: Implement M-profile FPSCR_nzcvqc
56
target/arm: Use new FPCR_NZCV_MASK constant
57
target/arm: Factor out preserve-fp-state from full_vfp_access_check()
58
target/arm: Implement FPCXT_S fp system register
59
hw/intc/armv7m_nvic: Update FPDSCR masking for v8.1M
60
target/arm: For v8.1M, always clear R0-R3, R12, APSR, EPSR on exception entry
61
target/arm: In v8.1M, don't set HFSR.FORCED on vector table fetch failures
62
target/arm: Implement v8.1M REVIDR register
63
target/arm: Implement new v8.1M NOCP check for exception return
64
target/arm: Implement new v8.1M VLLDM and VLSTM encodings
65
hw/intc/armv7m_nvic: Support v8.1M CCR.TRD bit
66
target/arm: Implement CCR_S.TRD behaviour for SG insns
67
hw/intc/armv7m_nvic: Fix "return from inactive handler" check
68
target/arm: Implement M-profile "minimal RAS implementation"
69
hw/intc/armv7m_nvic: Implement read/write for RAS register block
70
hw/arm/armv7m: Correct typo in QOM object name
71
72
Vikram Garhwal (4):
73
hw/net/can: Introduce Xilinx ZynqMP CAN controller
74
xlnx-zynqmp: Connect Xilinx ZynqMP CAN controllers
75
tests/qtest: Introduce tests for Xilinx ZynqMP CAN controller
76
MAINTAINERS: Add maintainer entry for Xilinx ZynqMP CAN controller
77
78
meson.build | 1 +
79
hw/arm/smmuv3-internal.h | 2 +-
80
hw/net/can/trace.h | 1 +
81
include/hw/arm/xlnx-zynqmp.h | 8 +
82
include/hw/intc/armv7m_nvic.h | 2 +
83
include/hw/net/xlnx-zynqmp-can.h | 78 +++
84
target/arm/cpu.h | 46 ++
85
target/arm/m-nocp.decode | 10 +-
86
target/arm/t32.decode | 10 +-
87
target/arm/vfp.decode | 14 +
88
hw/arm/armv7m.c | 4 +-
89
hw/arm/sbsa-ref.c | 23 +-
90
hw/arm/xlnx-zcu102.c | 20 +
91
hw/arm/xlnx-zynqmp.c | 34 ++
92
hw/intc/armv7m_nvic.c | 246 ++++++--
93
hw/misc/imx25_ccm.c | 12 +-
94
hw/misc/imx31_ccm.c | 14 +-
95
hw/misc/imx6_ccm.c | 20 +-
96
hw/misc/imx6_src.c | 2 +-
97
hw/misc/imx6ul_ccm.c | 4 +-
98
hw/misc/imx_ccm.c | 4 +-
99
hw/net/can/xlnx-zynqmp-can.c | 1161 ++++++++++++++++++++++++++++++++++++++
100
target/arm/cpu.c | 5 +-
101
target/arm/helper.c | 7 +-
102
target/arm/m_helper.c | 130 ++++-
103
target/arm/translate.c | 105 +++-
104
tests/qtest/npcm7xx_rng-test.c | 12 +
105
tests/qtest/xlnx-can-test.c | 360 ++++++++++++
106
MAINTAINERS | 8 +
107
hw/Kconfig | 1 +
108
hw/net/can/meson.build | 1 +
109
hw/net/can/trace-events | 9 +
110
target/arm/translate-vfp.c.inc | 511 ++++++++++++++++-
111
tests/qtest/meson.build | 1 +
112
34 files changed, 2713 insertions(+), 153 deletions(-)
113
create mode 100644 hw/net/can/trace.h
114
create mode 100644 include/hw/net/xlnx-zynqmp-can.h
115
create mode 100644 hw/net/can/xlnx-zynqmp-can.c
116
create mode 100644 tests/qtest/xlnx-can-test.c
117
create mode 100644 hw/net/can/trace-events
118
diff view generated by jsdifflib
Deleted patch
1
In several places the target/arm code defines local float constants
2
for 2, 3 and 1.5, which are also provided by include/fpu/softfloat.h.
3
Remove the unnecessary local duplicate versions.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-2-peter.maydell@linaro.org
8
---
9
target/arm/helper-a64.c | 11 -----------
10
target/arm/translate-sve.c | 4 ----
11
target/arm/vfp_helper.c | 4 ----
12
3 files changed, 19 deletions(-)
13
14
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-a64.c
17
+++ b/target/arm/helper-a64.c
18
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_cgt_f64)(float64 a, float64 b, void *fpstp)
19
* versions, these do a fully fused multiply-add or
20
* multiply-add-and-halve.
21
*/
22
-#define float16_two make_float16(0x4000)
23
-#define float16_three make_float16(0x4200)
24
-#define float16_one_point_five make_float16(0x3e00)
25
-
26
-#define float32_two make_float32(0x40000000)
27
-#define float32_three make_float32(0x40400000)
28
-#define float32_one_point_five make_float32(0x3fc00000)
29
-
30
-#define float64_two make_float64(0x4000000000000000ULL)
31
-#define float64_three make_float64(0x4008000000000000ULL)
32
-#define float64_one_point_five make_float64(0x3FF8000000000000ULL)
33
34
uint32_t HELPER(recpsf_f16)(uint32_t a, uint32_t b, void *fpstp)
35
{
36
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/translate-sve.c
39
+++ b/target/arm/translate-sve.c
40
@@ -XXX,XX +XXX,XX @@ static bool trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a) \
41
return true; \
42
}
43
44
-#define float16_two make_float16(0x4000)
45
-#define float32_two make_float32(0x40000000)
46
-#define float64_two make_float64(0x4000000000000000ULL)
47
-
48
DO_FP_IMM(FADD, fadds, half, one)
49
DO_FP_IMM(FSUB, fsubs, half, one)
50
DO_FP_IMM(FMUL, fmuls, half, two)
51
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/vfp_helper.c
54
+++ b/target/arm/vfp_helper.c
55
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
56
return r;
57
}
58
59
-#define float32_two make_float32(0x40000000)
60
-#define float32_three make_float32(0x40400000)
61
-#define float32_one_point_five make_float32(0x3fc00000)
62
-
63
float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
64
{
65
float_status *s = &env->vfp.standard_fp_status;
66
--
67
2.20.1
68
69
diff view generated by jsdifflib
1
Convert the Neon floating-point VMUL, VMLA and VMLS to use gvec,
1
From: Kunkun Jiang <jiangkunkun@huawei.com>
2
and use this to implement fp16 support.
3
2
3
Accroding to the SMMUv3 spec, the SPAN field of Level1 Stream Table
4
Descriptor is 5 bits([4:0]).
5
6
Fixes: 9bde7f0674f(hw/arm/smmuv3: Implement translate callback)
7
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
8
Message-id: 20201124023711.1184-1-jiangkunkun@huawei.com
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Acked-by: Eric Auger <eric.auger@redhat.com>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-45-peter.maydell@linaro.org
7
---
12
---
8
target/arm/translate-neon.c.inc | 114 ++++++++++++++++----------------
13
hw/arm/smmuv3-internal.h | 2 +-
9
1 file changed, 57 insertions(+), 57 deletions(-)
14
1 file changed, 1 insertion(+), 1 deletion(-)
10
15
11
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
16
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
12
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-neon.c.inc
18
--- a/hw/arm/smmuv3-internal.h
14
+++ b/target/arm/translate-neon.c.inc
19
+++ b/hw/arm/smmuv3-internal.h
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_2sc(DisasContext *s, arg_2scalar *a)
20
@@ -XXX,XX +XXX,XX @@ static inline uint64_t l1std_l2ptr(STEDesc *desc)
16
return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
21
return hi << 32 | lo;
17
}
22
}
18
23
19
-/*
24
-#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
20
- * Rather than have a float-specific version of do_2scalar just for
25
+#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 5))
21
- * three insns, we wrap a NeonGenTwoSingleOpFn to turn it into
26
22
- * a NeonGenTwoOpFn.
27
#endif
23
- */
24
-#define WRAP_FP_FN(WRAPNAME, FUNC) \
25
- static void WRAPNAME(TCGv_i32 rd, TCGv_i32 rn, TCGv_i32 rm) \
26
- { \
27
- TCGv_ptr fpstatus = fpstatus_ptr(FPST_STD); \
28
- FUNC(rd, rn, rm, fpstatus); \
29
- tcg_temp_free_ptr(fpstatus); \
30
+static bool do_2scalar_fp_vec(DisasContext *s, arg_2scalar *a,
31
+ gen_helper_gvec_3_ptr *fn)
32
+{
33
+ /* Two registers and a scalar, using gvec */
34
+ int vec_size = a->q ? 16 : 8;
35
+ int rd_ofs = neon_reg_offset(a->vd, 0);
36
+ int rn_ofs = neon_reg_offset(a->vn, 0);
37
+ int rm_ofs;
38
+ int idx;
39
+ TCGv_ptr fpstatus;
40
+
41
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
42
+ return false;
43
}
44
45
-WRAP_FP_FN(gen_VMUL_F_mul, gen_helper_vfp_muls)
46
-WRAP_FP_FN(gen_VMUL_F_add, gen_helper_vfp_adds)
47
-WRAP_FP_FN(gen_VMUL_F_sub, gen_helper_vfp_subs)
48
+ /* UNDEF accesses to D16-D31 if they don't exist. */
49
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
50
+ ((a->vd | a->vn | a->vm) & 0x10)) {
51
+ return false;
52
+ }
53
54
-static bool trans_VMUL_F_2sc(DisasContext *s, arg_2scalar *a)
55
-{
56
- static NeonGenTwoOpFn * const opfn[] = {
57
- NULL,
58
- NULL, /* TODO: fp16 support */
59
- gen_VMUL_F_mul,
60
- NULL,
61
- };
62
+ if (!fn) {
63
+ /* Bad size (including size == 3, which is a different insn group) */
64
+ return false;
65
+ }
66
67
- return do_2scalar(s, a, opfn[a->size], NULL);
68
+ if (a->q && ((a->vd | a->vn) & 1)) {
69
+ return false;
70
+ }
71
+
72
+ if (!vfp_access_check(s)) {
73
+ return true;
74
+ }
75
+
76
+ /* a->vm is M:Vm, which encodes both register and index */
77
+ idx = extract32(a->vm, a->size + 2, 2);
78
+ a->vm = extract32(a->vm, 0, a->size + 2);
79
+ rm_ofs = neon_reg_offset(a->vm, 0);
80
+
81
+ fpstatus = fpstatus_ptr(a->size == 1 ? FPST_STD_F16 : FPST_STD);
82
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpstatus,
83
+ vec_size, vec_size, idx, fn);
84
+ tcg_temp_free_ptr(fpstatus);
85
+ return true;
86
}
87
88
-static bool trans_VMLA_F_2sc(DisasContext *s, arg_2scalar *a)
89
-{
90
- static NeonGenTwoOpFn * const opfn[] = {
91
- NULL,
92
- NULL, /* TODO: fp16 support */
93
- gen_VMUL_F_mul,
94
- NULL,
95
- };
96
- static NeonGenTwoOpFn * const accfn[] = {
97
- NULL,
98
- NULL, /* TODO: fp16 support */
99
- gen_VMUL_F_add,
100
- NULL,
101
- };
102
+#define DO_VMUL_F_2sc(NAME, FUNC) \
103
+ static bool trans_##NAME##_F_2sc(DisasContext *s, arg_2scalar *a) \
104
+ { \
105
+ static gen_helper_gvec_3_ptr * const opfn[] = { \
106
+ NULL, \
107
+ gen_helper_##FUNC##_h, \
108
+ gen_helper_##FUNC##_s, \
109
+ NULL, \
110
+ }; \
111
+ if (a->size == MO_16 && !dc_isar_feature(aa32_fp16_arith, s)) { \
112
+ return false; \
113
+ } \
114
+ return do_2scalar_fp_vec(s, a, opfn[a->size]); \
115
+ }
116
117
- return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
118
-}
119
-
120
-static bool trans_VMLS_F_2sc(DisasContext *s, arg_2scalar *a)
121
-{
122
- static NeonGenTwoOpFn * const opfn[] = {
123
- NULL,
124
- NULL, /* TODO: fp16 support */
125
- gen_VMUL_F_mul,
126
- NULL,
127
- };
128
- static NeonGenTwoOpFn * const accfn[] = {
129
- NULL,
130
- NULL, /* TODO: fp16 support */
131
- gen_VMUL_F_sub,
132
- NULL,
133
- };
134
-
135
- return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
136
-}
137
+DO_VMUL_F_2sc(VMUL, gvec_fmul_idx)
138
+DO_VMUL_F_2sc(VMLA, gvec_fmla_nf_idx)
139
+DO_VMUL_F_2sc(VMLS, gvec_fmls_nf_idx)
140
141
WRAP_ENV_FN(gen_VQDMULH_16, gen_helper_neon_qdmulh_s16)
142
WRAP_ENV_FN(gen_VQDMULH_32, gen_helper_neon_qdmulh_s32)
143
--
28
--
144
2.20.1
29
2.20.1
145
30
146
31
diff view generated by jsdifflib
1
From: Graeme Gregory <graeme@nuviainc.com>
1
From: Vikram Garhwal <fnu.vikram@xilinx.com>
2
2
3
A difference between sbsa platform and the virt platform is PSCI is
3
The Xilinx ZynqMP CAN controller is developed based on SocketCAN, QEMU CAN bus
4
handled by ARM-TF in the sbsa platform. This means that the PSCI code
4
implementation. Bus connection and socketCAN connection for each CAN module
5
there needs to communicate some of the platform power changes down
5
can be set through command lines.
6
to the qemu code for things like shutdown/reset control.
7
6
8
Space has been left to extend the EC if we find other use cases in
7
Example for using single CAN:
9
future where ARM-TF and qemu need to communicate.
8
-object can-bus,id=canbus0 \
9
-machine xlnx-zcu102.canbus0=canbus0 \
10
-object can-host-socketcan,id=socketcan0,if=vcan0,canbus=canbus0
10
11
11
Signed-off-by: Graeme Gregory <graeme@nuviainc.com>
12
Example for connecting both CAN to same virtual CAN on host machine:
12
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
13
-object can-bus,id=canbus0 -object can-bus,id=canbus1 \
13
Tested-by: Leif Lindholm <leif@nuviainc.com>
14
-machine xlnx-zcu102.canbus0=canbus0 \
14
Message-id: 20200826141952.136164-2-graeme@nuviainc.com
15
-machine xlnx-zcu102.canbus1=canbus1 \
16
-object can-host-socketcan,id=socketcan0,if=vcan0,canbus=canbus0 \
17
-object can-host-socketcan,id=socketcan1,if=vcan0,canbus=canbus1
18
19
To create virtual CAN on the host machine, please check the QEMU CAN docs:
20
https://github.com/qemu/qemu/blob/master/docs/can.txt
21
22
Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
23
Message-id: 1605728926-352690-2-git-send-email-fnu.vikram@xilinx.com
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
24
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
26
---
18
hw/misc/sbsa_ec.c | 98 +++++++++++++++++++++++++++++++++++++++++++++
27
meson.build | 1 +
19
hw/misc/meson.build | 2 +
28
hw/net/can/trace.h | 1 +
20
2 files changed, 100 insertions(+)
29
include/hw/net/xlnx-zynqmp-can.h | 78 ++
21
create mode 100644 hw/misc/sbsa_ec.c
30
hw/net/can/xlnx-zynqmp-can.c | 1161 ++++++++++++++++++++++++++++++
31
hw/Kconfig | 1 +
32
hw/net/can/meson.build | 1 +
33
hw/net/can/trace-events | 9 +
34
7 files changed, 1252 insertions(+)
35
create mode 100644 hw/net/can/trace.h
36
create mode 100644 include/hw/net/xlnx-zynqmp-can.h
37
create mode 100644 hw/net/can/xlnx-zynqmp-can.c
38
create mode 100644 hw/net/can/trace-events
22
39
23
diff --git a/hw/misc/sbsa_ec.c b/hw/misc/sbsa_ec.c
40
diff --git a/meson.build b/meson.build
41
index XXXXXXX..XXXXXXX 100644
42
--- a/meson.build
43
+++ b/meson.build
44
@@ -XXX,XX +XXX,XX @@ if have_system
45
'hw/misc',
46
'hw/misc/macio',
47
'hw/net',
48
+ 'hw/net/can',
49
'hw/nvram',
50
'hw/pci',
51
'hw/pci-host',
52
diff --git a/hw/net/can/trace.h b/hw/net/can/trace.h
24
new file mode 100644
53
new file mode 100644
25
index XXXXXXX..XXXXXXX
54
index XXXXXXX..XXXXXXX
26
--- /dev/null
55
--- /dev/null
27
+++ b/hw/misc/sbsa_ec.c
56
+++ b/hw/net/can/trace.h
57
@@ -0,0 +1 @@
58
+#include "trace/trace-hw_net_can.h"
59
diff --git a/include/hw/net/xlnx-zynqmp-can.h b/include/hw/net/xlnx-zynqmp-can.h
60
new file mode 100644
61
index XXXXXXX..XXXXXXX
62
--- /dev/null
63
+++ b/include/hw/net/xlnx-zynqmp-can.h
28
@@ -XXX,XX +XXX,XX @@
64
@@ -XXX,XX +XXX,XX @@
29
+/*
65
+/*
30
+ * ARM SBSA Reference Platform Embedded Controller
66
+ * QEMU model of the Xilinx ZynqMP CAN controller.
31
+ *
67
+ *
32
+ * A device to allow PSCI running in the secure side of sbsa-ref machine
68
+ * Copyright (c) 2020 Xilinx Inc.
33
+ * to communicate platform power states to qemu.
34
+ *
69
+ *
35
+ * Copyright (c) 2020 Nuvia Inc
70
+ * Written-by: Vikram Garhwal<fnu.vikram@xilinx.com>
36
+ * Written by Graeme Gregory <graeme@nuviainc.com>
37
+ *
71
+ *
38
+ * SPDX-License-Identifer: GPL-2.0-or-later
72
+ * Based on QEMU CAN Device emulation implemented by Jin Yang, Deniz Eren and
73
+ * Pavel Pisa.
74
+ *
75
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
76
+ * of this software and associated documentation files (the "Software"), to deal
77
+ * in the Software without restriction, including without limitation the rights
78
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
79
+ * copies of the Software, and to permit persons to whom the Software is
80
+ * furnished to do so, subject to the following conditions:
81
+ *
82
+ * The above copyright notice and this permission notice shall be included in
83
+ * all copies or substantial portions of the Software.
84
+ *
85
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
86
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
87
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
88
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
89
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
90
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
91
+ * THE SOFTWARE.
39
+ */
92
+ */
40
+
93
+
94
+#ifndef XLNX_ZYNQMP_CAN_H
95
+#define XLNX_ZYNQMP_CAN_H
96
+
97
+#include "hw/register.h"
98
+#include "net/can_emu.h"
99
+#include "net/can_host.h"
100
+#include "qemu/fifo32.h"
101
+#include "hw/ptimer.h"
102
+#include "hw/qdev-clock.h"
103
+
104
+#define TYPE_XLNX_ZYNQMP_CAN "xlnx.zynqmp-can"
105
+
106
+#define XLNX_ZYNQMP_CAN(obj) \
107
+ OBJECT_CHECK(XlnxZynqMPCANState, (obj), TYPE_XLNX_ZYNQMP_CAN)
108
+
109
+#define MAX_CAN_CTRLS 2
110
+#define XLNX_ZYNQMP_CAN_R_MAX (0x84 / 4)
111
+#define MAILBOX_CAPACITY 64
112
+#define CAN_TIMER_MAX 0XFFFFUL
113
+#define CAN_DEFAULT_CLOCK (24 * 1000 * 1000)
114
+
115
+/* Each CAN_FRAME will have 4 * 32bit size. */
116
+#define CAN_FRAME_SIZE 4
117
+#define RXFIFO_SIZE (MAILBOX_CAPACITY * CAN_FRAME_SIZE)
118
+
119
+typedef struct XlnxZynqMPCANState {
120
+ SysBusDevice parent_obj;
121
+ MemoryRegion iomem;
122
+
123
+ qemu_irq irq;
124
+
125
+ CanBusClientState bus_client;
126
+ CanBusState *canbus;
127
+
128
+ struct {
129
+ uint32_t ext_clk_freq;
130
+ } cfg;
131
+
132
+ RegisterInfo reg_info[XLNX_ZYNQMP_CAN_R_MAX];
133
+ uint32_t regs[XLNX_ZYNQMP_CAN_R_MAX];
134
+
135
+ Fifo32 rx_fifo;
136
+ Fifo32 tx_fifo;
137
+ Fifo32 txhpb_fifo;
138
+
139
+ ptimer_state *can_timer;
140
+} XlnxZynqMPCANState;
141
+
142
+#endif
143
diff --git a/hw/net/can/xlnx-zynqmp-can.c b/hw/net/can/xlnx-zynqmp-can.c
144
new file mode 100644
145
index XXXXXXX..XXXXXXX
146
--- /dev/null
147
+++ b/hw/net/can/xlnx-zynqmp-can.c
148
@@ -XXX,XX +XXX,XX @@
149
+/*
150
+ * QEMU model of the Xilinx ZynqMP CAN controller.
151
+ * This implementation is based on the following datasheet:
152
+ * https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf
153
+ *
154
+ * Copyright (c) 2020 Xilinx Inc.
155
+ *
156
+ * Written-by: Vikram Garhwal<fnu.vikram@xilinx.com>
157
+ *
158
+ * Based on QEMU CAN Device emulation implemented by Jin Yang, Deniz Eren and
159
+ * Pavel Pisa
160
+ *
161
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
162
+ * of this software and associated documentation files (the "Software"), to deal
163
+ * in the Software without restriction, including without limitation the rights
164
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
165
+ * copies of the Software, and to permit persons to whom the Software is
166
+ * furnished to do so, subject to the following conditions:
167
+ *
168
+ * The above copyright notice and this permission notice shall be included in
169
+ * all copies or substantial portions of the Software.
170
+ *
171
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
172
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
173
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
174
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
175
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
176
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
177
+ * THE SOFTWARE.
178
+ */
179
+
41
+#include "qemu/osdep.h"
180
+#include "qemu/osdep.h"
42
+#include "qemu-common.h"
181
+#include "hw/sysbus.h"
182
+#include "hw/register.h"
183
+#include "hw/irq.h"
184
+#include "qapi/error.h"
185
+#include "qemu/bitops.h"
43
+#include "qemu/log.h"
186
+#include "qemu/log.h"
44
+#include "hw/sysbus.h"
187
+#include "qemu/cutils.h"
45
+#include "sysemu/runstate.h"
188
+#include "sysemu/sysemu.h"
46
+
189
+#include "migration/vmstate.h"
47
+typedef struct {
190
+#include "hw/qdev-properties.h"
48
+ SysBusDevice parent_obj;
191
+#include "net/can_emu.h"
49
+ MemoryRegion iomem;
192
+#include "net/can_host.h"
50
+} SECUREECState;
193
+#include "qemu/event_notifier.h"
51
+
194
+#include "qom/object_interfaces.h"
52
+#define TYPE_SBSA_EC "sbsa-ec"
195
+#include "hw/net/xlnx-zynqmp-can.h"
53
+#define SECURE_EC(obj) OBJECT_CHECK(SECUREECState, (obj), TYPE_SBSA_EC)
196
+#include "trace.h"
54
+
197
+
55
+enum sbsa_ec_powerstates {
198
+#ifndef XLNX_ZYNQMP_CAN_ERR_DEBUG
56
+ SBSA_EC_CMD_POWEROFF = 0x01,
199
+#define XLNX_ZYNQMP_CAN_ERR_DEBUG 0
57
+ SBSA_EC_CMD_REBOOT = 0x02,
200
+#endif
201
+
202
+#define MAX_DLC 8
203
+#undef ERROR
204
+
205
+REG32(SOFTWARE_RESET_REGISTER, 0x0)
206
+ FIELD(SOFTWARE_RESET_REGISTER, CEN, 1, 1)
207
+ FIELD(SOFTWARE_RESET_REGISTER, SRST, 0, 1)
208
+REG32(MODE_SELECT_REGISTER, 0x4)
209
+ FIELD(MODE_SELECT_REGISTER, SNOOP, 2, 1)
210
+ FIELD(MODE_SELECT_REGISTER, LBACK, 1, 1)
211
+ FIELD(MODE_SELECT_REGISTER, SLEEP, 0, 1)
212
+REG32(ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER, 0x8)
213
+ FIELD(ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER, BRP, 0, 8)
214
+REG32(ARBITRATION_PHASE_BIT_TIMING_REGISTER, 0xc)
215
+ FIELD(ARBITRATION_PHASE_BIT_TIMING_REGISTER, SJW, 7, 2)
216
+ FIELD(ARBITRATION_PHASE_BIT_TIMING_REGISTER, TS2, 4, 3)
217
+ FIELD(ARBITRATION_PHASE_BIT_TIMING_REGISTER, TS1, 0, 4)
218
+REG32(ERROR_COUNTER_REGISTER, 0x10)
219
+ FIELD(ERROR_COUNTER_REGISTER, REC, 8, 8)
220
+ FIELD(ERROR_COUNTER_REGISTER, TEC, 0, 8)
221
+REG32(ERROR_STATUS_REGISTER, 0x14)
222
+ FIELD(ERROR_STATUS_REGISTER, ACKER, 4, 1)
223
+ FIELD(ERROR_STATUS_REGISTER, BERR, 3, 1)
224
+ FIELD(ERROR_STATUS_REGISTER, STER, 2, 1)
225
+ FIELD(ERROR_STATUS_REGISTER, FMER, 1, 1)
226
+ FIELD(ERROR_STATUS_REGISTER, CRCER, 0, 1)
227
+REG32(STATUS_REGISTER, 0x18)
228
+ FIELD(STATUS_REGISTER, SNOOP, 12, 1)
229
+ FIELD(STATUS_REGISTER, ACFBSY, 11, 1)
230
+ FIELD(STATUS_REGISTER, TXFLL, 10, 1)
231
+ FIELD(STATUS_REGISTER, TXBFLL, 9, 1)
232
+ FIELD(STATUS_REGISTER, ESTAT, 7, 2)
233
+ FIELD(STATUS_REGISTER, ERRWRN, 6, 1)
234
+ FIELD(STATUS_REGISTER, BBSY, 5, 1)
235
+ FIELD(STATUS_REGISTER, BIDLE, 4, 1)
236
+ FIELD(STATUS_REGISTER, NORMAL, 3, 1)
237
+ FIELD(STATUS_REGISTER, SLEEP, 2, 1)
238
+ FIELD(STATUS_REGISTER, LBACK, 1, 1)
239
+ FIELD(STATUS_REGISTER, CONFIG, 0, 1)
240
+REG32(INTERRUPT_STATUS_REGISTER, 0x1c)
241
+ FIELD(INTERRUPT_STATUS_REGISTER, TXFEMP, 14, 1)
242
+ FIELD(INTERRUPT_STATUS_REGISTER, TXFWMEMP, 13, 1)
243
+ FIELD(INTERRUPT_STATUS_REGISTER, RXFWMFLL, 12, 1)
244
+ FIELD(INTERRUPT_STATUS_REGISTER, WKUP, 11, 1)
245
+ FIELD(INTERRUPT_STATUS_REGISTER, SLP, 10, 1)
246
+ FIELD(INTERRUPT_STATUS_REGISTER, BSOFF, 9, 1)
247
+ FIELD(INTERRUPT_STATUS_REGISTER, ERROR, 8, 1)
248
+ FIELD(INTERRUPT_STATUS_REGISTER, RXNEMP, 7, 1)
249
+ FIELD(INTERRUPT_STATUS_REGISTER, RXOFLW, 6, 1)
250
+ FIELD(INTERRUPT_STATUS_REGISTER, RXUFLW, 5, 1)
251
+ FIELD(INTERRUPT_STATUS_REGISTER, RXOK, 4, 1)
252
+ FIELD(INTERRUPT_STATUS_REGISTER, TXBFLL, 3, 1)
253
+ FIELD(INTERRUPT_STATUS_REGISTER, TXFLL, 2, 1)
254
+ FIELD(INTERRUPT_STATUS_REGISTER, TXOK, 1, 1)
255
+ FIELD(INTERRUPT_STATUS_REGISTER, ARBLST, 0, 1)
256
+REG32(INTERRUPT_ENABLE_REGISTER, 0x20)
257
+ FIELD(INTERRUPT_ENABLE_REGISTER, ETXFEMP, 14, 1)
258
+ FIELD(INTERRUPT_ENABLE_REGISTER, ETXFWMEMP, 13, 1)
259
+ FIELD(INTERRUPT_ENABLE_REGISTER, ERXFWMFLL, 12, 1)
260
+ FIELD(INTERRUPT_ENABLE_REGISTER, EWKUP, 11, 1)
261
+ FIELD(INTERRUPT_ENABLE_REGISTER, ESLP, 10, 1)
262
+ FIELD(INTERRUPT_ENABLE_REGISTER, EBSOFF, 9, 1)
263
+ FIELD(INTERRUPT_ENABLE_REGISTER, EERROR, 8, 1)
264
+ FIELD(INTERRUPT_ENABLE_REGISTER, ERXNEMP, 7, 1)
265
+ FIELD(INTERRUPT_ENABLE_REGISTER, ERXOFLW, 6, 1)
266
+ FIELD(INTERRUPT_ENABLE_REGISTER, ERXUFLW, 5, 1)
267
+ FIELD(INTERRUPT_ENABLE_REGISTER, ERXOK, 4, 1)
268
+ FIELD(INTERRUPT_ENABLE_REGISTER, ETXBFLL, 3, 1)
269
+ FIELD(INTERRUPT_ENABLE_REGISTER, ETXFLL, 2, 1)
270
+ FIELD(INTERRUPT_ENABLE_REGISTER, ETXOK, 1, 1)
271
+ FIELD(INTERRUPT_ENABLE_REGISTER, EARBLST, 0, 1)
272
+REG32(INTERRUPT_CLEAR_REGISTER, 0x24)
273
+ FIELD(INTERRUPT_CLEAR_REGISTER, CTXFEMP, 14, 1)
274
+ FIELD(INTERRUPT_CLEAR_REGISTER, CTXFWMEMP, 13, 1)
275
+ FIELD(INTERRUPT_CLEAR_REGISTER, CRXFWMFLL, 12, 1)
276
+ FIELD(INTERRUPT_CLEAR_REGISTER, CWKUP, 11, 1)
277
+ FIELD(INTERRUPT_CLEAR_REGISTER, CSLP, 10, 1)
278
+ FIELD(INTERRUPT_CLEAR_REGISTER, CBSOFF, 9, 1)
279
+ FIELD(INTERRUPT_CLEAR_REGISTER, CERROR, 8, 1)
280
+ FIELD(INTERRUPT_CLEAR_REGISTER, CRXNEMP, 7, 1)
281
+ FIELD(INTERRUPT_CLEAR_REGISTER, CRXOFLW, 6, 1)
282
+ FIELD(INTERRUPT_CLEAR_REGISTER, CRXUFLW, 5, 1)
283
+ FIELD(INTERRUPT_CLEAR_REGISTER, CRXOK, 4, 1)
284
+ FIELD(INTERRUPT_CLEAR_REGISTER, CTXBFLL, 3, 1)
285
+ FIELD(INTERRUPT_CLEAR_REGISTER, CTXFLL, 2, 1)
286
+ FIELD(INTERRUPT_CLEAR_REGISTER, CTXOK, 1, 1)
287
+ FIELD(INTERRUPT_CLEAR_REGISTER, CARBLST, 0, 1)
288
+REG32(TIMESTAMP_REGISTER, 0x28)
289
+ FIELD(TIMESTAMP_REGISTER, CTS, 0, 1)
290
+REG32(WIR, 0x2c)
291
+ FIELD(WIR, EW, 8, 8)
292
+ FIELD(WIR, FW, 0, 8)
293
+REG32(TXFIFO_ID, 0x30)
294
+ FIELD(TXFIFO_ID, IDH, 21, 11)
295
+ FIELD(TXFIFO_ID, SRRRTR, 20, 1)
296
+ FIELD(TXFIFO_ID, IDE, 19, 1)
297
+ FIELD(TXFIFO_ID, IDL, 1, 18)
298
+ FIELD(TXFIFO_ID, RTR, 0, 1)
299
+REG32(TXFIFO_DLC, 0x34)
300
+ FIELD(TXFIFO_DLC, DLC, 28, 4)
301
+REG32(TXFIFO_DATA1, 0x38)
302
+ FIELD(TXFIFO_DATA1, DB0, 24, 8)
303
+ FIELD(TXFIFO_DATA1, DB1, 16, 8)
304
+ FIELD(TXFIFO_DATA1, DB2, 8, 8)
305
+ FIELD(TXFIFO_DATA1, DB3, 0, 8)
306
+REG32(TXFIFO_DATA2, 0x3c)
307
+ FIELD(TXFIFO_DATA2, DB4, 24, 8)
308
+ FIELD(TXFIFO_DATA2, DB5, 16, 8)
309
+ FIELD(TXFIFO_DATA2, DB6, 8, 8)
310
+ FIELD(TXFIFO_DATA2, DB7, 0, 8)
311
+REG32(TXHPB_ID, 0x40)
312
+ FIELD(TXHPB_ID, IDH, 21, 11)
313
+ FIELD(TXHPB_ID, SRRRTR, 20, 1)
314
+ FIELD(TXHPB_ID, IDE, 19, 1)
315
+ FIELD(TXHPB_ID, IDL, 1, 18)
316
+ FIELD(TXHPB_ID, RTR, 0, 1)
317
+REG32(TXHPB_DLC, 0x44)
318
+ FIELD(TXHPB_DLC, DLC, 28, 4)
319
+REG32(TXHPB_DATA1, 0x48)
320
+ FIELD(TXHPB_DATA1, DB0, 24, 8)
321
+ FIELD(TXHPB_DATA1, DB1, 16, 8)
322
+ FIELD(TXHPB_DATA1, DB2, 8, 8)
323
+ FIELD(TXHPB_DATA1, DB3, 0, 8)
324
+REG32(TXHPB_DATA2, 0x4c)
325
+ FIELD(TXHPB_DATA2, DB4, 24, 8)
326
+ FIELD(TXHPB_DATA2, DB5, 16, 8)
327
+ FIELD(TXHPB_DATA2, DB6, 8, 8)
328
+ FIELD(TXHPB_DATA2, DB7, 0, 8)
329
+REG32(RXFIFO_ID, 0x50)
330
+ FIELD(RXFIFO_ID, IDH, 21, 11)
331
+ FIELD(RXFIFO_ID, SRRRTR, 20, 1)
332
+ FIELD(RXFIFO_ID, IDE, 19, 1)
333
+ FIELD(RXFIFO_ID, IDL, 1, 18)
334
+ FIELD(RXFIFO_ID, RTR, 0, 1)
335
+REG32(RXFIFO_DLC, 0x54)
336
+ FIELD(RXFIFO_DLC, DLC, 28, 4)
337
+ FIELD(RXFIFO_DLC, RXT, 0, 16)
338
+REG32(RXFIFO_DATA1, 0x58)
339
+ FIELD(RXFIFO_DATA1, DB0, 24, 8)
340
+ FIELD(RXFIFO_DATA1, DB1, 16, 8)
341
+ FIELD(RXFIFO_DATA1, DB2, 8, 8)
342
+ FIELD(RXFIFO_DATA1, DB3, 0, 8)
343
+REG32(RXFIFO_DATA2, 0x5c)
344
+ FIELD(RXFIFO_DATA2, DB4, 24, 8)
345
+ FIELD(RXFIFO_DATA2, DB5, 16, 8)
346
+ FIELD(RXFIFO_DATA2, DB6, 8, 8)
347
+ FIELD(RXFIFO_DATA2, DB7, 0, 8)
348
+REG32(AFR, 0x60)
349
+ FIELD(AFR, UAF4, 3, 1)
350
+ FIELD(AFR, UAF3, 2, 1)
351
+ FIELD(AFR, UAF2, 1, 1)
352
+ FIELD(AFR, UAF1, 0, 1)
353
+REG32(AFMR1, 0x64)
354
+ FIELD(AFMR1, AMIDH, 21, 11)
355
+ FIELD(AFMR1, AMSRR, 20, 1)
356
+ FIELD(AFMR1, AMIDE, 19, 1)
357
+ FIELD(AFMR1, AMIDL, 1, 18)
358
+ FIELD(AFMR1, AMRTR, 0, 1)
359
+REG32(AFIR1, 0x68)
360
+ FIELD(AFIR1, AIIDH, 21, 11)
361
+ FIELD(AFIR1, AISRR, 20, 1)
362
+ FIELD(AFIR1, AIIDE, 19, 1)
363
+ FIELD(AFIR1, AIIDL, 1, 18)
364
+ FIELD(AFIR1, AIRTR, 0, 1)
365
+REG32(AFMR2, 0x6c)
366
+ FIELD(AFMR2, AMIDH, 21, 11)
367
+ FIELD(AFMR2, AMSRR, 20, 1)
368
+ FIELD(AFMR2, AMIDE, 19, 1)
369
+ FIELD(AFMR2, AMIDL, 1, 18)
370
+ FIELD(AFMR2, AMRTR, 0, 1)
371
+REG32(AFIR2, 0x70)
372
+ FIELD(AFIR2, AIIDH, 21, 11)
373
+ FIELD(AFIR2, AISRR, 20, 1)
374
+ FIELD(AFIR2, AIIDE, 19, 1)
375
+ FIELD(AFIR2, AIIDL, 1, 18)
376
+ FIELD(AFIR2, AIRTR, 0, 1)
377
+REG32(AFMR3, 0x74)
378
+ FIELD(AFMR3, AMIDH, 21, 11)
379
+ FIELD(AFMR3, AMSRR, 20, 1)
380
+ FIELD(AFMR3, AMIDE, 19, 1)
381
+ FIELD(AFMR3, AMIDL, 1, 18)
382
+ FIELD(AFMR3, AMRTR, 0, 1)
383
+REG32(AFIR3, 0x78)
384
+ FIELD(AFIR3, AIIDH, 21, 11)
385
+ FIELD(AFIR3, AISRR, 20, 1)
386
+ FIELD(AFIR3, AIIDE, 19, 1)
387
+ FIELD(AFIR3, AIIDL, 1, 18)
388
+ FIELD(AFIR3, AIRTR, 0, 1)
389
+REG32(AFMR4, 0x7c)
390
+ FIELD(AFMR4, AMIDH, 21, 11)
391
+ FIELD(AFMR4, AMSRR, 20, 1)
392
+ FIELD(AFMR4, AMIDE, 19, 1)
393
+ FIELD(AFMR4, AMIDL, 1, 18)
394
+ FIELD(AFMR4, AMRTR, 0, 1)
395
+REG32(AFIR4, 0x80)
396
+ FIELD(AFIR4, AIIDH, 21, 11)
397
+ FIELD(AFIR4, AISRR, 20, 1)
398
+ FIELD(AFIR4, AIIDE, 19, 1)
399
+ FIELD(AFIR4, AIIDL, 1, 18)
400
+ FIELD(AFIR4, AIRTR, 0, 1)
401
+
402
+static void can_update_irq(XlnxZynqMPCANState *s)
403
+{
404
+ uint32_t irq;
405
+
406
+ /* Watermark register interrupts. */
407
+ if ((fifo32_num_free(&s->tx_fifo) / CAN_FRAME_SIZE) >
408
+ ARRAY_FIELD_EX32(s->regs, WIR, EW)) {
409
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, TXFWMEMP, 1);
410
+ }
411
+
412
+ if ((fifo32_num_used(&s->rx_fifo) / CAN_FRAME_SIZE) >
413
+ ARRAY_FIELD_EX32(s->regs, WIR, FW)) {
414
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXFWMFLL, 1);
415
+ }
416
+
417
+ /* RX Interrupts. */
418
+ if (fifo32_num_used(&s->rx_fifo) >= CAN_FRAME_SIZE) {
419
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXNEMP, 1);
420
+ }
421
+
422
+ /* TX interrupts. */
423
+ if (fifo32_is_empty(&s->tx_fifo)) {
424
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, TXFEMP, 1);
425
+ }
426
+
427
+ if (fifo32_is_full(&s->tx_fifo)) {
428
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, TXFLL, 1);
429
+ }
430
+
431
+ if (fifo32_is_full(&s->txhpb_fifo)) {
432
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, TXBFLL, 1);
433
+ }
434
+
435
+ irq = s->regs[R_INTERRUPT_STATUS_REGISTER];
436
+ irq &= s->regs[R_INTERRUPT_ENABLE_REGISTER];
437
+
438
+ trace_xlnx_can_update_irq(s->regs[R_INTERRUPT_STATUS_REGISTER],
439
+ s->regs[R_INTERRUPT_ENABLE_REGISTER], irq);
440
+ qemu_set_irq(s->irq, irq);
441
+}
442
+
443
+static void can_ier_post_write(RegisterInfo *reg, uint64_t val)
444
+{
445
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
446
+
447
+ can_update_irq(s);
448
+}
449
+
450
+static uint64_t can_icr_pre_write(RegisterInfo *reg, uint64_t val)
451
+{
452
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
453
+
454
+ s->regs[R_INTERRUPT_STATUS_REGISTER] &= ~val;
455
+ can_update_irq(s);
456
+
457
+ return 0;
458
+}
459
+
460
+static void can_config_reset(XlnxZynqMPCANState *s)
461
+{
462
+ /* Reset all the configuration registers. */
463
+ register_reset(&s->reg_info[R_SOFTWARE_RESET_REGISTER]);
464
+ register_reset(&s->reg_info[R_MODE_SELECT_REGISTER]);
465
+ register_reset(
466
+ &s->reg_info[R_ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER]);
467
+ register_reset(&s->reg_info[R_ARBITRATION_PHASE_BIT_TIMING_REGISTER]);
468
+ register_reset(&s->reg_info[R_STATUS_REGISTER]);
469
+ register_reset(&s->reg_info[R_INTERRUPT_STATUS_REGISTER]);
470
+ register_reset(&s->reg_info[R_INTERRUPT_ENABLE_REGISTER]);
471
+ register_reset(&s->reg_info[R_INTERRUPT_CLEAR_REGISTER]);
472
+ register_reset(&s->reg_info[R_WIR]);
473
+}
474
+
475
+static void can_config_mode(XlnxZynqMPCANState *s)
476
+{
477
+ register_reset(&s->reg_info[R_ERROR_COUNTER_REGISTER]);
478
+ register_reset(&s->reg_info[R_ERROR_STATUS_REGISTER]);
479
+
480
+ /* Put XlnxZynqMPCAN in configuration mode. */
481
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, CONFIG, 1);
482
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, WKUP, 0);
483
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, SLP, 0);
484
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, BSOFF, 0);
485
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, ERROR, 0);
486
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXOFLW, 0);
487
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXOK, 0);
488
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, TXOK, 0);
489
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, ARBLST, 0);
490
+
491
+ can_update_irq(s);
492
+}
493
+
494
+static void update_status_register_mode_bits(XlnxZynqMPCANState *s)
495
+{
496
+ bool sleep_status = ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, SLEEP);
497
+ bool sleep_mode = ARRAY_FIELD_EX32(s->regs, MODE_SELECT_REGISTER, SLEEP);
498
+ /* Wake up interrupt bit. */
499
+ bool wakeup_irq_val = sleep_status && (sleep_mode == 0);
500
+ /* Sleep interrupt bit. */
501
+ bool sleep_irq_val = sleep_mode && (sleep_status == 0);
502
+
503
+ /* Clear previous core mode status bits. */
504
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, LBACK, 0);
505
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, SLEEP, 0);
506
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, SNOOP, 0);
507
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, NORMAL, 0);
508
+
509
+ /* set current mode bit and generate irqs accordingly. */
510
+ if (ARRAY_FIELD_EX32(s->regs, MODE_SELECT_REGISTER, LBACK)) {
511
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, LBACK, 1);
512
+ } else if (ARRAY_FIELD_EX32(s->regs, MODE_SELECT_REGISTER, SLEEP)) {
513
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, SLEEP, 1);
514
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, SLP,
515
+ sleep_irq_val);
516
+ } else if (ARRAY_FIELD_EX32(s->regs, MODE_SELECT_REGISTER, SNOOP)) {
517
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, SNOOP, 1);
518
+ } else {
519
+ /*
520
+ * If all bits are zero then XlnxZynqMPCAN is set in normal mode.
521
+ */
522
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, NORMAL, 1);
523
+ /* Set wakeup interrupt bit. */
524
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, WKUP,
525
+ wakeup_irq_val);
526
+ }
527
+
528
+ can_update_irq(s);
529
+}
530
+
531
+static void can_exit_sleep_mode(XlnxZynqMPCANState *s)
532
+{
533
+ ARRAY_FIELD_DP32(s->regs, MODE_SELECT_REGISTER, SLEEP, 0);
534
+ update_status_register_mode_bits(s);
535
+}
536
+
537
+static void generate_frame(qemu_can_frame *frame, uint32_t *data)
538
+{
539
+ frame->can_id = data[0];
540
+ frame->can_dlc = FIELD_EX32(data[1], TXFIFO_DLC, DLC);
541
+
542
+ frame->data[0] = FIELD_EX32(data[2], TXFIFO_DATA1, DB3);
543
+ frame->data[1] = FIELD_EX32(data[2], TXFIFO_DATA1, DB2);
544
+ frame->data[2] = FIELD_EX32(data[2], TXFIFO_DATA1, DB1);
545
+ frame->data[3] = FIELD_EX32(data[2], TXFIFO_DATA1, DB0);
546
+
547
+ frame->data[4] = FIELD_EX32(data[3], TXFIFO_DATA2, DB7);
548
+ frame->data[5] = FIELD_EX32(data[3], TXFIFO_DATA2, DB6);
549
+ frame->data[6] = FIELD_EX32(data[3], TXFIFO_DATA2, DB5);
550
+ frame->data[7] = FIELD_EX32(data[3], TXFIFO_DATA2, DB4);
551
+}
552
+
553
+static bool tx_ready_check(XlnxZynqMPCANState *s)
554
+{
555
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, SRST)) {
556
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
557
+
558
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Attempting to transfer data while"
559
+ " data while controller is in reset mode.\n",
560
+ path);
561
+ return false;
562
+ }
563
+
564
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN) == 0) {
565
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
566
+
567
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Attempting to transfer"
568
+ " data while controller is in configuration mode. Reset"
569
+ " the core so operations can start fresh.\n",
570
+ path);
571
+ return false;
572
+ }
573
+
574
+ if (ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, SNOOP)) {
575
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
576
+
577
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Attempting to transfer"
578
+ " data while controller is in SNOOP MODE.\n",
579
+ path);
580
+ return false;
581
+ }
582
+
583
+ return true;
584
+}
585
+
586
+static void transfer_fifo(XlnxZynqMPCANState *s, Fifo32 *fifo)
587
+{
588
+ qemu_can_frame frame;
589
+ uint32_t data[CAN_FRAME_SIZE];
590
+ int i;
591
+ bool can_tx = tx_ready_check(s);
592
+
593
+ if (!can_tx) {
594
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
595
+
596
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Controller is not enabled for data"
597
+ " transfer.\n", path);
598
+ can_update_irq(s);
599
+ return;
600
+ }
601
+
602
+ while (!fifo32_is_empty(fifo)) {
603
+ for (i = 0; i < CAN_FRAME_SIZE; i++) {
604
+ data[i] = fifo32_pop(fifo);
605
+ }
606
+
607
+ if (ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, LBACK)) {
608
+ /*
609
+ * Controller is in loopback. In Loopback mode, the CAN core
610
+ * transmits a recessive bitstream on to the XlnxZynqMPCAN Bus.
611
+ * Any message transmitted is looped back to the RX line and
612
+ * acknowledged. The XlnxZynqMPCAN core receives any message
613
+ * that it transmits.
614
+ */
615
+ if (fifo32_is_full(&s->rx_fifo)) {
616
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXOFLW, 1);
617
+ } else {
618
+ for (i = 0; i < CAN_FRAME_SIZE; i++) {
619
+ fifo32_push(&s->rx_fifo, data[i]);
620
+ }
621
+
622
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXOK, 1);
623
+ }
624
+ } else {
625
+ /* Normal mode Tx. */
626
+ generate_frame(&frame, data);
627
+
628
+ trace_xlnx_can_tx_data(frame.can_id, frame.can_dlc,
629
+ frame.data[0], frame.data[1],
630
+ frame.data[2], frame.data[3],
631
+ frame.data[4], frame.data[5],
632
+ frame.data[6], frame.data[7]);
633
+ can_bus_client_send(&s->bus_client, &frame, 1);
634
+ }
635
+ }
636
+
637
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, TXOK, 1);
638
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, TXBFLL, 0);
639
+
640
+ if (ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, SLEEP)) {
641
+ can_exit_sleep_mode(s);
642
+ }
643
+
644
+ can_update_irq(s);
645
+}
646
+
647
+static uint64_t can_srr_pre_write(RegisterInfo *reg, uint64_t val)
648
+{
649
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
650
+
651
+ ARRAY_FIELD_DP32(s->regs, SOFTWARE_RESET_REGISTER, CEN,
652
+ FIELD_EX32(val, SOFTWARE_RESET_REGISTER, CEN));
653
+
654
+ if (FIELD_EX32(val, SOFTWARE_RESET_REGISTER, SRST)) {
655
+ trace_xlnx_can_reset(val);
656
+
657
+ /* First, core will do software reset then will enter in config mode. */
658
+ can_config_reset(s);
659
+ }
660
+
661
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN) == 0) {
662
+ can_config_mode(s);
663
+ } else {
664
+ /*
665
+ * Leave config mode. Now XlnxZynqMPCAN core will enter normal,
666
+ * sleep, snoop or loopback mode depending upon LBACK, SLEEP, SNOOP
667
+ * register states.
668
+ */
669
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, CONFIG, 0);
670
+
671
+ ptimer_transaction_begin(s->can_timer);
672
+ ptimer_set_count(s->can_timer, 0);
673
+ ptimer_transaction_commit(s->can_timer);
674
+
675
+ /* XlnxZynqMPCAN is out of config mode. It will send pending data. */
676
+ transfer_fifo(s, &s->txhpb_fifo);
677
+ transfer_fifo(s, &s->tx_fifo);
678
+ }
679
+
680
+ update_status_register_mode_bits(s);
681
+
682
+ return s->regs[R_SOFTWARE_RESET_REGISTER];
683
+}
684
+
685
+static uint64_t can_msr_pre_write(RegisterInfo *reg, uint64_t val)
686
+{
687
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
688
+ uint8_t multi_mode;
689
+
690
+ /*
691
+ * Multiple mode set check. This is done to make sure user doesn't set
692
+ * multiple modes.
693
+ */
694
+ multi_mode = FIELD_EX32(val, MODE_SELECT_REGISTER, LBACK) +
695
+ FIELD_EX32(val, MODE_SELECT_REGISTER, SLEEP) +
696
+ FIELD_EX32(val, MODE_SELECT_REGISTER, SNOOP);
697
+
698
+ if (multi_mode > 1) {
699
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
700
+
701
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Attempting to config"
702
+ " several modes simultaneously. One mode will be selected"
703
+ " according to their priority: LBACK > SLEEP > SNOOP.\n",
704
+ path);
705
+ }
706
+
707
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN) == 0) {
708
+ /* We are in configuration mode, any mode can be selected. */
709
+ s->regs[R_MODE_SELECT_REGISTER] = val;
710
+ } else {
711
+ bool sleep_mode_bit = FIELD_EX32(val, MODE_SELECT_REGISTER, SLEEP);
712
+
713
+ ARRAY_FIELD_DP32(s->regs, MODE_SELECT_REGISTER, SLEEP, sleep_mode_bit);
714
+
715
+ if (FIELD_EX32(val, MODE_SELECT_REGISTER, LBACK)) {
716
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
717
+
718
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Attempting to set"
719
+ " LBACK mode without setting CEN bit as 0.\n",
720
+ path);
721
+ } else if (FIELD_EX32(val, MODE_SELECT_REGISTER, SNOOP)) {
722
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
723
+
724
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Attempting to set"
725
+ " SNOOP mode without setting CEN bit as 0.\n",
726
+ path);
727
+ }
728
+
729
+ update_status_register_mode_bits(s);
730
+ }
731
+
732
+ return s->regs[R_MODE_SELECT_REGISTER];
733
+}
734
+
735
+static uint64_t can_brpr_pre_write(RegisterInfo *reg, uint64_t val)
736
+{
737
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
738
+
739
+ /* Only allow writes when in config mode. */
740
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN)) {
741
+ return s->regs[R_ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER];
742
+ }
743
+
744
+ return val;
745
+}
746
+
747
+static uint64_t can_btr_pre_write(RegisterInfo *reg, uint64_t val)
748
+{
749
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
750
+
751
+ /* Only allow writes when in config mode. */
752
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN)) {
753
+ return s->regs[R_ARBITRATION_PHASE_BIT_TIMING_REGISTER];
754
+ }
755
+
756
+ return val;
757
+}
758
+
759
+static uint64_t can_tcr_pre_write(RegisterInfo *reg, uint64_t val)
760
+{
761
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
762
+
763
+ if (FIELD_EX32(val, TIMESTAMP_REGISTER, CTS)) {
764
+ ptimer_transaction_begin(s->can_timer);
765
+ ptimer_set_count(s->can_timer, 0);
766
+ ptimer_transaction_commit(s->can_timer);
767
+ }
768
+
769
+ return 0;
770
+}
771
+
772
+static void update_rx_fifo(XlnxZynqMPCANState *s, const qemu_can_frame *frame)
773
+{
774
+ bool filter_pass = false;
775
+ uint16_t timestamp = 0;
776
+
777
+ /* If no filter is enabled. Message will be stored in FIFO. */
778
+ if (!((ARRAY_FIELD_EX32(s->regs, AFR, UAF1)) |
779
+ (ARRAY_FIELD_EX32(s->regs, AFR, UAF2)) |
780
+ (ARRAY_FIELD_EX32(s->regs, AFR, UAF3)) |
781
+ (ARRAY_FIELD_EX32(s->regs, AFR, UAF4)))) {
782
+ filter_pass = true;
783
+ }
784
+
785
+ /*
786
+ * Messages that pass any of the acceptance filters will be stored in
787
+ * the RX FIFO.
788
+ */
789
+ if (ARRAY_FIELD_EX32(s->regs, AFR, UAF1)) {
790
+ uint32_t id_masked = s->regs[R_AFMR1] & frame->can_id;
791
+ uint32_t filter_id_masked = s->regs[R_AFMR1] & s->regs[R_AFIR1];
792
+
793
+ if (filter_id_masked == id_masked) {
794
+ filter_pass = true;
795
+ }
796
+ }
797
+
798
+ if (ARRAY_FIELD_EX32(s->regs, AFR, UAF2)) {
799
+ uint32_t id_masked = s->regs[R_AFMR2] & frame->can_id;
800
+ uint32_t filter_id_masked = s->regs[R_AFMR2] & s->regs[R_AFIR2];
801
+
802
+ if (filter_id_masked == id_masked) {
803
+ filter_pass = true;
804
+ }
805
+ }
806
+
807
+ if (ARRAY_FIELD_EX32(s->regs, AFR, UAF3)) {
808
+ uint32_t id_masked = s->regs[R_AFMR3] & frame->can_id;
809
+ uint32_t filter_id_masked = s->regs[R_AFMR3] & s->regs[R_AFIR3];
810
+
811
+ if (filter_id_masked == id_masked) {
812
+ filter_pass = true;
813
+ }
814
+ }
815
+
816
+ if (ARRAY_FIELD_EX32(s->regs, AFR, UAF4)) {
817
+ uint32_t id_masked = s->regs[R_AFMR4] & frame->can_id;
818
+ uint32_t filter_id_masked = s->regs[R_AFMR4] & s->regs[R_AFIR4];
819
+
820
+ if (filter_id_masked == id_masked) {
821
+ filter_pass = true;
822
+ }
823
+ }
824
+
825
+ if (!filter_pass) {
826
+ trace_xlnx_can_rx_fifo_filter_reject(frame->can_id, frame->can_dlc);
827
+ return;
828
+ }
829
+
830
+ /* Store the message in fifo if it passed through any of the filters. */
831
+ if (filter_pass && frame->can_dlc <= MAX_DLC) {
832
+
833
+ if (fifo32_is_full(&s->rx_fifo)) {
834
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXOFLW, 1);
835
+ } else {
836
+ timestamp = CAN_TIMER_MAX - ptimer_get_count(s->can_timer);
837
+
838
+ fifo32_push(&s->rx_fifo, frame->can_id);
839
+
840
+ fifo32_push(&s->rx_fifo, deposit32(0, R_RXFIFO_DLC_DLC_SHIFT,
841
+ R_RXFIFO_DLC_DLC_LENGTH,
842
+ frame->can_dlc) |
843
+ deposit32(0, R_RXFIFO_DLC_RXT_SHIFT,
844
+ R_RXFIFO_DLC_RXT_LENGTH,
845
+ timestamp));
846
+
847
+ /* First 32 bit of the data. */
848
+ fifo32_push(&s->rx_fifo, deposit32(0, R_TXFIFO_DATA1_DB3_SHIFT,
849
+ R_TXFIFO_DATA1_DB3_LENGTH,
850
+ frame->data[0]) |
851
+ deposit32(0, R_TXFIFO_DATA1_DB2_SHIFT,
852
+ R_TXFIFO_DATA1_DB2_LENGTH,
853
+ frame->data[1]) |
854
+ deposit32(0, R_TXFIFO_DATA1_DB1_SHIFT,
855
+ R_TXFIFO_DATA1_DB1_LENGTH,
856
+ frame->data[2]) |
857
+ deposit32(0, R_TXFIFO_DATA1_DB0_SHIFT,
858
+ R_TXFIFO_DATA1_DB0_LENGTH,
859
+ frame->data[3]));
860
+ /* Last 32 bit of the data. */
861
+ fifo32_push(&s->rx_fifo, deposit32(0, R_TXFIFO_DATA2_DB7_SHIFT,
862
+ R_TXFIFO_DATA2_DB7_LENGTH,
863
+ frame->data[4]) |
864
+ deposit32(0, R_TXFIFO_DATA2_DB6_SHIFT,
865
+ R_TXFIFO_DATA2_DB6_LENGTH,
866
+ frame->data[5]) |
867
+ deposit32(0, R_TXFIFO_DATA2_DB5_SHIFT,
868
+ R_TXFIFO_DATA2_DB5_LENGTH,
869
+ frame->data[6]) |
870
+ deposit32(0, R_TXFIFO_DATA2_DB4_SHIFT,
871
+ R_TXFIFO_DATA2_DB4_LENGTH,
872
+ frame->data[7]));
873
+
874
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXOK, 1);
875
+ trace_xlnx_can_rx_data(frame->can_id, frame->can_dlc,
876
+ frame->data[0], frame->data[1],
877
+ frame->data[2], frame->data[3],
878
+ frame->data[4], frame->data[5],
879
+ frame->data[6], frame->data[7]);
880
+ }
881
+
882
+ can_update_irq(s);
883
+ }
884
+}
885
+
886
+static uint64_t can_rxfifo_pre_read(RegisterInfo *reg, uint64_t val)
887
+{
888
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
889
+
890
+ if (!fifo32_is_empty(&s->rx_fifo)) {
891
+ val = fifo32_pop(&s->rx_fifo);
892
+ } else {
893
+ ARRAY_FIELD_DP32(s->regs, INTERRUPT_STATUS_REGISTER, RXUFLW, 1);
894
+ }
895
+
896
+ can_update_irq(s);
897
+ return val;
898
+}
899
+
900
+static void can_filter_enable_post_write(RegisterInfo *reg, uint64_t val)
901
+{
902
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
903
+
904
+ if (ARRAY_FIELD_EX32(s->regs, AFR, UAF1) &&
905
+ ARRAY_FIELD_EX32(s->regs, AFR, UAF2) &&
906
+ ARRAY_FIELD_EX32(s->regs, AFR, UAF3) &&
907
+ ARRAY_FIELD_EX32(s->regs, AFR, UAF4)) {
908
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, ACFBSY, 1);
909
+ } else {
910
+ ARRAY_FIELD_DP32(s->regs, STATUS_REGISTER, ACFBSY, 0);
911
+ }
912
+}
913
+
914
+static uint64_t can_filter_mask_pre_write(RegisterInfo *reg, uint64_t val)
915
+{
916
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
917
+ uint32_t reg_idx = (reg->access->addr) / 4;
918
+ uint32_t filter_number = (reg_idx - R_AFMR1) / 2;
919
+
920
+ /* modify an acceptance filter, the corresponding UAF bit should be '0'. */
921
+ if (!(s->regs[R_AFR] & (1 << filter_number))) {
922
+ s->regs[reg_idx] = val;
923
+
924
+ trace_xlnx_can_filter_mask_pre_write(filter_number, s->regs[reg_idx]);
925
+ } else {
926
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
927
+
928
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Acceptance filter %d"
929
+ " mask is not set as corresponding UAF bit is not 0.\n",
930
+ path, filter_number + 1);
931
+ }
932
+
933
+ return s->regs[reg_idx];
934
+}
935
+
936
+static uint64_t can_filter_id_pre_write(RegisterInfo *reg, uint64_t val)
937
+{
938
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
939
+ uint32_t reg_idx = (reg->access->addr) / 4;
940
+ uint32_t filter_number = (reg_idx - R_AFIR1) / 2;
941
+
942
+ if (!(s->regs[R_AFR] & (1 << filter_number))) {
943
+ s->regs[reg_idx] = val;
944
+
945
+ trace_xlnx_can_filter_id_pre_write(filter_number, s->regs[reg_idx]);
946
+ } else {
947
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
948
+
949
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Acceptance filter %d"
950
+ " id is not set as corresponding UAF bit is not 0.\n",
951
+ path, filter_number + 1);
952
+ }
953
+
954
+ return s->regs[reg_idx];
955
+}
956
+
957
+static void can_tx_post_write(RegisterInfo *reg, uint64_t val)
958
+{
959
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(reg->opaque);
960
+
961
+ bool is_txhpb = reg->access->addr > A_TXFIFO_DATA2;
962
+
963
+ bool initiate_transfer = (reg->access->addr == A_TXFIFO_DATA2) ||
964
+ (reg->access->addr == A_TXHPB_DATA2);
965
+
966
+ Fifo32 *f = is_txhpb ? &s->txhpb_fifo : &s->tx_fifo;
967
+
968
+ if (!fifo32_is_full(f)) {
969
+ fifo32_push(f, val);
970
+ } else {
971
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
972
+
973
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: TX FIFO is full.\n", path);
974
+ }
975
+
976
+ /* Initiate the message send if TX register is written. */
977
+ if (initiate_transfer &&
978
+ ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN)) {
979
+ transfer_fifo(s, f);
980
+ }
981
+
982
+ can_update_irq(s);
983
+}
984
+
985
+static const RegisterAccessInfo can_regs_info[] = {
986
+ { .name = "SOFTWARE_RESET_REGISTER",
987
+ .addr = A_SOFTWARE_RESET_REGISTER,
988
+ .rsvd = 0xfffffffc,
989
+ .pre_write = can_srr_pre_write,
990
+ },{ .name = "MODE_SELECT_REGISTER",
991
+ .addr = A_MODE_SELECT_REGISTER,
992
+ .rsvd = 0xfffffff8,
993
+ .pre_write = can_msr_pre_write,
994
+ },{ .name = "ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER",
995
+ .addr = A_ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER,
996
+ .rsvd = 0xffffff00,
997
+ .pre_write = can_brpr_pre_write,
998
+ },{ .name = "ARBITRATION_PHASE_BIT_TIMING_REGISTER",
999
+ .addr = A_ARBITRATION_PHASE_BIT_TIMING_REGISTER,
1000
+ .rsvd = 0xfffffe00,
1001
+ .pre_write = can_btr_pre_write,
1002
+ },{ .name = "ERROR_COUNTER_REGISTER",
1003
+ .addr = A_ERROR_COUNTER_REGISTER,
1004
+ .rsvd = 0xffff0000,
1005
+ .ro = 0xffffffff,
1006
+ },{ .name = "ERROR_STATUS_REGISTER",
1007
+ .addr = A_ERROR_STATUS_REGISTER,
1008
+ .rsvd = 0xffffffe0,
1009
+ .w1c = 0x1f,
1010
+ },{ .name = "STATUS_REGISTER", .addr = A_STATUS_REGISTER,
1011
+ .reset = 0x1,
1012
+ .rsvd = 0xffffe000,
1013
+ .ro = 0x1fff,
1014
+ },{ .name = "INTERRUPT_STATUS_REGISTER",
1015
+ .addr = A_INTERRUPT_STATUS_REGISTER,
1016
+ .reset = 0x6000,
1017
+ .rsvd = 0xffff8000,
1018
+ .ro = 0x7fff,
1019
+ },{ .name = "INTERRUPT_ENABLE_REGISTER",
1020
+ .addr = A_INTERRUPT_ENABLE_REGISTER,
1021
+ .rsvd = 0xffff8000,
1022
+ .post_write = can_ier_post_write,
1023
+ },{ .name = "INTERRUPT_CLEAR_REGISTER",
1024
+ .addr = A_INTERRUPT_CLEAR_REGISTER,
1025
+ .rsvd = 0xffff8000,
1026
+ .pre_write = can_icr_pre_write,
1027
+ },{ .name = "TIMESTAMP_REGISTER",
1028
+ .addr = A_TIMESTAMP_REGISTER,
1029
+ .rsvd = 0xfffffffe,
1030
+ .pre_write = can_tcr_pre_write,
1031
+ },{ .name = "WIR", .addr = A_WIR,
1032
+ .reset = 0x3f3f,
1033
+ .rsvd = 0xffff0000,
1034
+ },{ .name = "TXFIFO_ID", .addr = A_TXFIFO_ID,
1035
+ .post_write = can_tx_post_write,
1036
+ },{ .name = "TXFIFO_DLC", .addr = A_TXFIFO_DLC,
1037
+ .rsvd = 0xfffffff,
1038
+ .post_write = can_tx_post_write,
1039
+ },{ .name = "TXFIFO_DATA1", .addr = A_TXFIFO_DATA1,
1040
+ .post_write = can_tx_post_write,
1041
+ },{ .name = "TXFIFO_DATA2", .addr = A_TXFIFO_DATA2,
1042
+ .post_write = can_tx_post_write,
1043
+ },{ .name = "TXHPB_ID", .addr = A_TXHPB_ID,
1044
+ .post_write = can_tx_post_write,
1045
+ },{ .name = "TXHPB_DLC", .addr = A_TXHPB_DLC,
1046
+ .rsvd = 0xfffffff,
1047
+ .post_write = can_tx_post_write,
1048
+ },{ .name = "TXHPB_DATA1", .addr = A_TXHPB_DATA1,
1049
+ .post_write = can_tx_post_write,
1050
+ },{ .name = "TXHPB_DATA2", .addr = A_TXHPB_DATA2,
1051
+ .post_write = can_tx_post_write,
1052
+ },{ .name = "RXFIFO_ID", .addr = A_RXFIFO_ID,
1053
+ .ro = 0xffffffff,
1054
+ .post_read = can_rxfifo_pre_read,
1055
+ },{ .name = "RXFIFO_DLC", .addr = A_RXFIFO_DLC,
1056
+ .rsvd = 0xfff0000,
1057
+ .post_read = can_rxfifo_pre_read,
1058
+ },{ .name = "RXFIFO_DATA1", .addr = A_RXFIFO_DATA1,
1059
+ .post_read = can_rxfifo_pre_read,
1060
+ },{ .name = "RXFIFO_DATA2", .addr = A_RXFIFO_DATA2,
1061
+ .post_read = can_rxfifo_pre_read,
1062
+ },{ .name = "AFR", .addr = A_AFR,
1063
+ .rsvd = 0xfffffff0,
1064
+ .post_write = can_filter_enable_post_write,
1065
+ },{ .name = "AFMR1", .addr = A_AFMR1,
1066
+ .pre_write = can_filter_mask_pre_write,
1067
+ },{ .name = "AFIR1", .addr = A_AFIR1,
1068
+ .pre_write = can_filter_id_pre_write,
1069
+ },{ .name = "AFMR2", .addr = A_AFMR2,
1070
+ .pre_write = can_filter_mask_pre_write,
1071
+ },{ .name = "AFIR2", .addr = A_AFIR2,
1072
+ .pre_write = can_filter_id_pre_write,
1073
+ },{ .name = "AFMR3", .addr = A_AFMR3,
1074
+ .pre_write = can_filter_mask_pre_write,
1075
+ },{ .name = "AFIR3", .addr = A_AFIR3,
1076
+ .pre_write = can_filter_id_pre_write,
1077
+ },{ .name = "AFMR4", .addr = A_AFMR4,
1078
+ .pre_write = can_filter_mask_pre_write,
1079
+ },{ .name = "AFIR4", .addr = A_AFIR4,
1080
+ .pre_write = can_filter_id_pre_write,
1081
+ }
58
+};
1082
+};
59
+
1083
+
60
+static uint64_t sbsa_ec_read(void *opaque, hwaddr offset, unsigned size)
1084
+static void xlnx_zynqmp_can_ptimer_cb(void *opaque)
61
+{
1085
+{
62
+ /* No use for this currently */
1086
+ /* No action required on the timer rollover. */
63
+ qemu_log_mask(LOG_GUEST_ERROR, "sbsa-ec: no readable registers");
1087
+}
1088
+
1089
+static const MemoryRegionOps can_ops = {
1090
+ .read = register_read_memory,
1091
+ .write = register_write_memory,
1092
+ .endianness = DEVICE_LITTLE_ENDIAN,
1093
+ .valid = {
1094
+ .min_access_size = 4,
1095
+ .max_access_size = 4,
1096
+ },
1097
+};
1098
+
1099
+static void xlnx_zynqmp_can_reset_init(Object *obj, ResetType type)
1100
+{
1101
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(obj);
1102
+ unsigned int i;
1103
+
1104
+ for (i = R_RXFIFO_ID; i < ARRAY_SIZE(s->reg_info); ++i) {
1105
+ register_reset(&s->reg_info[i]);
1106
+ }
1107
+
1108
+ ptimer_transaction_begin(s->can_timer);
1109
+ ptimer_set_count(s->can_timer, 0);
1110
+ ptimer_transaction_commit(s->can_timer);
1111
+}
1112
+
1113
+static void xlnx_zynqmp_can_reset_hold(Object *obj)
1114
+{
1115
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(obj);
1116
+ unsigned int i;
1117
+
1118
+ for (i = 0; i < R_RXFIFO_ID; ++i) {
1119
+ register_reset(&s->reg_info[i]);
1120
+ }
1121
+
1122
+ /*
1123
+ * Reset FIFOs when CAN model is reset. This will clear the fifo writes
1124
+ * done by post_write which gets called from register_reset function,
1125
+ * post_write handle will not be able to trigger tx because CAN will be
1126
+ * disabled when software_reset_register is cleared first.
1127
+ */
1128
+ fifo32_reset(&s->rx_fifo);
1129
+ fifo32_reset(&s->tx_fifo);
1130
+ fifo32_reset(&s->txhpb_fifo);
1131
+}
1132
+
1133
+static bool xlnx_zynqmp_can_can_receive(CanBusClientState *client)
1134
+{
1135
+ XlnxZynqMPCANState *s = container_of(client, XlnxZynqMPCANState,
1136
+ bus_client);
1137
+
1138
+ if (ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, SRST)) {
1139
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
1140
+
1141
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Controller is in reset state.\n",
1142
+ path);
1143
+ return false;
1144
+ }
1145
+
1146
+ if ((ARRAY_FIELD_EX32(s->regs, SOFTWARE_RESET_REGISTER, CEN)) == 0) {
1147
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
1148
+
1149
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Controller is disabled. Incoming"
1150
+ " messages will be discarded.\n", path);
1151
+ return false;
1152
+ }
1153
+
1154
+ return true;
1155
+}
1156
+
1157
+static ssize_t xlnx_zynqmp_can_receive(CanBusClientState *client,
1158
+ const qemu_can_frame *buf, size_t buf_size) {
1159
+ XlnxZynqMPCANState *s = container_of(client, XlnxZynqMPCANState,
1160
+ bus_client);
1161
+ const qemu_can_frame *frame = buf;
1162
+
1163
+ if (buf_size <= 0) {
1164
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
1165
+
1166
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Error in the data received.\n",
1167
+ path);
1168
+ return 0;
1169
+ }
1170
+
1171
+ if (ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, SNOOP)) {
1172
+ /* Snoop Mode: Just keep the data. no response back. */
1173
+ update_rx_fifo(s, frame);
1174
+ } else if ((ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, SLEEP))) {
1175
+ /*
1176
+ * XlnxZynqMPCAN is in sleep mode. Any data on bus will bring it to wake
1177
+ * up state.
1178
+ */
1179
+ can_exit_sleep_mode(s);
1180
+ update_rx_fifo(s, frame);
1181
+ } else if ((ARRAY_FIELD_EX32(s->regs, STATUS_REGISTER, SLEEP)) == 0) {
1182
+ update_rx_fifo(s, frame);
1183
+ } else {
1184
+ /*
1185
+ * XlnxZynqMPCAN will not participate in normal bus communication
1186
+ * and will not receive any messages transmitted by other CAN nodes.
1187
+ */
1188
+ trace_xlnx_can_rx_discard(s->regs[R_STATUS_REGISTER]);
1189
+ }
1190
+
1191
+ return 1;
1192
+}
1193
+
1194
+static CanBusClientInfo can_xilinx_bus_client_info = {
1195
+ .can_receive = xlnx_zynqmp_can_can_receive,
1196
+ .receive = xlnx_zynqmp_can_receive,
1197
+};
1198
+
1199
+static int xlnx_zynqmp_can_connect_to_bus(XlnxZynqMPCANState *s,
1200
+ CanBusState *bus)
1201
+{
1202
+ s->bus_client.info = &can_xilinx_bus_client_info;
1203
+
1204
+ if (can_bus_insert_client(bus, &s->bus_client) < 0) {
1205
+ return -1;
1206
+ }
64
+ return 0;
1207
+ return 0;
65
+}
1208
+}
66
+
1209
+
67
+static void sbsa_ec_write(void *opaque, hwaddr offset,
1210
+static void xlnx_zynqmp_can_realize(DeviceState *dev, Error **errp)
68
+ uint64_t value, unsigned size)
1211
+{
69
+{
1212
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(dev);
70
+ if (offset == 0) { /* PSCI machine power command register */
1213
+
71
+ switch (value) {
1214
+ if (s->canbus) {
72
+ case SBSA_EC_CMD_POWEROFF:
1215
+ if (xlnx_zynqmp_can_connect_to_bus(s, s->canbus) < 0) {
73
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
1216
+ g_autofree char *path = object_get_canonical_path(OBJECT(s));
74
+ break;
1217
+
75
+ case SBSA_EC_CMD_REBOOT:
1218
+ error_setg(errp, "%s: xlnx_zynqmp_can_connect_to_bus"
76
+ qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
1219
+ " failed.", path);
77
+ break;
1220
+ return;
78
+ default:
79
+ qemu_log_mask(LOG_GUEST_ERROR,
80
+ "sbsa-ec: unknown power command");
81
+ }
1221
+ }
82
+ } else {
1222
+ }
83
+ qemu_log_mask(LOG_GUEST_ERROR, "sbsa-ec: unknown EC register");
1223
+
84
+ }
1224
+ /* Create RX FIFO, TXFIFO, TXHPB storage. */
85
+}
1225
+ fifo32_create(&s->rx_fifo, RXFIFO_SIZE);
86
+
1226
+ fifo32_create(&s->tx_fifo, RXFIFO_SIZE);
87
+static const MemoryRegionOps sbsa_ec_ops = {
1227
+ fifo32_create(&s->txhpb_fifo, CAN_FRAME_SIZE);
88
+ .read = sbsa_ec_read,
1228
+
89
+ .write = sbsa_ec_write,
1229
+ /* Allocate a new timer. */
90
+ .endianness = DEVICE_NATIVE_ENDIAN,
1230
+ s->can_timer = ptimer_init(xlnx_zynqmp_can_ptimer_cb, s,
91
+ .valid.min_access_size = 4,
1231
+ PTIMER_POLICY_DEFAULT);
92
+ .valid.max_access_size = 4,
1232
+
1233
+ ptimer_transaction_begin(s->can_timer);
1234
+
1235
+ ptimer_set_freq(s->can_timer, s->cfg.ext_clk_freq);
1236
+ ptimer_set_limit(s->can_timer, CAN_TIMER_MAX, 1);
1237
+ ptimer_run(s->can_timer, 0);
1238
+ ptimer_transaction_commit(s->can_timer);
1239
+}
1240
+
1241
+static void xlnx_zynqmp_can_init(Object *obj)
1242
+{
1243
+ XlnxZynqMPCANState *s = XLNX_ZYNQMP_CAN(obj);
1244
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
1245
+
1246
+ RegisterInfoArray *reg_array;
1247
+
1248
+ memory_region_init(&s->iomem, obj, TYPE_XLNX_ZYNQMP_CAN,
1249
+ XLNX_ZYNQMP_CAN_R_MAX * 4);
1250
+ reg_array = register_init_block32(DEVICE(obj), can_regs_info,
1251
+ ARRAY_SIZE(can_regs_info),
1252
+ s->reg_info, s->regs,
1253
+ &can_ops,
1254
+ XLNX_ZYNQMP_CAN_ERR_DEBUG,
1255
+ XLNX_ZYNQMP_CAN_R_MAX * 4);
1256
+
1257
+ memory_region_add_subregion(&s->iomem, 0x00, &reg_array->mem);
1258
+ sysbus_init_mmio(sbd, &s->iomem);
1259
+ sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->irq);
1260
+}
1261
+
1262
+static const VMStateDescription vmstate_can = {
1263
+ .name = TYPE_XLNX_ZYNQMP_CAN,
1264
+ .version_id = 1,
1265
+ .minimum_version_id = 1,
1266
+ .fields = (VMStateField[]) {
1267
+ VMSTATE_FIFO32(rx_fifo, XlnxZynqMPCANState),
1268
+ VMSTATE_FIFO32(tx_fifo, XlnxZynqMPCANState),
1269
+ VMSTATE_FIFO32(txhpb_fifo, XlnxZynqMPCANState),
1270
+ VMSTATE_UINT32_ARRAY(regs, XlnxZynqMPCANState, XLNX_ZYNQMP_CAN_R_MAX),
1271
+ VMSTATE_PTIMER(can_timer, XlnxZynqMPCANState),
1272
+ VMSTATE_END_OF_LIST(),
1273
+ }
93
+};
1274
+};
94
+
1275
+
95
+static void sbsa_ec_init(Object *obj)
1276
+static Property xlnx_zynqmp_can_properties[] = {
96
+{
1277
+ DEFINE_PROP_UINT32("ext_clk_freq", XlnxZynqMPCANState, cfg.ext_clk_freq,
97
+ SECUREECState *s = SECURE_EC(obj);
1278
+ CAN_DEFAULT_CLOCK),
98
+ SysBusDevice *dev = SYS_BUS_DEVICE(obj);
1279
+ DEFINE_PROP_LINK("canbus", XlnxZynqMPCANState, canbus, TYPE_CAN_BUS,
99
+
1280
+ CanBusState *),
100
+ memory_region_init_io(&s->iomem, obj, &sbsa_ec_ops, s, "sbsa-ec",
1281
+ DEFINE_PROP_END_OF_LIST(),
101
+ 0x1000);
1282
+};
102
+ sysbus_init_mmio(dev, &s->iomem);
1283
+
103
+}
1284
+static void xlnx_zynqmp_can_class_init(ObjectClass *klass, void *data)
104
+
105
+static void sbsa_ec_class_init(ObjectClass *klass, void *data)
106
+{
1285
+{
107
+ DeviceClass *dc = DEVICE_CLASS(klass);
1286
+ DeviceClass *dc = DEVICE_CLASS(klass);
108
+
1287
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
109
+ /* No vmstate or reset required: device has no internal state */
1288
+
110
+ dc->user_creatable = false;
1289
+ rc->phases.enter = xlnx_zynqmp_can_reset_init;
111
+}
1290
+ rc->phases.hold = xlnx_zynqmp_can_reset_hold;
112
+
1291
+ dc->realize = xlnx_zynqmp_can_realize;
113
+static const TypeInfo sbsa_ec_info = {
1292
+ device_class_set_props(dc, xlnx_zynqmp_can_properties);
114
+ .name = TYPE_SBSA_EC,
1293
+ dc->vmsd = &vmstate_can;
1294
+}
1295
+
1296
+static const TypeInfo can_info = {
1297
+ .name = TYPE_XLNX_ZYNQMP_CAN,
115
+ .parent = TYPE_SYS_BUS_DEVICE,
1298
+ .parent = TYPE_SYS_BUS_DEVICE,
116
+ .instance_size = sizeof(SECUREECState),
1299
+ .instance_size = sizeof(XlnxZynqMPCANState),
117
+ .instance_init = sbsa_ec_init,
1300
+ .class_init = xlnx_zynqmp_can_class_init,
118
+ .class_init = sbsa_ec_class_init,
1301
+ .instance_init = xlnx_zynqmp_can_init,
119
+};
1302
+};
120
+
1303
+
121
+static void sbsa_ec_register_type(void)
1304
+static void can_register_types(void)
122
+{
1305
+{
123
+ type_register_static(&sbsa_ec_info);
1306
+ type_register_static(&can_info);
124
+}
1307
+}
125
+
1308
+
126
+type_init(sbsa_ec_register_type);
1309
+type_init(can_register_types)
127
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
1310
diff --git a/hw/Kconfig b/hw/Kconfig
128
index XXXXXXX..XXXXXXX 100644
1311
index XXXXXXX..XXXXXXX 100644
129
--- a/hw/misc/meson.build
1312
--- a/hw/Kconfig
130
+++ b/hw/misc/meson.build
1313
+++ b/hw/Kconfig
131
@@ -XXX,XX +XXX,XX @@ specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c'))
1314
@@ -XXX,XX +XXX,XX @@ config XILINX_AXI
132
1315
config XLNX_ZYNQMP
133
specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 'mips_cpc.c'))
1316
bool
134
specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
1317
select REGISTER
135
+
1318
+ select CAN_BUS
136
+specific_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
1319
diff --git a/hw/net/can/meson.build b/hw/net/can/meson.build
1320
index XXXXXXX..XXXXXXX 100644
1321
--- a/hw/net/can/meson.build
1322
+++ b/hw/net/can/meson.build
1323
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_CAN_PCI', if_true: files('can_pcm3680_pci.c'))
1324
softmmu_ss.add(when: 'CONFIG_CAN_PCI', if_true: files('can_mioe3680_pci.c'))
1325
softmmu_ss.add(when: 'CONFIG_CAN_CTUCANFD', if_true: files('ctucan_core.c'))
1326
softmmu_ss.add(when: 'CONFIG_CAN_CTUCANFD_PCI', if_true: files('ctucan_pci.c'))
1327
+softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP', if_true: files('xlnx-zynqmp-can.c'))
1328
diff --git a/hw/net/can/trace-events b/hw/net/can/trace-events
1329
new file mode 100644
1330
index XXXXXXX..XXXXXXX
1331
--- /dev/null
1332
+++ b/hw/net/can/trace-events
1333
@@ -XXX,XX +XXX,XX @@
1334
+# xlnx-zynqmp-can.c
1335
+xlnx_can_update_irq(uint32_t isr, uint32_t ier, uint32_t irq) "ISR: 0x%08x IER: 0x%08x IRQ: 0x%08x"
1336
+xlnx_can_reset(uint32_t val) "Resetting controller with value = 0x%08x"
1337
+xlnx_can_rx_fifo_filter_reject(uint32_t id, uint8_t dlc) "Frame: ID: 0x%08x DLC: 0x%02x"
1338
+xlnx_can_filter_id_pre_write(uint8_t filter_num, uint32_t value) "Filter%d ID: 0x%08x"
1339
+xlnx_can_filter_mask_pre_write(uint8_t filter_num, uint32_t value) "Filter%d MASK: 0x%08x"
1340
+xlnx_can_tx_data(uint32_t id, uint8_t dlc, uint8_t db0, uint8_t db1, uint8_t db2, uint8_t db3, uint8_t db4, uint8_t db5, uint8_t db6, uint8_t db7) "Frame: ID: 0x%08x DLC: 0x%02x DATA: 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x"
1341
+xlnx_can_rx_data(uint32_t id, uint32_t dlc, uint8_t db0, uint8_t db1, uint8_t db2, uint8_t db3, uint8_t db4, uint8_t db5, uint8_t db6, uint8_t db7) "Frame: ID: 0x%08x DLC: 0x%02x DATA: 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x 0x%02x"
1342
+xlnx_can_rx_discard(uint32_t status) "Controller is not enabled for bus communication. Status Register: 0x%08x"
137
--
1343
--
138
2.20.1
1344
2.20.1
139
1345
140
1346
diff view generated by jsdifflib
1
Convert the Neon VCVT float<->fixed-point insns to a
1
From: Vikram Garhwal <fnu.vikram@xilinx.com>
2
gvec style, in preparation for adding fp16 support.
3
2
3
Connect CAN0 and CAN1 on the ZynqMP.
4
5
Reviewed-by: Francisco Iglesias <francisco.iglesias@xilinx.com>
6
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
7
Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
8
Message-id: 1605728926-352690-3-git-send-email-fnu.vikram@xilinx.com
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-38-peter.maydell@linaro.org
7
---
10
---
8
target/arm/helper.h | 5 +++++
11
include/hw/arm/xlnx-zynqmp.h | 8 ++++++++
9
target/arm/vec_helper.c | 20 +++++++++++++++++++
12
hw/arm/xlnx-zcu102.c | 20 ++++++++++++++++++++
10
target/arm/translate-neon.c.inc | 35 +++++++++++++++++----------------
13
hw/arm/xlnx-zynqmp.c | 34 ++++++++++++++++++++++++++++++++++
11
3 files changed, 43 insertions(+), 17 deletions(-)
14
3 files changed, 62 insertions(+)
12
15
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
14
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
18
--- a/include/hw/arm/xlnx-zynqmp.h
16
+++ b/target/arm/helper.h
19
+++ b/include/hw/arm/xlnx-zynqmp.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_tosizs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@
18
DEF_HELPER_FLAGS_4(gvec_touszh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
#include "hw/intc/arm_gic.h"
19
DEF_HELPER_FLAGS_4(gvec_touizs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
#include "hw/net/cadence_gem.h"
20
23
#include "hw/char/cadence_uart.h"
21
+DEF_HELPER_FLAGS_4(gvec_vcvt_sf, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+#include "hw/net/xlnx-zynqmp-can.h"
22
+DEF_HELPER_FLAGS_4(gvec_vcvt_uf, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
#include "hw/ide/ahci.h"
23
+DEF_HELPER_FLAGS_4(gvec_vcvt_fs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
#include "hw/sd/sdhci.h"
24
+DEF_HELPER_FLAGS_4(gvec_vcvt_fu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
#include "hw/ssi/xilinx_spips.h"
28
@@ -XXX,XX +XXX,XX @@
29
#include "hw/cpu/cluster.h"
30
#include "target/arm/cpu.h"
31
#include "qom/object.h"
32
+#include "net/can_emu.h"
33
34
#define TYPE_XLNX_ZYNQMP "xlnx,zynqmp"
35
OBJECT_DECLARE_SIMPLE_TYPE(XlnxZynqMPState, XLNX_ZYNQMP)
36
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(XlnxZynqMPState, XLNX_ZYNQMP)
37
#define XLNX_ZYNQMP_NUM_RPU_CPUS 2
38
#define XLNX_ZYNQMP_NUM_GEMS 4
39
#define XLNX_ZYNQMP_NUM_UARTS 2
40
+#define XLNX_ZYNQMP_NUM_CAN 2
41
+#define XLNX_ZYNQMP_CAN_REF_CLK (24 * 1000 * 1000)
42
#define XLNX_ZYNQMP_NUM_SDHCI 2
43
#define XLNX_ZYNQMP_NUM_SPIS 2
44
#define XLNX_ZYNQMP_NUM_GDMA_CH 8
45
@@ -XXX,XX +XXX,XX @@ struct XlnxZynqMPState {
46
47
CadenceGEMState gem[XLNX_ZYNQMP_NUM_GEMS];
48
CadenceUARTState uart[XLNX_ZYNQMP_NUM_UARTS];
49
+ XlnxZynqMPCANState can[XLNX_ZYNQMP_NUM_CAN];
50
SysbusAHCIState sata;
51
SDHCIState sdhci[XLNX_ZYNQMP_NUM_SDHCI];
52
XilinxSPIPS spi[XLNX_ZYNQMP_NUM_SPIS];
53
@@ -XXX,XX +XXX,XX @@ struct XlnxZynqMPState {
54
bool virt;
55
/* Has the RPU subsystem? */
56
bool has_rpu;
25
+
57
+
26
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
58
+ /* CAN bus. */
27
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
59
+ CanBusState *canbus[XLNX_ZYNQMP_NUM_CAN];
28
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
60
};
29
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
61
62
#endif
63
diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c
30
index XXXXXXX..XXXXXXX 100644
64
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/vec_helper.c
65
--- a/hw/arm/xlnx-zcu102.c
32
+++ b/target/arm/vec_helper.c
66
+++ b/hw/arm/xlnx-zcu102.c
33
@@ -XXX,XX +XXX,XX @@ DO_NEON_PAIRWISE(neon_pmax, max)
67
@@ -XXX,XX +XXX,XX @@
34
DO_NEON_PAIRWISE(neon_pmin, min)
68
#include "sysemu/qtest.h"
35
69
#include "sysemu/device_tree.h"
36
#undef DO_NEON_PAIRWISE
70
#include "qom/object.h"
71
+#include "net/can_emu.h"
72
73
struct XlnxZCU102 {
74
MachineState parent_obj;
75
@@ -XXX,XX +XXX,XX @@ struct XlnxZCU102 {
76
bool secure;
77
bool virt;
78
79
+ CanBusState *canbus[XLNX_ZYNQMP_NUM_CAN];
37
+
80
+
38
+#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
81
struct arm_boot_info binfo;
39
+ void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
82
};
40
+ { \
83
41
+ intptr_t i, oprsz = simd_oprsz(desc); \
84
@@ -XXX,XX +XXX,XX @@ static void xlnx_zcu102_init(MachineState *machine)
42
+ int shift = simd_data(desc); \
85
object_property_set_bool(OBJECT(&s->soc), "virtualization", s->virt,
43
+ TYPE *d = vd, *n = vn; \
86
&error_fatal);
44
+ float_status *fpst = stat; \
87
45
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
88
+ for (i = 0; i < XLNX_ZYNQMP_NUM_CAN; i++) {
46
+ d[i] = FUNC(n[i], shift, fpst); \
89
+ gchar *bus_name = g_strdup_printf("canbus%d", i);
47
+ } \
90
+
48
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
91
+ object_property_set_link(OBJECT(&s->soc), bus_name,
92
+ OBJECT(s->canbus[i]), &error_fatal);
93
+ g_free(bus_name);
49
+ }
94
+ }
50
+
95
+
51
+DO_VCVT_FIXED(gvec_vcvt_sf, helper_vfp_sltos, uint32_t)
96
qdev_realize(DEVICE(&s->soc), NULL, &error_fatal);
52
+DO_VCVT_FIXED(gvec_vcvt_uf, helper_vfp_ultos, uint32_t)
97
53
+DO_VCVT_FIXED(gvec_vcvt_fs, helper_vfp_tosls_round_to_zero, uint32_t)
98
/* Create and plug in the SD cards */
54
+DO_VCVT_FIXED(gvec_vcvt_fu, helper_vfp_touls_round_to_zero, uint32_t)
99
@@ -XXX,XX +XXX,XX @@ static void xlnx_zcu102_machine_instance_init(Object *obj)
100
s->secure = false;
101
/* Default to virt (EL2) being disabled */
102
s->virt = false;
103
+ object_property_add_link(obj, "xlnx-zcu102.canbus0", TYPE_CAN_BUS,
104
+ (Object **)&s->canbus[0],
105
+ object_property_allow_set_link,
106
+ 0);
55
+
107
+
56
+#undef DO_VCVT_FIXED
108
+ object_property_add_link(obj, "xlnx-zcu102.canbus1", TYPE_CAN_BUS,
57
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
109
+ (Object **)&s->canbus[1],
110
+ object_property_allow_set_link,
111
+ 0);
112
}
113
114
static void xlnx_zcu102_machine_class_init(ObjectClass *oc, void *data)
115
diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
58
index XXXXXXX..XXXXXXX 100644
116
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/translate-neon.c.inc
117
--- a/hw/arm/xlnx-zynqmp.c
60
+++ b/target/arm/translate-neon.c.inc
118
+++ b/hw/arm/xlnx-zynqmp.c
61
@@ -XXX,XX +XXX,XX @@ static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
119
@@ -XXX,XX +XXX,XX @@ static const int uart_intr[XLNX_ZYNQMP_NUM_UARTS] = {
62
}
120
21, 22,
63
121
};
64
static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
122
65
- NeonGenTwoSingleOpFn *fn)
123
+static const uint64_t can_addr[XLNX_ZYNQMP_NUM_CAN] = {
66
+ gen_helper_gvec_2_ptr *fn)
124
+ 0xFF060000, 0xFF070000,
67
{
125
+};
68
/* FP operations in 2-reg-and-shift group */
126
+
69
- TCGv_i32 tmp, shiftv;
127
+static const int can_intr[XLNX_ZYNQMP_NUM_CAN] = {
70
- TCGv_ptr fpstatus;
128
+ 23, 24,
71
- int pass;
129
+};
72
+ int vec_size = a->q ? 16 : 8;
130
+
73
+ int rd_ofs = neon_reg_offset(a->vd, 0);
131
static const uint64_t sdhci_addr[XLNX_ZYNQMP_NUM_SDHCI] = {
74
+ int rm_ofs = neon_reg_offset(a->vm, 0);
132
0xFF160000, 0xFF170000,
75
+ TCGv_ptr fpst;
133
};
76
134
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_init(Object *obj)
77
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
135
TYPE_CADENCE_UART);
78
return false;
79
}
136
}
80
137
81
+ if (a->size != 0) {
138
+ for (i = 0; i < XLNX_ZYNQMP_NUM_CAN; i++) {
82
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
139
+ object_initialize_child(obj, "can[*]", &s->can[i],
83
+ return false;
140
+ TYPE_XLNX_ZYNQMP_CAN);
84
+ }
85
+ }
141
+ }
86
+
142
+
87
/* UNDEF accesses to D16-D31 if they don't exist. */
143
object_initialize_child(obj, "sata", &s->sata, TYPE_SYSBUS_AHCI);
88
if (!dc_isar_feature(aa32_simd_r32, s) &&
144
89
((a->vd | a->vm) & 0x10)) {
145
for (i = 0; i < XLNX_ZYNQMP_NUM_SDHCI; i++) {
90
@@ -XXX,XX +XXX,XX @@ static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
146
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
91
return true;
147
gic_spi[uart_intr[i]]);
92
}
148
}
93
149
94
- fpstatus = fpstatus_ptr(FPST_STD);
150
+ for (i = 0; i < XLNX_ZYNQMP_NUM_CAN; i++) {
95
- shiftv = tcg_const_i32(a->shift);
151
+ object_property_set_int(OBJECT(&s->can[i]), "ext_clk_freq",
96
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
152
+ XLNX_ZYNQMP_CAN_REF_CLK, &error_abort);
97
- tmp = neon_load_reg(a->vm, pass);
153
+
98
- fn(tmp, tmp, shiftv, fpstatus);
154
+ object_property_set_link(OBJECT(&s->can[i]), "canbus",
99
- neon_store_reg(a->vd, pass, tmp);
155
+ OBJECT(s->canbus[i]), &error_fatal);
100
- }
156
+
101
- tcg_temp_free_ptr(fpstatus);
157
+ sysbus_realize(SYS_BUS_DEVICE(&s->can[i]), &err);
102
- tcg_temp_free_i32(shiftv);
158
+ if (err) {
103
+ fpst = fpstatus_ptr(a->size ? FPST_STD_F16 : FPST_STD);
159
+ error_propagate(errp, err);
104
+ tcg_gen_gvec_2_ptr(rd_ofs, rm_ofs, fpst, vec_size, vec_size, a->shift, fn);
160
+ return;
105
+ tcg_temp_free_ptr(fpst);
161
+ }
106
return true;
162
+ sysbus_mmio_map(SYS_BUS_DEVICE(&s->can[i]), 0, can_addr[i]);
107
}
163
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->can[i]), 0,
108
164
+ gic_spi[can_intr[i]]);
109
@@ -XXX,XX +XXX,XX @@ static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
165
+ }
110
return do_fp_2sh(s, a, FUNC); \
166
+
111
}
167
object_property_set_int(OBJECT(&s->sata), "num-ports", SATA_NUM_PORTS,
112
168
&error_abort);
113
-DO_FP_2SH(VCVT_SF, gen_helper_vfp_sltos)
169
if (!sysbus_realize(SYS_BUS_DEVICE(&s->sata), errp)) {
114
-DO_FP_2SH(VCVT_UF, gen_helper_vfp_ultos)
170
@@ -XXX,XX +XXX,XX @@ static Property xlnx_zynqmp_props[] = {
115
-DO_FP_2SH(VCVT_FS, gen_helper_vfp_tosls_round_to_zero)
171
DEFINE_PROP_BOOL("has_rpu", XlnxZynqMPState, has_rpu, false),
116
-DO_FP_2SH(VCVT_FU, gen_helper_vfp_touls_round_to_zero)
172
DEFINE_PROP_LINK("ddr-ram", XlnxZynqMPState, ddr_ram, TYPE_MEMORY_REGION,
117
+DO_FP_2SH(VCVT_SF, gen_helper_gvec_vcvt_sf)
173
MemoryRegion *),
118
+DO_FP_2SH(VCVT_UF, gen_helper_gvec_vcvt_uf)
174
+ DEFINE_PROP_LINK("canbus0", XlnxZynqMPState, canbus[0], TYPE_CAN_BUS,
119
+DO_FP_2SH(VCVT_FS, gen_helper_gvec_vcvt_fs)
175
+ CanBusState *),
120
+DO_FP_2SH(VCVT_FU, gen_helper_gvec_vcvt_fu)
176
+ DEFINE_PROP_LINK("canbus1", XlnxZynqMPState, canbus[1], TYPE_CAN_BUS,
121
177
+ CanBusState *),
122
static uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
178
DEFINE_PROP_END_OF_LIST()
123
{
179
};
180
124
--
181
--
125
2.20.1
182
2.20.1
126
183
127
184
diff view generated by jsdifflib
1
Add gvec helpers for doing Neon-style indexed non-fused fp
1
From: Vikram Garhwal <fnu.vikram@xilinx.com>
2
multiply-and-accumulate operations.
2
3
3
The QTests perform five tests on the Xilinx ZynqMP CAN controller:
4
Tests the CAN controller in loopback, sleep and snoop mode.
5
Tests filtering of incoming CAN messages.
6
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Reviewed-by: Francisco Iglesias <francisco.iglesias@xilinx.com>
9
Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
10
Message-id: 1605728926-352690-4-git-send-email-fnu.vikram@xilinx.com
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20200828183354.27913-44-peter.maydell@linaro.org
6
---
12
---
7
target/arm/helper.h | 10 ++++++++++
13
tests/qtest/xlnx-can-test.c | 360 ++++++++++++++++++++++++++++++++++++
8
target/arm/vec_helper.c | 27 ++++++++++++++++++++++-----
14
tests/qtest/meson.build | 1 +
9
2 files changed, 32 insertions(+), 5 deletions(-)
15
2 files changed, 361 insertions(+)
10
16
create mode 100644 tests/qtest/xlnx-can-test.c
11
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
18
diff --git a/tests/qtest/xlnx-can-test.c b/tests/qtest/xlnx-can-test.c
19
new file mode 100644
20
index XXXXXXX..XXXXXXX
21
--- /dev/null
22
+++ b/tests/qtest/xlnx-can-test.c
23
@@ -XXX,XX +XXX,XX @@
24
+/*
25
+ * QTests for the Xilinx ZynqMP CAN controller.
26
+ *
27
+ * Copyright (c) 2020 Xilinx Inc.
28
+ *
29
+ * Written-by: Vikram Garhwal<fnu.vikram@xilinx.com>
30
+ *
31
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
32
+ * of this software and associated documentation files (the "Software"), to deal
33
+ * in the Software without restriction, including without limitation the rights
34
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
35
+ * copies of the Software, and to permit persons to whom the Software is
36
+ * furnished to do so, subject to the following conditions:
37
+ *
38
+ * The above copyright notice and this permission notice shall be included in
39
+ * all copies or substantial portions of the Software.
40
+ *
41
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
42
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
43
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
44
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
45
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
46
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
47
+ * THE SOFTWARE.
48
+ */
49
+
50
+#include "qemu/osdep.h"
51
+#include "libqos/libqtest.h"
52
+
53
+/* Base address. */
54
+#define CAN0_BASE_ADDR 0xFF060000
55
+#define CAN1_BASE_ADDR 0xFF070000
56
+
57
+/* Register addresses. */
58
+#define R_SRR_OFFSET 0x00
59
+#define R_MSR_OFFSET 0x04
60
+#define R_SR_OFFSET 0x18
61
+#define R_ISR_OFFSET 0x1C
62
+#define R_ICR_OFFSET 0x24
63
+#define R_TXID_OFFSET 0x30
64
+#define R_TXDLC_OFFSET 0x34
65
+#define R_TXDATA1_OFFSET 0x38
66
+#define R_TXDATA2_OFFSET 0x3C
67
+#define R_RXID_OFFSET 0x50
68
+#define R_RXDLC_OFFSET 0x54
69
+#define R_RXDATA1_OFFSET 0x58
70
+#define R_RXDATA2_OFFSET 0x5C
71
+#define R_AFR 0x60
72
+#define R_AFMR1 0x64
73
+#define R_AFIR1 0x68
74
+#define R_AFMR2 0x6C
75
+#define R_AFIR2 0x70
76
+#define R_AFMR3 0x74
77
+#define R_AFIR3 0x78
78
+#define R_AFMR4 0x7C
79
+#define R_AFIR4 0x80
80
+
81
+/* CAN modes. */
82
+#define CONFIG_MODE 0x00
83
+#define NORMAL_MODE 0x00
84
+#define LOOPBACK_MODE 0x02
85
+#define SNOOP_MODE 0x04
86
+#define SLEEP_MODE 0x01
87
+#define ENABLE_CAN (1 << 1)
88
+#define STATUS_NORMAL_MODE (1 << 3)
89
+#define STATUS_LOOPBACK_MODE (1 << 1)
90
+#define STATUS_SNOOP_MODE (1 << 12)
91
+#define STATUS_SLEEP_MODE (1 << 2)
92
+#define ISR_TXOK (1 << 1)
93
+#define ISR_RXOK (1 << 4)
94
+
95
+static void match_rx_tx_data(const uint32_t *buf_tx, const uint32_t *buf_rx,
96
+ uint8_t can_timestamp)
97
+{
98
+ uint16_t size = 0;
99
+ uint8_t len = 4;
100
+
101
+ while (size < len) {
102
+ if (R_RXID_OFFSET + 4 * size == R_RXDLC_OFFSET) {
103
+ g_assert_cmpint(buf_rx[size], ==, buf_tx[size] + can_timestamp);
104
+ } else {
105
+ g_assert_cmpint(buf_rx[size], ==, buf_tx[size]);
106
+ }
107
+
108
+ size++;
109
+ }
110
+}
111
+
112
+static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx)
113
+{
114
+ uint32_t int_status;
115
+
116
+ /* Read the interrupt on CAN rx. */
117
+ int_status = qtest_readl(qts, can_base_addr + R_ISR_OFFSET) & ISR_RXOK;
118
+
119
+ g_assert_cmpint(int_status, ==, ISR_RXOK);
120
+
121
+ /* Read the RX register data for CAN. */
122
+ buf_rx[0] = qtest_readl(qts, can_base_addr + R_RXID_OFFSET);
123
+ buf_rx[1] = qtest_readl(qts, can_base_addr + R_RXDLC_OFFSET);
124
+ buf_rx[2] = qtest_readl(qts, can_base_addr + R_RXDATA1_OFFSET);
125
+ buf_rx[3] = qtest_readl(qts, can_base_addr + R_RXDATA2_OFFSET);
126
+
127
+ /* Clear the RX interrupt. */
128
+ qtest_writel(qts, CAN1_BASE_ADDR + R_ICR_OFFSET, ISR_RXOK);
129
+}
130
+
131
+static void send_data(QTestState *qts, uint64_t can_base_addr,
132
+ const uint32_t *buf_tx)
133
+{
134
+ uint32_t int_status;
135
+
136
+ /* Write the TX register data for CAN. */
137
+ qtest_writel(qts, can_base_addr + R_TXID_OFFSET, buf_tx[0]);
138
+ qtest_writel(qts, can_base_addr + R_TXDLC_OFFSET, buf_tx[1]);
139
+ qtest_writel(qts, can_base_addr + R_TXDATA1_OFFSET, buf_tx[2]);
140
+ qtest_writel(qts, can_base_addr + R_TXDATA2_OFFSET, buf_tx[3]);
141
+
142
+ /* Read the interrupt on CAN for tx. */
143
+ int_status = qtest_readl(qts, can_base_addr + R_ISR_OFFSET) & ISR_TXOK;
144
+
145
+ g_assert_cmpint(int_status, ==, ISR_TXOK);
146
+
147
+ /* Clear the interrupt for tx. */
148
+ qtest_writel(qts, CAN0_BASE_ADDR + R_ICR_OFFSET, ISR_TXOK);
149
+}
150
+
151
+/*
152
+ * This test will be transferring data from CAN0 and CAN1 through canbus. CAN0
153
+ * initiate the data transfer to can-bus, CAN1 receives the data. Test compares
154
+ * the data sent from CAN0 with received on CAN1.
155
+ */
156
+static void test_can_bus(void)
157
+{
158
+ const uint32_t buf_tx[4] = { 0xFF, 0x80000000, 0x12345678, 0x87654321 };
159
+ uint32_t buf_rx[4] = { 0x00, 0x00, 0x00, 0x00 };
160
+ uint32_t status = 0;
161
+ uint8_t can_timestamp = 1;
162
+
163
+ QTestState *qts = qtest_init("-machine xlnx-zcu102"
164
+ " -object can-bus,id=canbus0"
165
+ " -machine xlnx-zcu102.canbus0=canbus0"
166
+ " -machine xlnx-zcu102.canbus1=canbus0"
167
+ );
168
+
169
+ /* Configure the CAN0 and CAN1. */
170
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
171
+ qtest_writel(qts, CAN0_BASE_ADDR + R_MSR_OFFSET, NORMAL_MODE);
172
+ qtest_writel(qts, CAN1_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
173
+ qtest_writel(qts, CAN1_BASE_ADDR + R_MSR_OFFSET, NORMAL_MODE);
174
+
175
+ /* Check here if CAN0 and CAN1 are in normal mode. */
176
+ status = qtest_readl(qts, CAN0_BASE_ADDR + R_SR_OFFSET);
177
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
178
+
179
+ status = qtest_readl(qts, CAN1_BASE_ADDR + R_SR_OFFSET);
180
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
181
+
182
+ send_data(qts, CAN0_BASE_ADDR, buf_tx);
183
+
184
+ read_data(qts, CAN1_BASE_ADDR, buf_rx);
185
+ match_rx_tx_data(buf_tx, buf_rx, can_timestamp);
186
+
187
+ qtest_quit(qts);
188
+}
189
+
190
+/*
191
+ * This test is performing loopback mode on CAN0 and CAN1. Data sent from TX of
192
+ * each CAN0 and CAN1 are compared with RX register data for respective CAN.
193
+ */
194
+static void test_can_loopback(void)
195
+{
196
+ uint32_t buf_tx[4] = { 0xFF, 0x80000000, 0x12345678, 0x87654321 };
197
+ uint32_t buf_rx[4] = { 0x00, 0x00, 0x00, 0x00 };
198
+ uint32_t status = 0;
199
+
200
+ QTestState *qts = qtest_init("-machine xlnx-zcu102"
201
+ " -object can-bus,id=canbus0"
202
+ " -machine xlnx-zcu102.canbus0=canbus0"
203
+ " -machine xlnx-zcu102.canbus1=canbus0"
204
+ );
205
+
206
+ /* Configure the CAN0 in loopback mode. */
207
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, CONFIG_MODE);
208
+ qtest_writel(qts, CAN0_BASE_ADDR + R_MSR_OFFSET, LOOPBACK_MODE);
209
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
210
+
211
+ /* Check here if CAN0 is set in loopback mode. */
212
+ status = qtest_readl(qts, CAN0_BASE_ADDR + R_SR_OFFSET);
213
+
214
+ g_assert_cmpint(status, ==, STATUS_LOOPBACK_MODE);
215
+
216
+ send_data(qts, CAN0_BASE_ADDR, buf_tx);
217
+ read_data(qts, CAN0_BASE_ADDR, buf_rx);
218
+ match_rx_tx_data(buf_tx, buf_rx, 0);
219
+
220
+ /* Configure the CAN1 in loopback mode. */
221
+ qtest_writel(qts, CAN1_BASE_ADDR + R_SRR_OFFSET, CONFIG_MODE);
222
+ qtest_writel(qts, CAN1_BASE_ADDR + R_MSR_OFFSET, LOOPBACK_MODE);
223
+ qtest_writel(qts, CAN1_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
224
+
225
+ /* Check here if CAN1 is set in loopback mode. */
226
+ status = qtest_readl(qts, CAN1_BASE_ADDR + R_SR_OFFSET);
227
+
228
+ g_assert_cmpint(status, ==, STATUS_LOOPBACK_MODE);
229
+
230
+ send_data(qts, CAN1_BASE_ADDR, buf_tx);
231
+ read_data(qts, CAN1_BASE_ADDR, buf_rx);
232
+ match_rx_tx_data(buf_tx, buf_rx, 0);
233
+
234
+ qtest_quit(qts);
235
+}
236
+
237
+/*
238
+ * Enable filters for CAN1. This will filter incoming messages with ID. In this
239
+ * test message will pass through filter 2.
240
+ */
241
+static void test_can_filter(void)
242
+{
243
+ uint32_t buf_tx[4] = { 0x14, 0x80000000, 0x12345678, 0x87654321 };
244
+ uint32_t buf_rx[4] = { 0x00, 0x00, 0x00, 0x00 };
245
+ uint32_t status = 0;
246
+ uint8_t can_timestamp = 1;
247
+
248
+ QTestState *qts = qtest_init("-machine xlnx-zcu102"
249
+ " -object can-bus,id=canbus0"
250
+ " -machine xlnx-zcu102.canbus0=canbus0"
251
+ " -machine xlnx-zcu102.canbus1=canbus0"
252
+ );
253
+
254
+ /* Configure the CAN0 and CAN1. */
255
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
256
+ qtest_writel(qts, CAN0_BASE_ADDR + R_MSR_OFFSET, NORMAL_MODE);
257
+ qtest_writel(qts, CAN1_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
258
+ qtest_writel(qts, CAN1_BASE_ADDR + R_MSR_OFFSET, NORMAL_MODE);
259
+
260
+ /* Check here if CAN0 and CAN1 are in normal mode. */
261
+ status = qtest_readl(qts, CAN0_BASE_ADDR + R_SR_OFFSET);
262
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
263
+
264
+ status = qtest_readl(qts, CAN1_BASE_ADDR + R_SR_OFFSET);
265
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
266
+
267
+ /* Set filter for CAN1 for incoming messages. */
268
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFR, 0x0);
269
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFMR1, 0xF7);
270
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFIR1, 0x121F);
271
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFMR2, 0x5431);
272
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFIR2, 0x14);
273
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFMR3, 0x1234);
274
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFIR3, 0x5431);
275
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFMR4, 0xFFF);
276
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFIR4, 0x1234);
277
+
278
+ qtest_writel(qts, CAN1_BASE_ADDR + R_AFR, 0xF);
279
+
280
+ send_data(qts, CAN0_BASE_ADDR, buf_tx);
281
+
282
+ read_data(qts, CAN1_BASE_ADDR, buf_rx);
283
+ match_rx_tx_data(buf_tx, buf_rx, can_timestamp);
284
+
285
+ qtest_quit(qts);
286
+}
287
+
288
+/* Testing sleep mode on CAN0 while CAN1 is in normal mode. */
289
+static void test_can_sleepmode(void)
290
+{
291
+ uint32_t buf_tx[4] = { 0x14, 0x80000000, 0x12345678, 0x87654321 };
292
+ uint32_t buf_rx[4] = { 0x00, 0x00, 0x00, 0x00 };
293
+ uint32_t status = 0;
294
+ uint8_t can_timestamp = 1;
295
+
296
+ QTestState *qts = qtest_init("-machine xlnx-zcu102"
297
+ " -object can-bus,id=canbus0"
298
+ " -machine xlnx-zcu102.canbus0=canbus0"
299
+ " -machine xlnx-zcu102.canbus1=canbus0"
300
+ );
301
+
302
+ /* Configure the CAN0. */
303
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, CONFIG_MODE);
304
+ qtest_writel(qts, CAN0_BASE_ADDR + R_MSR_OFFSET, SLEEP_MODE);
305
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
306
+
307
+ qtest_writel(qts, CAN1_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
308
+ qtest_writel(qts, CAN1_BASE_ADDR + R_MSR_OFFSET, NORMAL_MODE);
309
+
310
+ /* Check here if CAN0 is in SLEEP mode and CAN1 in normal mode. */
311
+ status = qtest_readl(qts, CAN0_BASE_ADDR + R_SR_OFFSET);
312
+ g_assert_cmpint(status, ==, STATUS_SLEEP_MODE);
313
+
314
+ status = qtest_readl(qts, CAN1_BASE_ADDR + R_SR_OFFSET);
315
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
316
+
317
+ send_data(qts, CAN1_BASE_ADDR, buf_tx);
318
+
319
+ /*
320
+ * Once CAN1 sends data on can-bus. CAN0 should exit sleep mode.
321
+ * Check the CAN0 status now. It should exit the sleep mode and receive the
322
+ * incoming data.
323
+ */
324
+ status = qtest_readl(qts, CAN0_BASE_ADDR + R_SR_OFFSET);
325
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
326
+
327
+ read_data(qts, CAN0_BASE_ADDR, buf_rx);
328
+
329
+ match_rx_tx_data(buf_tx, buf_rx, can_timestamp);
330
+
331
+ qtest_quit(qts);
332
+}
333
+
334
+/* Testing Snoop mode on CAN0 while CAN1 is in normal mode. */
335
+static void test_can_snoopmode(void)
336
+{
337
+ uint32_t buf_tx[4] = { 0x14, 0x80000000, 0x12345678, 0x87654321 };
338
+ uint32_t buf_rx[4] = { 0x00, 0x00, 0x00, 0x00 };
339
+ uint32_t status = 0;
340
+ uint8_t can_timestamp = 1;
341
+
342
+ QTestState *qts = qtest_init("-machine xlnx-zcu102"
343
+ " -object can-bus,id=canbus0"
344
+ " -machine xlnx-zcu102.canbus0=canbus0"
345
+ " -machine xlnx-zcu102.canbus1=canbus0"
346
+ );
347
+
348
+ /* Configure the CAN0. */
349
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, CONFIG_MODE);
350
+ qtest_writel(qts, CAN0_BASE_ADDR + R_MSR_OFFSET, SNOOP_MODE);
351
+ qtest_writel(qts, CAN0_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
352
+
353
+ qtest_writel(qts, CAN1_BASE_ADDR + R_SRR_OFFSET, ENABLE_CAN);
354
+ qtest_writel(qts, CAN1_BASE_ADDR + R_MSR_OFFSET, NORMAL_MODE);
355
+
356
+ /* Check here if CAN0 is in SNOOP mode and CAN1 in normal mode. */
357
+ status = qtest_readl(qts, CAN0_BASE_ADDR + R_SR_OFFSET);
358
+ g_assert_cmpint(status, ==, STATUS_SNOOP_MODE);
359
+
360
+ status = qtest_readl(qts, CAN1_BASE_ADDR + R_SR_OFFSET);
361
+ g_assert_cmpint(status, ==, STATUS_NORMAL_MODE);
362
+
363
+ send_data(qts, CAN1_BASE_ADDR, buf_tx);
364
+
365
+ read_data(qts, CAN0_BASE_ADDR, buf_rx);
366
+
367
+ match_rx_tx_data(buf_tx, buf_rx, can_timestamp);
368
+
369
+ qtest_quit(qts);
370
+}
371
+
372
+int main(int argc, char **argv)
373
+{
374
+ g_test_init(&argc, &argv, NULL);
375
+
376
+ qtest_add_func("/net/can/can_bus", test_can_bus);
377
+ qtest_add_func("/net/can/can_loopback", test_can_loopback);
378
+ qtest_add_func("/net/can/can_filter", test_can_filter);
379
+ qtest_add_func("/net/can/can_test_snoopmode", test_can_snoopmode);
380
+ qtest_add_func("/net/can/can_test_sleepmode", test_can_sleepmode);
381
+
382
+ return g_test_run();
383
+}
384
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
12
index XXXXXXX..XXXXXXX 100644
385
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/helper.h
386
--- a/tests/qtest/meson.build
14
+++ b/target/arm/helper.h
387
+++ b/tests/qtest/meson.build
15
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG,
388
@@ -XXX,XX +XXX,XX @@ qtests_aarch64 = \
16
DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG,
389
['arm-cpu-features',
17
void, ptr, ptr, ptr, ptr, i32)
390
'numa-test',
18
391
'boot-serial-test',
19
+DEF_HELPER_FLAGS_5(gvec_fmla_nf_idx_h, TCG_CALL_NO_RWG,
392
+ 'xlnx-can-test',
20
+ void, ptr, ptr, ptr, ptr, i32)
393
'migration-test']
21
+DEF_HELPER_FLAGS_5(gvec_fmla_nf_idx_s, TCG_CALL_NO_RWG,
394
22
+ void, ptr, ptr, ptr, ptr, i32)
395
qtests_s390x = \
23
+
24
+DEF_HELPER_FLAGS_5(gvec_fmls_nf_idx_h, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(gvec_fmls_nf_idx_s, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG,
30
void, ptr, ptr, ptr, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG,
32
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/vec_helper.c
35
+++ b/target/arm/vec_helper.c
36
@@ -XXX,XX +XXX,XX @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, )
37
38
#undef DO_MLA_IDX
39
40
-#define DO_FMUL_IDX(NAME, TYPE, H) \
41
+#define DO_FMUL_IDX(NAME, ADD, TYPE, H) \
42
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
43
{ \
44
intptr_t i, j, oprsz = simd_oprsz(desc); \
45
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
46
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
47
TYPE mm = m[H(i + idx)]; \
48
for (j = 0; j < segment; j++) { \
49
- d[i + j] = TYPE##_mul(n[i + j], mm, stat); \
50
+ d[i + j] = TYPE##_##ADD(d[i + j], \
51
+ TYPE##_mul(n[i + j], mm, stat), stat); \
52
} \
53
} \
54
clear_tail(d, oprsz, simd_maxsz(desc)); \
55
}
56
57
-DO_FMUL_IDX(gvec_fmul_idx_h, float16, H2)
58
-DO_FMUL_IDX(gvec_fmul_idx_s, float32, H4)
59
-DO_FMUL_IDX(gvec_fmul_idx_d, float64, )
60
+#define float16_nop(N, M, S) (M)
61
+#define float32_nop(N, M, S) (M)
62
+#define float64_nop(N, M, S) (M)
63
64
+DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16, H2)
65
+DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32, H4)
66
+DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64, )
67
+
68
+/*
69
+ * Non-fused multiply-accumulate operations, for Neon. NB that unlike
70
+ * the fused ops below they assume accumulate both from and into Vd.
71
+ */
72
+DO_FMUL_IDX(gvec_fmla_nf_idx_h, add, float16, H2)
73
+DO_FMUL_IDX(gvec_fmla_nf_idx_s, add, float32, H4)
74
+DO_FMUL_IDX(gvec_fmls_nf_idx_h, sub, float16, H2)
75
+DO_FMUL_IDX(gvec_fmls_nf_idx_s, sub, float32, H4)
76
+
77
+#undef float16_nop
78
+#undef float32_nop
79
+#undef float64_nop
80
#undef DO_FMUL_IDX
81
82
#define DO_FMLA_IDX(NAME, TYPE, H) \
83
--
396
--
84
2.20.1
397
2.20.1
85
398
86
399
diff view generated by jsdifflib
1
In the gvec helper functions for indexed operations, for AArch32
1
From: Vikram Garhwal <fnu.vikram@xilinx.com>
2
Neon the oprsz (total size of the vector) can be less than 16 bytes
3
if the operation is on a D reg. Since the inner loop in these
4
helpers always goes from 0 to segment, we must clamp it based
5
on oprsz to avoid processing a full 16 byte segment when asked to
6
handle an 8 byte wide vector.
7
2
3
Reviewed-by: Francisco Iglesias <francisco.iglesias@xilinx.com>
4
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5
Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
6
Message-id: 1605728926-352690-5-git-send-email-fnu.vikram@xilinx.com
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200828183354.27913-43-peter.maydell@linaro.org
11
---
8
---
12
target/arm/vec_helper.c | 12 ++++++++----
9
MAINTAINERS | 8 ++++++++
13
1 file changed, 8 insertions(+), 4 deletions(-)
10
1 file changed, 8 insertions(+)
14
11
15
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
12
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/vec_helper.c
14
--- a/MAINTAINERS
18
+++ b/target/arm/vec_helper.c
15
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@ DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32)
16
@@ -XXX,XX +XXX,XX @@ F: hw/net/opencores_eth.c
20
#define DO_MUL_IDX(NAME, TYPE, H) \
17
21
void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
18
Devices
22
{ \
19
-------
23
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
20
+Xilinx CAN
24
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
21
+M: Vikram Garhwal <fnu.vikram@xilinx.com>
25
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
22
+M: Francisco Iglesias <francisco.iglesias@xilinx.com>
26
intptr_t idx = simd_data(desc); \
23
+S: Maintained
27
TYPE *d = vd, *n = vn, *m = vm; \
24
+F: hw/net/can/xlnx-*
28
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
25
+F: include/hw/net/xlnx-*
29
@@ -XXX,XX +XXX,XX @@ DO_MUL_IDX(gvec_mul_idx_d, uint64_t, )
26
+F: tests/qtest/xlnx-can-test*
30
#define DO_MLA_IDX(NAME, TYPE, OP, H) \
27
+
31
void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \
28
EDU
32
{ \
29
M: Jiri Slaby <jslaby@suse.cz>
33
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
30
S: Maintained
34
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
35
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
36
intptr_t idx = simd_data(desc); \
37
TYPE *d = vd, *n = vn, *m = vm, *a = va; \
38
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
39
@@ -XXX,XX +XXX,XX @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, )
40
#define DO_FMUL_IDX(NAME, TYPE, H) \
41
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
42
{ \
43
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
44
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
45
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
46
intptr_t idx = simd_data(desc); \
47
TYPE *d = vd, *n = vn, *m = vm; \
48
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
49
@@ -XXX,XX +XXX,XX @@ DO_FMUL_IDX(gvec_fmul_idx_d, float64, )
50
void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
51
void *stat, uint32_t desc) \
52
{ \
53
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
54
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
55
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
56
TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \
57
intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \
58
TYPE *d = vd, *n = vn, *m = vm, *a = va; \
59
--
31
--
60
2.20.1
32
2.20.1
61
33
62
34
diff view generated by jsdifflib
1
From: Leif Lindholm <leif@nuviainc.com>
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2
2
3
The sbsa-ref platform uses a minimal device tree to pass amount of memory
3
Trusted Firmware now supports A72 on sbsa-ref by default [1] so enable
4
as well as number of cpus to the firmware. However, when dumping that
4
it for QEMU as well. A53 was already enabled there.
5
minimal dtb (with -M sbsa-virt,dumpdtb=<file>), the resulting blob
6
generates a warning when decompiled by dtc due to lack of reg property.
7
5
8
Add a simple reg property per cpu, representing a 64-bit MPIDR_EL1.
6
1. https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/7117
9
7
10
This also ends up being cleaner than having the firmware calculating its
8
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
11
own IDs for generating APCI.
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
10
Message-id: 20201120141705.246690-1-marcin.juszkiewicz@linaro.org
13
Signed-off-by: Leif Lindholm <leif@nuviainc.com>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
15
Message-id: 20200827124335.30586-1-leif@nuviainc.com
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
13
---
18
hw/arm/sbsa-ref.c | 29 +++++++++++++++++++++++------
14
hw/arm/sbsa-ref.c | 23 ++++++++++++++++++++---
19
1 file changed, 23 insertions(+), 6 deletions(-)
15
1 file changed, 20 insertions(+), 3 deletions(-)
20
16
21
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
17
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
22
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
23
--- a/hw/arm/sbsa-ref.c
19
--- a/hw/arm/sbsa-ref.c
24
+++ b/hw/arm/sbsa-ref.c
20
+++ b/hw/arm/sbsa-ref.c
25
@@ -XXX,XX +XXX,XX @@ static const int sbsa_ref_irqmap[] = {
21
@@ -XXX,XX +XXX,XX @@ static const int sbsa_ref_irqmap[] = {
26
[SBSA_EHCI] = 11,
22
[SBSA_GWDT] = 16,
27
};
23
};
28
24
29
+static uint64_t sbsa_ref_cpu_mp_affinity(SBSAMachineState *sms, int idx)
25
+static const char * const valid_cpus[] = {
26
+ ARM_CPU_TYPE_NAME("cortex-a53"),
27
+ ARM_CPU_TYPE_NAME("cortex-a57"),
28
+ ARM_CPU_TYPE_NAME("cortex-a72"),
29
+};
30
+
31
+static bool cpu_type_valid(const char *cpu)
30
+{
32
+{
31
+ uint8_t clustersz = ARM_DEFAULT_CPUS_PER_CLUSTER;
33
+ int i;
32
+ return arm_cpu_mp_affinity(idx, clustersz);
34
+
35
+ for (i = 0; i < ARRAY_SIZE(valid_cpus); i++) {
36
+ if (strcmp(cpu, valid_cpus[i]) == 0) {
37
+ return true;
38
+ }
39
+ }
40
+ return false;
33
+}
41
+}
34
+
42
+
35
/*
43
static uint64_t sbsa_ref_cpu_mp_affinity(SBSAMachineState *sms, int idx)
36
* Firmware on this machine only uses ACPI table to load OS, these limited
44
{
37
* device tree nodes are just to let firmware know the info which varies from
45
uint8_t clustersz = ARM_DEFAULT_CPUS_PER_CLUSTER;
38
@@ -XXX,XX +XXX,XX @@ static void create_fdt(SBSAMachineState *sms)
46
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
39
g_free(matrix);
47
const CPUArchIdList *possible_cpus;
48
int n, sbsa_max_cpus;
49
50
- if (strcmp(machine->cpu_type, ARM_CPU_TYPE_NAME("cortex-a57"))) {
51
- error_report("sbsa-ref: CPU type other than the built-in "
52
- "cortex-a57 not supported");
53
+ if (!cpu_type_valid(machine->cpu_type)) {
54
+ error_report("mach-virt: CPU type %s not supported", machine->cpu_type);
55
exit(1);
40
}
56
}
41
57
42
+ /*
43
+ * From Documentation/devicetree/bindings/arm/cpus.yaml
44
+ * On ARM v8 64-bit systems this property is required
45
+ * and matches the MPIDR_EL1 register affinity bits.
46
+ *
47
+ * * If cpus node's #address-cells property is set to 2
48
+ *
49
+ * The first reg cell bits [7:0] must be set to
50
+ * bits [39:32] of MPIDR_EL1.
51
+ *
52
+ * The second reg cell bits [23:0] must be set to
53
+ * bits [23:0] of MPIDR_EL1.
54
+ */
55
qemu_fdt_add_subnode(sms->fdt, "/cpus");
56
+ qemu_fdt_setprop_cell(sms->fdt, "/cpus", "#address-cells", 2);
57
+ qemu_fdt_setprop_cell(sms->fdt, "/cpus", "#size-cells", 0x0);
58
59
for (cpu = sms->smp_cpus - 1; cpu >= 0; cpu--) {
60
char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
61
ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
62
CPUState *cs = CPU(armcpu);
63
+ uint64_t mpidr = sbsa_ref_cpu_mp_affinity(sms, cpu);
64
65
qemu_fdt_add_subnode(sms->fdt, nodename);
66
+ qemu_fdt_setprop_u64(sms->fdt, nodename, "reg", mpidr);
67
68
if (ms->possible_cpus->cpus[cs->cpu_index].props.has_node_id) {
69
qemu_fdt_setprop_cell(sms->fdt, nodename, "numa-node-id",
70
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
71
arm_load_kernel(ARM_CPU(first_cpu), machine, &sms->bootinfo);
72
}
73
74
-static uint64_t sbsa_ref_cpu_mp_affinity(SBSAMachineState *sms, int idx)
75
-{
76
- uint8_t clustersz = ARM_DEFAULT_CPUS_PER_CLUSTER;
77
- return arm_cpu_mp_affinity(idx, clustersz);
78
-}
79
-
80
static const CPUArchIdList *sbsa_ref_possible_cpu_arch_ids(MachineState *ms)
81
{
82
unsigned int max_cpus = ms->smp.max_cpus;
83
--
58
--
84
2.20.1
59
2.20.1
85
60
86
61
diff view generated by jsdifflib
1
Implement fp16 versions of the VFP VMLA, VMLS, VNMLS, VNMLA, VNMUL
1
From: Havard Skinnemoen <hskinnemoen@google.com>
2
instructions. (These are all the remaining ones which we implement
3
via do_vfp_3op_[hsd]p().)
4
2
3
Dump the collected random data after a randomness test failure.
4
5
Note that this relies on the test having called
6
g_test_set_nonfatal_assertions() so we don't abort immediately on the
7
assertion failure.
8
9
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
[PMM: minor commit message tweak]
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-5-peter.maydell@linaro.org
8
---
13
---
9
target/arm/helper.h | 1 +
14
tests/qtest/npcm7xx_rng-test.c | 12 ++++++++++++
10
target/arm/vfp.decode | 5 ++
15
1 file changed, 12 insertions(+)
11
target/arm/vfp_helper.c | 5 ++
12
target/arm/translate-vfp.c.inc | 84 ++++++++++++++++++++++++++++++++++
13
4 files changed, 95 insertions(+)
14
16
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
diff --git a/tests/qtest/npcm7xx_rng-test.c b/tests/qtest/npcm7xx_rng-test.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.h
19
--- a/tests/qtest/npcm7xx_rng-test.c
18
+++ b/target/arm/helper.h
20
+++ b/tests/qtest/npcm7xx_rng-test.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, ptr)
21
@@ -XXX,XX +XXX,XX @@
20
DEF_HELPER_3(vfp_minnumh, f16, f16, f16, ptr)
22
21
DEF_HELPER_3(vfp_minnums, f32, f32, f32, ptr)
23
#include "libqtest-single.h"
22
DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr)
24
#include "qemu/bitops.h"
23
+DEF_HELPER_1(vfp_negh, f16, f16)
25
+#include "qemu-common.h"
24
DEF_HELPER_1(vfp_negs, f32, f32)
26
25
DEF_HELPER_1(vfp_negd, f64, f64)
27
#define RNG_BASE_ADDR 0xf000b000
26
DEF_HELPER_1(vfp_abss, f32, f32)
28
27
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
29
@@ -XXX,XX +XXX,XX @@
28
index XXXXXXX..XXXXXXX 100644
30
/* Number of bits to collect for randomness tests. */
29
--- a/target/arm/vfp.decode
31
#define TEST_INPUT_BITS (128)
30
+++ b/target/arm/vfp.decode
32
31
@@ -XXX,XX +XXX,XX @@ VLDM_VSTM_dp ---- 1101 0.1 l:1 rn:4 .... 1011 imm:8 \
33
+static void dump_buf_if_failed(const uint8_t *buf, size_t size)
32
vd=%vd_dp p=1 u=0 w=1
33
34
# 3-register VFP data-processing; bits [23,21:20,6] identify the operation.
35
+VMLA_hp ---- 1110 0.00 .... .... 1001 .0.0 .... @vfp_dnm_s
36
VMLA_sp ---- 1110 0.00 .... .... 1010 .0.0 .... @vfp_dnm_s
37
VMLA_dp ---- 1110 0.00 .... .... 1011 .0.0 .... @vfp_dnm_d
38
39
+VMLS_hp ---- 1110 0.00 .... .... 1001 .1.0 .... @vfp_dnm_s
40
VMLS_sp ---- 1110 0.00 .... .... 1010 .1.0 .... @vfp_dnm_s
41
VMLS_dp ---- 1110 0.00 .... .... 1011 .1.0 .... @vfp_dnm_d
42
43
+VNMLS_hp ---- 1110 0.01 .... .... 1001 .0.0 .... @vfp_dnm_s
44
VNMLS_sp ---- 1110 0.01 .... .... 1010 .0.0 .... @vfp_dnm_s
45
VNMLS_dp ---- 1110 0.01 .... .... 1011 .0.0 .... @vfp_dnm_d
46
47
+VNMLA_hp ---- 1110 0.01 .... .... 1001 .1.0 .... @vfp_dnm_s
48
VNMLA_sp ---- 1110 0.01 .... .... 1010 .1.0 .... @vfp_dnm_s
49
VNMLA_dp ---- 1110 0.01 .... .... 1011 .1.0 .... @vfp_dnm_d
50
51
@@ -XXX,XX +XXX,XX @@ VMUL_hp ---- 1110 0.10 .... .... 1001 .0.0 .... @vfp_dnm_s
52
VMUL_sp ---- 1110 0.10 .... .... 1010 .0.0 .... @vfp_dnm_s
53
VMUL_dp ---- 1110 0.10 .... .... 1011 .0.0 .... @vfp_dnm_d
54
55
+VNMUL_hp ---- 1110 0.10 .... .... 1001 .1.0 .... @vfp_dnm_s
56
VNMUL_sp ---- 1110 0.10 .... .... 1010 .1.0 .... @vfp_dnm_s
57
VNMUL_dp ---- 1110 0.10 .... .... 1011 .1.0 .... @vfp_dnm_d
58
59
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
60
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/vfp_helper.c
62
+++ b/target/arm/vfp_helper.c
63
@@ -XXX,XX +XXX,XX @@ VFP_BINOP(minnum)
64
VFP_BINOP(maxnum)
65
#undef VFP_BINOP
66
67
+dh_ctype_f16 VFP_HELPER(neg, h)(dh_ctype_f16 a)
68
+{
34
+{
69
+ return float16_chs(a);
35
+ if (g_test_failed()) {
36
+ qemu_hexdump(stderr, "", buf, size);
37
+ }
70
+}
38
+}
71
+
39
+
72
float32 VFP_HELPER(neg, s)(float32 a)
40
static void rng_writeb(unsigned int offset, uint8_t value)
73
{
41
{
74
return float32_chs(a);
42
writeb(RNG_BASE_ADDR + offset, value);
75
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
43
@@ -XXX,XX +XXX,XX @@ static void test_continuous_monobit(void)
76
index XXXXXXX..XXXXXXX 100644
44
}
77
--- a/target/arm/translate-vfp.c.inc
45
78
+++ b/target/arm/translate-vfp.c.inc
46
g_assert_cmpfloat(calc_monobit_p(buf, sizeof(buf)), >, 0.01);
79
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
47
+ dump_buf_if_failed(buf, sizeof(buf));
80
return true;
81
}
48
}
82
49
83
+static void gen_VMLA_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
50
/*
84
+{
51
@@ -XXX,XX +XXX,XX @@ static void test_continuous_runs(void)
85
+ /* Note that order of inputs to the add matters for NaNs */
52
}
86
+ TCGv_i32 tmp = tcg_temp_new_i32();
53
87
+
54
g_assert_cmpfloat(calc_runs_p(buf.l, sizeof(buf) * BITS_PER_BYTE), >, 0.01);
88
+ gen_helper_vfp_mulh(tmp, vn, vm, fpst);
55
+ dump_buf_if_failed(buf.c, sizeof(buf));
89
+ gen_helper_vfp_addh(vd, vd, tmp, fpst);
90
+ tcg_temp_free_i32(tmp);
91
+}
92
+
93
+static bool trans_VMLA_hp(DisasContext *s, arg_VMLA_sp *a)
94
+{
95
+ return do_vfp_3op_hp(s, gen_VMLA_hp, a->vd, a->vn, a->vm, true);
96
+}
97
+
98
static void gen_VMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
99
{
100
/* Note that order of inputs to the add matters for NaNs */
101
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLA_dp(DisasContext *s, arg_VMLA_dp *a)
102
return do_vfp_3op_dp(s, gen_VMLA_dp, a->vd, a->vn, a->vm, true);
103
}
56
}
104
57
105
+static void gen_VMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
58
/*
106
+{
59
@@ -XXX,XX +XXX,XX @@ static void test_first_byte_monobit(void)
107
+ /*
60
}
108
+ * VMLS: vd = vd + -(vn * vm)
61
109
+ * Note that order of inputs to the add matters for NaNs.
62
g_assert_cmpfloat(calc_monobit_p(buf, sizeof(buf)), >, 0.01);
110
+ */
63
+ dump_buf_if_failed(buf, sizeof(buf));
111
+ TCGv_i32 tmp = tcg_temp_new_i32();
112
+
113
+ gen_helper_vfp_mulh(tmp, vn, vm, fpst);
114
+ gen_helper_vfp_negh(tmp, tmp);
115
+ gen_helper_vfp_addh(vd, vd, tmp, fpst);
116
+ tcg_temp_free_i32(tmp);
117
+}
118
+
119
+static bool trans_VMLS_hp(DisasContext *s, arg_VMLS_sp *a)
120
+{
121
+ return do_vfp_3op_hp(s, gen_VMLS_hp, a->vd, a->vn, a->vm, true);
122
+}
123
+
124
static void gen_VMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
125
{
126
/*
127
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_dp(DisasContext *s, arg_VMLS_dp *a)
128
return do_vfp_3op_dp(s, gen_VMLS_dp, a->vd, a->vn, a->vm, true);
129
}
64
}
130
65
131
+static void gen_VNMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
66
/*
132
+{
67
@@ -XXX,XX +XXX,XX @@ static void test_first_byte_runs(void)
133
+ /*
68
}
134
+ * VNMLS: -fd + (fn * fm)
69
135
+ * Note that it isn't valid to replace (-A + B) with (B - A) or similar
70
g_assert_cmpfloat(calc_runs_p(buf.l, sizeof(buf) * BITS_PER_BYTE), >, 0.01);
136
+ * plausible looking simplifications because this will give wrong results
71
+ dump_buf_if_failed(buf.c, sizeof(buf));
137
+ * for NaNs.
138
+ */
139
+ TCGv_i32 tmp = tcg_temp_new_i32();
140
+
141
+ gen_helper_vfp_mulh(tmp, vn, vm, fpst);
142
+ gen_helper_vfp_negh(vd, vd);
143
+ gen_helper_vfp_addh(vd, vd, tmp, fpst);
144
+ tcg_temp_free_i32(tmp);
145
+}
146
+
147
+static bool trans_VNMLS_hp(DisasContext *s, arg_VNMLS_sp *a)
148
+{
149
+ return do_vfp_3op_hp(s, gen_VNMLS_hp, a->vd, a->vn, a->vm, true);
150
+}
151
+
152
static void gen_VNMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
153
{
154
/*
155
@@ -XXX,XX +XXX,XX @@ static bool trans_VNMLS_dp(DisasContext *s, arg_VNMLS_dp *a)
156
return do_vfp_3op_dp(s, gen_VNMLS_dp, a->vd, a->vn, a->vm, true);
157
}
72
}
158
73
159
+static void gen_VNMLA_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
74
int main(int argc, char **argv)
160
+{
161
+ /* VNMLA: -fd + -(fn * fm) */
162
+ TCGv_i32 tmp = tcg_temp_new_i32();
163
+
164
+ gen_helper_vfp_mulh(tmp, vn, vm, fpst);
165
+ gen_helper_vfp_negh(tmp, tmp);
166
+ gen_helper_vfp_negh(vd, vd);
167
+ gen_helper_vfp_addh(vd, vd, tmp, fpst);
168
+ tcg_temp_free_i32(tmp);
169
+}
170
+
171
+static bool trans_VNMLA_hp(DisasContext *s, arg_VNMLA_sp *a)
172
+{
173
+ return do_vfp_3op_hp(s, gen_VNMLA_hp, a->vd, a->vn, a->vm, true);
174
+}
175
+
176
static void gen_VNMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
177
{
178
/* VNMLA: -fd + -(fn * fm) */
179
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_dp(DisasContext *s, arg_VMUL_dp *a)
180
return do_vfp_3op_dp(s, gen_helper_vfp_muld, a->vd, a->vn, a->vm, false);
181
}
182
183
+static void gen_VNMUL_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
184
+{
185
+ /* VNMUL: -(fn * fm) */
186
+ gen_helper_vfp_mulh(vd, vn, vm, fpst);
187
+ gen_helper_vfp_negh(vd, vd);
188
+}
189
+
190
+static bool trans_VNMUL_hp(DisasContext *s, arg_VNMUL_sp *a)
191
+{
192
+ return do_vfp_3op_hp(s, gen_VNMUL_hp, a->vd, a->vn, a->vm, false);
193
+}
194
+
195
static void gen_VNMUL_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
196
{
197
/* VNMUL: -(fn * fm) */
198
--
75
--
199
2.20.1
76
2.20.1
200
77
201
78
diff view generated by jsdifflib
1
Implement the fp16 versions of the VFP VCVT instruction forms
1
From: Alex Chen <alex.chen@huawei.com>
2
which convert between floating point and integer with a specified
3
rounding mode.
4
2
3
We should use printf format specifier "%u" instead of "%d" for
4
argument of type "unsigned int".
5
6
Reported-by: Euler Robot <euler.robot@huawei.com>
7
Signed-off-by: Alex Chen <alex.chen@huawei.com>
8
Message-id: 20201126111109.112238-2-alex.chen@huawei.com
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-17-peter.maydell@linaro.org
8
---
11
---
9
target/arm/vfp-uncond.decode | 6 ++++--
12
hw/misc/imx25_ccm.c | 12 ++++++------
10
target/arm/translate-vfp.c.inc | 32 ++++++++++++++++++++++++--------
13
1 file changed, 6 insertions(+), 6 deletions(-)
11
2 files changed, 28 insertions(+), 10 deletions(-)
12
14
13
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
15
diff --git a/hw/misc/imx25_ccm.c b/hw/misc/imx25_ccm.c
14
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/vfp-uncond.decode
17
--- a/hw/misc/imx25_ccm.c
16
+++ b/target/arm/vfp-uncond.decode
18
+++ b/hw/misc/imx25_ccm.c
17
@@ -XXX,XX +XXX,XX @@ VRINT 1111 1110 1.11 10 rm:2 .... 1011 01.0 .... \
19
@@ -XXX,XX +XXX,XX @@ static const char *imx25_ccm_reg_name(uint32_t reg)
18
vm=%vm_dp vd=%vd_dp dp=1
20
case IMX25_CCM_LPIMR1_REG:
19
21
return "lpimr1";
20
# VCVT float to int with specified rounding mode; Vd is always single-precision
22
default:
21
+VCVT 1111 1110 1.11 11 rm:2 .... 1001 op:1 1.0 .... \
23
- sprintf(unknown, "[%d ?]", reg);
22
+ vm=%vm_sp vd=%vd_sp sz=1
24
+ sprintf(unknown, "[%u ?]", reg);
23
VCVT 1111 1110 1.11 11 rm:2 .... 1010 op:1 1.0 .... \
25
return unknown;
24
- vm=%vm_sp vd=%vd_sp dp=0
25
+ vm=%vm_sp vd=%vd_sp sz=2
26
VCVT 1111 1110 1.11 11 rm:2 .... 1011 op:1 1.0 .... \
27
- vm=%vm_dp vd=%vd_sp dp=1
28
+ vm=%vm_dp vd=%vd_sp sz=3
29
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate-vfp.c.inc
32
+++ b/target/arm/translate-vfp.c.inc
33
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
34
static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
35
{
36
uint32_t rd, rm;
37
- bool dp = a->dp;
38
+ int sz = a->sz;
39
TCGv_ptr fpst;
40
TCGv_i32 tcg_rmode, tcg_shift;
41
int rounding = fp_decode_rm[a->rm];
42
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
43
return false;
44
}
26
}
45
27
}
46
- if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
28
@@ -XXX,XX +XXX,XX @@ static uint32_t imx25_ccm_get_mpll_clk(IMXCCMState *dev)
47
+ if (sz == 3 && !dc_isar_feature(aa32_fpdp_v2, s)) {
29
freq = imx_ccm_calc_pll(s->reg[IMX25_CCM_MPCTL_REG], CKIH_FREQ);
48
+ return false;
49
+ }
50
+
51
+ if (sz == 1 && !dc_isar_feature(aa32_fp16_arith, s)) {
52
return false;
53
}
30
}
54
31
55
/* UNDEF accesses to D16-D31 if they don't exist */
32
- DPRINTF("freq = %d\n", freq);
56
- if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
33
+ DPRINTF("freq = %u\n", freq);
57
+ if (sz == 3 && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
34
58
return false;
35
return freq;
36
}
37
@@ -XXX,XX +XXX,XX @@ static uint32_t imx25_ccm_get_mcu_clk(IMXCCMState *dev)
38
39
freq = freq / (1 + EXTRACT(s->reg[IMX25_CCM_CCTL_REG], ARM_CLK_DIV));
40
41
- DPRINTF("freq = %d\n", freq);
42
+ DPRINTF("freq = %u\n", freq);
43
44
return freq;
45
}
46
@@ -XXX,XX +XXX,XX @@ static uint32_t imx25_ccm_get_ahb_clk(IMXCCMState *dev)
47
freq = imx25_ccm_get_mcu_clk(dev)
48
/ (1 + EXTRACT(s->reg[IMX25_CCM_CCTL_REG], AHB_CLK_DIV));
49
50
- DPRINTF("freq = %d\n", freq);
51
+ DPRINTF("freq = %u\n", freq);
52
53
return freq;
54
}
55
@@ -XXX,XX +XXX,XX @@ static uint32_t imx25_ccm_get_ipg_clk(IMXCCMState *dev)
56
57
freq = imx25_ccm_get_ahb_clk(dev) / 2;
58
59
- DPRINTF("freq = %d\n", freq);
60
+ DPRINTF("freq = %u\n", freq);
61
62
return freq;
63
}
64
@@ -XXX,XX +XXX,XX @@ static uint32_t imx25_ccm_get_clock_frequency(IMXCCMState *dev, IMXClk clock)
65
break;
59
}
66
}
60
67
61
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
68
- DPRINTF("Clock = %d) = %d\n", clock, freq);
62
return true;
69
+ DPRINTF("Clock = %d) = %u\n", clock, freq);
63
}
70
64
71
return freq;
65
- fpst = fpstatus_ptr(FPST_FPCR);
72
}
66
+ if (sz == 1) {
67
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
68
+ } else {
69
+ fpst = fpstatus_ptr(FPST_FPCR);
70
+ }
71
72
tcg_shift = tcg_const_i32(0);
73
74
tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
75
gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
76
77
- if (dp) {
78
+ if (sz == 3) {
79
TCGv_i64 tcg_double, tcg_res;
80
TCGv_i32 tcg_tmp;
81
tcg_double = tcg_temp_new_i64();
82
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
83
tcg_single = tcg_temp_new_i32();
84
tcg_res = tcg_temp_new_i32();
85
neon_load_reg32(tcg_single, rm);
86
- if (is_signed) {
87
- gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst);
88
+ if (sz == 1) {
89
+ if (is_signed) {
90
+ gen_helper_vfp_toslh(tcg_res, tcg_single, tcg_shift, fpst);
91
+ } else {
92
+ gen_helper_vfp_toulh(tcg_res, tcg_single, tcg_shift, fpst);
93
+ }
94
} else {
95
- gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
96
+ if (is_signed) {
97
+ gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst);
98
+ } else {
99
+ gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
100
+ }
101
}
102
neon_store_reg32(tcg_res, rd);
103
tcg_temp_free_i32(tcg_res);
104
--
73
--
105
2.20.1
74
2.20.1
106
75
107
76
diff view generated by jsdifflib
1
Convert the Neon VCVT with-specified-rounding-mode instructions
1
From: Alex Chen <alex.chen@huawei.com>
2
to gvec, and use this to implement fp16 support for them.
3
2
3
We should use printf format specifier "%u" instead of "%d" for
4
argument of type "unsigned int".
5
6
Reported-by: Euler Robot <euler.robot@huawei.com>
7
Signed-off-by: Alex Chen <alex.chen@huawei.com>
8
Message-id: 20201126111109.112238-3-alex.chen@huawei.com
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-40-peter.maydell@linaro.org
7
---
11
---
8
target/arm/helper.h | 5 ++
12
hw/misc/imx31_ccm.c | 14 +++++++-------
9
target/arm/vec_helper.c | 23 +++++++
13
hw/misc/imx_ccm.c | 4 ++--
10
target/arm/translate-neon.c.inc | 105 ++++++++++++--------------------
14
2 files changed, 9 insertions(+), 9 deletions(-)
11
3 files changed, 66 insertions(+), 67 deletions(-)
12
15
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
diff --git a/hw/misc/imx31_ccm.c b/hw/misc/imx31_ccm.c
14
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
18
--- a/hw/misc/imx31_ccm.c
16
+++ b/target/arm/helper.h
19
+++ b/hw/misc/imx31_ccm.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_vcvt_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@ static const char *imx31_ccm_reg_name(uint32_t reg)
18
DEF_HELPER_FLAGS_4(gvec_vcvt_hs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
case IMX31_CCM_PDR2_REG:
19
DEF_HELPER_FLAGS_4(gvec_vcvt_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
return "PDR2";
20
23
default:
21
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_ss, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
- sprintf(unknown, "[%d ?]", reg);
22
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_us, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+ sprintf(unknown, "[%u ?]", reg);
23
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
return unknown;
24
+DEF_HELPER_FLAGS_4(gvec_vcvt_rm_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
}
25
+
28
}
26
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
@@ -XXX,XX +XXX,XX @@ static uint32_t imx31_ccm_get_pll_ref_clk(IMXCCMState *dev)
27
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
freq = CKIH_FREQ;
28
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
}
29
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
32
33
- DPRINTF("freq = %d\n", freq);
34
+ DPRINTF("freq = %u\n", freq);
35
36
return freq;
37
}
38
@@ -XXX,XX +XXX,XX @@ static uint32_t imx31_ccm_get_mpll_clk(IMXCCMState *dev)
39
freq = imx_ccm_calc_pll(s->reg[IMX31_CCM_MPCTL_REG],
40
imx31_ccm_get_pll_ref_clk(dev));
41
42
- DPRINTF("freq = %d\n", freq);
43
+ DPRINTF("freq = %u\n", freq);
44
45
return freq;
46
}
47
@@ -XXX,XX +XXX,XX @@ static uint32_t imx31_ccm_get_mcu_main_clk(IMXCCMState *dev)
48
freq = imx31_ccm_get_mpll_clk(dev);
49
}
50
51
- DPRINTF("freq = %d\n", freq);
52
+ DPRINTF("freq = %u\n", freq);
53
54
return freq;
55
}
56
@@ -XXX,XX +XXX,XX @@ static uint32_t imx31_ccm_get_hclk_clk(IMXCCMState *dev)
57
freq = imx31_ccm_get_mcu_main_clk(dev)
58
/ (1 + EXTRACT(s->reg[IMX31_CCM_PDR0_REG], MAX));
59
60
- DPRINTF("freq = %d\n", freq);
61
+ DPRINTF("freq = %u\n", freq);
62
63
return freq;
64
}
65
@@ -XXX,XX +XXX,XX @@ static uint32_t imx31_ccm_get_ipg_clk(IMXCCMState *dev)
66
freq = imx31_ccm_get_hclk_clk(dev)
67
/ (1 + EXTRACT(s->reg[IMX31_CCM_PDR0_REG], IPG));
68
69
- DPRINTF("freq = %d\n", freq);
70
+ DPRINTF("freq = %u\n", freq);
71
72
return freq;
73
}
74
@@ -XXX,XX +XXX,XX @@ static uint32_t imx31_ccm_get_clock_frequency(IMXCCMState *dev, IMXClk clock)
75
break;
76
}
77
78
- DPRINTF("Clock = %d) = %d\n", clock, freq);
79
+ DPRINTF("Clock = %d) = %u\n", clock, freq);
80
81
return freq;
82
}
83
diff --git a/hw/misc/imx_ccm.c b/hw/misc/imx_ccm.c
30
index XXXXXXX..XXXXXXX 100644
84
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/vec_helper.c
85
--- a/hw/misc/imx_ccm.c
32
+++ b/target/arm/vec_helper.c
86
+++ b/hw/misc/imx_ccm.c
33
@@ -XXX,XX +XXX,XX @@ DO_VCVT_FIXED(gvec_vcvt_hs, helper_vfp_toshh_round_to_zero, uint16_t)
87
@@ -XXX,XX +XXX,XX @@ uint32_t imx_ccm_get_clock_frequency(IMXCCMState *dev, IMXClk clock)
34
DO_VCVT_FIXED(gvec_vcvt_hu, helper_vfp_touhh_round_to_zero, uint16_t)
88
freq = klass->get_clock_frequency(dev, clock);
35
36
#undef DO_VCVT_FIXED
37
+
38
+#define DO_VCVT_RMODE(NAME, FUNC, TYPE) \
39
+ void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
40
+ { \
41
+ float_status *fpst = stat; \
42
+ intptr_t i, oprsz = simd_oprsz(desc); \
43
+ uint32_t rmode = simd_data(desc); \
44
+ uint32_t prev_rmode = get_float_rounding_mode(fpst); \
45
+ TYPE *d = vd, *n = vn; \
46
+ set_float_rounding_mode(rmode, fpst); \
47
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
48
+ d[i] = FUNC(n[i], 0, fpst); \
49
+ } \
50
+ set_float_rounding_mode(prev_rmode, fpst); \
51
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
52
+ }
53
+
54
+DO_VCVT_RMODE(gvec_vcvt_rm_ss, helper_vfp_tosls, uint32_t)
55
+DO_VCVT_RMODE(gvec_vcvt_rm_us, helper_vfp_touls, uint32_t)
56
+DO_VCVT_RMODE(gvec_vcvt_rm_sh, helper_vfp_toshh, uint16_t)
57
+DO_VCVT_RMODE(gvec_vcvt_rm_uh, helper_vfp_touhh, uint16_t)
58
+
59
+#undef DO_VCVT_RMODE
60
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/translate-neon.c.inc
63
+++ b/target/arm/translate-neon.c.inc
64
@@ -XXX,XX +XXX,XX @@ DO_VRINT(VRINTZ, FPROUNDING_ZERO)
65
DO_VRINT(VRINTM, FPROUNDING_NEGINF)
66
DO_VRINT(VRINTP, FPROUNDING_POSINF)
67
68
-static bool do_vcvt(DisasContext *s, arg_2misc *a, int rmode, bool is_signed)
69
-{
70
- /*
71
- * Handle a VCVT* operation by iterating 32 bits at a time,
72
- * with a specified rounding mode in operation.
73
- */
74
- int pass;
75
- TCGv_ptr fpst;
76
- TCGv_i32 tcg_rmode, tcg_shift;
77
-
78
- if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
79
- !arm_dc_feature(s, ARM_FEATURE_V8)) {
80
- return false;
81
+#define DO_VEC_RMODE(INSN, RMODE, OP) \
82
+ static void gen_##INSN(unsigned vece, uint32_t rd_ofs, \
83
+ uint32_t rm_ofs, \
84
+ uint32_t oprsz, uint32_t maxsz) \
85
+ { \
86
+ static gen_helper_gvec_2_ptr * const fns[4] = { \
87
+ NULL, \
88
+ gen_helper_gvec_##OP##h, \
89
+ gen_helper_gvec_##OP##s, \
90
+ NULL, \
91
+ }; \
92
+ TCGv_ptr fpst; \
93
+ fpst = fpstatus_ptr(vece == 1 ? FPST_STD_F16 : FPST_STD); \
94
+ tcg_gen_gvec_2_ptr(rd_ofs, rm_ofs, fpst, oprsz, maxsz, \
95
+ arm_rmode_to_sf(RMODE), fns[vece]); \
96
+ tcg_temp_free_ptr(fpst); \
97
+ } \
98
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
99
+ { \
100
+ if (!arm_dc_feature(s, ARM_FEATURE_V8)) { \
101
+ return false; \
102
+ } \
103
+ if (a->size == MO_16) { \
104
+ if (!dc_isar_feature(aa32_fp16_arith, s)) { \
105
+ return false; \
106
+ } \
107
+ } else if (a->size != MO_32) { \
108
+ return false; \
109
+ } \
110
+ return do_2misc_vec(s, a, gen_##INSN); \
111
}
89
}
112
90
113
- /* UNDEF accesses to D16-D31 if they don't exist. */
91
- DPRINTF("(clock = %d) = %d\n", clock, freq);
114
- if (!dc_isar_feature(aa32_simd_r32, s) &&
92
+ DPRINTF("(clock = %d) = %u\n", clock, freq);
115
- ((a->vd | a->vm) & 0x10)) {
93
116
- return false;
94
return freq;
117
- }
95
}
118
-
96
@@ -XXX,XX +XXX,XX @@ uint32_t imx_ccm_calc_pll(uint32_t pllreg, uint32_t base_freq)
119
- if (a->size != 2) {
97
freq = ((2 * (base_freq >> 10) * (mfi * mfd + mfn)) /
120
- /* TODO: FP16 will be the size == 1 case */
98
(mfd * pd)) << 10;
121
- return false;
99
122
- }
100
- DPRINTF("(pllreg = 0x%08x, base_freq = %d) = %d\n", pllreg, base_freq,
123
-
101
+ DPRINTF("(pllreg = 0x%08x, base_freq = %u) = %d\n", pllreg, base_freq,
124
- if ((a->vd | a->vm) & a->q) {
102
freq);
125
- return false;
103
126
- }
104
return freq;
127
-
128
- if (!vfp_access_check(s)) {
129
- return true;
130
- }
131
-
132
- fpst = fpstatus_ptr(FPST_STD);
133
- tcg_shift = tcg_const_i32(0);
134
- tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
135
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
136
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
137
- TCGv_i32 tmp = neon_load_reg(a->vm, pass);
138
- if (is_signed) {
139
- gen_helper_vfp_tosls(tmp, tmp, tcg_shift, fpst);
140
- } else {
141
- gen_helper_vfp_touls(tmp, tmp, tcg_shift, fpst);
142
- }
143
- neon_store_reg(a->vd, pass, tmp);
144
- }
145
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
146
- tcg_temp_free_i32(tcg_rmode);
147
- tcg_temp_free_i32(tcg_shift);
148
- tcg_temp_free_ptr(fpst);
149
-
150
- return true;
151
-}
152
-
153
-#define DO_VCVT(INSN, RMODE, SIGNED) \
154
- static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
155
- { \
156
- return do_vcvt(s, a, RMODE, SIGNED); \
157
- }
158
-
159
-DO_VCVT(VCVTAU, FPROUNDING_TIEAWAY, false)
160
-DO_VCVT(VCVTAS, FPROUNDING_TIEAWAY, true)
161
-DO_VCVT(VCVTNU, FPROUNDING_TIEEVEN, false)
162
-DO_VCVT(VCVTNS, FPROUNDING_TIEEVEN, true)
163
-DO_VCVT(VCVTPU, FPROUNDING_POSINF, false)
164
-DO_VCVT(VCVTPS, FPROUNDING_POSINF, true)
165
-DO_VCVT(VCVTMU, FPROUNDING_NEGINF, false)
166
-DO_VCVT(VCVTMS, FPROUNDING_NEGINF, true)
167
+DO_VEC_RMODE(VCVTAU, FPROUNDING_TIEAWAY, vcvt_rm_u)
168
+DO_VEC_RMODE(VCVTAS, FPROUNDING_TIEAWAY, vcvt_rm_s)
169
+DO_VEC_RMODE(VCVTNU, FPROUNDING_TIEEVEN, vcvt_rm_u)
170
+DO_VEC_RMODE(VCVTNS, FPROUNDING_TIEEVEN, vcvt_rm_s)
171
+DO_VEC_RMODE(VCVTPU, FPROUNDING_POSINF, vcvt_rm_u)
172
+DO_VEC_RMODE(VCVTPS, FPROUNDING_POSINF, vcvt_rm_s)
173
+DO_VEC_RMODE(VCVTMU, FPROUNDING_NEGINF, vcvt_rm_u)
174
+DO_VEC_RMODE(VCVTMS, FPROUNDING_NEGINF, vcvt_rm_s)
175
176
static bool trans_VSWP(DisasContext *s, arg_2misc *a)
177
{
178
--
105
--
179
2.20.1
106
2.20.1
180
107
181
108
diff view generated by jsdifflib
1
Convert the Neon VRINTX insn to use gvec, and use this to implement
1
From: Alex Chen <alex.chen@huawei.com>
2
fp16 support for it.
3
2
3
We should use printf format specifier "%u" instead of "%d" for
4
argument of type "unsigned int".
5
6
Reported-by: Euler Robot <euler.robot@huawei.com>
7
Signed-off-by: Alex Chen <alex.chen@huawei.com>
8
Message-id: 20201126111109.112238-4-alex.chen@huawei.com
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-42-peter.maydell@linaro.org
7
---
11
---
8
target/arm/helper.h | 3 +++
12
hw/misc/imx6_ccm.c | 20 ++++++++++----------
9
target/arm/vec_helper.c | 3 +++
13
hw/misc/imx6_src.c | 2 +-
10
target/arm/translate-neon.c.inc | 45 +++------------------------------
14
2 files changed, 11 insertions(+), 11 deletions(-)
11
3 files changed, 9 insertions(+), 42 deletions(-)
12
15
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
diff --git a/hw/misc/imx6_ccm.c b/hw/misc/imx6_ccm.c
14
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
18
--- a/hw/misc/imx6_ccm.c
16
+++ b/target/arm/helper.h
19
+++ b/hw/misc/imx6_ccm.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_vcvt_rm_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@ static const char *imx6_ccm_reg_name(uint32_t reg)
18
DEF_HELPER_FLAGS_4(gvec_vrint_rm_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
case CCM_CMEOR:
19
DEF_HELPER_FLAGS_4(gvec_vrint_rm_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
return "CMEOR";
20
23
default:
21
+DEF_HELPER_FLAGS_4(gvec_vrintx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
- sprintf(unknown, "%d ?", reg);
22
+DEF_HELPER_FLAGS_4(gvec_vrintx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+ sprintf(unknown, "%u ?", reg);
23
+
26
return unknown;
24
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
}
25
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
}
26
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
@@ -XXX,XX +XXX,XX @@ static const char *imx6_analog_reg_name(uint32_t reg)
27
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
30
case USB_ANALOG_DIGPROG:
31
return "USB_ANALOG_DIGPROG";
32
default:
33
- sprintf(unknown, "%d ?", reg);
34
+ sprintf(unknown, "%u ?", reg);
35
return unknown;
36
}
37
}
38
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_analog_get_pll2_clk(IMX6CCMState *dev)
39
freq *= 20;
40
}
41
42
- DPRINTF("freq = %d\n", (uint32_t)freq);
43
+ DPRINTF("freq = %u\n", (uint32_t)freq);
44
45
return freq;
46
}
47
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_analog_get_pll2_pfd0_clk(IMX6CCMState *dev)
48
freq = imx6_analog_get_pll2_clk(dev) * 18
49
/ EXTRACT(dev->analog[CCM_ANALOG_PFD_528], PFD0_FRAC);
50
51
- DPRINTF("freq = %d\n", (uint32_t)freq);
52
+ DPRINTF("freq = %u\n", (uint32_t)freq);
53
54
return freq;
55
}
56
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_analog_get_pll2_pfd2_clk(IMX6CCMState *dev)
57
freq = imx6_analog_get_pll2_clk(dev) * 18
58
/ EXTRACT(dev->analog[CCM_ANALOG_PFD_528], PFD2_FRAC);
59
60
- DPRINTF("freq = %d\n", (uint32_t)freq);
61
+ DPRINTF("freq = %u\n", (uint32_t)freq);
62
63
return freq;
64
}
65
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_analog_get_periph_clk(IMX6CCMState *dev)
66
break;
67
}
68
69
- DPRINTF("freq = %d\n", (uint32_t)freq);
70
+ DPRINTF("freq = %u\n", (uint32_t)freq);
71
72
return freq;
73
}
74
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_ccm_get_ahb_clk(IMX6CCMState *dev)
75
freq = imx6_analog_get_periph_clk(dev)
76
/ (1 + EXTRACT(dev->ccm[CCM_CBCDR], AHB_PODF));
77
78
- DPRINTF("freq = %d\n", (uint32_t)freq);
79
+ DPRINTF("freq = %u\n", (uint32_t)freq);
80
81
return freq;
82
}
83
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_ccm_get_ipg_clk(IMX6CCMState *dev)
84
freq = imx6_ccm_get_ahb_clk(dev)
85
/ (1 + EXTRACT(dev->ccm[CCM_CBCDR], IPG_PODF));
86
87
- DPRINTF("freq = %d\n", (uint32_t)freq);
88
+ DPRINTF("freq = %u\n", (uint32_t)freq);
89
90
return freq;
91
}
92
@@ -XXX,XX +XXX,XX @@ static uint64_t imx6_ccm_get_per_clk(IMX6CCMState *dev)
93
freq = imx6_ccm_get_ipg_clk(dev)
94
/ (1 + EXTRACT(dev->ccm[CCM_CSCMR1], PERCLK_PODF));
95
96
- DPRINTF("freq = %d\n", (uint32_t)freq);
97
+ DPRINTF("freq = %u\n", (uint32_t)freq);
98
99
return freq;
100
}
101
@@ -XXX,XX +XXX,XX @@ static uint32_t imx6_ccm_get_clock_frequency(IMXCCMState *dev, IMXClk clock)
102
break;
103
}
104
105
- DPRINTF("Clock = %d) = %d\n", clock, freq);
106
+ DPRINTF("Clock = %d) = %u\n", clock, freq);
107
108
return freq;
109
}
110
diff --git a/hw/misc/imx6_src.c b/hw/misc/imx6_src.c
28
index XXXXXXX..XXXXXXX 100644
111
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/vec_helper.c
112
--- a/hw/misc/imx6_src.c
30
+++ b/target/arm/vec_helper.c
113
+++ b/hw/misc/imx6_src.c
31
@@ -XXX,XX +XXX,XX @@ DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16)
114
@@ -XXX,XX +XXX,XX @@ static const char *imx6_src_reg_name(uint32_t reg)
32
DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32)
115
case SRC_GPR10:
33
DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64)
116
return "SRC_GPR10";
34
117
default:
35
+DO_2OP(gvec_vrintx_h, float16_round_to_int, float16)
118
- sprintf(unknown, "%d ?", reg);
36
+DO_2OP(gvec_vrintx_s, float32_round_to_int, float32)
119
+ sprintf(unknown, "%u ?", reg);
37
+
120
return unknown;
38
DO_2OP(gvec_sitos, helper_vfp_sitos, int32_t)
121
}
39
DO_2OP(gvec_uitos, helper_vfp_uitos, uint32_t)
40
DO_2OP(gvec_tosizs, helper_vfp_tosizs, float32)
41
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/translate-neon.c.inc
44
+++ b/target/arm/translate-neon.c.inc
45
@@ -XXX,XX +XXX,XX @@ static bool trans_VQNEG(DisasContext *s, arg_2misc *a)
46
return do_2misc(s, a, fn[a->size]);
47
}
122
}
48
49
-static bool do_2misc_fp(DisasContext *s, arg_2misc *a,
50
- NeonGenOneSingleOpFn *fn)
51
-{
52
- int pass;
53
- TCGv_ptr fpst;
54
-
55
- /* Handle a 2-reg-misc operation by iterating 32 bits at a time */
56
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
57
- return false;
58
- }
59
-
60
- /* UNDEF accesses to D16-D31 if they don't exist. */
61
- if (!dc_isar_feature(aa32_simd_r32, s) &&
62
- ((a->vd | a->vm) & 0x10)) {
63
- return false;
64
- }
65
-
66
- if (a->size != 2) {
67
- /* TODO: FP16 will be the size == 1 case */
68
- return false;
69
- }
70
-
71
- if ((a->vd | a->vm) & a->q) {
72
- return false;
73
- }
74
-
75
- if (!vfp_access_check(s)) {
76
- return true;
77
- }
78
-
79
- fpst = fpstatus_ptr(FPST_STD);
80
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
81
- TCGv_i32 tmp = neon_load_reg(a->vm, pass);
82
- fn(tmp, tmp, fpst);
83
- neon_store_reg(a->vd, pass, tmp);
84
- }
85
- tcg_temp_free_ptr(fpst);
86
-
87
- return true;
88
-}
89
-
90
#define DO_2MISC_FP_VEC(INSN, HFUNC, SFUNC) \
91
static void gen_##INSN(unsigned vece, uint32_t rd_ofs, \
92
uint32_t rm_ofs, \
93
@@ -XXX,XX +XXX,XX @@ DO_2MISC_FP_VEC(VCVT_FU, gen_helper_gvec_ustoh, gen_helper_gvec_uitos)
94
DO_2MISC_FP_VEC(VCVT_SF, gen_helper_gvec_tosszh, gen_helper_gvec_tosizs)
95
DO_2MISC_FP_VEC(VCVT_UF, gen_helper_gvec_touszh, gen_helper_gvec_touizs)
96
97
+DO_2MISC_FP_VEC(VRINTX_impl, gen_helper_gvec_vrintx_h, gen_helper_gvec_vrintx_s)
98
+
99
static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
100
{
101
if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
102
return false;
103
}
104
- return do_2misc_fp(s, a, gen_helper_rints_exact);
105
+ return trans_VRINTX_impl(s, a);
106
}
107
108
#define DO_VEC_RMODE(INSN, RMODE, OP) \
109
--
123
--
110
2.20.1
124
2.20.1
111
125
112
126
diff view generated by jsdifflib
1
From: Graeme Gregory <graeme@nuviainc.com>
1
From: Alex Chen <alex.chen@huawei.com>
2
2
3
Add the previously created sbsa-ec device to the sbsa-ref machine in
3
We should use printf format specifier "%u" instead of "%d" for
4
secure memory so the PSCI implementation in ARM-TF can access it, but
4
argument of type "unsigned int".
5
not expose it to non secure firmware or OS except by via ARM-TF.
6
5
7
Signed-off-by: Graeme Gregory <graeme@nuviainc.com>
6
Reported-by: Euler Robot <euler.robot@huawei.com>
8
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
7
Signed-off-by: Alex Chen <alex.chen@huawei.com>
9
Tested-by: Leif Lindholm <leif@nuviainc.com>
8
Message-id: 20201126111109.112238-5-alex.chen@huawei.com
10
Message-id: 20200826141952.136164-3-graeme@nuviainc.com
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
11
---
14
hw/arm/sbsa-ref.c | 14 ++++++++++++++
12
hw/misc/imx6ul_ccm.c | 4 ++--
15
1 file changed, 14 insertions(+)
13
1 file changed, 2 insertions(+), 2 deletions(-)
16
14
17
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
15
diff --git a/hw/misc/imx6ul_ccm.c b/hw/misc/imx6ul_ccm.c
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/hw/arm/sbsa-ref.c
17
--- a/hw/misc/imx6ul_ccm.c
20
+++ b/hw/arm/sbsa-ref.c
18
+++ b/hw/misc/imx6ul_ccm.c
21
@@ -XXX,XX +XXX,XX @@ enum {
19
@@ -XXX,XX +XXX,XX @@ static const char *imx6ul_ccm_reg_name(uint32_t reg)
22
SBSA_CPUPERIPHS,
20
case CCM_CMEOR:
23
SBSA_GIC_DIST,
21
return "CMEOR";
24
SBSA_GIC_REDIST,
22
default:
25
+ SBSA_SECURE_EC,
23
- sprintf(unknown, "%d ?", reg);
26
SBSA_SMMU,
24
+ sprintf(unknown, "%u ?", reg);
27
SBSA_UART,
25
return unknown;
28
SBSA_RTC,
26
}
29
@@ -XXX,XX +XXX,XX @@ static const MemMapEntry sbsa_ref_memmap[] = {
30
[SBSA_CPUPERIPHS] = { 0x40000000, 0x00040000 },
31
[SBSA_GIC_DIST] = { 0x40060000, 0x00010000 },
32
[SBSA_GIC_REDIST] = { 0x40080000, 0x04000000 },
33
+ [SBSA_SECURE_EC] = { 0x50000000, 0x00001000 },
34
[SBSA_UART] = { 0x60000000, 0x00001000 },
35
[SBSA_RTC] = { 0x60010000, 0x00001000 },
36
[SBSA_GPIO] = { 0x60020000, 0x00001000 },
37
@@ -XXX,XX +XXX,XX @@ static void *sbsa_ref_dtb(const struct arm_boot_info *binfo, int *fdt_size)
38
return board->fdt;
39
}
27
}
40
28
@@ -XXX,XX +XXX,XX @@ static const char *imx6ul_analog_reg_name(uint32_t reg)
41
+static void create_secure_ec(MemoryRegion *mem)
29
case USB_ANALOG_DIGPROG:
42
+{
30
return "USB_ANALOG_DIGPROG";
43
+ hwaddr base = sbsa_ref_memmap[SBSA_SECURE_EC].base;
31
default:
44
+ DeviceState *dev = qdev_new("sbsa-ec");
32
- sprintf(unknown, "%d ?", reg);
45
+ SysBusDevice *s = SYS_BUS_DEVICE(dev);
33
+ sprintf(unknown, "%u ?", reg);
46
+
34
return unknown;
47
+ memory_region_add_subregion(mem, base,
35
}
48
+ sysbus_mmio_get_region(s, 0));
36
}
49
+}
50
+
51
static void sbsa_ref_init(MachineState *machine)
52
{
53
unsigned int smp_cpus = machine->smp.cpus;
54
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
55
56
create_pcie(sms);
57
58
+ create_secure_ec(secure_sysmem);
59
+
60
sms->bootinfo.ram_size = machine->ram_size;
61
sms->bootinfo.nb_cpus = smp_cpus;
62
sms->bootinfo.board_id = -1;
63
--
37
--
64
2.20.1
38
2.20.1
65
39
66
40
diff view generated by jsdifflib
1
Convert the neon floating-point vector absolute comparison ops
1
For M-profile CPUs, the range from 0xe0000000 to 0xe00fffff is the
2
VACGE and VACGT over to using a gvec hepler and use this to
2
Private Peripheral Bus range, which includes all of the memory mapped
3
implement the fp16 case.
3
devices and registers that are part of the CPU itself, including the
4
NVIC, systick timer, and debug and trace components like the Data
5
Watchpoint and Trace unit (DWT). Within this large region, the range
6
0xe000e000 to 0xe000efff is the System Control Space (NVIC, system
7
registers, systick) and 0xe002e000 to 0exe002efff is its Non-secure
8
alias.
9
10
The architecture is clear that within the SCS unimplemented registers
11
should be RES0 for privileged accesses and generate BusFault for
12
unprivileged accesses, and we currently implement this.
13
14
It is less clear about how to handle accesses to unimplemented
15
regions of the wider PPB. Unprivileged accesses should definitely
16
cause BusFaults (R_DQQS), but the behaviour of privileged accesses is
17
not given as a general rule. However, the register definitions of
18
individual registers for components like the DWT all state that they
19
are RES0 if the relevant component is not implemented, so the
20
simplest way to provide that is to provide RAZ/WI for the whole range
21
for privileged accesses. (The v7M Arm ARM does say that reserved
22
registers should be UNK/SBZP.)
23
24
Expand the container MemoryRegion that the NVIC exposes so that
25
it covers the whole PPB space. This means:
26
* moving the address that the ARMV7M device maps it to down by
27
0xe000 bytes
28
* moving the off and the offsets within the container of all the
29
subregions forward by 0xe000 bytes
30
* adding a new default MemoryRegion that covers the whole container
31
at a lower priority than anything else and which provides the
32
RAZWI/BusFault behaviour
4
33
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
34
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
35
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-28-peter.maydell@linaro.org
36
Message-id: 20201119215617.29887-2-peter.maydell@linaro.org
8
---
37
---
9
target/arm/helper.h | 6 ++++++
38
include/hw/intc/armv7m_nvic.h | 1 +
10
target/arm/vec_helper.c | 26 ++++++++++++++++++++++++++
39
hw/arm/armv7m.c | 2 +-
11
target/arm/translate-neon.c.inc | 4 ++--
40
hw/intc/armv7m_nvic.c | 78 ++++++++++++++++++++++++++++++-----
12
3 files changed, 34 insertions(+), 2 deletions(-)
41
3 files changed, 69 insertions(+), 12 deletions(-)
13
42
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
43
diff --git a/include/hw/intc/armv7m_nvic.h b/include/hw/intc/armv7m_nvic.h
15
index XXXXXXX..XXXXXXX 100644
44
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
45
--- a/include/hw/intc/armv7m_nvic.h
17
+++ b/target/arm/helper.h
46
+++ b/include/hw/intc/armv7m_nvic.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
47
@@ -XXX,XX +XXX,XX @@ struct NVICState {
19
DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
MemoryRegion systickmem;
20
DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
49
MemoryRegion systick_ns_mem;
21
50
MemoryRegion container;
22
+DEF_HELPER_FLAGS_5(gvec_facge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
51
+ MemoryRegion defaultmem;
23
+DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
52
24
+
53
uint32_t num_irq;
25
+DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
54
qemu_irq excpout;
26
+DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
55
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
27
+
28
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
29
void, ptr, ptr, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
31
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
32
index XXXXXXX..XXXXXXX 100644
56
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/vec_helper.c
57
--- a/hw/arm/armv7m.c
34
+++ b/target/arm/vec_helper.c
58
+++ b/hw/arm/armv7m.c
35
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_cgt(float32 op1, float32 op2, float_status *stat)
59
@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
36
return -float32_lt(op2, op1, stat);
60
sysbus_connect_irq(sbd, 0,
37
}
61
qdev_get_gpio_in(DEVICE(s->cpu), ARM_CPU_IRQ));
38
62
39
+static uint16_t float16_acge(float16 op1, float16 op2, float_status *stat)
63
- memory_region_add_subregion(&s->container, 0xe000e000,
64
+ memory_region_add_subregion(&s->container, 0xe0000000,
65
sysbus_mmio_get_region(sbd, 0));
66
67
for (i = 0; i < ARRAY_SIZE(s->bitband); i++) {
68
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/hw/intc/armv7m_nvic.c
71
+++ b/hw/intc/armv7m_nvic.c
72
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps nvic_systick_ops = {
73
.endianness = DEVICE_NATIVE_ENDIAN,
74
};
75
76
+/*
77
+ * Unassigned portions of the PPB space are RAZ/WI for privileged
78
+ * accesses, and fault for non-privileged accesses.
79
+ */
80
+static MemTxResult ppb_default_read(void *opaque, hwaddr addr,
81
+ uint64_t *data, unsigned size,
82
+ MemTxAttrs attrs)
40
+{
83
+{
41
+ return -float16_le(float16_abs(op2), float16_abs(op1), stat);
84
+ qemu_log_mask(LOG_UNIMP, "Read of unassigned area of PPB: offset 0x%x\n",
85
+ (uint32_t)addr);
86
+ if (attrs.user) {
87
+ return MEMTX_ERROR;
88
+ }
89
+ *data = 0;
90
+ return MEMTX_OK;
42
+}
91
+}
43
+
92
+
44
+static uint32_t float32_acge(float32 op1, float32 op2, float_status *stat)
93
+static MemTxResult ppb_default_write(void *opaque, hwaddr addr,
94
+ uint64_t value, unsigned size,
95
+ MemTxAttrs attrs)
45
+{
96
+{
46
+ return -float32_le(float32_abs(op2), float32_abs(op1), stat);
97
+ qemu_log_mask(LOG_UNIMP, "Write of unassigned area of PPB: offset 0x%x\n",
98
+ (uint32_t)addr);
99
+ if (attrs.user) {
100
+ return MEMTX_ERROR;
101
+ }
102
+ return MEMTX_OK;
47
+}
103
+}
48
+
104
+
49
+static uint16_t float16_acgt(float16 op1, float16 op2, float_status *stat)
105
+static const MemoryRegionOps ppb_default_ops = {
50
+{
106
+ .read_with_attrs = ppb_default_read,
51
+ return -float16_lt(float16_abs(op2), float16_abs(op1), stat);
107
+ .write_with_attrs = ppb_default_write,
52
+}
108
+ .endianness = DEVICE_NATIVE_ENDIAN,
109
+ .valid.min_access_size = 1,
110
+ .valid.max_access_size = 8,
111
+};
53
+
112
+
54
+static uint32_t float32_acgt(float32 op1, float32 op2, float_status *stat)
113
static int nvic_post_load(void *opaque, int version_id)
55
+{
114
{
56
+ return -float32_lt(float32_abs(op2), float32_abs(op1), stat);
115
NVICState *s = opaque;
57
+}
116
@@ -XXX,XX +XXX,XX @@ static void nvic_systick_trigger(void *opaque, int n, int level)
58
+
117
static void armv7m_nvic_realize(DeviceState *dev, Error **errp)
59
#define DO_2OP(NAME, FUNC, TYPE) \
118
{
60
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
119
NVICState *s = NVIC(dev);
61
{ \
120
- int regionlen;
62
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_fcge_s, float32_cge, float32)
121
63
DO_3OP(gvec_fcgt_h, float16_cgt, float16)
122
/* The armv7m container object will have set our CPU pointer */
64
DO_3OP(gvec_fcgt_s, float32_cgt, float32)
123
if (!s->cpu || !arm_feature(&s->cpu->env, ARM_FEATURE_M)) {
65
124
@@ -XXX,XX +XXX,XX @@ static void armv7m_nvic_realize(DeviceState *dev, Error **errp)
66
+DO_3OP(gvec_facge_h, float16_acge, float16)
125
M_REG_S));
67
+DO_3OP(gvec_facge_s, float32_acge, float32)
68
+
69
+DO_3OP(gvec_facgt_h, float16_acgt, float16)
70
+DO_3OP(gvec_facgt_s, float32_acgt, float32)
71
+
72
#ifdef TARGET_AARCH64
73
74
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
75
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
76
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/translate-neon.c.inc
78
+++ b/target/arm/translate-neon.c.inc
79
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s, gen_helper_gvec_fmul_h)
80
DO_3S_FP_GVEC(VCEQ, gen_helper_gvec_fceq_s, gen_helper_gvec_fceq_h)
81
DO_3S_FP_GVEC(VCGE, gen_helper_gvec_fcge_s, gen_helper_gvec_fcge_h)
82
DO_3S_FP_GVEC(VCGT, gen_helper_gvec_fcgt_s, gen_helper_gvec_fcgt_h)
83
+DO_3S_FP_GVEC(VACGE, gen_helper_gvec_facge_s, gen_helper_gvec_facge_h)
84
+DO_3S_FP_GVEC(VACGT, gen_helper_gvec_facgt_s, gen_helper_gvec_facgt_h)
85
86
/*
87
* For all the functions using this macro, size == 1 means fp16,
88
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VCGT, gen_helper_gvec_fcgt_s, gen_helper_gvec_fcgt_h)
89
return do_3same_fp(s, a, FUNC, READS_VD); \
90
}
126
}
91
127
92
-DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
128
- /* The NVIC and System Control Space (SCS) starts at 0xe000e000
93
-DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
129
+ /*
94
DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
130
+ * This device provides a single sysbus memory region which
95
DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
131
+ * represents the whole of the "System PPB" space. This is the
132
+ * range from 0xe0000000 to 0xe00fffff and includes the NVIC,
133
+ * the System Control Space (system registers), the systick timer,
134
+ * and for CPUs with the Security extension an NS banked version
135
+ * of all of these.
136
+ *
137
+ * The default behaviour for unimplemented registers/ranges
138
+ * (for instance the Data Watchpoint and Trace unit at 0xe0001000)
139
+ * is to RAZ/WI for privileged access and BusFault for non-privileged
140
+ * access.
141
+ *
142
+ * The NVIC and System Control Space (SCS) starts at 0xe000e000
143
* and looks like this:
144
* 0x004 - ICTR
145
* 0x010 - 0xff - systick
146
@@ -XXX,XX +XXX,XX @@ static void armv7m_nvic_realize(DeviceState *dev, Error **errp)
147
* generally code determining which banked register to use should
148
* use attrs.secure; code determining actual behaviour of the system
149
* should use env->v7m.secure.
150
+ *
151
+ * The container covers the whole PPB space. Within it the priority
152
+ * of overlapping regions is:
153
+ * - default region (for RAZ/WI and BusFault) : -1
154
+ * - system register regions : 0
155
+ * - systick : 1
156
+ * This is because the systick device is a small block of registers
157
+ * in the middle of the other system control registers.
158
*/
159
- regionlen = arm_feature(&s->cpu->env, ARM_FEATURE_V8) ? 0x21000 : 0x1000;
160
- memory_region_init(&s->container, OBJECT(s), "nvic", regionlen);
161
- /* The system register region goes at the bottom of the priority
162
- * stack as it covers the whole page.
163
- */
164
+ memory_region_init(&s->container, OBJECT(s), "nvic", 0x100000);
165
+ memory_region_init_io(&s->defaultmem, OBJECT(s), &ppb_default_ops, s,
166
+ "nvic-default", 0x100000);
167
+ memory_region_add_subregion_overlap(&s->container, 0, &s->defaultmem, -1);
168
memory_region_init_io(&s->sysregmem, OBJECT(s), &nvic_sysreg_ops, s,
169
"nvic_sysregs", 0x1000);
170
- memory_region_add_subregion(&s->container, 0, &s->sysregmem);
171
+ memory_region_add_subregion(&s->container, 0xe000, &s->sysregmem);
172
173
memory_region_init_io(&s->systickmem, OBJECT(s),
174
&nvic_systick_ops, s,
175
"nvic_systick", 0xe0);
176
177
- memory_region_add_subregion_overlap(&s->container, 0x10,
178
+ memory_region_add_subregion_overlap(&s->container, 0xe010,
179
&s->systickmem, 1);
180
181
if (arm_feature(&s->cpu->env, ARM_FEATURE_V8)) {
182
memory_region_init_io(&s->sysreg_ns_mem, OBJECT(s),
183
&nvic_sysreg_ns_ops, &s->sysregmem,
184
"nvic_sysregs_ns", 0x1000);
185
- memory_region_add_subregion(&s->container, 0x20000, &s->sysreg_ns_mem);
186
+ memory_region_add_subregion(&s->container, 0x2e000, &s->sysreg_ns_mem);
187
memory_region_init_io(&s->systick_ns_mem, OBJECT(s),
188
&nvic_sysreg_ns_ops, &s->systickmem,
189
"nvic_systick_ns", 0xe0);
190
- memory_region_add_subregion_overlap(&s->container, 0x20010,
191
+ memory_region_add_subregion_overlap(&s->container, 0x2e010,
192
&s->systick_ns_mem, 1);
193
}
96
194
97
--
195
--
98
2.20.1
196
2.20.1
99
197
100
198
diff view generated by jsdifflib
1
Convert the Neon VRINT-with-specified-rounding-mode insns to gvec,
1
In v8.1M the PXN architecture extension adds a new PXN bit to the
2
and use this to implement the fp16 versions.
2
MPU_RLAR registers, which forbids execution of code in the region
3
from a privileged mode.
4
5
This is another feature which is just in the generic "in v8.1M" set
6
and has no ID register field indicating its presence.
3
7
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-41-peter.maydell@linaro.org
10
Message-id: 20201119215617.29887-3-peter.maydell@linaro.org
7
---
11
---
8
target/arm/helper.h | 4 +-
12
target/arm/helper.c | 7 ++++++-
9
target/arm/vec_helper.c | 21 +++++++++++
13
1 file changed, 6 insertions(+), 1 deletion(-)
10
target/arm/vfp_helper.c | 17 ---------
11
target/arm/translate-neon.c.inc | 67 +++------------------------------
12
4 files changed, 30 insertions(+), 79 deletions(-)
13
14
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
17
--- a/target/arm/helper.c
17
+++ b/target/arm/helper.h
18
+++ b/target/arm/helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
19
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
19
DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
20
} else {
20
21
uint32_t ap = extract32(env->pmsav8.rbar[secure][matchregion], 1, 2);
21
DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
22
uint32_t xn = extract32(env->pmsav8.rbar[secure][matchregion], 0, 1);
22
-DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
23
+ bool pxn = false;
23
24
DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f32, TCG_CALL_NO_RWG, f32, f16, ptr, i32)
25
DEF_HELPER_FLAGS_3(vfp_fcvt_f32_to_f16, TCG_CALL_NO_RWG, f16, f32, ptr, i32)
26
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_vcvt_rm_us, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
DEF_HELPER_FLAGS_4(gvec_vcvt_rm_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
DEF_HELPER_FLAGS_4(gvec_vcvt_rm_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
30
+DEF_HELPER_FLAGS_4(gvec_vrint_rm_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(gvec_vrint_rm_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+
24
+
33
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+ if (arm_feature(env, ARM_FEATURE_V8_1M)) {
34
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+ pxn = extract32(env->pmsav8.rlar[secure][matchregion], 4, 1);
35
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+ }
36
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
28
37
index XXXXXXX..XXXXXXX 100644
29
if (m_is_system_region(env, address)) {
38
--- a/target/arm/vec_helper.c
30
/* System space is always execute never */
39
+++ b/target/arm/vec_helper.c
31
@@ -XXX,XX +XXX,XX @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
40
@@ -XXX,XX +XXX,XX @@ DO_VCVT_RMODE(gvec_vcvt_rm_sh, helper_vfp_toshh, uint16_t)
32
}
41
DO_VCVT_RMODE(gvec_vcvt_rm_uh, helper_vfp_touhh, uint16_t)
33
42
34
*prot = simple_ap_to_rw_prot(env, mmu_idx, ap);
43
#undef DO_VCVT_RMODE
35
- if (*prot && !xn) {
44
+
36
+ if (*prot && !xn && !(pxn && !is_user)) {
45
+#define DO_VRINT_RMODE(NAME, FUNC, TYPE) \
37
*prot |= PAGE_EXEC;
46
+ void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
38
}
47
+ { \
39
/* We don't need to look the attribute up in the MAIR0/MAIR1
48
+ float_status *fpst = stat; \
49
+ intptr_t i, oprsz = simd_oprsz(desc); \
50
+ uint32_t rmode = simd_data(desc); \
51
+ uint32_t prev_rmode = get_float_rounding_mode(fpst); \
52
+ TYPE *d = vd, *n = vn; \
53
+ set_float_rounding_mode(rmode, fpst); \
54
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
55
+ d[i] = FUNC(n[i], fpst); \
56
+ } \
57
+ set_float_rounding_mode(prev_rmode, fpst); \
58
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
59
+ }
60
+
61
+DO_VRINT_RMODE(gvec_vrint_rm_h, helper_rinth, uint16_t)
62
+DO_VRINT_RMODE(gvec_vrint_rm_s, helper_rints, uint32_t)
63
+
64
+#undef DO_VRINT_RMODE
65
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
66
index XXXXXXX..XXXXXXX 100644
67
--- a/target/arm/vfp_helper.c
68
+++ b/target/arm/vfp_helper.c
69
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(set_rmode)(uint32_t rmode, void *fpstp)
70
return prev_rmode;
71
}
72
73
-/* Set the current fp rounding mode in the standard fp status and return
74
- * the old one. This is for NEON instructions that need to change the
75
- * rounding mode but wish to use the standard FPSCR values for everything
76
- * else. Always set the rounding mode back to the correct value after
77
- * modifying it.
78
- * The argument is a softfloat float_round_ value.
79
- */
80
-uint32_t HELPER(set_neon_rmode)(uint32_t rmode, CPUARMState *env)
81
-{
82
- float_status *fp_status = &env->vfp.standard_fp_status;
83
-
84
- uint32_t prev_rmode = get_float_rounding_mode(fp_status);
85
- set_float_rounding_mode(rmode, fp_status);
86
-
87
- return prev_rmode;
88
-}
89
-
90
/* Half precision conversions. */
91
float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, void *fpstp, uint32_t ahp_mode)
92
{
93
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/translate-neon.c.inc
96
+++ b/target/arm/translate-neon.c.inc
97
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
98
return do_2misc_fp(s, a, gen_helper_rints_exact);
99
}
100
101
-static bool do_vrint(DisasContext *s, arg_2misc *a, int rmode)
102
-{
103
- /*
104
- * Handle a VRINT* operation by iterating 32 bits at a time,
105
- * with a specified rounding mode in operation.
106
- */
107
- int pass;
108
- TCGv_ptr fpst;
109
- TCGv_i32 tcg_rmode;
110
-
111
- if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
112
- !arm_dc_feature(s, ARM_FEATURE_V8)) {
113
- return false;
114
- }
115
-
116
- /* UNDEF accesses to D16-D31 if they don't exist. */
117
- if (!dc_isar_feature(aa32_simd_r32, s) &&
118
- ((a->vd | a->vm) & 0x10)) {
119
- return false;
120
- }
121
-
122
- if (a->size != 2) {
123
- /* TODO: FP16 will be the size == 1 case */
124
- return false;
125
- }
126
-
127
- if ((a->vd | a->vm) & a->q) {
128
- return false;
129
- }
130
-
131
- if (!vfp_access_check(s)) {
132
- return true;
133
- }
134
-
135
- fpst = fpstatus_ptr(FPST_STD);
136
- tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
137
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
138
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
139
- TCGv_i32 tmp = neon_load_reg(a->vm, pass);
140
- gen_helper_rints(tmp, tmp, fpst);
141
- neon_store_reg(a->vd, pass, tmp);
142
- }
143
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
144
- tcg_temp_free_i32(tcg_rmode);
145
- tcg_temp_free_ptr(fpst);
146
-
147
- return true;
148
-}
149
-
150
-#define DO_VRINT(INSN, RMODE) \
151
- static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
152
- { \
153
- return do_vrint(s, a, RMODE); \
154
- }
155
-
156
-DO_VRINT(VRINTN, FPROUNDING_TIEEVEN)
157
-DO_VRINT(VRINTA, FPROUNDING_TIEAWAY)
158
-DO_VRINT(VRINTZ, FPROUNDING_ZERO)
159
-DO_VRINT(VRINTM, FPROUNDING_NEGINF)
160
-DO_VRINT(VRINTP, FPROUNDING_POSINF)
161
-
162
#define DO_VEC_RMODE(INSN, RMODE, OP) \
163
static void gen_##INSN(unsigned vece, uint32_t rd_ofs, \
164
uint32_t rm_ofs, \
165
@@ -XXX,XX +XXX,XX @@ DO_VEC_RMODE(VCVTPS, FPROUNDING_POSINF, vcvt_rm_s)
166
DO_VEC_RMODE(VCVTMU, FPROUNDING_NEGINF, vcvt_rm_u)
167
DO_VEC_RMODE(VCVTMS, FPROUNDING_NEGINF, vcvt_rm_s)
168
169
+DO_VEC_RMODE(VRINTN, FPROUNDING_TIEEVEN, vrint_rm_)
170
+DO_VEC_RMODE(VRINTA, FPROUNDING_TIEAWAY, vrint_rm_)
171
+DO_VEC_RMODE(VRINTZ, FPROUNDING_ZERO, vrint_rm_)
172
+DO_VEC_RMODE(VRINTM, FPROUNDING_NEGINF, vrint_rm_)
173
+DO_VEC_RMODE(VRINTP, FPROUNDING_POSINF, vrint_rm_)
174
+
175
static bool trans_VSWP(DisasContext *s, arg_2misc *a)
176
{
177
TCGv_i64 rm, rd;
178
--
40
--
179
2.20.1
41
2.20.1
180
42
181
43
diff view generated by jsdifflib
1
Convert the Neon pairwise fp ops to use a single gvic-style
1
In arm_cpu_realizefn() we check whether the board code disabled EL3
2
helper to do the full operation instead of one helper call
2
via the has_el3 CPU object property, which we create if the CPU
3
for each 32-bit part. This allows us to use the same
3
starts with the ARM_FEATURE_EL3 feature bit. If it is disabled, then
4
framework to implement the fp16.
4
we turn off ARM_FEATURE_EL3 and also zero out the relevant fields in
5
the ID_PFR1 and ID_AA64PFR0 registers.
6
7
This codepath was incorrectly being taken for M-profile CPUs, which
8
do not have an EL3 and don't set ARM_FEATURE_EL3, but which may have
9
the M-profile Security extension and so should have non-zero values
10
in the ID_PFR1.Security field.
11
12
Restrict the handling of the feature flag to A/R-profile cores.
5
13
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200828183354.27913-36-peter.maydell@linaro.org
16
Message-id: 20201119215617.29887-4-peter.maydell@linaro.org
9
---
17
---
10
target/arm/helper.h | 7 +++++
18
target/arm/cpu.c | 2 +-
11
target/arm/vec_helper.c | 45 +++++++++++++++++++++++++++++++++
19
1 file changed, 1 insertion(+), 1 deletion(-)
12
target/arm/translate-neon.c.inc | 42 ++++++++++++------------------
13
3 files changed, 68 insertions(+), 26 deletions(-)
14
20
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
21
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
16
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.h
23
--- a/target/arm/cpu.c
18
+++ b/target/arm/helper.h
24
+++ b/target/arm/cpu.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
25
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
20
DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
26
}
21
void, ptr, ptr, ptr, ptr, i32)
22
23
+DEF_HELPER_FLAGS_5(neon_paddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(neon_pmaxh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_5(neon_pminh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(neon_padds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_5(neon_pmaxs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_5(neon_pmins, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+
30
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/vec_helper.c
36
+++ b/target/arm/vec_helper.c
37
@@ -XXX,XX +XXX,XX @@ DO_ABA(gvec_uaba_s, uint32_t)
38
DO_ABA(gvec_uaba_d, uint64_t)
39
40
#undef DO_ABA
41
+
42
+#define DO_NEON_PAIRWISE(NAME, OP) \
43
+ void HELPER(NAME##s)(void *vd, void *vn, void *vm, \
44
+ void *stat, uint32_t oprsz) \
45
+ { \
46
+ float_status *fpst = stat; \
47
+ float32 *d = vd; \
48
+ float32 *n = vn; \
49
+ float32 *m = vm; \
50
+ float32 r0, r1; \
51
+ \
52
+ /* Read all inputs before writing outputs in case vm == vd */ \
53
+ r0 = float32_##OP(n[H4(0)], n[H4(1)], fpst); \
54
+ r1 = float32_##OP(m[H4(0)], m[H4(1)], fpst); \
55
+ \
56
+ d[H4(0)] = r0; \
57
+ d[H4(1)] = r1; \
58
+ } \
59
+ \
60
+ void HELPER(NAME##h)(void *vd, void *vn, void *vm, \
61
+ void *stat, uint32_t oprsz) \
62
+ { \
63
+ float_status *fpst = stat; \
64
+ float16 *d = vd; \
65
+ float16 *n = vn; \
66
+ float16 *m = vm; \
67
+ float16 r0, r1, r2, r3; \
68
+ \
69
+ /* Read all inputs before writing outputs in case vm == vd */ \
70
+ r0 = float16_##OP(n[H2(0)], n[H2(1)], fpst); \
71
+ r1 = float16_##OP(n[H2(2)], n[H2(3)], fpst); \
72
+ r2 = float16_##OP(m[H2(0)], m[H2(1)], fpst); \
73
+ r3 = float16_##OP(m[H2(2)], m[H2(3)], fpst); \
74
+ \
75
+ d[H4(0)] = r0; \
76
+ d[H4(1)] = r1; \
77
+ d[H4(2)] = r2; \
78
+ d[H4(3)] = r3; \
79
+ }
80
+
81
+DO_NEON_PAIRWISE(neon_padd, add)
82
+DO_NEON_PAIRWISE(neon_pmax, max)
83
+DO_NEON_PAIRWISE(neon_pmin, min)
84
+
85
+#undef DO_NEON_PAIRWISE
86
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/translate-neon.c.inc
89
+++ b/target/arm/translate-neon.c.inc
90
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
91
return do_3same(s, a, gen_VMINNM_fp32_3s);
92
}
93
94
-static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
95
+static bool do_3same_fp_pair(DisasContext *s, arg_3same *a,
96
+ gen_helper_gvec_3_ptr *fn)
97
{
98
- /* FP operations handled pairwise 32 bits at a time */
99
- TCGv_i32 tmp, tmp2, tmp3;
100
+ /* FP pairwise operations */
101
TCGv_ptr fpstatus;
102
103
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
104
@@ -XXX,XX +XXX,XX @@ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
105
106
assert(a->q == 0); /* enforced by decode patterns */
107
108
- /*
109
- * Note that we have to be careful not to clobber the source operands
110
- * in the "vm == vd" case by storing the result of the first pass too
111
- * early. Since Q is 0 there are always just two passes, so instead
112
- * of a complicated loop over each pass we just unroll.
113
- */
114
- fpstatus = fpstatus_ptr(FPST_STD);
115
- tmp = neon_load_reg(a->vn, 0);
116
- tmp2 = neon_load_reg(a->vn, 1);
117
- fn(tmp, tmp, tmp2, fpstatus);
118
- tcg_temp_free_i32(tmp2);
119
120
- tmp3 = neon_load_reg(a->vm, 0);
121
- tmp2 = neon_load_reg(a->vm, 1);
122
- fn(tmp3, tmp3, tmp2, fpstatus);
123
- tcg_temp_free_i32(tmp2);
124
+ fpstatus = fpstatus_ptr(a->size != 0 ? FPST_STD_F16 : FPST_STD);
125
+ tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
126
+ vfp_reg_offset(1, a->vn),
127
+ vfp_reg_offset(1, a->vm),
128
+ fpstatus, 8, 8, 0, fn);
129
tcg_temp_free_ptr(fpstatus);
130
131
- neon_store_reg(a->vd, 0, tmp);
132
- neon_store_reg(a->vd, 1, tmp3);
133
return true;
134
}
135
136
@@ -XXX,XX +XXX,XX @@ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
137
static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
138
{ \
139
if (a->size != 0) { \
140
- /* TODO fp16 support */ \
141
- return false; \
142
+ if (!dc_isar_feature(aa32_fp16_arith, s)) { \
143
+ return false; \
144
+ } \
145
+ return do_3same_fp_pair(s, a, FUNC##h); \
146
} \
147
- return do_3same_fp_pair(s, a, FUNC); \
148
+ return do_3same_fp_pair(s, a, FUNC##s); \
149
}
27
}
150
28
151
-DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
29
- if (!cpu->has_el3) {
152
-DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
30
+ if (!arm_feature(env, ARM_FEATURE_M) && !cpu->has_el3) {
153
-DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
31
/* If the has_el3 CPU property is disabled then we need to disable the
154
+DO_3S_FP_PAIR(VPADD, gen_helper_neon_padd)
32
* feature.
155
+DO_3S_FP_PAIR(VPMAX, gen_helper_neon_pmax)
33
*/
156
+DO_3S_FP_PAIR(VPMIN, gen_helper_neon_pmin)
157
158
static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
159
{
160
--
34
--
161
2.20.1
35
2.20.1
162
36
163
37
diff view generated by jsdifflib
1
Implement VFP fp16 for VABS, VNEG and VSQRT. This is all
1
Implement the v8.1M VSCCLRM insn, which zeros floating point
2
the fp16 insns that use the DO_VFP_2OP macro, because there
2
registers if there is an active floating point context.
3
is no fp16 version of VMOV_reg.
3
This requires support in write_neon_element32() for the MO_32
4
4
element size, so add it.
5
Notes:
5
6
* the gen_helper_vfp_negh already exists as we needed to create
6
Because we want to use arm_gen_condlabel(), we need to move
7
it for the fp16 multiply-add insns
7
the definition of that function up in translate.c so it is
8
* as usual we need to use the f16 version of the fp_status;
8
before the #include of translate-vfp.c.inc.
9
this is only relevant for VSQRT
10
9
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20200828183354.27913-9-peter.maydell@linaro.org
12
Message-id: 20201119215617.29887-5-peter.maydell@linaro.org
14
---
13
---
15
target/arm/helper.h | 2 ++
14
target/arm/cpu.h | 9 ++++
16
target/arm/vfp.decode | 3 +++
15
target/arm/m-nocp.decode | 8 +++-
17
target/arm/vfp_helper.c | 10 +++++++++
16
target/arm/translate.c | 21 +++++----
18
target/arm/translate-vfp.c.inc | 40 ++++++++++++++++++++++++++++++++++
17
target/arm/translate-vfp.c.inc | 84 ++++++++++++++++++++++++++++++++++
19
4 files changed, 55 insertions(+)
18
4 files changed, 111 insertions(+), 11 deletions(-)
20
19
21
diff --git a/target/arm/helper.h b/target/arm/helper.h
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
22
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/helper.h
22
--- a/target/arm/cpu.h
24
+++ b/target/arm/helper.h
23
+++ b/target/arm/cpu.h
25
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr)
24
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_mprofile(const ARMISARegisters *id)
26
DEF_HELPER_1(vfp_negh, f16, f16)
25
return FIELD_EX32(id->id_pfr1, ID_PFR1, MPROGMOD) != 0;
27
DEF_HELPER_1(vfp_negs, f32, f32)
26
}
28
DEF_HELPER_1(vfp_negd, f64, f64)
27
29
+DEF_HELPER_1(vfp_absh, f16, f16)
28
+static inline bool isar_feature_aa32_m_sec_state(const ARMISARegisters *id)
30
DEF_HELPER_1(vfp_abss, f32, f32)
31
DEF_HELPER_1(vfp_absd, f64, f64)
32
+DEF_HELPER_2(vfp_sqrth, f16, f16, env)
33
DEF_HELPER_2(vfp_sqrts, f32, f32, env)
34
DEF_HELPER_2(vfp_sqrtd, f64, f64, env)
35
DEF_HELPER_3(vfp_cmps, void, f32, f32, env)
36
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/vfp.decode
39
+++ b/target/arm/vfp.decode
40
@@ -XXX,XX +XXX,XX @@ VMOV_imm_dp ---- 1110 1.11 .... .... 1011 0000 .... \
41
VMOV_reg_sp ---- 1110 1.11 0000 .... 1010 01.0 .... @vfp_dm_ss
42
VMOV_reg_dp ---- 1110 1.11 0000 .... 1011 01.0 .... @vfp_dm_dd
43
44
+VABS_hp ---- 1110 1.11 0000 .... 1001 11.0 .... @vfp_dm_ss
45
VABS_sp ---- 1110 1.11 0000 .... 1010 11.0 .... @vfp_dm_ss
46
VABS_dp ---- 1110 1.11 0000 .... 1011 11.0 .... @vfp_dm_dd
47
48
+VNEG_hp ---- 1110 1.11 0001 .... 1001 01.0 .... @vfp_dm_ss
49
VNEG_sp ---- 1110 1.11 0001 .... 1010 01.0 .... @vfp_dm_ss
50
VNEG_dp ---- 1110 1.11 0001 .... 1011 01.0 .... @vfp_dm_dd
51
52
+VSQRT_hp ---- 1110 1.11 0001 .... 1001 11.0 .... @vfp_dm_ss
53
VSQRT_sp ---- 1110 1.11 0001 .... 1010 11.0 .... @vfp_dm_ss
54
VSQRT_dp ---- 1110 1.11 0001 .... 1011 11.0 .... @vfp_dm_dd
55
56
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/vfp_helper.c
59
+++ b/target/arm/vfp_helper.c
60
@@ -XXX,XX +XXX,XX @@ float64 VFP_HELPER(neg, d)(float64 a)
61
return float64_chs(a);
62
}
63
64
+dh_ctype_f16 VFP_HELPER(abs, h)(dh_ctype_f16 a)
65
+{
29
+{
66
+ return float16_abs(a);
30
+ /*
31
+ * Return true if M-profile state handling insns
32
+ * (VSCCLRM, CLRM, FPCTX access insns) are implemented
33
+ */
34
+ return FIELD_EX32(id->id_pfr1, ID_PFR1, SECURITY) >= 3;
67
+}
35
+}
68
+
36
+
69
float32 VFP_HELPER(abs, s)(float32 a)
37
static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
70
{
38
{
71
return float32_abs(a);
39
/* Sadly this is encoded differently for A-profile and M-profile */
72
@@ -XXX,XX +XXX,XX @@ float64 VFP_HELPER(abs, d)(float64 a)
40
diff --git a/target/arm/m-nocp.decode b/target/arm/m-nocp.decode
73
return float64_abs(a);
41
index XXXXXXX..XXXXXXX 100644
74
}
42
--- a/target/arm/m-nocp.decode
75
43
+++ b/target/arm/m-nocp.decode
76
+dh_ctype_f16 VFP_HELPER(sqrt, h)(dh_ctype_f16 a, CPUARMState *env)
44
@@ -XXX,XX +XXX,XX @@
45
# If the coprocessor is not present or disabled then we will generate
46
# the NOCP exception; otherwise we let the insn through to the main decode.
47
48
+%vd_dp 22:1 12:4
49
+%vd_sp 12:4 22:1
50
+
51
&nocp cp
52
53
{
54
# Special cases which do not take an early NOCP: VLLDM and VLSTM
55
VLLDM_VLSTM 1110 1100 001 l:1 rn:4 0000 1010 0000 0000
56
- # TODO: VSCCLRM (new in v8.1M) is similar:
57
- #VSCCLRM 1110 1100 1-01 1111 ---- 1011 ---- ---0
58
+ # VSCCLRM (new in v8.1M) is similar:
59
+ VSCCLRM 1110 1100 1.01 1111 .... 1011 imm:7 0 vd=%vd_dp size=3
60
+ VSCCLRM 1110 1100 1.01 1111 .... 1010 imm:8 vd=%vd_sp size=2
61
62
NOCP 111- 1110 ---- ---- ---- cp:4 ---- ---- &nocp
63
NOCP 111- 110- ---- ---- ---- cp:4 ---- ---- &nocp
64
diff --git a/target/arm/translate.c b/target/arm/translate.c
65
index XXXXXXX..XXXXXXX 100644
66
--- a/target/arm/translate.c
67
+++ b/target/arm/translate.c
68
@@ -XXX,XX +XXX,XX @@ void arm_translate_init(void)
69
a64_translate_init();
70
}
71
72
+/* Generate a label used for skipping this instruction */
73
+static void arm_gen_condlabel(DisasContext *s)
77
+{
74
+{
78
+ return float16_sqrt(a, &env->vfp.fp_status_f16);
75
+ if (!s->condjmp) {
76
+ s->condlabel = gen_new_label();
77
+ s->condjmp = 1;
78
+ }
79
+}
79
+}
80
+
80
+
81
float32 VFP_HELPER(sqrt, s)(float32 a, CPUARMState *env)
81
/* Flags for the disas_set_da_iss info argument:
82
{
82
* lower bits hold the Rt register number, higher bits are flags.
83
return float32_sqrt(a, &env->vfp.fp_status);
83
*/
84
@@ -XXX,XX +XXX,XX @@ static void write_neon_element64(TCGv_i64 src, int reg, int ele, MemOp memop)
85
long off = neon_element_offset(reg, ele, memop);
86
87
switch (memop) {
88
+ case MO_32:
89
+ tcg_gen_st32_i64(src, cpu_env, off);
90
+ break;
91
case MO_64:
92
tcg_gen_st_i64(src, cpu_env, off);
93
break;
94
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
95
s->base.is_jmp = DISAS_UPDATE_EXIT;
96
}
97
98
-/* Generate a label used for skipping this instruction */
99
-static void arm_gen_condlabel(DisasContext *s)
100
-{
101
- if (!s->condjmp) {
102
- s->condlabel = gen_new_label();
103
- s->condjmp = 1;
104
- }
105
-}
106
-
107
/* Skip this instruction if the ARM condition is false */
108
static void arm_skip_unless(DisasContext *s, uint32_t cond)
109
{
84
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
110
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
85
index XXXXXXX..XXXXXXX 100644
111
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/translate-vfp.c.inc
112
--- a/target/arm/translate-vfp.c.inc
87
+++ b/target/arm/translate-vfp.c.inc
113
+++ b/target/arm/translate-vfp.c.inc
88
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
114
@@ -XXX,XX +XXX,XX @@ static bool trans_VLLDM_VLSTM(DisasContext *s, arg_VLLDM_VLSTM *a)
89
return true;
115
return true;
90
}
116
}
91
117
92
+static bool do_vfp_2op_hp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
118
+static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
93
+{
119
+{
120
+ int btmreg, topreg;
121
+ TCGv_i64 zero;
122
+ TCGv_i32 aspen, sfpa;
123
+
124
+ if (!dc_isar_feature(aa32_m_sec_state, s)) {
125
+ /* Before v8.1M, fall through in decode to NOCP check */
126
+ return false;
127
+ }
128
+
129
+ /* Explicitly UNDEF because this takes precedence over NOCP */
130
+ if (!arm_dc_feature(s, ARM_FEATURE_M_MAIN) || !s->v8m_secure) {
131
+ unallocated_encoding(s);
132
+ return true;
133
+ }
134
+
135
+ if (!dc_isar_feature(aa32_vfp_simd, s)) {
136
+ /* NOP if we have neither FP nor MVE */
137
+ return true;
138
+ }
139
+
94
+ /*
140
+ /*
95
+ * Do a half-precision operation. Functionally this is
141
+ * If FPCCR.ASPEN != 0 && CONTROL_S.SFPA == 0 then there is no
96
+ * the same as do_vfp_2op_sp(), except:
142
+ * active floating point context so we must NOP (without doing
97
+ * - it doesn't need the VFP vector handling (fp16 is a
143
+ * any lazy state preservation or the NOCP check).
98
+ * v8 feature, and in v8 VFP vectors don't exist)
99
+ * - it does the aa32_fp16_arith feature test
100
+ */
144
+ */
101
+ TCGv_i32 f0;
145
+ aspen = load_cpu_field(v7m.fpccr[M_REG_S]);
102
+
146
+ sfpa = load_cpu_field(v7m.control[M_REG_S]);
103
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
147
+ tcg_gen_andi_i32(aspen, aspen, R_V7M_FPCCR_ASPEN_MASK);
104
+ return false;
148
+ tcg_gen_xori_i32(aspen, aspen, R_V7M_FPCCR_ASPEN_MASK);
105
+ }
149
+ tcg_gen_andi_i32(sfpa, sfpa, R_V7M_CONTROL_SFPA_MASK);
106
+
150
+ tcg_gen_or_i32(sfpa, sfpa, aspen);
107
+ if (s->vec_len != 0 || s->vec_stride != 0) {
151
+ arm_gen_condlabel(s);
108
+ return false;
152
+ tcg_gen_brcondi_i32(TCG_COND_EQ, sfpa, 0, s->condlabel);
153
+
154
+ if (s->fp_excp_el != 0) {
155
+ gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
156
+ syn_uncategorized(), s->fp_excp_el);
157
+ return true;
158
+ }
159
+
160
+ topreg = a->vd + a->imm - 1;
161
+ btmreg = a->vd;
162
+
163
+ /* Convert to Sreg numbers if the insn specified in Dregs */
164
+ if (a->size == 3) {
165
+ topreg = topreg * 2 + 1;
166
+ btmreg *= 2;
167
+ }
168
+
169
+ if (topreg > 63 || (topreg > 31 && !(topreg & 1))) {
170
+ /* UNPREDICTABLE: we choose to undef */
171
+ unallocated_encoding(s);
172
+ return true;
173
+ }
174
+
175
+ /* Silently ignore requests to clear D16-D31 if they don't exist */
176
+ if (topreg > 31 && !dc_isar_feature(aa32_simd_r32, s)) {
177
+ topreg = 31;
109
+ }
178
+ }
110
+
179
+
111
+ if (!vfp_access_check(s)) {
180
+ if (!vfp_access_check(s)) {
112
+ return true;
181
+ return true;
113
+ }
182
+ }
114
+
183
+
115
+ f0 = tcg_temp_new_i32();
184
+ /* Zero the Sregs from btmreg to topreg inclusive. */
116
+ neon_load_reg32(f0, vm);
185
+ zero = tcg_const_i64(0);
117
+ fn(f0, f0);
186
+ if (btmreg & 1) {
118
+ neon_store_reg32(f0, vd);
187
+ write_neon_element64(zero, btmreg >> 1, 1, MO_32);
119
+ tcg_temp_free_i32(f0);
188
+ btmreg++;
120
+
189
+ }
190
+ for (; btmreg + 1 <= topreg; btmreg += 2) {
191
+ write_neon_element64(zero, btmreg >> 1, 0, MO_64);
192
+ }
193
+ if (btmreg == topreg) {
194
+ write_neon_element64(zero, btmreg >> 1, 0, MO_32);
195
+ btmreg++;
196
+ }
197
+ assert(btmreg == topreg + 1);
198
+ /* TODO: when MVE is implemented, zero VPR here */
121
+ return true;
199
+ return true;
122
+}
200
+}
123
+
201
+
124
static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
202
static bool trans_NOCP(DisasContext *s, arg_nocp *a)
125
{
203
{
126
uint32_t delta_m = 0;
204
/*
127
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
128
DO_VFP_2OP(VMOV_reg, sp, tcg_gen_mov_i32)
129
DO_VFP_2OP(VMOV_reg, dp, tcg_gen_mov_i64)
130
131
+DO_VFP_2OP(VABS, hp, gen_helper_vfp_absh)
132
DO_VFP_2OP(VABS, sp, gen_helper_vfp_abss)
133
DO_VFP_2OP(VABS, dp, gen_helper_vfp_absd)
134
135
+DO_VFP_2OP(VNEG, hp, gen_helper_vfp_negh)
136
DO_VFP_2OP(VNEG, sp, gen_helper_vfp_negs)
137
DO_VFP_2OP(VNEG, dp, gen_helper_vfp_negd)
138
139
+static void gen_VSQRT_hp(TCGv_i32 vd, TCGv_i32 vm)
140
+{
141
+ gen_helper_vfp_sqrth(vd, vm, cpu_env);
142
+}
143
+
144
static void gen_VSQRT_sp(TCGv_i32 vd, TCGv_i32 vm)
145
{
146
gen_helper_vfp_sqrts(vd, vm, cpu_env);
147
@@ -XXX,XX +XXX,XX @@ static void gen_VSQRT_dp(TCGv_i64 vd, TCGv_i64 vm)
148
gen_helper_vfp_sqrtd(vd, vm, cpu_env);
149
}
150
151
+DO_VFP_2OP(VSQRT, hp, gen_VSQRT_hp)
152
DO_VFP_2OP(VSQRT, sp, gen_VSQRT_sp)
153
DO_VFP_2OP(VSQRT, dp, gen_VSQRT_dp)
154
155
--
205
--
156
2.20.1
206
2.20.1
157
207
158
208
diff view generated by jsdifflib
1
Implement fp16 version of VCMP.
1
In v8.1M the new CLRM instruction allows zeroing an arbitrary set of
2
the general-purpose registers and APSR. Implement this.
3
4
The encoding is a subset of the LDMIA T2 encoding, using what would
5
be Rn=0b1111 (which UNDEFs for LDMIA).
2
6
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200828183354.27913-11-peter.maydell@linaro.org
9
Message-id: 20201119215617.29887-6-peter.maydell@linaro.org
6
---
10
---
7
target/arm/helper.h | 2 ++
11
target/arm/t32.decode | 6 +++++-
8
target/arm/vfp.decode | 2 ++
12
target/arm/translate.c | 38 ++++++++++++++++++++++++++++++++++++++
9
target/arm/vfp_helper.c | 15 +++++++------
13
2 files changed, 43 insertions(+), 1 deletion(-)
10
target/arm/translate-vfp.c.inc | 39 ++++++++++++++++++++++++++++++++++
11
4 files changed, 51 insertions(+), 7 deletions(-)
12
14
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
14
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
17
--- a/target/arm/t32.decode
16
+++ b/target/arm/helper.h
18
+++ b/target/arm/t32.decode
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_1(vfp_absd, f64, f64)
19
@@ -XXX,XX +XXX,XX @@ UXTAB 1111 1010 0101 .... 1111 .... 10.. .... @rrr_rot
18
DEF_HELPER_2(vfp_sqrth, f16, f16, env)
20
19
DEF_HELPER_2(vfp_sqrts, f32, f32, env)
21
STM_t32 1110 1000 10.0 .... ................ @ldstm i=1 b=0
20
DEF_HELPER_2(vfp_sqrtd, f64, f64, env)
22
STM_t32 1110 1001 00.0 .... ................ @ldstm i=0 b=1
21
+DEF_HELPER_3(vfp_cmph, void, f16, f16, env)
23
-LDM_t32 1110 1000 10.1 .... ................ @ldstm i=1 b=0
22
DEF_HELPER_3(vfp_cmps, void, f32, f32, env)
24
+{
23
DEF_HELPER_3(vfp_cmpd, void, f64, f64, env)
25
+ # Rn=15 UNDEFs for LDM; M-profile CLRM uses that encoding
24
+DEF_HELPER_3(vfp_cmpeh, void, f16, f16, env)
26
+ CLRM 1110 1000 1001 1111 list:16
25
DEF_HELPER_3(vfp_cmpes, void, f32, f32, env)
27
+ LDM_t32 1110 1000 10.1 .... ................ @ldstm i=1 b=0
26
DEF_HELPER_3(vfp_cmped, void, f64, f64, env)
28
+}
27
29
LDM_t32 1110 1001 00.1 .... ................ @ldstm i=0 b=1
28
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
30
31
&rfe !extern rn w pu
32
diff --git a/target/arm/translate.c b/target/arm/translate.c
29
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/vfp.decode
34
--- a/target/arm/translate.c
31
+++ b/target/arm/vfp.decode
35
+++ b/target/arm/translate.c
32
@@ -XXX,XX +XXX,XX @@ VSQRT_hp ---- 1110 1.11 0001 .... 1001 11.0 .... @vfp_dm_ss
36
@@ -XXX,XX +XXX,XX @@ static bool trans_LDM_t16(DisasContext *s, arg_ldst_block *a)
33
VSQRT_sp ---- 1110 1.11 0001 .... 1010 11.0 .... @vfp_dm_ss
37
return do_ldm(s, a, 1);
34
VSQRT_dp ---- 1110 1.11 0001 .... 1011 11.0 .... @vfp_dm_dd
35
36
+VCMP_hp ---- 1110 1.11 010 z:1 .... 1001 e:1 1.0 .... \
37
+ vd=%vd_sp vm=%vm_sp
38
VCMP_sp ---- 1110 1.11 010 z:1 .... 1010 e:1 1.0 .... \
39
vd=%vd_sp vm=%vm_sp
40
VCMP_dp ---- 1110 1.11 010 z:1 .... 1011 e:1 1.0 .... \
41
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/vfp_helper.c
44
+++ b/target/arm/vfp_helper.c
45
@@ -XXX,XX +XXX,XX @@ static void softfloat_to_vfp_compare(CPUARMState *env, FloatRelation cmp)
46
}
38
}
47
39
48
/* XXX: check quiet/signaling case */
40
+static bool trans_CLRM(DisasContext *s, arg_CLRM *a)
49
-#define DO_VFP_cmp(p, type) \
50
-void VFP_HELPER(cmp, p)(type a, type b, CPUARMState *env) \
51
+#define DO_VFP_cmp(P, FLOATTYPE, ARGTYPE, FPST) \
52
+void VFP_HELPER(cmp, P)(ARGTYPE a, ARGTYPE b, CPUARMState *env) \
53
{ \
54
softfloat_to_vfp_compare(env, \
55
- type ## _compare_quiet(a, b, &env->vfp.fp_status)); \
56
+ FLOATTYPE ## _compare_quiet(a, b, &env->vfp.FPST)); \
57
} \
58
-void VFP_HELPER(cmpe, p)(type a, type b, CPUARMState *env) \
59
+void VFP_HELPER(cmpe, P)(ARGTYPE a, ARGTYPE b, CPUARMState *env) \
60
{ \
61
softfloat_to_vfp_compare(env, \
62
- type ## _compare(a, b, &env->vfp.fp_status)); \
63
+ FLOATTYPE ## _compare(a, b, &env->vfp.FPST)); \
64
}
65
-DO_VFP_cmp(s, float32)
66
-DO_VFP_cmp(d, float64)
67
+DO_VFP_cmp(h, float16, dh_ctype_f16, fp_status_f16)
68
+DO_VFP_cmp(s, float32, float32, fp_status)
69
+DO_VFP_cmp(d, float64, float64, fp_status)
70
#undef DO_VFP_cmp
71
72
/* Integer to float and float to integer conversions */
73
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/translate-vfp.c.inc
76
+++ b/target/arm/translate-vfp.c.inc
77
@@ -XXX,XX +XXX,XX @@ DO_VFP_2OP(VSQRT, hp, gen_VSQRT_hp)
78
DO_VFP_2OP(VSQRT, sp, gen_VSQRT_sp)
79
DO_VFP_2OP(VSQRT, dp, gen_VSQRT_dp)
80
81
+static bool trans_VCMP_hp(DisasContext *s, arg_VCMP_sp *a)
82
+{
41
+{
83
+ TCGv_i32 vd, vm;
42
+ int i;
43
+ TCGv_i32 zero;
84
+
44
+
85
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
45
+ if (!dc_isar_feature(aa32_m_sec_state, s)) {
86
+ return false;
46
+ return false;
87
+ }
47
+ }
88
+
48
+
89
+ /* Vm/M bits must be zero for the Z variant */
49
+ if (extract32(a->list, 13, 1)) {
90
+ if (a->z && a->vm != 0) {
91
+ return false;
50
+ return false;
92
+ }
51
+ }
93
+
52
+
94
+ if (!vfp_access_check(s)) {
53
+ if (!a->list) {
95
+ return true;
54
+ /* UNPREDICTABLE; we choose to UNDEF */
55
+ return false;
96
+ }
56
+ }
97
+
57
+
98
+ vd = tcg_temp_new_i32();
58
+ zero = tcg_const_i32(0);
99
+ vm = tcg_temp_new_i32();
59
+ for (i = 0; i < 15; i++) {
100
+
60
+ if (extract32(a->list, i, 1)) {
101
+ neon_load_reg32(vd, a->vd);
61
+ /* Clear R[i] */
102
+ if (a->z) {
62
+ tcg_gen_mov_i32(cpu_R[i], zero);
103
+ tcg_gen_movi_i32(vm, 0);
63
+ }
104
+ } else {
105
+ neon_load_reg32(vm, a->vm);
106
+ }
64
+ }
107
+
65
+ if (extract32(a->list, 15, 1)) {
108
+ if (a->e) {
66
+ /*
109
+ gen_helper_vfp_cmpeh(vd, vm, cpu_env);
67
+ * Clear APSR (by calling the MSR helper with the same argument
110
+ } else {
68
+ * as for "MSR APSR_nzcvqg, Rn": mask = 0b1100, SYSM=0)
111
+ gen_helper_vfp_cmph(vd, vm, cpu_env);
69
+ */
70
+ TCGv_i32 maskreg = tcg_const_i32(0xc << 8);
71
+ gen_helper_v7m_msr(cpu_env, maskreg, zero);
72
+ tcg_temp_free_i32(maskreg);
112
+ }
73
+ }
113
+
74
+ tcg_temp_free_i32(zero);
114
+ tcg_temp_free_i32(vd);
115
+ tcg_temp_free_i32(vm);
116
+
117
+ return true;
75
+ return true;
118
+}
76
+}
119
+
77
+
120
static bool trans_VCMP_sp(DisasContext *s, arg_VCMP_sp *a)
78
/*
121
{
79
* Branch, branch with link
122
TCGv_i32 vd, vm;
80
*/
123
--
81
--
124
2.20.1
82
2.20.1
125
83
126
84
diff view generated by jsdifflib
1
Implement VFP fp16 support for the VMOV immediate insn.
1
For M-profile before v8.1M, the only valid register for VMSR/VMRS is
2
the FPSCR. We have a comment that states this, but the actual logic
3
to forbid accesses for any other register value is missing, so we
4
would end up with A-profile style behaviour. Add the missing check.
2
5
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200828183354.27913-10-peter.maydell@linaro.org
8
Message-id: 20201119215617.29887-7-peter.maydell@linaro.org
6
---
9
---
7
target/arm/vfp.decode | 2 ++
10
target/arm/translate-vfp.c.inc | 5 ++++-
8
target/arm/translate-vfp.c.inc | 22 ++++++++++++++++++++++
11
1 file changed, 4 insertions(+), 1 deletion(-)
9
2 files changed, 24 insertions(+)
10
12
11
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/vfp.decode
14
+++ b/target/arm/vfp.decode
15
@@ -XXX,XX +XXX,XX @@ VFMS_dp ---- 1110 1.10 .... .... 1011 .1.0 .... @vfp_dnm_d
16
VFNMA_dp ---- 1110 1.01 .... .... 1011 .0.0 .... @vfp_dnm_d
17
VFNMS_dp ---- 1110 1.01 .... .... 1011 .1.0 .... @vfp_dnm_d
18
19
+VMOV_imm_hp ---- 1110 1.11 .... .... 1001 0000 .... \
20
+ vd=%vd_sp imm=%vmov_imm
21
VMOV_imm_sp ---- 1110 1.11 .... .... 1010 0000 .... \
22
vd=%vd_sp imm=%vmov_imm
23
VMOV_imm_dp ---- 1110 1.11 .... .... 1011 0000 .... \
24
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
13
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
25
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/translate-vfp.c.inc
15
--- a/target/arm/translate-vfp.c.inc
27
+++ b/target/arm/translate-vfp.c.inc
16
+++ b/target/arm/translate-vfp.c.inc
28
@@ -XXX,XX +XXX,XX @@ MAKE_VFM_TRANS_FNS(hp)
17
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
29
MAKE_VFM_TRANS_FNS(sp)
18
* Accesses to R15 are UNPREDICTABLE; we choose to undef.
30
MAKE_VFM_TRANS_FNS(dp)
19
* (FPSCR -> r15 is a special case which writes to the PSR flags.)
31
20
*/
32
+static bool trans_VMOV_imm_hp(DisasContext *s, arg_VMOV_imm_sp *a)
21
- if (a->rt == 15 && (!a->l || a->reg != ARM_VFP_FPSCR)) {
33
+{
22
+ if (a->reg != ARM_VFP_FPSCR) {
34
+ TCGv_i32 fd;
23
+ return false;
35
+
24
+ }
36
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
25
+ if (a->rt == 15 && !a->l) {
37
+ return false;
26
return false;
38
+ }
27
}
39
+
28
}
40
+ if (s->vec_len != 0 || s->vec_stride != 0) {
41
+ return false;
42
+ }
43
+
44
+ if (!vfp_access_check(s)) {
45
+ return true;
46
+ }
47
+
48
+ fd = tcg_const_i32(vfp_expand_imm(MO_16, a->imm));
49
+ neon_store_reg32(fd, a->vd);
50
+ tcg_temp_free_i32(fd);
51
+ return true;
52
+}
53
+
54
static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
55
{
56
uint32_t delta_d = 0;
57
--
29
--
58
2.20.1
30
2.20.1
59
31
60
32
diff view generated by jsdifflib
1
Implement the fp16 versions of the VFP VCVT instruction forms which
1
Currently M-profile borrows the A-profile code for VMSR and VMRS
2
convert between floating point and fixed-point.
2
(access to the FP system registers), because all it needs to support
3
is the FPSCR. In v8.1M things become significantly more complicated
4
in two ways:
5
6
* there are several new FP system registers; some have side effects
7
on read, and one (FPCXT_NS) needs to avoid the usual
8
vfp_access_check() and the "only if FPU implemented" check
9
10
* all sysregs are now accessible both by VMRS/VMSR (which
11
reads/writes a general purpose register) and also by VLDR/VSTR
12
(which reads/writes them directly to memory)
13
14
Refactor the structure of how we handle VMSR/VMRS to cope with this:
15
16
* keep the M-profile code entirely separate from the A-profile code
17
18
* abstract out the "read or write the general purpose register" part
19
of the code into a loadfn or storefn function pointer, so we can
20
reuse it for VLDR/VSTR.
3
21
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
23
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-16-peter.maydell@linaro.org
24
Message-id: 20201119215617.29887-8-peter.maydell@linaro.org
7
---
25
---
8
target/arm/vfp.decode | 2 ++
26
target/arm/cpu.h | 3 +
9
target/arm/translate-vfp.c.inc | 59 ++++++++++++++++++++++++++++++++++
27
target/arm/translate-vfp.c.inc | 182 ++++++++++++++++++++++++++++++---
10
2 files changed, 61 insertions(+)
28
2 files changed, 171 insertions(+), 14 deletions(-)
11
29
12
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
30
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/vfp.decode
32
--- a/target/arm/cpu.h
15
+++ b/target/arm/vfp.decode
33
+++ b/target/arm/cpu.h
16
@@ -XXX,XX +XXX,XX @@ VJCVT ---- 1110 1.11 1001 .... 1011 11.0 .... @vfp_dm_sd
34
@@ -XXX,XX +XXX,XX @@ enum arm_cpu_mode {
17
# We assemble bits 18 (op), 16 (u) and 7 (sx) into a single opc field
35
#define ARM_VFP_FPINST 9
18
# for the convenience of the trans_VCVT_fix functions.
36
#define ARM_VFP_FPINST2 10
19
%vcvt_fix_op 18:1 16:1 7:1
37
20
+VCVT_fix_hp ---- 1110 1.11 1.1. .... 1001 .1.0 .... \
38
+/* QEMU-internal value meaning "FPSCR, but we care only about NZCV" */
21
+ vd=%vd_sp imm=%vm_sp opc=%vcvt_fix_op
39
+#define QEMU_VFP_FPSCR_NZCV 0xffff
22
VCVT_fix_sp ---- 1110 1.11 1.1. .... 1010 .1.0 .... \
40
+
23
vd=%vd_sp imm=%vm_sp opc=%vcvt_fix_op
41
/* iwMMXt coprocessor control registers. */
24
VCVT_fix_dp ---- 1110 1.11 1.1. .... 1011 .1.0 .... \
42
#define ARM_IWMMXT_wCID 0
43
#define ARM_IWMMXT_wCon 1
25
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
44
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
26
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate-vfp.c.inc
46
--- a/target/arm/translate-vfp.c.inc
28
+++ b/target/arm/translate-vfp.c.inc
47
+++ b/target/arm/translate-vfp.c.inc
29
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
48
@@ -XXX,XX +XXX,XX @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
30
return true;
49
return true;
31
}
50
}
32
51
33
+static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a)
52
+/*
34
+{
53
+ * M-profile provides two different sets of instructions that can
35
+ TCGv_i32 vd, shift;
54
+ * access floating point system registers: VMSR/VMRS (which move
36
+ TCGv_ptr fpst;
55
+ * to/from a general purpose register) and VLDR/VSTR sysreg (which
37
+ int frac_bits;
56
+ * move directly to/from memory). In some cases there are also side
38
+
57
+ * effects which must happen after any write to memory (which could
39
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
58
+ * cause an exception). So we implement the common logic for the
59
+ * sysreg access in gen_M_fp_sysreg_write() and gen_M_fp_sysreg_read(),
60
+ * which take pointers to callback functions which will perform the
61
+ * actual "read/write general purpose register" and "read/write
62
+ * memory" operations.
63
+ */
64
+
65
+/*
66
+ * Emit code to store the sysreg to its final destination; frees the
67
+ * TCG temp 'value' it is passed.
68
+ */
69
+typedef void fp_sysreg_storefn(DisasContext *s, void *opaque, TCGv_i32 value);
70
+/*
71
+ * Emit code to load the value to be copied to the sysreg; returns
72
+ * a new TCG temporary
73
+ */
74
+typedef TCGv_i32 fp_sysreg_loadfn(DisasContext *s, void *opaque);
75
+
76
+/* Common decode/access checks for fp sysreg read/write */
77
+typedef enum FPSysRegCheckResult {
78
+ FPSysRegCheckFailed, /* caller should return false */
79
+ FPSysRegCheckDone, /* caller should return true */
80
+ FPSysRegCheckContinue, /* caller should continue generating code */
81
+} FPSysRegCheckResult;
82
+
83
+static FPSysRegCheckResult fp_sysreg_checks(DisasContext *s, int regno)
84
+{
85
+ if (!dc_isar_feature(aa32_fpsp_v2, s)) {
86
+ return FPSysRegCheckFailed;
87
+ }
88
+
89
+ switch (regno) {
90
+ case ARM_VFP_FPSCR:
91
+ case QEMU_VFP_FPSCR_NZCV:
92
+ break;
93
+ default:
94
+ return FPSysRegCheckFailed;
95
+ }
96
+
97
+ if (!vfp_access_check(s)) {
98
+ return FPSysRegCheckDone;
99
+ }
100
+
101
+ return FPSysRegCheckContinue;
102
+}
103
+
104
+static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
105
+
106
+ fp_sysreg_loadfn *loadfn,
107
+ void *opaque)
108
+{
109
+ /* Do a write to an M-profile floating point system register */
110
+ TCGv_i32 tmp;
111
+
112
+ switch (fp_sysreg_checks(s, regno)) {
113
+ case FPSysRegCheckFailed:
40
+ return false;
114
+ return false;
41
+ }
115
+ case FPSysRegCheckDone:
42
+
43
+ if (!vfp_access_check(s)) {
44
+ return true;
116
+ return true;
45
+ }
117
+ case FPSysRegCheckContinue:
46
+
118
+ break;
47
+ frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm);
119
+ }
48
+
120
+
49
+ vd = tcg_temp_new_i32();
121
+ switch (regno) {
50
+ neon_load_reg32(vd, a->vd);
122
+ case ARM_VFP_FPSCR:
51
+
123
+ tmp = loadfn(s, opaque);
52
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
124
+ gen_helper_vfp_set_fpscr(cpu_env, tmp);
53
+ shift = tcg_const_i32(frac_bits);
125
+ tcg_temp_free_i32(tmp);
54
+
126
+ gen_lookup_tb(s);
55
+ /* Switch on op:U:sx bits */
56
+ switch (a->opc) {
57
+ case 0:
58
+ gen_helper_vfp_shtoh(vd, vd, shift, fpst);
59
+ break;
60
+ case 1:
61
+ gen_helper_vfp_sltoh(vd, vd, shift, fpst);
62
+ break;
63
+ case 2:
64
+ gen_helper_vfp_uhtoh(vd, vd, shift, fpst);
65
+ break;
66
+ case 3:
67
+ gen_helper_vfp_ultoh(vd, vd, shift, fpst);
68
+ break;
69
+ case 4:
70
+ gen_helper_vfp_toshh_round_to_zero(vd, vd, shift, fpst);
71
+ break;
72
+ case 5:
73
+ gen_helper_vfp_toslh_round_to_zero(vd, vd, shift, fpst);
74
+ break;
75
+ case 6:
76
+ gen_helper_vfp_touhh_round_to_zero(vd, vd, shift, fpst);
77
+ break;
78
+ case 7:
79
+ gen_helper_vfp_toulh_round_to_zero(vd, vd, shift, fpst);
80
+ break;
127
+ break;
81
+ default:
128
+ default:
82
+ g_assert_not_reached();
129
+ g_assert_not_reached();
83
+ }
130
+ }
84
+
85
+ neon_store_reg32(vd, a->vd);
86
+ tcg_temp_free_i32(vd);
87
+ tcg_temp_free_i32(shift);
88
+ tcg_temp_free_ptr(fpst);
89
+ return true;
131
+ return true;
90
+}
132
+}
91
+
133
+
92
static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a)
134
+static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
135
+ fp_sysreg_storefn *storefn,
136
+ void *opaque)
137
+{
138
+ /* Do a read from an M-profile floating point system register */
139
+ TCGv_i32 tmp;
140
+
141
+ switch (fp_sysreg_checks(s, regno)) {
142
+ case FPSysRegCheckFailed:
143
+ return false;
144
+ case FPSysRegCheckDone:
145
+ return true;
146
+ case FPSysRegCheckContinue:
147
+ break;
148
+ }
149
+
150
+ switch (regno) {
151
+ case ARM_VFP_FPSCR:
152
+ tmp = tcg_temp_new_i32();
153
+ gen_helper_vfp_get_fpscr(tmp, cpu_env);
154
+ storefn(s, opaque, tmp);
155
+ break;
156
+ case QEMU_VFP_FPSCR_NZCV:
157
+ /*
158
+ * Read just NZCV; this is a special case to avoid the
159
+ * helper call for the "VMRS to CPSR.NZCV" insn.
160
+ */
161
+ tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
162
+ tcg_gen_andi_i32(tmp, tmp, 0xf0000000);
163
+ storefn(s, opaque, tmp);
164
+ break;
165
+ default:
166
+ g_assert_not_reached();
167
+ }
168
+ return true;
169
+}
170
+
171
+static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, TCGv_i32 value)
172
+{
173
+ arg_VMSR_VMRS *a = opaque;
174
+
175
+ if (a->rt == 15) {
176
+ /* Set the 4 flag bits in the CPSR */
177
+ gen_set_nzcv(value);
178
+ tcg_temp_free_i32(value);
179
+ } else {
180
+ store_reg(s, a->rt, value);
181
+ }
182
+}
183
+
184
+static TCGv_i32 gpr_to_fp_sysreg(DisasContext *s, void *opaque)
185
+{
186
+ arg_VMSR_VMRS *a = opaque;
187
+
188
+ return load_reg(s, a->rt);
189
+}
190
+
191
+static bool gen_M_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
192
+{
193
+ /*
194
+ * Accesses to R15 are UNPREDICTABLE; we choose to undef.
195
+ * FPSCR -> r15 is a special case which writes to the PSR flags;
196
+ * set a->reg to a special value to tell gen_M_fp_sysreg_read()
197
+ * we only care about the top 4 bits of FPSCR there.
198
+ */
199
+ if (a->rt == 15) {
200
+ if (a->l && a->reg == ARM_VFP_FPSCR) {
201
+ a->reg = QEMU_VFP_FPSCR_NZCV;
202
+ } else {
203
+ return false;
204
+ }
205
+ }
206
+
207
+ if (a->l) {
208
+ /* VMRS, move FP system register to gp register */
209
+ return gen_M_fp_sysreg_read(s, a->reg, fp_sysreg_to_gpr, a);
210
+ } else {
211
+ /* VMSR, move gp register to FP system register */
212
+ return gen_M_fp_sysreg_write(s, a->reg, gpr_to_fp_sysreg, a);
213
+ }
214
+}
215
+
216
static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
93
{
217
{
94
TCGv_i32 vd, shift;
218
TCGv_i32 tmp;
219
bool ignore_vfp_enabled = false;
220
221
- if (!dc_isar_feature(aa32_fpsp_v2, s)) {
222
- return false;
223
+ if (arm_dc_feature(s, ARM_FEATURE_M)) {
224
+ return gen_M_VMSR_VMRS(s, a);
225
}
226
227
- if (arm_dc_feature(s, ARM_FEATURE_M)) {
228
- /*
229
- * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
230
- * Accesses to R15 are UNPREDICTABLE; we choose to undef.
231
- * (FPSCR -> r15 is a special case which writes to the PSR flags.)
232
- */
233
- if (a->reg != ARM_VFP_FPSCR) {
234
- return false;
235
- }
236
- if (a->rt == 15 && !a->l) {
237
- return false;
238
- }
239
+ if (!dc_isar_feature(aa32_fpsp_v2, s)) {
240
+ return false;
241
}
242
243
switch (a->reg) {
95
--
244
--
96
2.20.1
245
2.20.1
97
246
98
247
diff view generated by jsdifflib
1
Convert the neon floating-point vector operations VFMA and VFMS
1
The constant-expander functions like negate, plus_2, etc, are
2
to use a gvec helper, and use this to implement the fp16 case.
2
generally useful; move them up in translate.c so we can use them in
3
3
the VFP/Neon decoders as well as in the A32/T32/T16 decoders.
4
This is the last use of do_3same_fp() so we can now delete
5
that function.
6
4
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200828183354.27913-32-peter.maydell@linaro.org
7
Message-id: 20201119215617.29887-9-peter.maydell@linaro.org
10
---
8
---
11
target/arm/helper.h | 6 +++
9
target/arm/translate.c | 46 +++++++++++++++++++++++-------------------
12
target/arm/vec_helper.c | 33 +++++++++++-
10
1 file changed, 25 insertions(+), 21 deletions(-)
13
target/arm/translate-neon.c.inc | 92 +--------------------------------
14
3 files changed, 40 insertions(+), 91 deletions(-)
15
11
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
diff --git a/target/arm/translate.c b/target/arm/translate.c
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.h
14
--- a/target/arm/translate.c
19
+++ b/target/arm/helper.h
15
+++ b/target/arm/translate.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ static void arm_gen_condlabel(DisasContext *s)
21
DEF_HELPER_FLAGS_5(gvec_fmls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
17
}
22
DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
}
23
19
24
+DEF_HELPER_FLAGS_5(gvec_vfma_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
+/*
25
+DEF_HELPER_FLAGS_5(gvec_vfma_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
+ * Constant expanders for the decoders.
22
+ */
26
+
23
+
27
+DEF_HELPER_FLAGS_5(gvec_vfms_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
+static int negate(DisasContext *s, int x)
28
+DEF_HELPER_FLAGS_5(gvec_vfms_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+
30
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
31
void, ptr, ptr, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
33
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/vec_helper.c
36
+++ b/target/arm/vec_helper.c
37
@@ -XXX,XX +XXX,XX @@ static float32 float32_mulsub_nf(float32 dest, float32 op1, float32 op2,
38
return float32_sub(dest, float32_mul(op1, op2, stat), stat);
39
}
40
41
-#define DO_MULADD(NAME, FUNC, TYPE) \
42
+/* Fused versions; these have the semantics Neon VFMA/VFMS want */
43
+static float16 float16_muladd_f(float16 dest, float16 op1, float16 op2,
44
+ float_status *stat)
45
+{
25
+{
46
+ return float16_muladd(op1, op2, dest, 0, stat);
26
+ return -x;
47
+}
27
+}
48
+
28
+
49
+static float32 float32_muladd_f(float32 dest, float32 op1, float32 op2,
29
+static int plus_2(DisasContext *s, int x)
50
+ float_status *stat)
51
+{
30
+{
52
+ return float32_muladd(op1, op2, dest, 0, stat);
31
+ return x + 2;
53
+}
32
+}
54
+
33
+
55
+static float16 float16_mulsub_f(float16 dest, float16 op1, float16 op2,
34
+static int times_2(DisasContext *s, int x)
56
+ float_status *stat)
57
+{
35
+{
58
+ return float16_muladd(float16_chs(op1), op2, dest, 0, stat);
36
+ return x * 2;
59
+}
37
+}
60
+
38
+
61
+static float32 float32_mulsub_f(float32 dest, float32 op1, float32 op2,
39
+static int times_4(DisasContext *s, int x)
62
+ float_status *stat)
63
+{
40
+{
64
+ return float32_muladd(float32_chs(op1), op2, dest, 0, stat);
41
+ return x * 4;
65
+}
42
+}
66
+
43
+
67
+#define DO_MULADD(NAME, FUNC, TYPE) \
44
/* Flags for the disas_set_da_iss info argument:
68
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
45
* lower bits hold the Rt register number, higher bits are flags.
69
{ \
70
intptr_t i, oprsz = simd_oprsz(desc); \
71
@@ -XXX,XX +XXX,XX @@ DO_MULADD(gvec_fmla_s, float32_muladd_nf, float32)
72
DO_MULADD(gvec_fmls_h, float16_mulsub_nf, float16)
73
DO_MULADD(gvec_fmls_s, float32_mulsub_nf, float32)
74
75
+DO_MULADD(gvec_vfma_h, float16_muladd_f, float16)
76
+DO_MULADD(gvec_vfma_s, float32_muladd_f, float32)
77
+
78
+DO_MULADD(gvec_vfms_h, float16_mulsub_f, float16)
79
+DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32)
80
+
81
/* For the indexed ops, SVE applies the index per 128-bit vector segment.
82
* For AdvSIMD, there is of course only one such vector segment.
83
*/
46
*/
84
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
47
@@ -XXX,XX +XXX,XX @@ static void arm_skip_unless(DisasContext *s, uint32_t cond)
85
index XXXXXXX..XXXXXXX 100644
48
86
--- a/target/arm/translate-neon.c.inc
49
87
+++ b/target/arm/translate-neon.c.inc
50
/*
88
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
51
- * Constant expanders for the decoders.
89
DO_3SAME_VQDMULH(VQDMULH, qdmulh)
52
+ * Constant expanders used by T16/T32 decode
90
DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
53
*/
91
54
92
-static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
55
-static int negate(DisasContext *s, int x)
93
- bool reads_vd)
94
-{
56
-{
95
- /*
57
- return -x;
96
- * FP operations handled elementwise 32 bits at a time.
97
- * If reads_vd is true then the old value of Vd will be
98
- * loaded before calling the callback function. This is
99
- * used for multiply-accumulate type operations.
100
- */
101
- TCGv_i32 tmp, tmp2;
102
- int pass;
103
-
104
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
105
- return false;
106
- }
107
-
108
- /* UNDEF accesses to D16-D31 if they don't exist. */
109
- if (!dc_isar_feature(aa32_simd_r32, s) &&
110
- ((a->vd | a->vn | a->vm) & 0x10)) {
111
- return false;
112
- }
113
-
114
- if ((a->vn | a->vm | a->vd) & a->q) {
115
- return false;
116
- }
117
-
118
- if (!vfp_access_check(s)) {
119
- return true;
120
- }
121
-
122
- TCGv_ptr fpstatus = fpstatus_ptr(FPST_STD);
123
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
124
- tmp = neon_load_reg(a->vn, pass);
125
- tmp2 = neon_load_reg(a->vm, pass);
126
- if (reads_vd) {
127
- TCGv_i32 tmp_rd = neon_load_reg(a->vd, pass);
128
- fn(tmp_rd, tmp, tmp2, fpstatus);
129
- neon_store_reg(a->vd, pass, tmp_rd);
130
- tcg_temp_free_i32(tmp);
131
- } else {
132
- fn(tmp, tmp, tmp2, fpstatus);
133
- neon_store_reg(a->vd, pass, tmp);
134
- }
135
- tcg_temp_free_i32(tmp2);
136
- }
137
- tcg_temp_free_ptr(fpstatus);
138
- return true;
139
-}
58
-}
140
-
59
-
141
#define WRAP_FP_GVEC(WRAPNAME, FPST, FUNC) \
60
-static int plus_2(DisasContext *s, int x)
142
static void WRAPNAME(unsigned vece, uint32_t rd_ofs, \
143
uint32_t rn_ofs, uint32_t rm_ofs, \
144
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMAX, gen_helper_gvec_fmax_s, gen_helper_gvec_fmax_h)
145
DO_3S_FP_GVEC(VMIN, gen_helper_gvec_fmin_s, gen_helper_gvec_fmin_h)
146
DO_3S_FP_GVEC(VMLA, gen_helper_gvec_fmla_s, gen_helper_gvec_fmla_h)
147
DO_3S_FP_GVEC(VMLS, gen_helper_gvec_fmls_s, gen_helper_gvec_fmls_h)
148
+DO_3S_FP_GVEC(VFMA, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_h)
149
+DO_3S_FP_GVEC(VFMS, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_h)
150
151
WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
152
WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
153
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
154
return do_3same(s, a, gen_VRSQRTS_fp_3s);
155
}
156
157
-static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
158
- TCGv_ptr fpstatus)
159
-{
61
-{
160
- gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
62
- return x + 2;
161
-}
63
-}
162
-
64
-
163
-static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
65
-static int times_2(DisasContext *s, int x)
164
-{
66
-{
165
- if (!dc_isar_feature(aa32_simdfmac, s)) {
67
- return x * 2;
166
- return false;
167
- }
168
-
169
- if (a->size != 0) {
170
- /* TODO fp16 support */
171
- return false;
172
- }
173
-
174
- return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
175
-}
68
-}
176
-
69
-
177
-static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
70
-static int times_4(DisasContext *s, int x)
178
- TCGv_ptr fpstatus)
179
-{
71
-{
180
- gen_helper_vfp_negs(vn, vn);
72
- return x * 4;
181
- gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
182
-}
73
-}
183
-
74
-
184
-static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
75
/* Return only the rotation part of T32ExpandImm. */
185
-{
76
static int t32_expandimm_rot(DisasContext *s, int x)
186
- if (!dc_isar_feature(aa32_simdfmac, s)) {
187
- return false;
188
- }
189
-
190
- if (a->size != 0) {
191
- /* TODO fp16 support */
192
- return false;
193
- }
194
-
195
- return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
196
-}
197
-
198
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
199
{
77
{
200
/* FP operations handled pairwise 32 bits at a time */
201
--
78
--
202
2.20.1
79
2.20.1
203
80
204
81
diff view generated by jsdifflib
1
Implement the VFP fp16 variant of VMOV that transfers a 16-bit
1
Implement the new-in-v8.1M VLDR/VSTR variants which directly
2
value between a general purpose register and a VFP register.
2
read or write FP system registers to memory.
3
4
Note that Rt == 15 is UNPREDICTABLE; since this insn is v8 and later
5
only we have no need to replicate the old "updates CPSR.NZCV"
6
behaviour that the singleprec version of this insn does.
7
3
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200828183354.27913-22-peter.maydell@linaro.org
6
Message-id: 20201119215617.29887-10-peter.maydell@linaro.org
11
---
7
---
12
target/arm/vfp.decode | 1 +
8
target/arm/vfp.decode | 14 ++++++
13
target/arm/translate-vfp.c.inc | 34 ++++++++++++++++++++++++++++++++++
9
target/arm/translate-vfp.c.inc | 91 ++++++++++++++++++++++++++++++++++
14
2 files changed, 35 insertions(+)
10
2 files changed, 105 insertions(+)
15
11
16
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
12
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/vfp.decode
14
--- a/target/arm/vfp.decode
19
+++ b/target/arm/vfp.decode
15
+++ b/target/arm/vfp.decode
20
@@ -XXX,XX +XXX,XX @@ VDUP ---- 1110 1 b:1 q:1 0 .... rt:4 1011 . 0 e:1 1 0000 \
16
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR_hp ---- 1101 u:1 .0 l:1 rn:4 .... 1001 imm:8 vd=%vd_sp
21
vn=%vn_dp
17
VLDR_VSTR_sp ---- 1101 u:1 .0 l:1 rn:4 .... 1010 imm:8 vd=%vd_sp
22
18
VLDR_VSTR_dp ---- 1101 u:1 .0 l:1 rn:4 .... 1011 imm:8 vd=%vd_dp
23
VMSR_VMRS ---- 1110 111 l:1 reg:4 rt:4 1010 0001 0000
19
24
+VMOV_half ---- 1110 000 l:1 .... rt:4 1001 . 001 0000 vn=%vn_sp
20
+# M-profile VLDR/VSTR to sysreg
25
VMOV_single ---- 1110 000 l:1 .... rt:4 1010 . 001 0000 vn=%vn_sp
21
+%vldr_sysreg 22:1 13:3
26
22
+%imm7_0x4 0:7 !function=times_4
27
VMOV_64_sp ---- 1100 010 op:1 rt2:4 rt:4 1010 00.1 .... vm=%vm_sp
23
+
24
+&vldr_sysreg rn reg imm a w p
25
+@vldr_sysreg .... ... . a:1 . . . rn:4 ... . ... .. ....... \
26
+ reg=%vldr_sysreg imm=%imm7_0x4 &vldr_sysreg
27
+
28
+# P=0 W=0 is SEE "Related encodings", so split into two patterns
29
+VLDR_sysreg ---- 110 1 . . w:1 1 .... ... 0 111 11 ....... @vldr_sysreg p=1
30
+VLDR_sysreg ---- 110 0 . . 1 1 .... ... 0 111 11 ....... @vldr_sysreg p=0 w=1
31
+VSTR_sysreg ---- 110 1 . . w:1 0 .... ... 0 111 11 ....... @vldr_sysreg p=1
32
+VSTR_sysreg ---- 110 0 . . 1 0 .... ... 0 111 11 ....... @vldr_sysreg p=0 w=1
33
+
34
# We split the load/store multiple up into two patterns to avoid
35
# overlap with other insns in the "Advanced SIMD load/store and 64-bit move"
36
# grouping:
28
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
37
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
29
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-vfp.c.inc
39
--- a/target/arm/translate-vfp.c.inc
31
+++ b/target/arm/translate-vfp.c.inc
40
+++ b/target/arm/translate-vfp.c.inc
32
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
41
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
33
return true;
42
return true;
34
}
43
}
35
44
36
+static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a)
45
+static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
37
+{
46
+{
38
+ TCGv_i32 tmp;
47
+ arg_vldr_sysreg *a = opaque;
48
+ uint32_t offset = a->imm;
49
+ TCGv_i32 addr;
39
+
50
+
40
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
51
+ if (!a->a) {
52
+ offset = - offset;
53
+ }
54
+
55
+ addr = load_reg(s, a->rn);
56
+ if (a->p) {
57
+ tcg_gen_addi_i32(addr, addr, offset);
58
+ }
59
+
60
+ if (s->v8m_stackcheck && a->rn == 13 && a->w) {
61
+ gen_helper_v8m_stackcheck(cpu_env, addr);
62
+ }
63
+
64
+ gen_aa32_st_i32(s, value, addr, get_mem_index(s),
65
+ MO_UL | MO_ALIGN | s->be_data);
66
+ tcg_temp_free_i32(value);
67
+
68
+ if (a->w) {
69
+ /* writeback */
70
+ if (!a->p) {
71
+ tcg_gen_addi_i32(addr, addr, offset);
72
+ }
73
+ store_reg(s, a->rn, addr);
74
+ } else {
75
+ tcg_temp_free_i32(addr);
76
+ }
77
+}
78
+
79
+static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque)
80
+{
81
+ arg_vldr_sysreg *a = opaque;
82
+ uint32_t offset = a->imm;
83
+ TCGv_i32 addr;
84
+ TCGv_i32 value = tcg_temp_new_i32();
85
+
86
+ if (!a->a) {
87
+ offset = - offset;
88
+ }
89
+
90
+ addr = load_reg(s, a->rn);
91
+ if (a->p) {
92
+ tcg_gen_addi_i32(addr, addr, offset);
93
+ }
94
+
95
+ if (s->v8m_stackcheck && a->rn == 13 && a->w) {
96
+ gen_helper_v8m_stackcheck(cpu_env, addr);
97
+ }
98
+
99
+ gen_aa32_ld_i32(s, value, addr, get_mem_index(s),
100
+ MO_UL | MO_ALIGN | s->be_data);
101
+
102
+ if (a->w) {
103
+ /* writeback */
104
+ if (!a->p) {
105
+ tcg_gen_addi_i32(addr, addr, offset);
106
+ }
107
+ store_reg(s, a->rn, addr);
108
+ } else {
109
+ tcg_temp_free_i32(addr);
110
+ }
111
+ return value;
112
+}
113
+
114
+static bool trans_VLDR_sysreg(DisasContext *s, arg_vldr_sysreg *a)
115
+{
116
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
41
+ return false;
117
+ return false;
42
+ }
118
+ }
43
+
119
+ if (a->rn == 15) {
44
+ if (a->rt == 15) {
45
+ /* UNPREDICTABLE; we choose to UNDEF */
46
+ return false;
120
+ return false;
47
+ }
121
+ }
48
+
122
+ return gen_M_fp_sysreg_write(s, a->reg, memory_to_fp_sysreg, a);
49
+ if (!vfp_access_check(s)) {
50
+ return true;
51
+ }
52
+
53
+ if (a->l) {
54
+ /* VFP to general purpose register */
55
+ tmp = tcg_temp_new_i32();
56
+ neon_load_reg32(tmp, a->vn);
57
+ tcg_gen_andi_i32(tmp, tmp, 0xffff);
58
+ store_reg(s, a->rt, tmp);
59
+ } else {
60
+ /* general purpose register to VFP */
61
+ tmp = load_reg(s, a->rt);
62
+ tcg_gen_andi_i32(tmp, tmp, 0xffff);
63
+ neon_store_reg32(tmp, a->vn);
64
+ tcg_temp_free_i32(tmp);
65
+ }
66
+
67
+ return true;
68
+}
123
+}
69
+
124
+
70
static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a)
125
+static bool trans_VSTR_sysreg(DisasContext *s, arg_vldr_sysreg *a)
126
+{
127
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
128
+ return false;
129
+ }
130
+ if (a->rn == 15) {
131
+ return false;
132
+ }
133
+ return gen_M_fp_sysreg_read(s, a->reg, fp_sysreg_to_memory, a);
134
+}
135
+
136
static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a)
71
{
137
{
72
TCGv_i32 tmp;
138
TCGv_i32 tmp;
73
--
139
--
74
2.20.1
140
2.20.1
75
141
76
142
diff view generated by jsdifflib
1
The fp16 extension includes a new instruction VMOVX, which copies the
1
v8.1M defines a new FP system register FPSCR_nzcvqc; this behaves
2
upper 16 bits of a 32-bit source VFP register into the lower 16
2
like the existing FPSCR, except that it reads and writes only bits
3
bits of the destination and zeroes the high half of the destination.
3
[31:27] of the FPSCR (the N, Z, C, V and QC flag bits). (Unlike the
4
Implement it.
4
FPSCR, the special case for Rt=15 of writing the CPSR.NZCV is not
5
permitted.)
6
7
Implement the register. Since we don't yet implement MVE, we handle
8
the QC bit as RES0, with todo comments for where we will need to add
9
support later.
5
10
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200828183354.27913-21-peter.maydell@linaro.org
13
Message-id: 20201119215617.29887-11-peter.maydell@linaro.org
9
---
14
---
10
target/arm/vfp-uncond.decode | 3 +++
15
target/arm/cpu.h | 13 +++++++++++++
11
target/arm/translate-vfp.c.inc | 25 +++++++++++++++++++++++++
16
target/arm/translate-vfp.c.inc | 27 +++++++++++++++++++++++++++
12
2 files changed, 28 insertions(+)
17
2 files changed, 40 insertions(+)
13
18
14
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
19
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/vfp-uncond.decode
21
--- a/target/arm/cpu.h
17
+++ b/target/arm/vfp-uncond.decode
22
+++ b/target/arm/cpu.h
18
@@ -XXX,XX +XXX,XX @@ VCVT 1111 1110 1.11 11 rm:2 .... 1010 op:1 1.0 .... \
23
@@ -XXX,XX +XXX,XX @@ void vfp_set_fpscr(CPUARMState *env, uint32_t val);
19
VCVT 1111 1110 1.11 11 rm:2 .... 1011 op:1 1.0 .... \
24
#define FPCR_FZ (1 << 24) /* Flush-to-zero enable bit */
20
vm=%vm_dp vd=%vd_sp sz=3
25
#define FPCR_DN (1 << 25) /* Default NaN enable bit */
21
26
#define FPCR_QC (1 << 27) /* Cumulative saturation bit */
22
+VMOVX 1111 1110 1.11 0000 .... 1010 01 . 0 .... \
27
+#define FPCR_V (1 << 28) /* FP overflow flag */
23
+ vd=%vd_sp vm=%vm_sp
28
+#define FPCR_C (1 << 29) /* FP carry flag */
29
+#define FPCR_Z (1 << 30) /* FP zero flag */
30
+#define FPCR_N (1 << 31) /* FP negative flag */
24
+
31
+
25
VINS 1111 1110 1.11 0000 .... 1010 11 . 0 .... \
32
+#define FPCR_NZCV_MASK (FPCR_N | FPCR_Z | FPCR_C | FPCR_V)
26
vd=%vd_sp vm=%vm_sp
33
+#define FPCR_NZCVQC_MASK (FPCR_NZCV_MASK | FPCR_QC)
34
35
static inline uint32_t vfp_get_fpsr(CPUARMState *env)
36
{
37
@@ -XXX,XX +XXX,XX @@ enum arm_cpu_mode {
38
#define ARM_VFP_FPEXC 8
39
#define ARM_VFP_FPINST 9
40
#define ARM_VFP_FPINST2 10
41
+/* These ones are M-profile only */
42
+#define ARM_VFP_FPSCR_NZCVQC 2
43
+#define ARM_VFP_VPR 12
44
+#define ARM_VFP_P0 13
45
+#define ARM_VFP_FPCXT_NS 14
46
+#define ARM_VFP_FPCXT_S 15
47
48
/* QEMU-internal value meaning "FPSCR, but we care only about NZCV" */
49
#define QEMU_VFP_FPSCR_NZCV 0xffff
27
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
50
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
28
index XXXXXXX..XXXXXXX 100644
51
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/translate-vfp.c.inc
52
--- a/target/arm/translate-vfp.c.inc
30
+++ b/target/arm/translate-vfp.c.inc
53
+++ b/target/arm/translate-vfp.c.inc
31
@@ -XXX,XX +XXX,XX @@ static bool trans_VINS(DisasContext *s, arg_VINS *a)
54
@@ -XXX,XX +XXX,XX @@ static FPSysRegCheckResult fp_sysreg_checks(DisasContext *s, int regno)
32
tcg_temp_free_i32(rd);
55
case ARM_VFP_FPSCR:
33
return true;
56
case QEMU_VFP_FPSCR_NZCV:
34
}
57
break;
35
+
58
+ case ARM_VFP_FPSCR_NZCVQC:
36
+static bool trans_VMOVX(DisasContext *s, arg_VINS *a)
59
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
37
+{
60
+ return false;
38
+ TCGv_i32 rm;
61
+ }
39
+
62
+ break;
40
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
63
default:
41
+ return false;
64
return FPSysRegCheckFailed;
65
}
66
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
67
tcg_temp_free_i32(tmp);
68
gen_lookup_tb(s);
69
break;
70
+ case ARM_VFP_FPSCR_NZCVQC:
71
+ {
72
+ TCGv_i32 fpscr;
73
+ tmp = loadfn(s, opaque);
74
+ /*
75
+ * TODO: when we implement MVE, write the QC bit.
76
+ * For non-MVE, QC is RES0.
77
+ */
78
+ tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
79
+ fpscr = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
80
+ tcg_gen_andi_i32(fpscr, fpscr, ~FPCR_NZCV_MASK);
81
+ tcg_gen_or_i32(fpscr, fpscr, tmp);
82
+ store_cpu_field(fpscr, vfp.xregs[ARM_VFP_FPSCR]);
83
+ tcg_temp_free_i32(tmp);
84
+ break;
42
+ }
85
+ }
43
+
86
default:
44
+ if (s->vec_len != 0 || s->vec_stride != 0) {
87
g_assert_not_reached();
45
+ return false;
88
}
46
+ }
89
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
47
+
90
gen_helper_vfp_get_fpscr(tmp, cpu_env);
48
+ if (!vfp_access_check(s)) {
91
storefn(s, opaque, tmp);
49
+ return true;
92
break;
50
+ }
93
+ case ARM_VFP_FPSCR_NZCVQC:
51
+
94
+ /*
52
+ /* Set Vd to high half of Vm */
95
+ * TODO: MVE has a QC bit, which we probably won't store
53
+ rm = tcg_temp_new_i32();
96
+ * in the xregs[] field. For non-MVE, where QC is RES0,
54
+ neon_load_reg32(rm, a->vm);
97
+ * we can just fall through to the FPSCR_NZCV case.
55
+ tcg_gen_shri_i32(rm, rm, 16);
98
+ */
56
+ neon_store_reg32(rm, a->vd);
99
case QEMU_VFP_FPSCR_NZCV:
57
+ tcg_temp_free_i32(rm);
100
/*
58
+ return true;
101
* Read just NZCV; this is a special case to avoid the
59
+}
60
--
102
--
61
2.20.1
103
2.20.1
62
104
63
105
diff view generated by jsdifflib
1
Macroify the uses of do_vfp_2op_sp() and do_vfp_2op_dp(); this will
1
We defined a constant name for the mask of NZCV bits in the FPCR/FPSCR
2
make it easier to add the halfprec support.
2
in the previous commit; use it in a couple of places in existing code,
3
where we're masking out everything except NZCV for the "load to Rt=15
4
sets CPSR.NZCV" special case.
3
5
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-8-peter.maydell@linaro.org
8
Message-id: 20201119215617.29887-12-peter.maydell@linaro.org
7
---
9
---
8
target/arm/translate-vfp.c.inc | 49 ++++++++++------------------------
10
target/arm/translate-vfp.c.inc | 4 ++--
9
1 file changed, 14 insertions(+), 35 deletions(-)
11
1 file changed, 2 insertions(+), 2 deletions(-)
10
12
11
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
13
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
12
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-vfp.c.inc
15
--- a/target/arm/translate-vfp.c.inc
14
+++ b/target/arm/translate-vfp.c.inc
16
+++ b/target/arm/translate-vfp.c.inc
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
17
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
16
return true;
18
* helper call for the "VMRS to CPSR.NZCV" insn.
17
}
19
*/
18
20
tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
19
-static bool trans_VMOV_reg_sp(DisasContext *s, arg_VMOV_reg_sp *a)
21
- tcg_gen_andi_i32(tmp, tmp, 0xf0000000);
20
-{
22
+ tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
21
- return do_vfp_2op_sp(s, tcg_gen_mov_i32, a->vd, a->vm);
23
storefn(s, opaque, tmp);
22
-}
24
break;
23
+#define DO_VFP_2OP(INSN, PREC, FN) \
25
default:
24
+ static bool trans_##INSN##_##PREC(DisasContext *s, \
26
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
25
+ arg_##INSN##_##PREC *a) \
27
case ARM_VFP_FPSCR:
26
+ { \
28
if (a->rt == 15) {
27
+ return do_vfp_2op_##PREC(s, FN, a->vd, a->vm); \
29
tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
28
+ }
30
- tcg_gen_andi_i32(tmp, tmp, 0xf0000000);
29
31
+ tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
30
-static bool trans_VMOV_reg_dp(DisasContext *s, arg_VMOV_reg_dp *a)
32
} else {
31
-{
33
tmp = tcg_temp_new_i32();
32
- return do_vfp_2op_dp(s, tcg_gen_mov_i64, a->vd, a->vm);
34
gen_helper_vfp_get_fpscr(tmp, cpu_env);
33
-}
34
+DO_VFP_2OP(VMOV_reg, sp, tcg_gen_mov_i32)
35
+DO_VFP_2OP(VMOV_reg, dp, tcg_gen_mov_i64)
36
37
-static bool trans_VABS_sp(DisasContext *s, arg_VABS_sp *a)
38
-{
39
- return do_vfp_2op_sp(s, gen_helper_vfp_abss, a->vd, a->vm);
40
-}
41
+DO_VFP_2OP(VABS, sp, gen_helper_vfp_abss)
42
+DO_VFP_2OP(VABS, dp, gen_helper_vfp_absd)
43
44
-static bool trans_VABS_dp(DisasContext *s, arg_VABS_dp *a)
45
-{
46
- return do_vfp_2op_dp(s, gen_helper_vfp_absd, a->vd, a->vm);
47
-}
48
-
49
-static bool trans_VNEG_sp(DisasContext *s, arg_VNEG_sp *a)
50
-{
51
- return do_vfp_2op_sp(s, gen_helper_vfp_negs, a->vd, a->vm);
52
-}
53
-
54
-static bool trans_VNEG_dp(DisasContext *s, arg_VNEG_dp *a)
55
-{
56
- return do_vfp_2op_dp(s, gen_helper_vfp_negd, a->vd, a->vm);
57
-}
58
+DO_VFP_2OP(VNEG, sp, gen_helper_vfp_negs)
59
+DO_VFP_2OP(VNEG, dp, gen_helper_vfp_negd)
60
61
static void gen_VSQRT_sp(TCGv_i32 vd, TCGv_i32 vm)
62
{
63
gen_helper_vfp_sqrts(vd, vm, cpu_env);
64
}
65
66
-static bool trans_VSQRT_sp(DisasContext *s, arg_VSQRT_sp *a)
67
-{
68
- return do_vfp_2op_sp(s, gen_VSQRT_sp, a->vd, a->vm);
69
-}
70
-
71
static void gen_VSQRT_dp(TCGv_i64 vd, TCGv_i64 vm)
72
{
73
gen_helper_vfp_sqrtd(vd, vm, cpu_env);
74
}
75
76
-static bool trans_VSQRT_dp(DisasContext *s, arg_VSQRT_dp *a)
77
-{
78
- return do_vfp_2op_dp(s, gen_VSQRT_dp, a->vd, a->vm);
79
-}
80
+DO_VFP_2OP(VSQRT, sp, gen_VSQRT_sp)
81
+DO_VFP_2OP(VSQRT, dp, gen_VSQRT_dp)
82
83
static bool trans_VCMP_sp(DisasContext *s, arg_VCMP_sp *a)
84
{
85
--
35
--
86
2.20.1
36
2.20.1
87
37
88
38
diff view generated by jsdifflib
1
Implmeent VFP fp16 support for simple binary-operator VFP insns VADD,
1
Factor out the code which handles M-profile lazy FP state preservation
2
VSUB, VMUL, VDIV, VMINNM and VMAXNM:
2
from full_vfp_access_check(); accesses to the FPCXT_NS register are
3
3
a special case which need to do just this part (corresponding in the
4
* make the VFP_BINOP() macro generate float16 helpers as well as
4
pseudocode to the PreserveFPState() function), and not the full
5
float32 and float64
5
set of actions matching the pseudocode ExecuteFPCheck() which
6
* implement a do_vfp_3op_hp() function similar to the existing
6
normal FP instructions need to do.
7
do_vfp_3op_sp()
8
* add decode for the half-precision insn patterns
9
10
Note that the VFP_BINOP macro use creates a couple of unused helper
11
functions vfp_maxh and vfp_minh, but they're small so it's not worth
12
splitting the BINOP operations into "needs halfprec" and "no
13
halfprec" groups.
14
7
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Message-id: 20200828183354.27913-4-peter.maydell@linaro.org
10
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Message-id: 20201119215617.29887-13-peter.maydell@linaro.org
18
---
12
---
19
target/arm/helper.h | 8 ++++
13
target/arm/translate-vfp.c.inc | 45 ++++++++++++++++++++--------------
20
target/arm/vfp-uncond.decode | 3 ++
14
1 file changed, 27 insertions(+), 18 deletions(-)
21
target/arm/vfp.decode | 4 ++
22
target/arm/vfp_helper.c | 5 ++
23
target/arm/translate-vfp.c.inc | 86 ++++++++++++++++++++++++++++++++++
24
5 files changed, 106 insertions(+)
25
15
26
diff --git a/target/arm/helper.h b/target/arm/helper.h
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/helper.h
29
+++ b/target/arm/helper.h
30
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(probe_access, TCG_CALL_NO_WG, void, env, tl, i32, i32, i32)
31
DEF_HELPER_1(vfp_get_fpscr, i32, env)
32
DEF_HELPER_2(vfp_set_fpscr, void, env, i32)
33
34
+DEF_HELPER_3(vfp_addh, f16, f16, f16, ptr)
35
DEF_HELPER_3(vfp_adds, f32, f32, f32, ptr)
36
DEF_HELPER_3(vfp_addd, f64, f64, f64, ptr)
37
+DEF_HELPER_3(vfp_subh, f16, f16, f16, ptr)
38
DEF_HELPER_3(vfp_subs, f32, f32, f32, ptr)
39
DEF_HELPER_3(vfp_subd, f64, f64, f64, ptr)
40
+DEF_HELPER_3(vfp_mulh, f16, f16, f16, ptr)
41
DEF_HELPER_3(vfp_muls, f32, f32, f32, ptr)
42
DEF_HELPER_3(vfp_muld, f64, f64, f64, ptr)
43
+DEF_HELPER_3(vfp_divh, f16, f16, f16, ptr)
44
DEF_HELPER_3(vfp_divs, f32, f32, f32, ptr)
45
DEF_HELPER_3(vfp_divd, f64, f64, f64, ptr)
46
+DEF_HELPER_3(vfp_maxh, f16, f16, f16, ptr)
47
DEF_HELPER_3(vfp_maxs, f32, f32, f32, ptr)
48
DEF_HELPER_3(vfp_maxd, f64, f64, f64, ptr)
49
+DEF_HELPER_3(vfp_minh, f16, f16, f16, ptr)
50
DEF_HELPER_3(vfp_mins, f32, f32, f32, ptr)
51
DEF_HELPER_3(vfp_mind, f64, f64, f64, ptr)
52
+DEF_HELPER_3(vfp_maxnumh, f16, f16, f16, ptr)
53
DEF_HELPER_3(vfp_maxnums, f32, f32, f32, ptr)
54
DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, ptr)
55
+DEF_HELPER_3(vfp_minnumh, f16, f16, f16, ptr)
56
DEF_HELPER_3(vfp_minnums, f32, f32, f32, ptr)
57
DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr)
58
DEF_HELPER_1(vfp_negs, f32, f32)
59
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
60
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/vfp-uncond.decode
62
+++ b/target/arm/vfp-uncond.decode
63
@@ -XXX,XX +XXX,XX @@ VSEL 1111 1110 0. cc:2 .... .... 1010 .0.0 .... \
64
VSEL 1111 1110 0. cc:2 .... .... 1011 .0.0 .... \
65
vm=%vm_dp vn=%vn_dp vd=%vd_dp dp=1
66
67
+VMAXNM_hp 1111 1110 1.00 .... .... 1001 .0.0 .... @vfp_dnm_s
68
+VMINNM_hp 1111 1110 1.00 .... .... 1001 .1.0 .... @vfp_dnm_s
69
+
70
VMAXNM_sp 1111 1110 1.00 .... .... 1010 .0.0 .... @vfp_dnm_s
71
VMINNM_sp 1111 1110 1.00 .... .... 1010 .1.0 .... @vfp_dnm_s
72
73
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/vfp.decode
76
+++ b/target/arm/vfp.decode
77
@@ -XXX,XX +XXX,XX @@ VNMLS_dp ---- 1110 0.01 .... .... 1011 .0.0 .... @vfp_dnm_d
78
VNMLA_sp ---- 1110 0.01 .... .... 1010 .1.0 .... @vfp_dnm_s
79
VNMLA_dp ---- 1110 0.01 .... .... 1011 .1.0 .... @vfp_dnm_d
80
81
+VMUL_hp ---- 1110 0.10 .... .... 1001 .0.0 .... @vfp_dnm_s
82
VMUL_sp ---- 1110 0.10 .... .... 1010 .0.0 .... @vfp_dnm_s
83
VMUL_dp ---- 1110 0.10 .... .... 1011 .0.0 .... @vfp_dnm_d
84
85
VNMUL_sp ---- 1110 0.10 .... .... 1010 .1.0 .... @vfp_dnm_s
86
VNMUL_dp ---- 1110 0.10 .... .... 1011 .1.0 .... @vfp_dnm_d
87
88
+VADD_hp ---- 1110 0.11 .... .... 1001 .0.0 .... @vfp_dnm_s
89
VADD_sp ---- 1110 0.11 .... .... 1010 .0.0 .... @vfp_dnm_s
90
VADD_dp ---- 1110 0.11 .... .... 1011 .0.0 .... @vfp_dnm_d
91
92
+VSUB_hp ---- 1110 0.11 .... .... 1001 .1.0 .... @vfp_dnm_s
93
VSUB_sp ---- 1110 0.11 .... .... 1010 .1.0 .... @vfp_dnm_s
94
VSUB_dp ---- 1110 0.11 .... .... 1011 .1.0 .... @vfp_dnm_d
95
96
+VDIV_hp ---- 1110 1.00 .... .... 1001 .0.0 .... @vfp_dnm_s
97
VDIV_sp ---- 1110 1.00 .... .... 1010 .0.0 .... @vfp_dnm_s
98
VDIV_dp ---- 1110 1.00 .... .... 1011 .0.0 .... @vfp_dnm_d
99
100
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/vfp_helper.c
103
+++ b/target/arm/vfp_helper.c
104
@@ -XXX,XX +XXX,XX @@ void vfp_set_fpscr(CPUARMState *env, uint32_t val)
105
#define VFP_HELPER(name, p) HELPER(glue(glue(vfp_,name),p))
106
107
#define VFP_BINOP(name) \
108
+dh_ctype_f16 VFP_HELPER(name, h)(dh_ctype_f16 a, dh_ctype_f16 b, void *fpstp) \
109
+{ \
110
+ float_status *fpst = fpstp; \
111
+ return float16_ ## name(a, b, fpst); \
112
+} \
113
float32 VFP_HELPER(name, s)(float32 a, float32 b, void *fpstp) \
114
{ \
115
float_status *fpst = fpstp; \
116
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
16
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
117
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
118
--- a/target/arm/translate-vfp.c.inc
18
--- a/target/arm/translate-vfp.c.inc
119
+++ b/target/arm/translate-vfp.c.inc
19
+++ b/target/arm/translate-vfp.c.inc
120
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn,
20
@@ -XXX,XX +XXX,XX @@ static inline long vfp_f16_offset(unsigned reg, bool top)
121
return true;
21
return offs;
122
}
22
}
123
23
124
+static bool do_vfp_3op_hp(DisasContext *s, VFPGen3OpSPFn *fn,
24
+/*
125
+ int vd, int vn, int vm, bool reads_vd)
25
+ * Generate code for M-profile lazy FP state preservation if needed;
26
+ * this corresponds to the pseudocode PreserveFPState() function.
27
+ */
28
+static void gen_preserve_fp_state(DisasContext *s)
126
+{
29
+{
127
+ /*
30
+ if (s->v7m_lspact) {
128
+ * Do a half-precision operation. Functionally this is
31
+ /*
129
+ * the same as do_vfp_3op_sp(), except:
32
+ * Lazy state saving affects external memory and also the NVIC,
130
+ * - it uses the FPST_FPCR_F16
33
+ * so we must mark it as an IO operation for icount (and cause
131
+ * - it doesn't need the VFP vector handling (fp16 is a
34
+ * this to be the last insn in the TB).
132
+ * v8 feature, and in v8 VFP vectors don't exist)
35
+ */
133
+ * - it does the aa32_fp16_arith feature test
36
+ if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
134
+ */
37
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
135
+ TCGv_i32 f0, f1, fd;
38
+ gen_io_start();
136
+ TCGv_ptr fpst;
39
+ }
137
+
40
+ gen_helper_v7m_preserve_fp_state(cpu_env);
138
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
41
+ /*
139
+ return false;
42
+ * If the preserve_fp_state helper doesn't throw an exception
43
+ * then it will clear LSPACT; we don't need to repeat this for
44
+ * any further FP insns in this TB.
45
+ */
46
+ s->v7m_lspact = false;
140
+ }
47
+ }
141
+
142
+ if (s->vec_len != 0 || s->vec_stride != 0) {
143
+ return false;
144
+ }
145
+
146
+ if (!vfp_access_check(s)) {
147
+ return true;
148
+ }
149
+
150
+ f0 = tcg_temp_new_i32();
151
+ f1 = tcg_temp_new_i32();
152
+ fd = tcg_temp_new_i32();
153
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
154
+
155
+ neon_load_reg32(f0, vn);
156
+ neon_load_reg32(f1, vm);
157
+
158
+ if (reads_vd) {
159
+ neon_load_reg32(fd, vd);
160
+ }
161
+ fn(fd, f0, f1, fpst);
162
+ neon_store_reg32(fd, vd);
163
+
164
+ tcg_temp_free_i32(f0);
165
+ tcg_temp_free_i32(f1);
166
+ tcg_temp_free_i32(fd);
167
+ tcg_temp_free_ptr(fpst);
168
+
169
+ return true;
170
+}
48
+}
171
+
49
+
172
static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
50
/*
173
int vd, int vn, int vm, bool reads_vd)
51
* Check that VFP access is enabled. If it is, do the necessary
174
{
52
* M-profile lazy-FP handling and then return true.
175
@@ -XXX,XX +XXX,XX @@ static bool trans_VNMLA_dp(DisasContext *s, arg_VNMLA_dp *a)
53
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
176
return do_vfp_3op_dp(s, gen_VNMLA_dp, a->vd, a->vn, a->vm, true);
54
/* Handle M-profile lazy FP state mechanics */
177
}
55
178
56
/* Trigger lazy-state preservation if necessary */
179
+static bool trans_VMUL_hp(DisasContext *s, arg_VMUL_sp *a)
57
- if (s->v7m_lspact) {
180
+{
58
- /*
181
+ return do_vfp_3op_hp(s, gen_helper_vfp_mulh, a->vd, a->vn, a->vm, false);
59
- * Lazy state saving affects external memory and also the NVIC,
182
+}
60
- * so we must mark it as an IO operation for icount (and cause
183
+
61
- * this to be the last insn in the TB).
184
static bool trans_VMUL_sp(DisasContext *s, arg_VMUL_sp *a)
62
- */
185
{
63
- if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
186
return do_vfp_3op_sp(s, gen_helper_vfp_muls, a->vd, a->vn, a->vm, false);
64
- s->base.is_jmp = DISAS_UPDATE_EXIT;
187
@@ -XXX,XX +XXX,XX @@ static bool trans_VNMUL_dp(DisasContext *s, arg_VNMUL_dp *a)
65
- gen_io_start();
188
return do_vfp_3op_dp(s, gen_VNMUL_dp, a->vd, a->vn, a->vm, false);
66
- }
189
}
67
- gen_helper_v7m_preserve_fp_state(cpu_env);
190
68
- /*
191
+static bool trans_VADD_hp(DisasContext *s, arg_VADD_sp *a)
69
- * If the preserve_fp_state helper doesn't throw an exception
192
+{
70
- * then it will clear LSPACT; we don't need to repeat this for
193
+ return do_vfp_3op_hp(s, gen_helper_vfp_addh, a->vd, a->vn, a->vm, false);
71
- * any further FP insns in this TB.
194
+}
72
- */
195
+
73
- s->v7m_lspact = false;
196
static bool trans_VADD_sp(DisasContext *s, arg_VADD_sp *a)
74
- }
197
{
75
+ gen_preserve_fp_state(s);
198
return do_vfp_3op_sp(s, gen_helper_vfp_adds, a->vd, a->vn, a->vm, false);
76
199
@@ -XXX,XX +XXX,XX @@ static bool trans_VADD_dp(DisasContext *s, arg_VADD_dp *a)
77
/* Update ownership of FP context: set FPCCR.S to match current state */
200
return do_vfp_3op_dp(s, gen_helper_vfp_addd, a->vd, a->vn, a->vm, false);
78
if (s->v8m_fpccr_s_wrong) {
201
}
202
203
+static bool trans_VSUB_hp(DisasContext *s, arg_VSUB_sp *a)
204
+{
205
+ return do_vfp_3op_hp(s, gen_helper_vfp_subh, a->vd, a->vn, a->vm, false);
206
+}
207
+
208
static bool trans_VSUB_sp(DisasContext *s, arg_VSUB_sp *a)
209
{
210
return do_vfp_3op_sp(s, gen_helper_vfp_subs, a->vd, a->vn, a->vm, false);
211
@@ -XXX,XX +XXX,XX @@ static bool trans_VSUB_dp(DisasContext *s, arg_VSUB_dp *a)
212
return do_vfp_3op_dp(s, gen_helper_vfp_subd, a->vd, a->vn, a->vm, false);
213
}
214
215
+static bool trans_VDIV_hp(DisasContext *s, arg_VDIV_sp *a)
216
+{
217
+ return do_vfp_3op_hp(s, gen_helper_vfp_divh, a->vd, a->vn, a->vm, false);
218
+}
219
+
220
static bool trans_VDIV_sp(DisasContext *s, arg_VDIV_sp *a)
221
{
222
return do_vfp_3op_sp(s, gen_helper_vfp_divs, a->vd, a->vn, a->vm, false);
223
@@ -XXX,XX +XXX,XX @@ static bool trans_VDIV_dp(DisasContext *s, arg_VDIV_dp *a)
224
return do_vfp_3op_dp(s, gen_helper_vfp_divd, a->vd, a->vn, a->vm, false);
225
}
226
227
+static bool trans_VMINNM_hp(DisasContext *s, arg_VMINNM_sp *a)
228
+{
229
+ if (!dc_isar_feature(aa32_vminmaxnm, s)) {
230
+ return false;
231
+ }
232
+ return do_vfp_3op_hp(s, gen_helper_vfp_minnumh,
233
+ a->vd, a->vn, a->vm, false);
234
+}
235
+
236
+static bool trans_VMAXNM_hp(DisasContext *s, arg_VMAXNM_sp *a)
237
+{
238
+ if (!dc_isar_feature(aa32_vminmaxnm, s)) {
239
+ return false;
240
+ }
241
+ return do_vfp_3op_hp(s, gen_helper_vfp_maxnumh,
242
+ a->vd, a->vn, a->vm, false);
243
+}
244
+
245
static bool trans_VMINNM_sp(DisasContext *s, arg_VMINNM_sp *a)
246
{
247
if (!dc_isar_feature(aa32_vminmaxnm, s)) {
248
--
79
--
249
2.20.1
80
2.20.1
250
81
251
82
diff view generated by jsdifflib
1
Macroify creation of the trans functions for single and double
1
Implement the new-in-v8.1M FPCXT_S floating point system register.
2
precision VFMA, VFMS, VFNMA, VFNMS. The repetition was OK for
2
This is for saving and restoring the secure floating point context,
3
two sizes, but we're about to add halfprec and it will get a bit
3
and it reads and writes bits [27:0] from the FPSCR and the
4
more than seems reasonable.
4
CONTROL.SFPA bit in bit [31].
5
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200828183354.27913-6-peter.maydell@linaro.org
8
Message-id: 20201119215617.29887-14-peter.maydell@linaro.org
9
---
9
---
10
target/arm/translate-vfp.c.inc | 50 +++++++++-------------------------
10
target/arm/translate-vfp.c.inc | 58 ++++++++++++++++++++++++++++++++++
11
1 file changed, 13 insertions(+), 37 deletions(-)
11
1 file changed, 58 insertions(+)
12
12
13
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
13
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
14
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate-vfp.c.inc
15
--- a/target/arm/translate-vfp.c.inc
16
+++ b/target/arm/translate-vfp.c.inc
16
+++ b/target/arm/translate-vfp.c.inc
17
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
17
@@ -XXX,XX +XXX,XX @@ static FPSysRegCheckResult fp_sysreg_checks(DisasContext *s, int regno)
18
return true;
18
return false;
19
}
19
}
20
20
break;
21
-static bool trans_VFMA_sp(DisasContext *s, arg_VFMA_sp *a)
21
+ case ARM_VFP_FPCXT_S:
22
-{
22
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
23
- return do_vfm_sp(s, a, false, false);
23
+ return false;
24
-}
24
+ }
25
-
25
+ if (!s->v8m_secure) {
26
-static bool trans_VFMS_sp(DisasContext *s, arg_VFMS_sp *a)
26
+ return false;
27
-{
27
+ }
28
- return do_vfm_sp(s, a, true, false);
28
+ break;
29
-}
29
default:
30
-
30
return FPSysRegCheckFailed;
31
-static bool trans_VFNMA_sp(DisasContext *s, arg_VFNMA_sp *a)
31
}
32
-{
32
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
33
- return do_vfm_sp(s, a, false, true);
33
tcg_temp_free_i32(tmp);
34
-}
34
break;
35
-
35
}
36
-static bool trans_VFNMS_sp(DisasContext *s, arg_VFNMS_sp *a)
36
+ case ARM_VFP_FPCXT_S:
37
-{
37
+ {
38
- return do_vfm_sp(s, a, true, true);
38
+ TCGv_i32 sfpa, control, fpscr;
39
-}
39
+ /* Set FPSCR[27:0] and CONTROL.SFPA from value */
40
-
40
+ tmp = loadfn(s, opaque);
41
static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
41
+ sfpa = tcg_temp_new_i32();
42
{
42
+ tcg_gen_shri_i32(sfpa, tmp, 31);
43
/*
43
+ control = load_cpu_field(v7m.control[M_REG_S]);
44
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
44
+ tcg_gen_deposit_i32(control, control, sfpa,
45
return true;
45
+ R_V7M_CONTROL_SFPA_SHIFT, 1);
46
}
46
+ store_cpu_field(control, v7m.control[M_REG_S]);
47
47
+ fpscr = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
48
-static bool trans_VFMA_dp(DisasContext *s, arg_VFMA_dp *a)
48
+ tcg_gen_andi_i32(fpscr, fpscr, FPCR_NZCV_MASK);
49
-{
49
+ tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
50
- return do_vfm_dp(s, a, false, false);
50
+ tcg_gen_or_i32(fpscr, fpscr, tmp);
51
-}
51
+ store_cpu_field(fpscr, vfp.xregs[ARM_VFP_FPSCR]);
52
+#define MAKE_ONE_VFM_TRANS_FN(INSN, PREC, NEGN, NEGD) \
52
+ tcg_temp_free_i32(tmp);
53
+ static bool trans_##INSN##_##PREC(DisasContext *s, \
53
+ tcg_temp_free_i32(sfpa);
54
+ arg_##INSN##_##PREC *a) \
54
+ break;
55
+ { \
56
+ return do_vfm_##PREC(s, a, NEGN, NEGD); \
57
+ }
55
+ }
58
56
default:
59
-static bool trans_VFMS_dp(DisasContext *s, arg_VFMS_dp *a)
57
g_assert_not_reached();
60
-{
58
}
61
- return do_vfm_dp(s, a, true, false);
59
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
62
-}
60
tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
63
+#define MAKE_VFM_TRANS_FNS(PREC) \
61
storefn(s, opaque, tmp);
64
+ MAKE_ONE_VFM_TRANS_FN(VFMA, PREC, false, false) \
62
break;
65
+ MAKE_ONE_VFM_TRANS_FN(VFMS, PREC, true, false) \
63
+ case ARM_VFP_FPCXT_S:
66
+ MAKE_ONE_VFM_TRANS_FN(VFNMA, PREC, false, true) \
64
+ {
67
+ MAKE_ONE_VFM_TRANS_FN(VFNMS, PREC, true, true)
65
+ TCGv_i32 control, sfpa, fpscr;
68
66
+ /* Bits [27:0] from FPSCR, bit [31] from CONTROL.SFPA */
69
-static bool trans_VFNMA_dp(DisasContext *s, arg_VFNMA_dp *a)
67
+ tmp = tcg_temp_new_i32();
70
-{
68
+ sfpa = tcg_temp_new_i32();
71
- return do_vfm_dp(s, a, false, true);
69
+ gen_helper_vfp_get_fpscr(tmp, cpu_env);
72
-}
70
+ tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
73
-
71
+ control = load_cpu_field(v7m.control[M_REG_S]);
74
-static bool trans_VFNMS_dp(DisasContext *s, arg_VFNMS_dp *a)
72
+ tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
75
-{
73
+ tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
76
- return do_vfm_dp(s, a, true, true);
74
+ tcg_gen_or_i32(tmp, tmp, sfpa);
77
-}
75
+ tcg_temp_free_i32(sfpa);
78
+MAKE_VFM_TRANS_FNS(sp)
76
+ /*
79
+MAKE_VFM_TRANS_FNS(dp)
77
+ * Store result before updating FPSCR etc, in case
80
78
+ * it is a memory write which causes an exception.
81
static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
79
+ */
82
{
80
+ storefn(s, opaque, tmp);
81
+ /*
82
+ * Now we must reset FPSCR from FPDSCR_NS, and clear
83
+ * CONTROL.SFPA; so we'll end the TB here.
84
+ */
85
+ tcg_gen_andi_i32(control, control, ~R_V7M_CONTROL_SFPA_MASK);
86
+ store_cpu_field(control, v7m.control[M_REG_S]);
87
+ fpscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
88
+ gen_helper_vfp_set_fpscr(cpu_env, fpscr);
89
+ tcg_temp_free_i32(fpscr);
90
+ gen_lookup_tb(s);
91
+ break;
92
+ }
93
default:
94
g_assert_not_reached();
95
}
83
--
96
--
84
2.20.1
97
2.20.1
85
98
86
99
diff view generated by jsdifflib
1
Set the MVFR1 ID register FPHP and SIMDHP fields to indicate
1
The FPDSCR register has a similar layout to the FPSCR. In v8.1M it
2
that our "-cpu max" has v8.2-FP16.
2
gains new fields FZ16 (if half-precision floating point is supported)
3
and LTPSIZE (always reads as 4). Update the reset value and the code
4
that handles writes to this register accordingly.
3
5
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-46-peter.maydell@linaro.org
8
Message-id: 20201119215617.29887-16-peter.maydell@linaro.org
7
---
9
---
8
target/arm/cpu.c | 3 ++-
10
target/arm/cpu.h | 5 +++++
9
target/arm/cpu64.c | 10 ++++------
11
hw/intc/armv7m_nvic.c | 9 ++++++++-
10
2 files changed, 6 insertions(+), 7 deletions(-)
12
target/arm/cpu.c | 3 +++
13
3 files changed, 16 insertions(+), 1 deletion(-)
11
14
15
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.h
18
+++ b/target/arm/cpu.h
19
@@ -XXX,XX +XXX,XX @@ void vfp_set_fpscr(CPUARMState *env, uint32_t val);
20
#define FPCR_IXE (1 << 12) /* Inexact exception trap enable */
21
#define FPCR_IDE (1 << 15) /* Input Denormal exception trap enable */
22
#define FPCR_FZ16 (1 << 19) /* ARMv8.2+, FP16 flush-to-zero */
23
+#define FPCR_RMODE_MASK (3 << 22) /* Rounding mode */
24
#define FPCR_FZ (1 << 24) /* Flush-to-zero enable bit */
25
#define FPCR_DN (1 << 25) /* Default NaN enable bit */
26
+#define FPCR_AHP (1 << 26) /* Alternative half-precision */
27
#define FPCR_QC (1 << 27) /* Cumulative saturation bit */
28
#define FPCR_V (1 << 28) /* FP overflow flag */
29
#define FPCR_C (1 << 29) /* FP carry flag */
30
#define FPCR_Z (1 << 30) /* FP zero flag */
31
#define FPCR_N (1 << 31) /* FP negative flag */
32
33
+#define FPCR_LTPSIZE_SHIFT 16 /* LTPSIZE, M-profile only */
34
+#define FPCR_LTPSIZE_MASK (7 << FPCR_LTPSIZE_SHIFT)
35
+
36
#define FPCR_NZCV_MASK (FPCR_N | FPCR_Z | FPCR_C | FPCR_V)
37
#define FPCR_NZCVQC_MASK (FPCR_NZCV_MASK | FPCR_QC)
38
39
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/hw/intc/armv7m_nvic.c
42
+++ b/hw/intc/armv7m_nvic.c
43
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
44
break;
45
case 0xf3c: /* FPDSCR */
46
if (cpu_isar_feature(aa32_vfp_simd, cpu)) {
47
- value &= 0x07c00000;
48
+ uint32_t mask = FPCR_AHP | FPCR_DN | FPCR_FZ | FPCR_RMODE_MASK;
49
+ if (cpu_isar_feature(any_fp16, cpu)) {
50
+ mask |= FPCR_FZ16;
51
+ }
52
+ value &= mask;
53
+ if (cpu_isar_feature(aa32_lob, cpu)) {
54
+ value |= 4 << FPCR_LTPSIZE_SHIFT;
55
+ }
56
cpu->env.v7m.fpdscr[attrs.secure] = value;
57
}
58
break;
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
59
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
60
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
61
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
62
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
63
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
17
cpu->isar.id_isar6 = t;
64
* always reset to 4.
18
65
*/
19
t = cpu->isar.mvfr1;
66
env->v7m.ltpsize = 4;
20
- t = FIELD_DP32(t, MVFR1, FPHP, 2); /* v8.0 FP support */
67
+ /* The LTPSIZE field in FPDSCR is constant and reads as 4. */
21
+ t = FIELD_DP32(t, MVFR1, FPHP, 3); /* v8.2-FP16 */
68
+ env->v7m.fpdscr[M_REG_NS] = 4 << FPCR_LTPSIZE_SHIFT;
22
+ t = FIELD_DP32(t, MVFR1, SIMDHP, 2); /* v8.2-FP16 */
69
+ env->v7m.fpdscr[M_REG_S] = 4 << FPCR_LTPSIZE_SHIFT;
23
cpu->isar.mvfr1 = t;
70
}
24
71
25
t = cpu->isar.mvfr2;
72
if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
26
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/cpu64.c
29
+++ b/target/arm/cpu64.c
30
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
31
u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
32
cpu->isar.id_dfr0 = u;
33
34
- /*
35
- * FIXME: We do not yet support ARMv8.2-fp16 for AArch32 yet,
36
- * so do not set MVFR1.FPHP. Strictly speaking this is not legal,
37
- * but it is also not legal to enable SVE without support for FP16,
38
- * and enabling SVE in system mode is more useful in the short term.
39
- */
40
+ u = cpu->isar.mvfr1;
41
+ u = FIELD_DP32(u, MVFR1, FPHP, 3); /* v8.2-FP16 */
42
+ u = FIELD_DP32(u, MVFR1, SIMDHP, 2); /* v8.2-FP16 */
43
+ cpu->isar.mvfr1 = u;
44
45
#ifdef CONFIG_USER_ONLY
46
/* For usermode -cpu max we can use a larger and more efficient DCZ
47
--
73
--
48
2.20.1
74
2.20.1
49
75
50
76
diff view generated by jsdifflib
1
Implement fp16 for the Neon VCVT insns which convert between
1
In v8.0M, on exception entry the registers R0-R3, R12, APSR and EPSR
2
float and fixed-point.
2
are zeroed for an exception taken to Non-secure state; for an
3
exception taken to Secure state they become UNKNOWN, and we chose to
4
leave them at their previous values.
5
6
In v8.1M the behaviour is specified more tightly and these registers
7
are always zeroed regardless of the security state that the exception
8
targets (see rule R_KPZV). Implement this.
3
9
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-39-peter.maydell@linaro.org
12
Message-id: 20201119215617.29887-17-peter.maydell@linaro.org
7
---
13
---
8
target/arm/helper.h | 5 +++++
14
target/arm/m_helper.c | 16 ++++++++++++----
9
target/arm/neon-dp.decode | 8 +++++++-
15
1 file changed, 12 insertions(+), 4 deletions(-)
10
target/arm/vec_helper.c | 4 ++++
11
target/arm/translate-neon.c.inc | 5 +++++
12
4 files changed, 21 insertions(+), 1 deletion(-)
13
16
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
19
--- a/target/arm/m_helper.c
17
+++ b/target/arm/helper.h
20
+++ b/target/arm/m_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_vcvt_uf, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
19
DEF_HELPER_FLAGS_4(gvec_vcvt_fs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
* Clear registers if necessary to prevent non-secure exception
20
DEF_HELPER_FLAGS_4(gvec_vcvt_fu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
* code being able to see register values from secure code.
21
24
* Where register values become architecturally UNKNOWN we leave
22
+DEF_HELPER_FLAGS_4(gvec_vcvt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
- * them with their previous values.
23
+DEF_HELPER_FLAGS_4(gvec_vcvt_uh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+ * them with their previous values. v8.1M is tighter than v8.0M
24
+DEF_HELPER_FLAGS_4(gvec_vcvt_hs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+ * here and always zeroes the caller-saved registers regardless
25
+DEF_HELPER_FLAGS_4(gvec_vcvt_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+ * of the security state the exception is targeting.
26
+
29
*/
27
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
28
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
- if (!targets_secure) {
29
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+ if (!targets_secure || arm_feature(env, ARM_FEATURE_V8_1M)) {
30
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
33
/*
31
index XXXXXXX..XXXXXXX 100644
34
* Always clear the caller-saved registers (they have been
32
--- a/target/arm/neon-dp.decode
35
* pushed to the stack earlier in v7m_push_stack()).
33
+++ b/target/arm/neon-dp.decode
36
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
34
@@ -XXX,XX +XXX,XX @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
37
* v7m_push_callee_stack()).
35
# We use size=0 for fp32 and size=1 for fp16 to match the 3-same encodings.
38
*/
36
@2reg_vcvt .... ... . . . 1 ..... .... .... . q:1 . . .... \
39
int i;
37
&2reg_shift vm=%vm_dp vd=%vd_dp size=0 shift=%neon_rshift_i5
40
+ /*
38
+@2reg_vcvt_f16 .... ... . . . 11 .... .... .... . q:1 . . .... \
41
+ * r4..r11 are callee-saves, zero only if background
39
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 shift=%neon_rshift_i4
42
+ * state was Secure (EXCRET.S == 1) and exception
40
43
+ * targets Non-secure state
41
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
44
+ */
42
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
45
+ bool zero_callee_saves = !targets_secure &&
43
@@ -XXX,XX +XXX,XX @@ VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
46
+ (lr & R_V7M_EXCRET_S_MASK);
44
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
47
45
48
for (i = 0; i < 13; i++) {
46
# VCVT fixed<->float conversions
49
- /* r4..r11 are callee-saves, zero only if EXCRET.S == 1 */
47
-# TODO: FP16 fixed<->float conversions are opc==0b1100 and 0b1101
50
- if (i < 4 || i > 11 || (lr & R_V7M_EXCRET_S_MASK)) {
48
+VCVT_SH_2sh 1111 001 0 1 . ...... .... 1100 0 . . 1 .... @2reg_vcvt_f16
51
+ if (i < 4 || i > 11 || zero_callee_saves) {
49
+VCVT_UH_2sh 1111 001 1 1 . ...... .... 1100 0 . . 1 .... @2reg_vcvt_f16
52
env->regs[i] = 0;
50
+VCVT_HS_2sh 1111 001 0 1 . ...... .... 1101 0 . . 1 .... @2reg_vcvt_f16
53
}
51
+VCVT_HU_2sh 1111 001 1 1 . ...... .... 1101 0 . . 1 .... @2reg_vcvt_f16
54
}
52
+
53
VCVT_SF_2sh 1111 001 0 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
54
VCVT_UF_2sh 1111 001 1 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
55
VCVT_FS_2sh 1111 001 0 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
56
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/vec_helper.c
59
+++ b/target/arm/vec_helper.c
60
@@ -XXX,XX +XXX,XX @@ DO_VCVT_FIXED(gvec_vcvt_sf, helper_vfp_sltos, uint32_t)
61
DO_VCVT_FIXED(gvec_vcvt_uf, helper_vfp_ultos, uint32_t)
62
DO_VCVT_FIXED(gvec_vcvt_fs, helper_vfp_tosls_round_to_zero, uint32_t)
63
DO_VCVT_FIXED(gvec_vcvt_fu, helper_vfp_touls_round_to_zero, uint32_t)
64
+DO_VCVT_FIXED(gvec_vcvt_sh, helper_vfp_shtoh, uint16_t)
65
+DO_VCVT_FIXED(gvec_vcvt_uh, helper_vfp_uhtoh, uint16_t)
66
+DO_VCVT_FIXED(gvec_vcvt_hs, helper_vfp_toshh_round_to_zero, uint16_t)
67
+DO_VCVT_FIXED(gvec_vcvt_hu, helper_vfp_touhh_round_to_zero, uint16_t)
68
69
#undef DO_VCVT_FIXED
70
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate-neon.c.inc
73
+++ b/target/arm/translate-neon.c.inc
74
@@ -XXX,XX +XXX,XX @@ DO_FP_2SH(VCVT_UF, gen_helper_gvec_vcvt_uf)
75
DO_FP_2SH(VCVT_FS, gen_helper_gvec_vcvt_fs)
76
DO_FP_2SH(VCVT_FU, gen_helper_gvec_vcvt_fu)
77
78
+DO_FP_2SH(VCVT_SH, gen_helper_gvec_vcvt_sh)
79
+DO_FP_2SH(VCVT_UH, gen_helper_gvec_vcvt_uh)
80
+DO_FP_2SH(VCVT_HS, gen_helper_gvec_vcvt_hs)
81
+DO_FP_2SH(VCVT_HU, gen_helper_gvec_vcvt_hu)
82
+
83
static uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
84
{
85
/*
86
--
55
--
87
2.20.1
56
2.20.1
88
57
89
58
diff view generated by jsdifflib
1
The fp16 extension includes a new instruction VINS, which copies the
1
In v8.1M, vector table fetch failures don't set HFSR.FORCED (see rule
2
lower 16 bits of a 32-bit source VFP register into the upper 16 bits
2
R_LLRP). (In previous versions of the architecture this was either
3
of the destination. Implement it.
3
required or IMPDEF.)
4
4
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-20-peter.maydell@linaro.org
7
Message-id: 20201119215617.29887-18-peter.maydell@linaro.org
8
---
8
---
9
target/arm/vfp-uncond.decode | 3 +++
9
target/arm/m_helper.c | 6 +++++-
10
target/arm/translate-vfp.c.inc | 28 ++++++++++++++++++++++++++++
10
1 file changed, 5 insertions(+), 1 deletion(-)
11
2 files changed, 31 insertions(+)
12
11
13
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
12
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/vfp-uncond.decode
14
--- a/target/arm/m_helper.c
16
+++ b/target/arm/vfp-uncond.decode
15
+++ b/target/arm/m_helper.c
17
@@ -XXX,XX +XXX,XX @@ VCVT 1111 1110 1.11 11 rm:2 .... 1010 op:1 1.0 .... \
16
@@ -XXX,XX +XXX,XX @@ load_fail:
18
vm=%vm_sp vd=%vd_sp sz=2
17
* The HardFault is Secure if BFHFNMINS is 0 (meaning that all HFs are
19
VCVT 1111 1110 1.11 11 rm:2 .... 1011 op:1 1.0 .... \
18
* secure); otherwise it targets the same security state as the
20
vm=%vm_dp vd=%vd_sp sz=3
19
* underlying exception.
21
+
20
+ * In v8.1M HardFaults from vector table fetch fails don't set FORCED.
22
+VINS 1111 1110 1.11 0000 .... 1010 11 . 0 .... \
21
*/
23
+ vd=%vd_sp vm=%vm_sp
22
if (!(cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK)) {
24
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
23
exc_secure = true;
25
index XXXXXXX..XXXXXXX 100644
24
}
26
--- a/target/arm/translate-vfp.c.inc
25
- env->v7m.hfsr |= R_V7M_HFSR_VECTTBL_MASK | R_V7M_HFSR_FORCED_MASK;
27
+++ b/target/arm/translate-vfp.c.inc
26
+ env->v7m.hfsr |= R_V7M_HFSR_VECTTBL_MASK;
28
@@ -XXX,XX +XXX,XX @@ static bool trans_NOCP(DisasContext *s, arg_NOCP *a)
27
+ if (!arm_feature(env, ARM_FEATURE_V8_1M)) {
29
28
+ env->v7m.hfsr |= R_V7M_HFSR_FORCED_MASK;
29
+ }
30
armv7m_nvic_set_pending_derived(env->nvic, ARMV7M_EXCP_HARD, exc_secure);
30
return false;
31
return false;
31
}
32
}
32
+
33
+static bool trans_VINS(DisasContext *s, arg_VINS *a)
34
+{
35
+ TCGv_i32 rd, rm;
36
+
37
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
38
+ return false;
39
+ }
40
+
41
+ if (s->vec_len != 0 || s->vec_stride != 0) {
42
+ return false;
43
+ }
44
+
45
+ if (!vfp_access_check(s)) {
46
+ return true;
47
+ }
48
+
49
+ /* Insert low half of Vm into high half of Vd */
50
+ rm = tcg_temp_new_i32();
51
+ rd = tcg_temp_new_i32();
52
+ neon_load_reg32(rm, a->vm);
53
+ neon_load_reg32(rd, a->vd);
54
+ tcg_gen_deposit_i32(rd, rd, rm, 16, 16);
55
+ neon_store_reg32(rd, a->vd);
56
+ tcg_temp_free_i32(rm);
57
+ tcg_temp_free_i32(rd);
58
+ return true;
59
+}
60
--
33
--
61
2.20.1
34
2.20.1
62
35
63
36
diff view generated by jsdifflib
1
Convert the Neon float-integer VCVT insns to gvec, and use this
1
In v8.1M a REVIDR register is defined, which is at address 0xe00ecfc
2
to implement fp16 support for them.
2
and is a read-only IMPDEF register providing implementation specific
3
3
minor revision information, like the v8A REVIDR_EL1. Implement this.
4
Note that unlike the VFP int<->fp16 VCVT insns we converted
5
earlier and which convert to/from a 32-bit integer, these
6
Neon insns convert to/from 16-bit integers. So we can use
7
the existing vfp conversion helpers for the f32<->u32/i32
8
case but need to provide our own for f16<->u16/i16.
9
4
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20200828183354.27913-37-peter.maydell@linaro.org
7
Message-id: 20201119215617.29887-19-peter.maydell@linaro.org
13
---
8
---
14
target/arm/helper.h | 9 +++++++++
9
hw/intc/armv7m_nvic.c | 5 +++++
15
target/arm/vec_helper.c | 29 +++++++++++++++++++++++++++++
10
1 file changed, 5 insertions(+)
16
target/arm/translate-neon.c.inc | 15 ++++-----------
17
3 files changed, 42 insertions(+), 11 deletions(-)
18
11
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
20
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.h
14
--- a/hw/intc/armv7m_nvic.c
22
+++ b/target/arm/helper.h
15
+++ b/hw/intc/armv7m_nvic.c
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(neon_padds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
24
DEF_HELPER_FLAGS_5(neon_pmaxs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
17
}
25
DEF_HELPER_FLAGS_5(neon_pmins, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
return val;
26
19
}
27
+DEF_HELPER_FLAGS_4(gvec_sstoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
+ case 0xcfc:
28
+DEF_HELPER_FLAGS_4(gvec_sitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
+ if (!arm_feature(&cpu->env, ARM_FEATURE_V8_1M)) {
29
+DEF_HELPER_FLAGS_4(gvec_ustoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
+ goto bad_offset;
30
+DEF_HELPER_FLAGS_4(gvec_uitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+ }
31
+DEF_HELPER_FLAGS_4(gvec_tosszh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+ return cpu->revidr;
32
+DEF_HELPER_FLAGS_4(gvec_tosizs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
case 0xd00: /* CPUID Base. */
33
+DEF_HELPER_FLAGS_4(gvec_touszh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
return cpu->midr;
34
+DEF_HELPER_FLAGS_4(gvec_touizs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
case 0xd04: /* Interrupt Control State (ICSR) */
35
+
36
DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/vec_helper.c
42
+++ b/target/arm/vec_helper.c
43
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_acgt(float32 op1, float32 op2, float_status *stat)
44
return -float32_lt(float32_abs(op2), float32_abs(op1), stat);
45
}
46
47
+static int16_t vfp_tosszh(float16 x, void *fpstp)
48
+{
49
+ float_status *fpst = fpstp;
50
+ if (float16_is_any_nan(x)) {
51
+ float_raise(float_flag_invalid, fpst);
52
+ return 0;
53
+ }
54
+ return float16_to_int16_round_to_zero(x, fpst);
55
+}
56
+
57
+static uint16_t vfp_touszh(float16 x, void *fpstp)
58
+{
59
+ float_status *fpst = fpstp;
60
+ if (float16_is_any_nan(x)) {
61
+ float_raise(float_flag_invalid, fpst);
62
+ return 0;
63
+ }
64
+ return float16_to_uint16_round_to_zero(x, fpst);
65
+}
66
+
67
#define DO_2OP(NAME, FUNC, TYPE) \
68
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
69
{ \
70
@@ -XXX,XX +XXX,XX @@ DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16)
71
DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32)
72
DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64)
73
74
+DO_2OP(gvec_sitos, helper_vfp_sitos, int32_t)
75
+DO_2OP(gvec_uitos, helper_vfp_uitos, uint32_t)
76
+DO_2OP(gvec_tosizs, helper_vfp_tosizs, float32)
77
+DO_2OP(gvec_touizs, helper_vfp_touizs, float32)
78
+DO_2OP(gvec_sstoh, int16_to_float16, int16_t)
79
+DO_2OP(gvec_ustoh, uint16_to_float16, uint16_t)
80
+DO_2OP(gvec_tosszh, vfp_tosszh, float16)
81
+DO_2OP(gvec_touszh, vfp_touszh, float16)
82
+
83
#define WRAP_CMP0_FWD(FN, CMPOP, TYPE) \
84
static TYPE TYPE##_##FN##0(TYPE op, float_status *stat) \
85
{ \
86
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/translate-neon.c.inc
89
+++ b/target/arm/translate-neon.c.inc
90
@@ -XXX,XX +XXX,XX @@ static bool do_2misc_fp(DisasContext *s, arg_2misc *a,
91
return true;
92
}
93
94
-#define DO_2MISC_FP(INSN, FUNC) \
95
- static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
96
- { \
97
- return do_2misc_fp(s, a, FUNC); \
98
- }
99
-
100
-DO_2MISC_FP(VCVT_FS, gen_helper_vfp_sitos)
101
-DO_2MISC_FP(VCVT_FU, gen_helper_vfp_uitos)
102
-DO_2MISC_FP(VCVT_SF, gen_helper_vfp_tosizs)
103
-DO_2MISC_FP(VCVT_UF, gen_helper_vfp_touizs)
104
-
105
#define DO_2MISC_FP_VEC(INSN, HFUNC, SFUNC) \
106
static void gen_##INSN(unsigned vece, uint32_t rd_ofs, \
107
uint32_t rm_ofs, \
108
@@ -XXX,XX +XXX,XX @@ DO_2MISC_FP_VEC(VCGE0_F, gen_helper_gvec_fcge0_h, gen_helper_gvec_fcge0_s)
109
DO_2MISC_FP_VEC(VCEQ0_F, gen_helper_gvec_fceq0_h, gen_helper_gvec_fceq0_s)
110
DO_2MISC_FP_VEC(VCLT0_F, gen_helper_gvec_fclt0_h, gen_helper_gvec_fclt0_s)
111
DO_2MISC_FP_VEC(VCLE0_F, gen_helper_gvec_fcle0_h, gen_helper_gvec_fcle0_s)
112
+DO_2MISC_FP_VEC(VCVT_FS, gen_helper_gvec_sstoh, gen_helper_gvec_sitos)
113
+DO_2MISC_FP_VEC(VCVT_FU, gen_helper_gvec_ustoh, gen_helper_gvec_uitos)
114
+DO_2MISC_FP_VEC(VCVT_SF, gen_helper_gvec_tosszh, gen_helper_gvec_tosizs)
115
+DO_2MISC_FP_VEC(VCVT_UF, gen_helper_gvec_touszh, gen_helper_gvec_touizs)
116
117
static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
118
{
119
--
28
--
120
2.20.1
29
2.20.1
121
30
122
31
diff view generated by jsdifflib
1
Convert the Neon VRSQRTS insn to using a gvec helper,
1
In v8.1M a new exception return check is added which may cause a NOCP
2
and use this to implement the fp16 case.
2
UsageFault (see rule R_XLTP): before we clear s0..s15 and the FPSCR
3
we must check whether access to CP10 from the Security state of the
4
returning exception is disabled; if it is then we must take a fault.
3
5
4
As with VRECPS, we adjust the phrasing of the new implementation
6
(Note that for our implementation CPPWR is always RAZ/WI and so can
5
slightly so that the fp32 version parallels the fp16 one.
7
never cause CP10 accesses to fail.)
8
9
The other v8.1M change to this register-clearing code is that if MVE
10
is implemented VPR must also be cleared, so add a TODO comment to
11
that effect.
6
12
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200828183354.27913-35-peter.maydell@linaro.org
15
Message-id: 20201119215617.29887-20-peter.maydell@linaro.org
10
---
16
---
11
target/arm/helper.h | 4 +++-
17
target/arm/m_helper.c | 22 +++++++++++++++++++++-
12
target/arm/vec_helper.c | 30 ++++++++++++++++++++++++++++++
18
1 file changed, 21 insertions(+), 1 deletion(-)
13
target/arm/vfp_helper.c | 15 ---------------
14
target/arm/translate-neon.c.inc | 21 +--------------------
15
4 files changed, 34 insertions(+), 36 deletions(-)
16
19
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
20
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
18
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.h
22
--- a/target/arm/m_helper.c
20
+++ b/target/arm/helper.h
23
+++ b/target/arm/m_helper.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
24
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
22
DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
25
v7m_exception_taken(cpu, excret, true, false);
23
DEF_HELPER_4(vfp_muladdh, f16, f16, f16, f16, ptr)
26
return;
24
27
} else {
25
-DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
28
- /* Clear s0..s15 and FPSCR */
26
DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
29
+ if (arm_feature(env, ARM_FEATURE_V8_1M)) {
27
DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
30
+ /* v8.1M adds this NOCP check */
28
DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
31
+ bool nsacr_pass = exc_secure ||
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i3
32
+ extract32(env->v7m.nsacr, 10, 1);
30
DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
+ bool cpacr_pass = v7m_cpacr_pass(env, exc_secure, true);
31
DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+ if (!nsacr_pass) {
32
35
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, true);
33
+DEF_HELPER_FLAGS_5(gvec_rsqrts_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
+ env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
34
+DEF_HELPER_FLAGS_5(gvec_rsqrts_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
37
+ qemu_log_mask(CPU_LOG_INT, "...taking UsageFault on existing "
35
+
38
+ "stackframe: NSACR prevents clearing FPU registers\n");
36
DEF_HELPER_FLAGS_5(gvec_fmla_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
+ v7m_exception_taken(cpu, excret, true, false);
37
DEF_HELPER_FLAGS_5(gvec_fmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
+ } else if (!cpacr_pass) {
38
41
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
39
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
42
+ exc_secure);
40
index XXXXXXX..XXXXXXX 100644
43
+ env->v7m.cfsr[exc_secure] |= R_V7M_CFSR_NOCP_MASK;
41
--- a/target/arm/vec_helper.c
44
+ qemu_log_mask(CPU_LOG_INT, "...taking UsageFault on existing "
42
+++ b/target/arm/vec_helper.c
45
+ "stackframe: CPACR prevents clearing FPU registers\n");
43
@@ -XXX,XX +XXX,XX @@ static float32 float32_recps_nf(float32 op1, float32 op2, float_status *stat)
46
+ v7m_exception_taken(cpu, excret, true, false);
44
return float32_sub(float32_two, float32_mul(op1, op2, stat), stat);
47
+ }
45
}
48
+ }
46
49
+ /* Clear s0..s15 and FPSCR; TODO also VPR when MVE is implemented */
47
+/* Reciprocal square-root step. AArch32 non-fused semantics. */
50
int i;
48
+static float16 float16_rsqrts_nf(float16 op1, float16 op2, float_status *stat)
51
49
+{
52
for (i = 0; i < 16; i += 2) {
50
+ op1 = float16_squash_input_denormal(op1, stat);
51
+ op2 = float16_squash_input_denormal(op2, stat);
52
+
53
+ if ((float16_is_infinity(op1) && float16_is_zero(op2)) ||
54
+ (float16_is_infinity(op2) && float16_is_zero(op1))) {
55
+ return float16_one_point_five;
56
+ }
57
+ op1 = float16_sub(float16_three, float16_mul(op1, op2, stat), stat);
58
+ return float16_div(op1, float16_two, stat);
59
+}
60
+
61
+static float32 float32_rsqrts_nf(float32 op1, float32 op2, float_status *stat)
62
+{
63
+ op1 = float32_squash_input_denormal(op1, stat);
64
+ op2 = float32_squash_input_denormal(op2, stat);
65
+
66
+ if ((float32_is_infinity(op1) && float32_is_zero(op2)) ||
67
+ (float32_is_infinity(op2) && float32_is_zero(op1))) {
68
+ return float32_one_point_five;
69
+ }
70
+ op1 = float32_sub(float32_three, float32_mul(op1, op2, stat), stat);
71
+ return float32_div(op1, float32_two, stat);
72
+}
73
+
74
#define DO_3OP(NAME, FUNC, TYPE) \
75
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
76
{ \
77
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_fminnum_s, float32_minnum, float32)
78
DO_3OP(gvec_recps_nf_h, float16_recps_nf, float16)
79
DO_3OP(gvec_recps_nf_s, float32_recps_nf, float32)
80
81
+DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16)
82
+DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32)
83
+
84
#ifdef TARGET_AARCH64
85
86
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
87
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/target/arm/vfp_helper.c
90
+++ b/target/arm/vfp_helper.c
91
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
92
return r;
93
}
94
95
-float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
96
-{
97
- float_status *s = &env->vfp.standard_fp_status;
98
- float32 product;
99
- if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
100
- (float32_is_infinity(b) && float32_is_zero_or_denormal(a))) {
101
- if (!(float32_is_zero(a) || float32_is_zero(b))) {
102
- float_raise(float_flag_input_denormal, s);
103
- }
104
- return float32_one_point_five;
105
- }
106
- product = float32_mul(a, b, s);
107
- return float32_div(float32_sub(float32_three, product, s), float32_two, s);
108
-}
109
-
110
/* NEON helpers. */
111
112
/* Constants 256 and 512 are used in some helpers; we avoid relying on
113
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
114
index XXXXXXX..XXXXXXX 100644
115
--- a/target/arm/translate-neon.c.inc
116
+++ b/target/arm/translate-neon.c.inc
117
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMLS, gen_helper_gvec_fmls_s, gen_helper_gvec_fmls_h)
118
DO_3S_FP_GVEC(VFMA, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_h)
119
DO_3S_FP_GVEC(VFMS, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_h)
120
DO_3S_FP_GVEC(VRECPS, gen_helper_gvec_recps_nf_s, gen_helper_gvec_recps_nf_h)
121
+DO_3S_FP_GVEC(VRSQRTS, gen_helper_gvec_rsqrts_nf_s, gen_helper_gvec_rsqrts_nf_h)
122
123
WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
124
WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
125
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
126
return do_3same(s, a, gen_VMINNM_fp32_3s);
127
}
128
129
-WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
130
-
131
-static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
132
- uint32_t rn_ofs, uint32_t rm_ofs,
133
- uint32_t oprsz, uint32_t maxsz)
134
-{
135
- static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
136
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
137
-}
138
-
139
-static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
140
-{
141
- if (a->size != 0) {
142
- /* TODO fp16 support */
143
- return false;
144
- }
145
-
146
- return do_3same(s, a, gen_VRSQRTS_fp_3s);
147
-}
148
-
149
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
150
{
151
/* FP operations handled pairwise 32 bits at a time */
152
--
53
--
153
2.20.1
54
2.20.1
154
55
155
56
diff view generated by jsdifflib
1
Implement the fp16 versions of the VFP VSEL instruction.
1
v8.1M adds new encodings of VLLDM and VLSTM (where bit 7 is set).
2
The only difference is that:
3
* the old T1 encodings UNDEF if the implementation implements 32
4
Dregs (this is currently architecturally impossible for M-profile)
5
* the new T2 encodings have the implementation-defined option to
6
read from memory (discarding the data) or write UNKNOWN values to
7
memory for the stack slots that would be D16-D31
8
9
We choose not to make those accesses, so for us the two
10
instructions behave identically assuming they don't UNDEF.
2
11
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200828183354.27913-18-peter.maydell@linaro.org
14
Message-id: 20201119215617.29887-21-peter.maydell@linaro.org
6
---
15
---
7
target/arm/vfp-uncond.decode | 6 ++++--
16
target/arm/m-nocp.decode | 2 +-
8
target/arm/translate-vfp.c.inc | 16 ++++++++++++----
17
target/arm/translate-vfp.c.inc | 25 +++++++++++++++++++++++++
9
2 files changed, 16 insertions(+), 6 deletions(-)
18
2 files changed, 26 insertions(+), 1 deletion(-)
10
19
11
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
20
diff --git a/target/arm/m-nocp.decode b/target/arm/m-nocp.decode
12
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/vfp-uncond.decode
22
--- a/target/arm/m-nocp.decode
14
+++ b/target/arm/vfp-uncond.decode
23
+++ b/target/arm/m-nocp.decode
15
@@ -XXX,XX +XXX,XX @@
24
@@ -XXX,XX +XXX,XX @@
16
@vfp_dnm_s ................................ vm=%vm_sp vn=%vn_sp vd=%vd_sp
25
17
@vfp_dnm_d ................................ vm=%vm_dp vn=%vn_dp vd=%vd_dp
26
{
18
27
# Special cases which do not take an early NOCP: VLLDM and VLSTM
19
+VSEL 1111 1110 0. cc:2 .... .... 1001 .0.0 .... \
28
- VLLDM_VLSTM 1110 1100 001 l:1 rn:4 0000 1010 0000 0000
20
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp sz=1
29
+ VLLDM_VLSTM 1110 1100 001 l:1 rn:4 0000 1010 op:1 000 0000
21
VSEL 1111 1110 0. cc:2 .... .... 1010 .0.0 .... \
30
# VSCCLRM (new in v8.1M) is similar:
22
- vm=%vm_sp vn=%vn_sp vd=%vd_sp dp=0
31
VSCCLRM 1110 1100 1.01 1111 .... 1011 imm:7 0 vd=%vd_dp size=3
23
+ vm=%vm_sp vn=%vn_sp vd=%vd_sp sz=2
32
VSCCLRM 1110 1100 1.01 1111 .... 1010 imm:8 vd=%vd_sp size=2
24
VSEL 1111 1110 0. cc:2 .... .... 1011 .0.0 .... \
25
- vm=%vm_dp vn=%vn_dp vd=%vd_dp dp=1
26
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp sz=3
27
28
VMAXNM_hp 1111 1110 1.00 .... .... 1001 .0.0 .... @vfp_dnm_s
29
VMINNM_hp 1111 1110 1.00 .... .... 1001 .1.0 .... @vfp_dnm_s
30
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
33
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
31
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/translate-vfp.c.inc
35
--- a/target/arm/translate-vfp.c.inc
33
+++ b/target/arm/translate-vfp.c.inc
36
+++ b/target/arm/translate-vfp.c.inc
34
@@ -XXX,XX +XXX,XX @@ static bool vfp_access_check(DisasContext *s)
37
@@ -XXX,XX +XXX,XX @@ static bool trans_VLLDM_VLSTM(DisasContext *s, arg_VLLDM_VLSTM *a)
35
static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
38
!arm_dc_feature(s, ARM_FEATURE_V8)) {
36
{
37
uint32_t rd, rn, rm;
38
- bool dp = a->dp;
39
+ int sz = a->sz;
40
41
if (!dc_isar_feature(aa32_vsel, s)) {
42
return false;
39
return false;
43
}
40
}
44
41
+
45
- if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
42
+ if (a->op) {
46
+ if (sz == 3 && !dc_isar_feature(aa32_fpdp_v2, s)) {
43
+ /*
47
+ return false;
44
+ * T2 encoding ({D0-D31} reglist): v8.1M and up. We choose not
45
+ * to take the IMPDEF option to make memory accesses to the stack
46
+ * slots that correspond to the D16-D31 registers (discarding
47
+ * read data and writing UNKNOWN values), so for us the T2
48
+ * encoding behaves identically to the T1 encoding.
49
+ */
50
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
51
+ return false;
52
+ }
53
+ } else {
54
+ /*
55
+ * T1 encoding ({D0-D15} reglist); undef if we have 32 Dregs.
56
+ * This is currently architecturally impossible, but we add the
57
+ * check to stay in line with the pseudocode. Note that we must
58
+ * emit code for the UNDEF so it takes precedence over the NOCP.
59
+ */
60
+ if (dc_isar_feature(aa32_simd_r32, s)) {
61
+ unallocated_encoding(s);
62
+ return true;
63
+ }
48
+ }
64
+ }
49
+
65
+
50
+ if (sz == 1 && !dc_isar_feature(aa32_fp16_arith, s)) {
66
/*
51
return false;
67
* If not secure, UNDEF. We must emit code for this
52
}
68
* rather than returning false so that this takes
53
54
/* UNDEF accesses to D16-D31 if they don't exist */
55
- if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
56
+ if (sz == 3 && !dc_isar_feature(aa32_simd_r32, s) &&
57
((a->vm | a->vn | a->vd) & 0x10)) {
58
return false;
59
}
60
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
61
return true;
62
}
63
64
- if (dp) {
65
+ if (sz == 3) {
66
TCGv_i64 frn, frm, dest;
67
TCGv_i64 tmp, zero, zf, nf, vf;
68
69
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
70
tcg_temp_free_i32(tmp);
71
break;
72
}
73
+ /* For fp16 the top half is always zeroes */
74
+ if (sz == 1) {
75
+ tcg_gen_andi_i32(dest, dest, 0xffff);
76
+ }
77
neon_store_reg32(dest, rd);
78
tcg_temp_free_i32(frn);
79
tcg_temp_free_i32(frm);
80
--
69
--
81
2.20.1
70
2.20.1
82
71
83
72
diff view generated by jsdifflib
1
The aa32_fp16_arith feature check function currently looks at the
1
v8.1M introduces a new TRD flag in the CCR register, which enables
2
AArch64 ID_AA64PFR0 register. This is (as the comment notes) not
2
checking for stack frame integrity signatures on SG instructions.
3
correct. The bogus check was put in mostly to allow testing of the
3
This bit is not banked, and is always RAZ/WI to Non-secure code.
4
fp16 variants of the VCMLA instructions and it was something of
4
Adjust the code for handling CCR reads and writes to handle this.
5
a mistake that we allowed them to exist in master.
6
7
Switch the feature check function to testing VMFR1.FPHP, which is
8
what it ought to be.
9
10
This will remove emulation of the VCMLA and VCADD insns from
11
AArch32 code running on an AArch64 '-cpu max' using system emulation.
12
(They were never enabled for aarch32 linux-user and system-emulation.)
13
Since we weren't advertising their existence via the AArch32 ID
14
register, well-behaved guests wouldn't have been using them anyway.
15
16
Once we have implemented all the AArch32 support for the FP16 extension
17
we will advertise it in the MVFR1 ID register field, which will reenable
18
these insns along with all the others.
19
5
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
22
Message-id: 20200828183354.27913-3-peter.maydell@linaro.org
8
Message-id: 20201119215617.29887-23-peter.maydell@linaro.org
23
---
9
---
24
target/arm/cpu.h | 7 +------
10
target/arm/cpu.h | 2 ++
25
1 file changed, 1 insertion(+), 6 deletions(-)
11
hw/intc/armv7m_nvic.c | 26 ++++++++++++++++++--------
12
2 files changed, 20 insertions(+), 8 deletions(-)
26
13
27
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
28
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/cpu.h
16
--- a/target/arm/cpu.h
30
+++ b/target/arm/cpu.h
17
+++ b/target/arm/cpu.h
31
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_predinv(const ARMISARegisters *id)
18
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CCR, STKOFHFNMIGN, 10, 1)
32
19
FIELD(V7M_CCR, DC, 16, 1)
33
static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
20
FIELD(V7M_CCR, IC, 17, 1)
34
{
21
FIELD(V7M_CCR, BP, 18, 1)
35
- /*
22
+FIELD(V7M_CCR, LOB, 19, 1)
36
- * This is a placeholder for use by VCMA until the rest of
23
+FIELD(V7M_CCR, TRD, 20, 1)
37
- * the ARMv8.2-FP16 extension is implemented for aa32 mode.
24
38
- * At which point we can properly set and check MVFR1.FPHP.
25
/* V7M SCR bits */
39
- */
26
FIELD(V7M_SCR, SLEEPONEXIT, 1, 1)
40
- return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
27
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
41
+ return FIELD_EX32(id->mvfr1, MVFR1, FPHP) >= 3;
28
index XXXXXXX..XXXXXXX 100644
42
}
29
--- a/hw/intc/armv7m_nvic.c
43
30
+++ b/hw/intc/armv7m_nvic.c
44
static inline bool isar_feature_aa32_vfp_simd(const ARMISARegisters *id)
31
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
32
}
33
return cpu->env.v7m.scr[attrs.secure];
34
case 0xd14: /* Configuration Control. */
35
- /* The BFHFNMIGN bit is the only non-banked bit; we
36
- * keep it in the non-secure copy of the register.
37
+ /*
38
+ * Non-banked bits: BFHFNMIGN (stored in the NS copy of the register)
39
+ * and TRD (stored in the S copy of the register)
40
*/
41
val = cpu->env.v7m.ccr[attrs.secure];
42
val |= cpu->env.v7m.ccr[M_REG_NS] & R_V7M_CCR_BFHFNMIGN_MASK;
43
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
44
cpu->env.v7m.scr[attrs.secure] = value;
45
break;
46
case 0xd14: /* Configuration Control. */
47
+ {
48
+ uint32_t mask;
49
+
50
if (!arm_feature(&cpu->env, ARM_FEATURE_M_MAIN)) {
51
goto bad_offset;
52
}
53
54
/* Enforce RAZ/WI on reserved and must-RAZ/WI bits */
55
- value &= (R_V7M_CCR_STKALIGN_MASK |
56
- R_V7M_CCR_BFHFNMIGN_MASK |
57
- R_V7M_CCR_DIV_0_TRP_MASK |
58
- R_V7M_CCR_UNALIGN_TRP_MASK |
59
- R_V7M_CCR_USERSETMPEND_MASK |
60
- R_V7M_CCR_NONBASETHRDENA_MASK);
61
+ mask = R_V7M_CCR_STKALIGN_MASK |
62
+ R_V7M_CCR_BFHFNMIGN_MASK |
63
+ R_V7M_CCR_DIV_0_TRP_MASK |
64
+ R_V7M_CCR_UNALIGN_TRP_MASK |
65
+ R_V7M_CCR_USERSETMPEND_MASK |
66
+ R_V7M_CCR_NONBASETHRDENA_MASK;
67
+ if (arm_feature(&cpu->env, ARM_FEATURE_V8_1M) && attrs.secure) {
68
+ /* TRD is always RAZ/WI from NS */
69
+ mask |= R_V7M_CCR_TRD_MASK;
70
+ }
71
+ value &= mask;
72
73
if (arm_feature(&cpu->env, ARM_FEATURE_V8)) {
74
/* v8M makes NONBASETHRDENA and STKALIGN be RES1 */
75
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
76
77
cpu->env.v7m.ccr[attrs.secure] = value;
78
break;
79
+ }
80
case 0xd24: /* System Handler Control and State (SHCSR) */
81
if (!arm_feature(&cpu->env, ARM_FEATURE_V7)) {
82
goto bad_offset;
45
--
83
--
46
2.20.1
84
2.20.1
47
85
48
86
diff view generated by jsdifflib
1
Implement the fp16 versions of the VFP VCVT instruction forms which
1
v8.1M introduces a new TRD flag in the CCR register, which enables
2
convert between floating point and integer.
2
checking for stack frame integrity signatures on SG instructions.
3
Add the code in the SG insn implementation for the new behaviour.
3
4
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-13-peter.maydell@linaro.org
7
Message-id: 20201119215617.29887-24-peter.maydell@linaro.org
7
---
8
---
8
target/arm/vfp.decode | 4 +++
9
target/arm/m_helper.c | 86 +++++++++++++++++++++++++++++++++++++++++++
9
target/arm/translate-vfp.c.inc | 65 ++++++++++++++++++++++++++++++++++
10
1 file changed, 86 insertions(+)
10
2 files changed, 69 insertions(+)
11
11
12
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
12
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
13
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/vfp.decode
14
--- a/target/arm/m_helper.c
15
+++ b/target/arm/vfp.decode
15
+++ b/target/arm/m_helper.c
16
@@ -XXX,XX +XXX,XX @@ VCVT_sp ---- 1110 1.11 0111 .... 1010 11.0 .... @vfp_dm_ds
16
@@ -XXX,XX +XXX,XX @@ static bool v7m_read_half_insn(ARMCPU *cpu, ARMMMUIdx mmu_idx,
17
VCVT_dp ---- 1110 1.11 0111 .... 1011 11.0 .... @vfp_dm_sd
18
19
# VCVT from integer to floating point: Vm always single; Vd depends on size
20
+VCVT_int_hp ---- 1110 1.11 1000 .... 1001 s:1 1.0 .... \
21
+ vd=%vd_sp vm=%vm_sp
22
VCVT_int_sp ---- 1110 1.11 1000 .... 1010 s:1 1.0 .... \
23
vd=%vd_sp vm=%vm_sp
24
VCVT_int_dp ---- 1110 1.11 1000 .... 1011 s:1 1.0 .... \
25
@@ -XXX,XX +XXX,XX @@ VCVT_fix_dp ---- 1110 1.11 1.1. .... 1011 .1.0 .... \
26
vd=%vd_dp imm=%vm_sp opc=%vcvt_fix_op
27
28
# VCVT float to integer (VCVT and VCVTR): Vd always single; Vd depends on size
29
+VCVT_hp_int ---- 1110 1.11 110 s:1 .... 1001 rz:1 1.0 .... \
30
+ vd=%vd_sp vm=%vm_sp
31
VCVT_sp_int ---- 1110 1.11 110 s:1 .... 1010 rz:1 1.0 .... \
32
vd=%vd_sp vm=%vm_sp
33
VCVT_dp_int ---- 1110 1.11 110 s:1 .... 1011 rz:1 1.0 .... \
34
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
35
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/translate-vfp.c.inc
37
+++ b/target/arm/translate-vfp.c.inc
38
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
39
return true;
17
return true;
40
}
18
}
41
19
42
+static bool trans_VCVT_int_hp(DisasContext *s, arg_VCVT_int_sp *a)
20
+static bool v7m_read_sg_stack_word(ARMCPU *cpu, ARMMMUIdx mmu_idx,
21
+ uint32_t addr, uint32_t *spdata)
43
+{
22
+{
44
+ TCGv_i32 vm;
23
+ /*
45
+ TCGv_ptr fpst;
24
+ * Read a word of data from the stack for the SG instruction,
25
+ * writing the value into *spdata. If the load succeeds, return
26
+ * true; otherwise pend an appropriate exception and return false.
27
+ * (We can't use data load helpers here that throw an exception
28
+ * because of the context we're called in, which is halfway through
29
+ * arm_v7m_cpu_do_interrupt().)
30
+ */
31
+ CPUState *cs = CPU(cpu);
32
+ CPUARMState *env = &cpu->env;
33
+ MemTxAttrs attrs = {};
34
+ MemTxResult txres;
35
+ target_ulong page_size;
36
+ hwaddr physaddr;
37
+ int prot;
38
+ ARMMMUFaultInfo fi = {};
39
+ ARMCacheAttrs cacheattrs = {};
40
+ uint32_t value;
46
+
41
+
47
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
42
+ if (get_phys_addr(env, addr, MMU_DATA_LOAD, mmu_idx, &physaddr,
43
+ &attrs, &prot, &page_size, &fi, &cacheattrs)) {
44
+ /* MPU/SAU lookup failed */
45
+ if (fi.type == ARMFault_QEMU_SFault) {
46
+ qemu_log_mask(CPU_LOG_INT,
47
+ "...SecureFault during stack word read\n");
48
+ env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK | R_V7M_SFSR_SFARVALID_MASK;
49
+ env->v7m.sfar = addr;
50
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
51
+ } else {
52
+ qemu_log_mask(CPU_LOG_INT,
53
+ "...MemManageFault during stack word read\n");
54
+ env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_DACCVIOL_MASK |
55
+ R_V7M_CFSR_MMARVALID_MASK;
56
+ env->v7m.mmfar[M_REG_S] = addr;
57
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_MEM, false);
58
+ }
59
+ return false;
60
+ }
61
+ value = address_space_ldl(arm_addressspace(cs, attrs), physaddr,
62
+ attrs, &txres);
63
+ if (txres != MEMTX_OK) {
64
+ /* BusFault trying to read the data */
65
+ qemu_log_mask(CPU_LOG_INT,
66
+ "...BusFault during stack word read\n");
67
+ env->v7m.cfsr[M_REG_NS] |=
68
+ (R_V7M_CFSR_PRECISERR_MASK | R_V7M_CFSR_BFARVALID_MASK);
69
+ env->v7m.bfar = addr;
70
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_BUS, false);
48
+ return false;
71
+ return false;
49
+ }
72
+ }
50
+
73
+
51
+ if (!vfp_access_check(s)) {
74
+ *spdata = value;
52
+ return true;
53
+ }
54
+
55
+ vm = tcg_temp_new_i32();
56
+ neon_load_reg32(vm, a->vm);
57
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
58
+ if (a->s) {
59
+ /* i32 -> f16 */
60
+ gen_helper_vfp_sitoh(vm, vm, fpst);
61
+ } else {
62
+ /* u32 -> f16 */
63
+ gen_helper_vfp_uitoh(vm, vm, fpst);
64
+ }
65
+ neon_store_reg32(vm, a->vd);
66
+ tcg_temp_free_i32(vm);
67
+ tcg_temp_free_ptr(fpst);
68
+ return true;
75
+ return true;
69
+}
76
+}
70
+
77
+
71
static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a)
78
static bool v7m_handle_execute_nsc(ARMCPU *cpu)
72
{
79
{
73
TCGv_i32 vm;
80
/*
74
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
81
@@ -XXX,XX +XXX,XX @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
75
return true;
82
*/
76
}
83
qemu_log_mask(CPU_LOG_INT, "...really an SG instruction at 0x%08" PRIx32
77
84
", executing it\n", env->regs[15]);
78
+static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a)
79
+{
80
+ TCGv_i32 vm;
81
+ TCGv_ptr fpst;
82
+
85
+
83
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
86
+ if (cpu_isar_feature(aa32_m_sec_state, cpu) &&
84
+ return false;
87
+ !arm_v7m_is_handler_mode(env)) {
88
+ /*
89
+ * v8.1M exception stack frame integrity check. Note that we
90
+ * must perform the memory access even if CCR_S.TRD is zero
91
+ * and we aren't going to check what the data loaded is.
92
+ */
93
+ uint32_t spdata, sp;
94
+
95
+ /*
96
+ * We know we are currently NS, so the S stack pointers must be
97
+ * in other_ss_{psp,msp}, not in regs[13]/other_sp.
98
+ */
99
+ sp = v7m_using_psp(env) ? env->v7m.other_ss_psp : env->v7m.other_ss_msp;
100
+ if (!v7m_read_sg_stack_word(cpu, mmu_idx, sp, &spdata)) {
101
+ /* Stack access failed and an exception has been pended */
102
+ return false;
103
+ }
104
+
105
+ if (env->v7m.ccr[M_REG_S] & R_V7M_CCR_TRD_MASK) {
106
+ if (((spdata & ~1) == 0xfefa125a) ||
107
+ !(env->v7m.control[M_REG_S] & 1)) {
108
+ goto gen_invep;
109
+ }
110
+ }
85
+ }
111
+ }
86
+
112
+
87
+ if (!vfp_access_check(s)) {
113
env->regs[14] &= ~1;
88
+ return true;
114
env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
89
+ }
115
switch_v7m_security_state(env, true);
90
+
91
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
92
+ vm = tcg_temp_new_i32();
93
+ neon_load_reg32(vm, a->vm);
94
+
95
+ if (a->s) {
96
+ if (a->rz) {
97
+ gen_helper_vfp_tosizh(vm, vm, fpst);
98
+ } else {
99
+ gen_helper_vfp_tosih(vm, vm, fpst);
100
+ }
101
+ } else {
102
+ if (a->rz) {
103
+ gen_helper_vfp_touizh(vm, vm, fpst);
104
+ } else {
105
+ gen_helper_vfp_touih(vm, vm, fpst);
106
+ }
107
+ }
108
+ neon_store_reg32(vm, a->vd);
109
+ tcg_temp_free_i32(vm);
110
+ tcg_temp_free_ptr(fpst);
111
+ return true;
112
+}
113
+
114
static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a)
115
{
116
TCGv_i32 vm;
117
--
116
--
118
2.20.1
117
2.20.1
119
118
120
119
diff view generated by jsdifflib
1
Convert the neon floating-point vector compare-vs-0 insns VCEQ0,
1
In commit 077d7449100d824a4 we added code to handle the v8M
2
VCGT0, VCLE0, VCGE0 and VCLT0 to use a gvec helper, and use this to
2
requirement that returns from NMI or HardFault forcibly deactivate
3
implement the fp16 case.
3
those exceptions regardless of what interrupt the guest is trying to
4
deactivate. Unfortunately this broke the handling of the "illegal
5
exception return because the returning exception number is not
6
active" check for those cases. In the pseudocode this test is done
7
on the exception the guest asks to return from, but because our
8
implementation was doing this in armv7m_nvic_complete_irq() after the
9
new "deactivate NMI/HardFault regardless" code we ended up doing the
10
test on the VecInfo for that exception instead, which usually meant
11
failing to raise the illegal exception return fault.
12
13
In the case for "configurable exception targeting the opposite
14
security state" we detected the illegal-return case but went ahead
15
and deactivated the VecInfo anyway, which is wrong because that is
16
the VecInfo for the other security state.
17
18
Rearrange the code so that we first identify the illegal return
19
cases, then see if we really need to deactivate NMI or HardFault
20
instead, and finally do the deactivation.
4
21
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
23
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-33-peter.maydell@linaro.org
24
Message-id: 20201119215617.29887-25-peter.maydell@linaro.org
8
---
25
---
9
target/arm/helper.h | 15 +++++++++++++++
26
hw/intc/armv7m_nvic.c | 59 +++++++++++++++++++++++--------------------
10
target/arm/vec_helper.c | 25 +++++++++++++++++++++++++
27
1 file changed, 32 insertions(+), 27 deletions(-)
11
target/arm/translate-neon.c.inc | 33 +++++----------------------------
12
3 files changed, 45 insertions(+), 28 deletions(-)
13
28
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
29
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
15
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
31
--- a/hw/intc/armv7m_nvic.c
17
+++ b/target/arm/helper.h
32
+++ b/hw/intc/armv7m_nvic.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
@@ -XXX,XX +XXX,XX @@ int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure)
19
DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
{
20
DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
NVICState *s = (NVICState *)opaque;
21
36
VecInfo *vec = NULL;
22
+DEF_HELPER_FLAGS_4(gvec_fcgt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
- int ret;
23
+DEF_HELPER_FLAGS_4(gvec_fcgt0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
+ int ret = 0;
39
40
assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
41
42
+ trace_nvic_complete_irq(irq, secure);
24
+
43
+
25
+DEF_HELPER_FLAGS_4(gvec_fcge0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
+ if (secure && exc_is_banked(irq)) {
26
+DEF_HELPER_FLAGS_4(gvec_fcge0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
45
+ vec = &s->sec_vectors[irq];
27
+
46
+ } else {
28
+DEF_HELPER_FLAGS_4(gvec_fceq0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
+ vec = &s->vectors[irq];
29
+DEF_HELPER_FLAGS_4(gvec_fceq0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_4(gvec_fcle0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(gvec_fcle0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+
34
+DEF_HELPER_FLAGS_4(gvec_fclt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(gvec_fclt0_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+
37
DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/vec_helper.c
43
+++ b/target/arm/vec_helper.c
44
@@ -XXX,XX +XXX,XX @@ DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16)
45
DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32)
46
DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64)
47
48
+#define WRAP_CMP0_FWD(FN, CMPOP, TYPE) \
49
+ static TYPE TYPE##_##FN##0(TYPE op, float_status *stat) \
50
+ { \
51
+ return TYPE##_##CMPOP(op, TYPE##_zero, stat); \
52
+ }
48
+ }
53
+
49
+
54
+#define WRAP_CMP0_REV(FN, CMPOP, TYPE) \
50
+ /*
55
+ static TYPE TYPE##_##FN##0(TYPE op, float_status *stat) \
51
+ * Identify illegal exception return cases. We can't immediately
56
+ { \
52
+ * return at this point because we still need to deactivate
57
+ return TYPE##_##CMPOP(TYPE##_zero, op, stat); \
53
+ * (either this exception or NMI/HardFault) first.
54
+ */
55
+ if (!exc_is_banked(irq) && exc_targets_secure(s, irq) != secure) {
56
+ /*
57
+ * Return from a configurable exception targeting the opposite
58
+ * security state from the one we're trying to complete it for.
59
+ * Clear vec because it's not really the VecInfo for this
60
+ * (irq, secstate) so we mustn't deactivate it.
61
+ */
62
+ ret = -1;
63
+ vec = NULL;
64
+ } else if (!vec->active) {
65
+ /* Return from an inactive interrupt */
66
+ ret = -1;
67
+ } else {
68
+ /* Legal return, we will return the RETTOBASE bit value to the caller */
69
+ ret = nvic_rettobase(s);
58
+ }
70
+ }
59
+
71
+
60
+#define DO_2OP_CMP0(FN, CMPOP, DIRN) \
72
/*
61
+ WRAP_CMP0_##DIRN(FN, CMPOP, float16) \
73
* For negative priorities, v8M will forcibly deactivate the appropriate
62
+ WRAP_CMP0_##DIRN(FN, CMPOP, float32) \
74
* NMI or HardFault regardless of what interrupt we're being asked to
63
+ DO_2OP(gvec_f##FN##0_h, float16_##FN##0, float16) \
75
@@ -XXX,XX +XXX,XX @@ int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure)
64
+ DO_2OP(gvec_f##FN##0_s, float32_##FN##0, float32)
76
}
65
+
77
66
+DO_2OP_CMP0(cgt, cgt, FWD)
78
if (!vec) {
67
+DO_2OP_CMP0(cge, cge, FWD)
79
- if (secure && exc_is_banked(irq)) {
68
+DO_2OP_CMP0(ceq, ceq, FWD)
80
- vec = &s->sec_vectors[irq];
69
+DO_2OP_CMP0(clt, cgt, REV)
81
- } else {
70
+DO_2OP_CMP0(cle, cge, REV)
82
- vec = &s->vectors[irq];
71
+
83
- }
72
#undef DO_2OP
73
+#undef DO_2OP_CMP0
74
75
/* Floating-point trigonometric starting value.
76
* See the ARM ARM pseudocode function FPTrigSMul.
77
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
78
index XXXXXXX..XXXXXXX 100644
79
--- a/target/arm/translate-neon.c.inc
80
+++ b/target/arm/translate-neon.c.inc
81
@@ -XXX,XX +XXX,XX @@ DO_2MISC_FP(VCVT_UF, gen_helper_vfp_touizs)
82
83
DO_2MISC_FP_VEC(VRECPE_F, gen_helper_gvec_frecpe_h, gen_helper_gvec_frecpe_s)
84
DO_2MISC_FP_VEC(VRSQRTE_F, gen_helper_gvec_frsqrte_h, gen_helper_gvec_frsqrte_s)
85
+DO_2MISC_FP_VEC(VCGT0_F, gen_helper_gvec_fcgt0_h, gen_helper_gvec_fcgt0_s)
86
+DO_2MISC_FP_VEC(VCGE0_F, gen_helper_gvec_fcge0_h, gen_helper_gvec_fcge0_s)
87
+DO_2MISC_FP_VEC(VCEQ0_F, gen_helper_gvec_fceq0_h, gen_helper_gvec_fceq0_s)
88
+DO_2MISC_FP_VEC(VCLT0_F, gen_helper_gvec_fclt0_h, gen_helper_gvec_fclt0_s)
89
+DO_2MISC_FP_VEC(VCLE0_F, gen_helper_gvec_fcle0_h, gen_helper_gvec_fcle0_s)
90
91
static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
92
{
93
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
94
return do_2misc_fp(s, a, gen_helper_rints_exact);
95
}
96
97
-#define WRAP_FP_CMP0_FWD(WRAPNAME, FUNC) \
98
- static void WRAPNAME(TCGv_i32 d, TCGv_i32 m, TCGv_ptr fpst) \
99
- { \
100
- TCGv_i32 zero = tcg_const_i32(0); \
101
- FUNC(d, m, zero, fpst); \
102
- tcg_temp_free_i32(zero); \
103
- }
104
-#define WRAP_FP_CMP0_REV(WRAPNAME, FUNC) \
105
- static void WRAPNAME(TCGv_i32 d, TCGv_i32 m, TCGv_ptr fpst) \
106
- { \
107
- TCGv_i32 zero = tcg_const_i32(0); \
108
- FUNC(d, zero, m, fpst); \
109
- tcg_temp_free_i32(zero); \
110
- }
84
- }
111
-
85
-
112
-#define DO_FP_CMP0(INSN, FUNC, REV) \
86
- trace_nvic_complete_irq(irq, secure);
113
- WRAP_FP_CMP0_##REV(gen_##INSN, FUNC) \
87
-
114
- static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
88
- if (!vec->active) {
115
- { \
89
- /* Tell the caller this was an illegal exception return */
116
- return do_2misc_fp(s, a, gen_##INSN); \
90
- return -1;
117
- }
91
- }
118
-
92
-
119
-DO_FP_CMP0(VCGT0_F, gen_helper_neon_cgt_f32, FWD)
93
- /*
120
-DO_FP_CMP0(VCGE0_F, gen_helper_neon_cge_f32, FWD)
94
- * If this is a configurable exception and it is currently
121
-DO_FP_CMP0(VCEQ0_F, gen_helper_neon_ceq_f32, FWD)
95
- * targeting the opposite security state from the one we're trying
122
-DO_FP_CMP0(VCLE0_F, gen_helper_neon_cge_f32, REV)
96
- * to complete it for, this counts as an illegal exception return.
123
-DO_FP_CMP0(VCLT0_F, gen_helper_neon_cgt_f32, REV)
97
- * We still need to deactivate whatever vector the logic above has
124
-
98
- * selected, though, as it might not be the same as the one for the
125
static bool do_vrint(DisasContext *s, arg_2misc *a, int rmode)
99
- * requested exception number.
126
{
100
- */
127
/*
101
- if (!exc_is_banked(irq) && exc_targets_secure(s, irq) != secure) {
102
- ret = -1;
103
- } else {
104
- ret = nvic_rettobase(s);
105
+ return ret;
106
}
107
108
vec->active = 0;
128
--
109
--
129
2.20.1
110
2.20.1
130
111
131
112
diff view generated by jsdifflib
1
Convert the Neon VRECPS insn to using a gvec helper, and
1
For v8.1M the architecture mandates that CPUs must provide at
2
use this to implement the fp16 case.
2
least the "minimal RAS implementation" from the Reliability,
3
3
Availability and Serviceability extension. This consists of:
4
The phrasing of the new float32_recps_nf() is slightly different from
4
* an ESB instruction which is a NOP
5
the old recps_f32() so that it parallels the f16 version; for f16 we
5
-- since it is in the HINT space we need only add a comment
6
can't assume that flush-to-zero is always enabled.
6
* an RFSR register which will RAZ/WI
7
* a RAZ/WI AIRCR.IESB bit
8
-- the code which handles writes to AIRCR does not allow setting
9
of RES0 bits, so we already treat this as RAZ/WI; add a comment
10
noting that this is deliberate
11
* minimal implementation of the RAS register block at 0xe0005000
12
-- this will be in a subsequent commit
13
* setting the ID_PFR0.RAS field to 0b0010
14
-- we will do this when we add the Cortex-M55 CPU model
7
15
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200828183354.27913-34-peter.maydell@linaro.org
18
Message-id: 20201119215617.29887-26-peter.maydell@linaro.org
11
---
19
---
12
target/arm/helper.h | 4 +++-
20
target/arm/cpu.h | 14 ++++++++++++++
13
target/arm/vec_helper.c | 31 +++++++++++++++++++++++++++++++
21
target/arm/t32.decode | 4 ++++
14
target/arm/vfp_helper.c | 13 -------------
22
hw/intc/armv7m_nvic.c | 13 +++++++++++++
15
target/arm/translate-neon.c.inc | 21 +--------------------
23
3 files changed, 31 insertions(+)
16
4 files changed, 35 insertions(+), 34 deletions(-)
17
24
18
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper.h
27
--- a/target/arm/cpu.h
21
+++ b/target/arm/helper.h
28
+++ b/target/arm/cpu.h
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
29
@@ -XXX,XX +XXX,XX @@ FIELD(ID_MMFR4, LSM, 20, 4)
23
DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
30
FIELD(ID_MMFR4, CCIDX, 24, 4)
24
DEF_HELPER_4(vfp_muladdh, f16, f16, f16, f16, ptr)
31
FIELD(ID_MMFR4, EVT, 28, 4)
25
32
26
-DEF_HELPER_3(recps_f32, f32, env, f32, f32)
33
+FIELD(ID_PFR0, STATE0, 0, 4)
27
DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
34
+FIELD(ID_PFR0, STATE1, 4, 4)
28
DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
35
+FIELD(ID_PFR0, STATE2, 8, 4)
29
DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
36
+FIELD(ID_PFR0, STATE3, 12, 4)
30
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i3
37
+FIELD(ID_PFR0, CSV2, 16, 4)
31
DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
+FIELD(ID_PFR0, AMU, 20, 4)
32
DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
+FIELD(ID_PFR0, DIT, 24, 4)
33
40
+FIELD(ID_PFR0, RAS, 28, 4)
34
+DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
+
41
+
37
DEF_HELPER_FLAGS_5(gvec_fmla_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
42
FIELD(ID_PFR1, PROGMOD, 0, 4)
38
DEF_HELPER_FLAGS_5(gvec_fmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
43
FIELD(ID_PFR1, SECURITY, 4, 4)
39
44
FIELD(ID_PFR1, MPROGMOD, 8, 4)
40
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
45
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_predinv(const ARMISARegisters *id)
41
index XXXXXXX..XXXXXXX 100644
46
return FIELD_EX32(id->id_isar6, ID_ISAR6, SPECRES) != 0;
42
--- a/target/arm/vec_helper.c
43
+++ b/target/arm/vec_helper.c
44
@@ -XXX,XX +XXX,XX @@ static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
45
return float32_abs(float32_sub(op1, op2, stat));
46
}
47
}
47
48
48
+/*
49
+static inline bool isar_feature_aa32_ras(const ARMISARegisters *id)
49
+ * Reciprocal step. These are the AArch32 version which uses a
50
+ * non-fused multiply-and-subtract.
51
+ */
52
+static float16 float16_recps_nf(float16 op1, float16 op2, float_status *stat)
53
+{
50
+{
54
+ op1 = float16_squash_input_denormal(op1, stat);
51
+ return FIELD_EX32(id->id_pfr0, ID_PFR0, RAS) != 0;
55
+ op2 = float16_squash_input_denormal(op2, stat);
56
+
57
+ if ((float16_is_infinity(op1) && float16_is_zero(op2)) ||
58
+ (float16_is_infinity(op2) && float16_is_zero(op1))) {
59
+ return float16_two;
60
+ }
61
+ return float16_sub(float16_two, float16_mul(op1, op2, stat), stat);
62
+}
52
+}
63
+
53
+
64
+static float32 float32_recps_nf(float32 op1, float32 op2, float_status *stat)
54
static inline bool isar_feature_aa32_mprofile(const ARMISARegisters *id)
65
+{
55
{
66
+ op1 = float32_squash_input_denormal(op1, stat);
56
return FIELD_EX32(id->id_pfr1, ID_PFR1, MPROGMOD) != 0;
67
+ op2 = float32_squash_input_denormal(op2, stat);
57
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/t32.decode
60
+++ b/target/arm/t32.decode
61
@@ -XXX,XX +XXX,XX @@ CLZ 1111 1010 1011 ---- 1111 .... 1000 .... @rdm
62
# SEV 1111 0011 1010 1111 1000 0000 0000 0100
63
# SEVL 1111 0011 1010 1111 1000 0000 0000 0101
64
65
+ # For M-profile minimal-RAS ESB can be a NOP, which is the
66
+ # default behaviour since it is in the hint space.
67
+ # ESB 1111 0011 1010 1111 1000 0000 0001 0000
68
+
68
+
69
+ if ((float32_is_infinity(op1) && float32_is_zero(op2)) ||
69
# The canonical nop ends in 0000 0000, but the whole rest
70
+ (float32_is_infinity(op2) && float32_is_zero(op1))) {
70
# of the space is "reserved hint, behaves as nop".
71
+ return float32_two;
71
NOP 1111 0011 1010 1111 1000 0000 ---- ----
72
+ }
72
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
73
+ return float32_sub(float32_two, float32_mul(op1, op2, stat), stat);
74
+}
75
+
76
#define DO_3OP(NAME, FUNC, TYPE) \
77
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
78
{ \
79
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_fmaxnum_s, float32_maxnum, float32)
80
DO_3OP(gvec_fminnum_h, float16_minnum, float16)
81
DO_3OP(gvec_fminnum_s, float32_minnum, float32)
82
83
+DO_3OP(gvec_recps_nf_h, float16_recps_nf, float16)
84
+DO_3OP(gvec_recps_nf_s, float32_recps_nf, float32)
85
+
86
#ifdef TARGET_AARCH64
87
88
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
89
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
90
index XXXXXXX..XXXXXXX 100644
73
index XXXXXXX..XXXXXXX 100644
91
--- a/target/arm/vfp_helper.c
74
--- a/hw/intc/armv7m_nvic.c
92
+++ b/target/arm/vfp_helper.c
75
+++ b/hw/intc/armv7m_nvic.c
93
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
76
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
94
return r;
77
return 0;
95
}
78
}
96
79
return cpu->env.v7m.sfar;
97
-float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
80
+ case 0xf04: /* RFSR */
98
-{
81
+ if (!cpu_isar_feature(aa32_ras, cpu)) {
99
- float_status *s = &env->vfp.standard_fp_status;
82
+ goto bad_offset;
100
- if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
83
+ }
101
- (float32_is_infinity(b) && float32_is_zero_or_denormal(a))) {
84
+ /* We provide minimal-RAS only: RFSR is RAZ/WI */
102
- if (!(float32_is_zero(a) || float32_is_zero(b))) {
85
+ return 0;
103
- float_raise(float_flag_input_denormal, s);
86
case 0xf34: /* FPCCR */
104
- }
87
if (!cpu_isar_feature(aa32_vfp_simd, cpu)) {
105
- return float32_two;
88
return 0;
106
- }
89
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
107
- return float32_sub(float32_two, float32_mul(a, b, s), s);
90
R_V7M_AIRCR_PRIGROUP_SHIFT,
108
-}
91
R_V7M_AIRCR_PRIGROUP_LENGTH);
109
-
92
}
110
float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
93
+ /* AIRCR.IESB is RAZ/WI because we implement only minimal RAS */
111
{
94
if (attrs.secure) {
112
float_status *s = &env->vfp.standard_fp_status;
95
/* These bits are only writable by secure */
113
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
96
cpu->env.v7m.aircr = value &
114
index XXXXXXX..XXXXXXX 100644
97
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
115
--- a/target/arm/translate-neon.c.inc
98
}
116
+++ b/target/arm/translate-neon.c.inc
99
break;
117
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMLA, gen_helper_gvec_fmla_s, gen_helper_gvec_fmla_h)
100
}
118
DO_3S_FP_GVEC(VMLS, gen_helper_gvec_fmls_s, gen_helper_gvec_fmls_h)
101
+ case 0xf04: /* RFSR */
119
DO_3S_FP_GVEC(VFMA, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_h)
102
+ if (!cpu_isar_feature(aa32_ras, cpu)) {
120
DO_3S_FP_GVEC(VFMS, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_h)
103
+ goto bad_offset;
121
+DO_3S_FP_GVEC(VRECPS, gen_helper_gvec_recps_nf_s, gen_helper_gvec_recps_nf_h)
104
+ }
122
105
+ /* We provide minimal-RAS only: RFSR is RAZ/WI */
123
WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
106
+ break;
124
WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
107
case 0xf34: /* FPCCR */
125
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
108
if (cpu_isar_feature(aa32_vfp_simd, cpu)) {
126
return do_3same(s, a, gen_VMINNM_fp32_3s);
109
/* Not all bits here are banked. */
127
}
128
129
-WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
130
-
131
-static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
132
- uint32_t rn_ofs, uint32_t rm_ofs,
133
- uint32_t oprsz, uint32_t maxsz)
134
-{
135
- static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
136
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
137
-}
138
-
139
-static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
140
-{
141
- if (a->size != 0) {
142
- /* TODO fp16 support */
143
- return false;
144
- }
145
-
146
- return do_3same(s, a, gen_VRECPS_fp_3s);
147
-}
148
-
149
WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
150
151
static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
152
--
110
--
153
2.20.1
111
2.20.1
154
112
155
113
diff view generated by jsdifflib
1
Implement the fp16 version of the VFP VRINT* insns.
1
The RAS feature has a block of memory-mapped registers at offset
2
0x5000 within the PPB. For a "minimal RAS" implementation we provide
3
no error records and so the only registers that exist in the block
4
are ERRIIDR and ERRDEVID.
5
6
The "RAZ/WI for privileged, BusFault for nonprivileged" behaviour
7
of the "nvic-default" region is actually valid for minimal-RAS,
8
so the main benefit of providing an explicit implementation of
9
the register block is more accurate LOG_UNIMP messages, and a
10
framework for where we could add a real RAS implementation later
11
if necessary.
2
12
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200828183354.27913-19-peter.maydell@linaro.org
15
Message-id: 20201119215617.29887-27-peter.maydell@linaro.org
6
---
16
---
7
target/arm/helper.h | 2 +
17
include/hw/intc/armv7m_nvic.h | 1 +
8
target/arm/vfp-uncond.decode | 6 ++-
18
hw/intc/armv7m_nvic.c | 56 +++++++++++++++++++++++++++++++++++
9
target/arm/vfp.decode | 3 ++
19
2 files changed, 57 insertions(+)
10
target/arm/vfp_helper.c | 21 ++++++++
11
target/arm/translate-vfp.c.inc | 98 +++++++++++++++++++++++++++++++---
12
5 files changed, 122 insertions(+), 8 deletions(-)
13
20
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
21
diff --git a/include/hw/intc/armv7m_nvic.h b/include/hw/intc/armv7m_nvic.h
15
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
23
--- a/include/hw/intc/armv7m_nvic.h
17
+++ b/target/arm/helper.h
24
+++ b/include/hw/intc/armv7m_nvic.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(shr_cc, i32, env, i32, i32)
25
@@ -XXX,XX +XXX,XX @@ struct NVICState {
19
DEF_HELPER_3(sar_cc, i32, env, i32, i32)
26
MemoryRegion sysreg_ns_mem;
20
DEF_HELPER_3(ror_cc, i32, env, i32, i32)
27
MemoryRegion systickmem;
21
28
MemoryRegion systick_ns_mem;
22
+DEF_HELPER_FLAGS_2(rinth_exact, TCG_CALL_NO_RWG, f16, f16, ptr)
29
+ MemoryRegion ras_mem;
23
DEF_HELPER_FLAGS_2(rints_exact, TCG_CALL_NO_RWG, f32, f32, ptr)
30
MemoryRegion container;
24
DEF_HELPER_FLAGS_2(rintd_exact, TCG_CALL_NO_RWG, f64, f64, ptr)
31
MemoryRegion defaultmem;
25
+DEF_HELPER_FLAGS_2(rinth, TCG_CALL_NO_RWG, f16, f16, ptr)
32
26
DEF_HELPER_FLAGS_2(rints, TCG_CALL_NO_RWG, f32, f32, ptr)
33
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
27
DEF_HELPER_FLAGS_2(rintd, TCG_CALL_NO_RWG, f64, f64, ptr)
28
29
diff --git a/target/arm/vfp-uncond.decode b/target/arm/vfp-uncond.decode
30
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/vfp-uncond.decode
35
--- a/hw/intc/armv7m_nvic.c
32
+++ b/target/arm/vfp-uncond.decode
36
+++ b/hw/intc/armv7m_nvic.c
33
@@ -XXX,XX +XXX,XX @@ VMINNM_sp 1111 1110 1.00 .... .... 1010 .1.0 .... @vfp_dnm_s
37
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps nvic_systick_ops = {
34
VMAXNM_dp 1111 1110 1.00 .... .... 1011 .0.0 .... @vfp_dnm_d
38
.endianness = DEVICE_NATIVE_ENDIAN,
35
VMINNM_dp 1111 1110 1.00 .... .... 1011 .1.0 .... @vfp_dnm_d
39
};
36
40
37
+VRINT 1111 1110 1.11 10 rm:2 .... 1001 01.0 .... \
41
+
38
+ vm=%vm_sp vd=%vd_sp sz=1
42
+static MemTxResult ras_read(void *opaque, hwaddr addr,
39
VRINT 1111 1110 1.11 10 rm:2 .... 1010 01.0 .... \
43
+ uint64_t *data, unsigned size,
40
- vm=%vm_sp vd=%vd_sp dp=0
44
+ MemTxAttrs attrs)
41
+ vm=%vm_sp vd=%vd_sp sz=2
42
VRINT 1111 1110 1.11 10 rm:2 .... 1011 01.0 .... \
43
- vm=%vm_dp vd=%vd_dp dp=1
44
+ vm=%vm_dp vd=%vd_dp sz=3
45
46
# VCVT float to int with specified rounding mode; Vd is always single-precision
47
VCVT 1111 1110 1.11 11 rm:2 .... 1001 op:1 1.0 .... \
48
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/vfp.decode
51
+++ b/target/arm/vfp.decode
52
@@ -XXX,XX +XXX,XX @@ VCVT_f16_f32 ---- 1110 1.11 0011 .... 1010 t:1 1.0 .... \
53
VCVT_f16_f64 ---- 1110 1.11 0011 .... 1011 t:1 1.0 .... \
54
vd=%vd_sp vm=%vm_dp
55
56
+VRINTR_hp ---- 1110 1.11 0110 .... 1001 01.0 .... @vfp_dm_ss
57
VRINTR_sp ---- 1110 1.11 0110 .... 1010 01.0 .... @vfp_dm_ss
58
VRINTR_dp ---- 1110 1.11 0110 .... 1011 01.0 .... @vfp_dm_dd
59
60
+VRINTZ_hp ---- 1110 1.11 0110 .... 1001 11.0 .... @vfp_dm_ss
61
VRINTZ_sp ---- 1110 1.11 0110 .... 1010 11.0 .... @vfp_dm_ss
62
VRINTZ_dp ---- 1110 1.11 0110 .... 1011 11.0 .... @vfp_dm_dd
63
64
+VRINTX_hp ---- 1110 1.11 0111 .... 1001 01.0 .... @vfp_dm_ss
65
VRINTX_sp ---- 1110 1.11 0111 .... 1010 01.0 .... @vfp_dm_ss
66
VRINTX_dp ---- 1110 1.11 0111 .... 1011 01.0 .... @vfp_dm_dd
67
68
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/target/arm/vfp_helper.c
71
+++ b/target/arm/vfp_helper.c
72
@@ -XXX,XX +XXX,XX @@ float64 VFP_HELPER(muladd, d)(float64 a, float64 b, float64 c, void *fpstp)
73
}
74
75
/* ARMv8 round to integral */
76
+dh_ctype_f16 HELPER(rinth_exact)(dh_ctype_f16 x, void *fp_status)
77
+{
45
+{
78
+ return float16_round_to_int(x, fp_status);
46
+ if (attrs.user) {
47
+ return MEMTX_ERROR;
48
+ }
49
+
50
+ switch (addr) {
51
+ case 0xe10: /* ERRIIDR */
52
+ /* architect field = Arm; product/variant/revision 0 */
53
+ *data = 0x43b;
54
+ break;
55
+ case 0xfc8: /* ERRDEVID */
56
+ /* Minimal RAS: we implement 0 error record indexes */
57
+ *data = 0;
58
+ break;
59
+ default:
60
+ qemu_log_mask(LOG_UNIMP, "Read RAS register offset 0x%x\n",
61
+ (uint32_t)addr);
62
+ *data = 0;
63
+ break;
64
+ }
65
+ return MEMTX_OK;
79
+}
66
+}
80
+
67
+
81
float32 HELPER(rints_exact)(float32 x, void *fp_status)
68
+static MemTxResult ras_write(void *opaque, hwaddr addr,
82
{
69
+ uint64_t value, unsigned size,
83
return float32_round_to_int(x, fp_status);
70
+ MemTxAttrs attrs)
84
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rintd_exact)(float64 x, void *fp_status)
85
return float64_round_to_int(x, fp_status);
86
}
87
88
+dh_ctype_f16 HELPER(rinth)(dh_ctype_f16 x, void *fp_status)
89
+{
71
+{
90
+ int old_flags = get_float_exception_flags(fp_status), new_flags;
72
+ if (attrs.user) {
91
+ float16 ret;
73
+ return MEMTX_ERROR;
92
+
93
+ ret = float16_round_to_int(x, fp_status);
94
+
95
+ /* Suppress any inexact exceptions the conversion produced */
96
+ if (!(old_flags & float_flag_inexact)) {
97
+ new_flags = get_float_exception_flags(fp_status);
98
+ set_float_exception_flags(new_flags & ~float_flag_inexact, fp_status);
99
+ }
74
+ }
100
+
75
+
101
+ return ret;
76
+ switch (addr) {
77
+ default:
78
+ qemu_log_mask(LOG_UNIMP, "Write to RAS register offset 0x%x\n",
79
+ (uint32_t)addr);
80
+ break;
81
+ }
82
+ return MEMTX_OK;
102
+}
83
+}
103
+
84
+
104
float32 HELPER(rints)(float32 x, void *fp_status)
85
+static const MemoryRegionOps ras_ops = {
105
{
86
+ .read_with_attrs = ras_read,
106
int old_flags = get_float_exception_flags(fp_status), new_flags;
87
+ .write_with_attrs = ras_write,
107
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
88
+ .endianness = DEVICE_NATIVE_ENDIAN,
108
index XXXXXXX..XXXXXXX 100644
89
+};
109
--- a/target/arm/translate-vfp.c.inc
90
+
110
+++ b/target/arm/translate-vfp.c.inc
91
/*
111
@@ -XXX,XX +XXX,XX @@ static const uint8_t fp_decode_rm[] = {
92
* Unassigned portions of the PPB space are RAZ/WI for privileged
112
static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
93
* accesses, and fault for non-privileged accesses.
113
{
94
@@ -XXX,XX +XXX,XX @@ static void armv7m_nvic_realize(DeviceState *dev, Error **errp)
114
uint32_t rd, rm;
95
&s->systick_ns_mem, 1);
115
- bool dp = a->dp;
116
+ int sz = a->sz;
117
TCGv_ptr fpst;
118
TCGv_i32 tcg_rmode;
119
int rounding = fp_decode_rm[a->rm];
120
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
121
return false;
122
}
96
}
123
97
124
- if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
98
+ if (cpu_isar_feature(aa32_ras, s->cpu)) {
125
+ if (sz == 3 && !dc_isar_feature(aa32_fpdp_v2, s)) {
99
+ memory_region_init_io(&s->ras_mem, OBJECT(s),
126
+ return false;
100
+ &ras_ops, s, "nvic_ras", 0x1000);
101
+ memory_region_add_subregion(&s->container, 0x5000, &s->ras_mem);
127
+ }
102
+ }
128
+
103
+
129
+ if (sz == 1 && !dc_isar_feature(aa32_fp16_arith, s)) {
104
sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->container);
130
return false;
131
}
132
133
/* UNDEF accesses to D16-D31 if they don't exist */
134
- if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
135
+ if (sz == 3 && !dc_isar_feature(aa32_simd_r32, s) &&
136
((a->vm | a->vd) & 0x10)) {
137
return false;
138
}
139
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
140
return true;
141
}
142
143
- fpst = fpstatus_ptr(FPST_FPCR);
144
+ if (sz == 1) {
145
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
146
+ } else {
147
+ fpst = fpstatus_ptr(FPST_FPCR);
148
+ }
149
150
tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
151
gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
152
153
- if (dp) {
154
+ if (sz == 3) {
155
TCGv_i64 tcg_op;
156
TCGv_i64 tcg_res;
157
tcg_op = tcg_temp_new_i64();
158
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
159
tcg_op = tcg_temp_new_i32();
160
tcg_res = tcg_temp_new_i32();
161
neon_load_reg32(tcg_op, rm);
162
- gen_helper_rints(tcg_res, tcg_op, fpst);
163
+ if (sz == 1) {
164
+ gen_helper_rinth(tcg_res, tcg_op, fpst);
165
+ } else {
166
+ gen_helper_rints(tcg_res, tcg_op, fpst);
167
+ }
168
neon_store_reg32(tcg_res, rd);
169
tcg_temp_free_i32(tcg_op);
170
tcg_temp_free_i32(tcg_res);
171
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
172
return true;
173
}
105
}
174
106
175
+static bool trans_VRINTR_hp(DisasContext *s, arg_VRINTR_sp *a)
176
+{
177
+ TCGv_ptr fpst;
178
+ TCGv_i32 tmp;
179
+
180
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
181
+ return false;
182
+ }
183
+
184
+ if (!vfp_access_check(s)) {
185
+ return true;
186
+ }
187
+
188
+ tmp = tcg_temp_new_i32();
189
+ neon_load_reg32(tmp, a->vm);
190
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
191
+ gen_helper_rinth(tmp, tmp, fpst);
192
+ neon_store_reg32(tmp, a->vd);
193
+ tcg_temp_free_ptr(fpst);
194
+ tcg_temp_free_i32(tmp);
195
+ return true;
196
+}
197
+
198
static bool trans_VRINTR_sp(DisasContext *s, arg_VRINTR_sp *a)
199
{
200
TCGv_ptr fpst;
201
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
202
return true;
203
}
204
205
+static bool trans_VRINTZ_hp(DisasContext *s, arg_VRINTZ_sp *a)
206
+{
207
+ TCGv_ptr fpst;
208
+ TCGv_i32 tmp;
209
+ TCGv_i32 tcg_rmode;
210
+
211
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
212
+ return false;
213
+ }
214
+
215
+ if (!vfp_access_check(s)) {
216
+ return true;
217
+ }
218
+
219
+ tmp = tcg_temp_new_i32();
220
+ neon_load_reg32(tmp, a->vm);
221
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
222
+ tcg_rmode = tcg_const_i32(float_round_to_zero);
223
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
224
+ gen_helper_rinth(tmp, tmp, fpst);
225
+ gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
226
+ neon_store_reg32(tmp, a->vd);
227
+ tcg_temp_free_ptr(fpst);
228
+ tcg_temp_free_i32(tcg_rmode);
229
+ tcg_temp_free_i32(tmp);
230
+ return true;
231
+}
232
+
233
static bool trans_VRINTZ_sp(DisasContext *s, arg_VRINTZ_sp *a)
234
{
235
TCGv_ptr fpst;
236
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
237
return true;
238
}
239
240
+static bool trans_VRINTX_hp(DisasContext *s, arg_VRINTX_sp *a)
241
+{
242
+ TCGv_ptr fpst;
243
+ TCGv_i32 tmp;
244
+
245
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
246
+ return false;
247
+ }
248
+
249
+ if (!vfp_access_check(s)) {
250
+ return true;
251
+ }
252
+
253
+ tmp = tcg_temp_new_i32();
254
+ neon_load_reg32(tmp, a->vm);
255
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
256
+ gen_helper_rinth_exact(tmp, tmp, fpst);
257
+ neon_store_reg32(tmp, a->vd);
258
+ tcg_temp_free_ptr(fpst);
259
+ tcg_temp_free_i32(tmp);
260
+ return true;
261
+}
262
+
263
static bool trans_VRINTX_sp(DisasContext *s, arg_VRINTX_sp *a)
264
{
265
TCGv_ptr fpst;
266
--
107
--
267
2.20.1
108
2.20.1
268
109
269
110
diff view generated by jsdifflib
1
Implement VFP fp16 support for fused multiply-add insns
1
Correct a typo in the name we give the NVIC object.
2
VFNMA, VFNMS, VFMA, VFMS.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-7-peter.maydell@linaro.org
6
Message-id: 20201119215617.29887-28-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper.h | 1 +
8
hw/arm/armv7m.c | 2 +-
9
target/arm/vfp.decode | 5 +++
9
1 file changed, 1 insertion(+), 1 deletion(-)
10
target/arm/vfp_helper.c | 7 ++++
11
target/arm/translate-vfp.c.inc | 64 ++++++++++++++++++++++++++++++++++
12
4 files changed, 77 insertions(+)
13
10
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
11
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
15
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
13
--- a/hw/arm/armv7m.c
17
+++ b/target/arm/helper.h
14
+++ b/hw/arm/armv7m.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
15
@@ -XXX,XX +XXX,XX @@ static void armv7m_instance_init(Object *obj)
19
16
20
DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
17
memory_region_init(&s->container, obj, "armv7m-container", UINT64_MAX);
21
DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
18
22
+DEF_HELPER_4(vfp_muladdh, f16, f16, f16, f16, ptr)
19
- object_initialize_child(obj, "nvnic", &s->nvic, TYPE_NVIC);
23
20
+ object_initialize_child(obj, "nvic", &s->nvic, TYPE_NVIC);
24
DEF_HELPER_3(recps_f32, f32, env, f32, f32)
21
object_property_add_alias(obj, "num-irq",
25
DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
22
OBJECT(&s->nvic), "num-irq");
26
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/vfp.decode
29
+++ b/target/arm/vfp.decode
30
@@ -XXX,XX +XXX,XX @@ VDIV_hp ---- 1110 1.00 .... .... 1001 .0.0 .... @vfp_dnm_s
31
VDIV_sp ---- 1110 1.00 .... .... 1010 .0.0 .... @vfp_dnm_s
32
VDIV_dp ---- 1110 1.00 .... .... 1011 .0.0 .... @vfp_dnm_d
33
34
+VFMA_hp ---- 1110 1.10 .... .... 1001 .0. 0 .... @vfp_dnm_s
35
+VFMS_hp ---- 1110 1.10 .... .... 1001 .1. 0 .... @vfp_dnm_s
36
+VFNMA_hp ---- 1110 1.01 .... .... 1001 .0. 0 .... @vfp_dnm_s
37
+VFNMS_hp ---- 1110 1.01 .... .... 1001 .1. 0 .... @vfp_dnm_s
38
+
39
VFMA_sp ---- 1110 1.10 .... .... 1010 .0. 0 .... @vfp_dnm_s
40
VFMS_sp ---- 1110 1.10 .... .... 1010 .1. 0 .... @vfp_dnm_s
41
VFNMA_sp ---- 1110 1.01 .... .... 1010 .0. 0 .... @vfp_dnm_s
42
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/vfp_helper.c
45
+++ b/target/arm/vfp_helper.c
46
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrte_u32)(uint32_t a)
47
}
48
49
/* VFPv4 fused multiply-accumulate */
50
+dh_ctype_f16 VFP_HELPER(muladd, h)(dh_ctype_f16 a, dh_ctype_f16 b,
51
+ dh_ctype_f16 c, void *fpstp)
52
+{
53
+ float_status *fpst = fpstp;
54
+ return float16_muladd(a, b, c, 0, fpst);
55
+}
56
+
57
float32 VFP_HELPER(muladd, s)(float32 a, float32 b, float32 c, void *fpstp)
58
{
59
float_status *fpst = fpstp;
60
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/translate-vfp.c.inc
63
+++ b/target/arm/translate-vfp.c.inc
64
@@ -XXX,XX +XXX,XX @@ static bool trans_VMAXNM_dp(DisasContext *s, arg_VMAXNM_dp *a)
65
a->vd, a->vn, a->vm, false);
66
}
67
68
+static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
69
+{
70
+ /*
71
+ * VFNMA : fd = muladd(-fd, fn, fm)
72
+ * VFNMS : fd = muladd(-fd, -fn, fm)
73
+ * VFMA : fd = muladd( fd, fn, fm)
74
+ * VFMS : fd = muladd( fd, -fn, fm)
75
+ *
76
+ * These are fused multiply-add, and must be done as one floating
77
+ * point operation with no rounding between the multiplication and
78
+ * addition steps. NB that doing the negations here as separate
79
+ * steps is correct : an input NaN should come out with its sign
80
+ * bit flipped if it is a negated-input.
81
+ */
82
+ TCGv_ptr fpst;
83
+ TCGv_i32 vn, vm, vd;
84
+
85
+ /*
86
+ * Present in VFPv4 only, and only with the FP16 extension.
87
+ * Note that we can't rely on the SIMDFMAC check alone, because
88
+ * in a Neon-no-VFP core that ID register field will be non-zero.
89
+ */
90
+ if (!dc_isar_feature(aa32_fp16_arith, s) ||
91
+ !dc_isar_feature(aa32_simdfmac, s) ||
92
+ !dc_isar_feature(aa32_fpsp_v2, s)) {
93
+ return false;
94
+ }
95
+
96
+ if (s->vec_len != 0 || s->vec_stride != 0) {
97
+ return false;
98
+ }
99
+
100
+ if (!vfp_access_check(s)) {
101
+ return true;
102
+ }
103
+
104
+ vn = tcg_temp_new_i32();
105
+ vm = tcg_temp_new_i32();
106
+ vd = tcg_temp_new_i32();
107
+
108
+ neon_load_reg32(vn, a->vn);
109
+ neon_load_reg32(vm, a->vm);
110
+ if (neg_n) {
111
+ /* VFNMS, VFMS */
112
+ gen_helper_vfp_negh(vn, vn);
113
+ }
114
+ neon_load_reg32(vd, a->vd);
115
+ if (neg_d) {
116
+ /* VFNMA, VFNMS */
117
+ gen_helper_vfp_negh(vd, vd);
118
+ }
119
+ fpst = fpstatus_ptr(FPST_FPCR_F16);
120
+ gen_helper_vfp_muladdh(vd, vn, vm, vd, fpst);
121
+ neon_store_reg32(vd, a->vd);
122
+
123
+ tcg_temp_free_ptr(fpst);
124
+ tcg_temp_free_i32(vn);
125
+ tcg_temp_free_i32(vm);
126
+ tcg_temp_free_i32(vd);
127
+
128
+ return true;
129
+}
130
+
131
static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
132
{
133
/*
134
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
135
MAKE_ONE_VFM_TRANS_FN(VFNMA, PREC, false, true) \
136
MAKE_ONE_VFM_TRANS_FN(VFNMS, PREC, true, true)
137
138
+MAKE_VFM_TRANS_FNS(hp)
139
MAKE_VFM_TRANS_FNS(sp)
140
MAKE_VFM_TRANS_FNS(dp)
141
23
142
--
24
--
143
2.20.1
25
2.20.1
144
26
145
27
diff view generated by jsdifflib
Deleted patch
1
Implement the fp16 versions of the VFP VLDR/VSTR (immediate).
2
1
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200828183354.27913-12-peter.maydell@linaro.org
6
---
7
target/arm/vfp.decode | 3 +--
8
target/arm/translate-vfp.c.inc | 35 ++++++++++++++++++++++++++++++++++
9
2 files changed, 36 insertions(+), 2 deletions(-)
10
11
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/vfp.decode
14
+++ b/target/arm/vfp.decode
15
@@ -XXX,XX +XXX,XX @@ VMOV_single ---- 1110 000 l:1 .... rt:4 1010 . 001 0000 vn=%vn_sp
16
VMOV_64_sp ---- 1100 010 op:1 rt2:4 rt:4 1010 00.1 .... vm=%vm_sp
17
VMOV_64_dp ---- 1100 010 op:1 rt2:4 rt:4 1011 00.1 .... vm=%vm_dp
18
19
-# Note that the half-precision variants of VLDR and VSTR are
20
-# not part of this decodetree at all because they have bits [9:8] == 0b01
21
+VLDR_VSTR_hp ---- 1101 u:1 .0 l:1 rn:4 .... 1001 imm:8 vd=%vd_sp
22
VLDR_VSTR_sp ---- 1101 u:1 .0 l:1 rn:4 .... 1010 imm:8 vd=%vd_sp
23
VLDR_VSTR_dp ---- 1101 u:1 .0 l:1 rn:4 .... 1011 imm:8 vd=%vd_dp
24
25
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate-vfp.c.inc
28
+++ b/target/arm/translate-vfp.c.inc
29
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_dp *a)
30
return true;
31
}
32
33
+static bool trans_VLDR_VSTR_hp(DisasContext *s, arg_VLDR_VSTR_sp *a)
34
+{
35
+ uint32_t offset;
36
+ TCGv_i32 addr, tmp;
37
+
38
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
39
+ return false;
40
+ }
41
+
42
+ if (!vfp_access_check(s)) {
43
+ return true;
44
+ }
45
+
46
+ /* imm8 field is offset/2 for fp16, unlike fp32 and fp64 */
47
+ offset = a->imm << 1;
48
+ if (!a->u) {
49
+ offset = -offset;
50
+ }
51
+
52
+ /* For thumb, use of PC is UNPREDICTABLE. */
53
+ addr = add_reg_for_lit(s, a->rn, offset);
54
+ tmp = tcg_temp_new_i32();
55
+ if (a->l) {
56
+ gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
57
+ neon_store_reg32(tmp, a->vd);
58
+ } else {
59
+ neon_load_reg32(tmp, a->vd);
60
+ gen_aa32_st16(s, tmp, addr, get_mem_index(s));
61
+ }
62
+ tcg_temp_free_i32(tmp);
63
+ tcg_temp_free_i32(addr);
64
+
65
+ return true;
66
+}
67
+
68
static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
69
{
70
uint32_t offset;
71
--
72
2.20.1
73
74
diff view generated by jsdifflib
Deleted patch
1
Currently the VFP_CONV_FIX macros take a single fsz argument for the
2
size of the float type, which is used both to select the name of
3
the functions to call (eg float32_is_any_nan()) and also for the
4
type to use for the float inputs and outputs (eg float32).
5
1
6
Separate these into fsz and ftype arguments, so that we can use them
7
for fp16, which uses 'float16' in the function names but is still
8
passing inputs and outputs in a 32-bit sized type.
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20200828183354.27913-14-peter.maydell@linaro.org
13
---
14
target/arm/vfp_helper.c | 46 ++++++++++++++++++++---------------------
15
1 file changed, 23 insertions(+), 23 deletions(-)
16
17
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/vfp_helper.c
20
+++ b/target/arm/vfp_helper.c
21
@@ -XXX,XX +XXX,XX @@ float32 VFP_HELPER(fcvts, d)(float64 x, CPUARMState *env)
22
}
23
24
/* VFP3 fixed point conversion. */
25
-#define VFP_CONV_FIX_FLOAT(name, p, fsz, isz, itype) \
26
-float##fsz HELPER(vfp_##name##to##p)(uint##isz##_t x, uint32_t shift, \
27
+#define VFP_CONV_FIX_FLOAT(name, p, fsz, ftype, isz, itype) \
28
+ftype HELPER(vfp_##name##to##p)(uint##isz##_t x, uint32_t shift, \
29
void *fpstp) \
30
{ return itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); }
31
32
-#define VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, ROUND, suff) \
33
-uint##isz##_t HELPER(vfp_to##name##p##suff)(float##fsz x, uint32_t shift, \
34
+#define VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, ROUND, suff) \
35
+uint##isz##_t HELPER(vfp_to##name##p##suff)(ftype x, uint32_t shift, \
36
void *fpst) \
37
{ \
38
if (unlikely(float##fsz##_is_any_nan(x))) { \
39
@@ -XXX,XX +XXX,XX @@ uint##isz##_t HELPER(vfp_to##name##p##suff)(float##fsz x, uint32_t shift, \
40
return float##fsz##_to_##itype##_scalbn(x, ROUND, shift, fpst); \
41
}
42
43
-#define VFP_CONV_FIX(name, p, fsz, isz, itype) \
44
-VFP_CONV_FIX_FLOAT(name, p, fsz, isz, itype) \
45
-VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, \
46
+#define VFP_CONV_FIX(name, p, fsz, ftype, isz, itype) \
47
+VFP_CONV_FIX_FLOAT(name, p, fsz, ftype, isz, itype) \
48
+VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, \
49
float_round_to_zero, _round_to_zero) \
50
-VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, \
51
+VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, \
52
get_float_rounding_mode(fpst), )
53
54
-#define VFP_CONV_FIX_A64(name, p, fsz, isz, itype) \
55
-VFP_CONV_FIX_FLOAT(name, p, fsz, isz, itype) \
56
-VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, \
57
+#define VFP_CONV_FIX_A64(name, p, fsz, ftype, isz, itype) \
58
+VFP_CONV_FIX_FLOAT(name, p, fsz, ftype, isz, itype) \
59
+VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, \
60
get_float_rounding_mode(fpst), )
61
62
-VFP_CONV_FIX(sh, d, 64, 64, int16)
63
-VFP_CONV_FIX(sl, d, 64, 64, int32)
64
-VFP_CONV_FIX_A64(sq, d, 64, 64, int64)
65
-VFP_CONV_FIX(uh, d, 64, 64, uint16)
66
-VFP_CONV_FIX(ul, d, 64, 64, uint32)
67
-VFP_CONV_FIX_A64(uq, d, 64, 64, uint64)
68
-VFP_CONV_FIX(sh, s, 32, 32, int16)
69
-VFP_CONV_FIX(sl, s, 32, 32, int32)
70
-VFP_CONV_FIX_A64(sq, s, 32, 64, int64)
71
-VFP_CONV_FIX(uh, s, 32, 32, uint16)
72
-VFP_CONV_FIX(ul, s, 32, 32, uint32)
73
-VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
74
+VFP_CONV_FIX(sh, d, 64, float64, 64, int16)
75
+VFP_CONV_FIX(sl, d, 64, float64, 64, int32)
76
+VFP_CONV_FIX_A64(sq, d, 64, float64, 64, int64)
77
+VFP_CONV_FIX(uh, d, 64, float64, 64, uint16)
78
+VFP_CONV_FIX(ul, d, 64, float64, 64, uint32)
79
+VFP_CONV_FIX_A64(uq, d, 64, float64, 64, uint64)
80
+VFP_CONV_FIX(sh, s, 32, float32, 32, int16)
81
+VFP_CONV_FIX(sl, s, 32, float32, 32, int32)
82
+VFP_CONV_FIX_A64(sq, s, 32, float32, 64, int64)
83
+VFP_CONV_FIX(uh, s, 32, float32, 32, uint16)
84
+VFP_CONV_FIX(ul, s, 32, float32, 32, uint32)
85
+VFP_CONV_FIX_A64(uq, s, 32, float32, 64, uint64)
86
87
#undef VFP_CONV_FIX
88
#undef VFP_CONV_FIX_FLOAT
89
--
90
2.20.1
91
92
diff view generated by jsdifflib
Deleted patch
1
Now the VFP_CONV_FIX macros can handle fp16's distinction between the
2
width of the operation and the width of the type used to pass operands,
3
use the macros rather than the open-coded functions.
4
1
5
This creates an extra six helper functions, all of which we are going
6
to need for the AArch32 VFP fp16 instructions.
7
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200828183354.27913-15-peter.maydell@linaro.org
11
---
12
target/arm/helper.h | 6 +++
13
target/arm/vfp_helper.c | 86 +++--------------------------------------
14
2 files changed, 12 insertions(+), 80 deletions(-)
15
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.h
19
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(vfp_tosizh, s32, f16, ptr)
21
DEF_HELPER_2(vfp_tosizs, s32, f32, ptr)
22
DEF_HELPER_2(vfp_tosizd, s32, f64, ptr)
23
24
+DEF_HELPER_3(vfp_toshh_round_to_zero, i32, f16, i32, ptr)
25
+DEF_HELPER_3(vfp_toslh_round_to_zero, i32, f16, i32, ptr)
26
+DEF_HELPER_3(vfp_touhh_round_to_zero, i32, f16, i32, ptr)
27
+DEF_HELPER_3(vfp_toulh_round_to_zero, i32, f16, i32, ptr)
28
DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr)
29
DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr)
30
DEF_HELPER_3(vfp_touhs_round_to_zero, i32, f32, i32, ptr)
31
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_sqtod, f64, i64, i32, ptr)
32
DEF_HELPER_3(vfp_uhtod, f64, i64, i32, ptr)
33
DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
34
DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
35
+DEF_HELPER_3(vfp_shtoh, f16, i32, i32, ptr)
36
+DEF_HELPER_3(vfp_uhtoh, f16, i32, i32, ptr)
37
DEF_HELPER_3(vfp_sltoh, f16, i32, i32, ptr)
38
DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
39
DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
40
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/vfp_helper.c
43
+++ b/target/arm/vfp_helper.c
44
@@ -XXX,XX +XXX,XX @@ VFP_CONV_FIX_A64(sq, s, 32, float32, 64, int64)
45
VFP_CONV_FIX(uh, s, 32, float32, 32, uint16)
46
VFP_CONV_FIX(ul, s, 32, float32, 32, uint32)
47
VFP_CONV_FIX_A64(uq, s, 32, float32, 64, uint64)
48
+VFP_CONV_FIX(sh, h, 16, dh_ctype_f16, 32, int16)
49
+VFP_CONV_FIX(sl, h, 16, dh_ctype_f16, 32, int32)
50
+VFP_CONV_FIX_A64(sq, h, 16, dh_ctype_f16, 64, int64)
51
+VFP_CONV_FIX(uh, h, 16, dh_ctype_f16, 32, uint16)
52
+VFP_CONV_FIX(ul, h, 16, dh_ctype_f16, 32, uint32)
53
+VFP_CONV_FIX_A64(uq, h, 16, dh_ctype_f16, 64, uint64)
54
55
#undef VFP_CONV_FIX
56
#undef VFP_CONV_FIX_FLOAT
57
#undef VFP_CONV_FLOAT_FIX_ROUND
58
#undef VFP_CONV_FIX_A64
59
60
-uint32_t HELPER(vfp_sltoh)(uint32_t x, uint32_t shift, void *fpst)
61
-{
62
- return int32_to_float16_scalbn(x, -shift, fpst);
63
-}
64
-
65
-uint32_t HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
66
-{
67
- return uint32_to_float16_scalbn(x, -shift, fpst);
68
-}
69
-
70
-uint32_t HELPER(vfp_sqtoh)(uint64_t x, uint32_t shift, void *fpst)
71
-{
72
- return int64_to_float16_scalbn(x, -shift, fpst);
73
-}
74
-
75
-uint32_t HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
76
-{
77
- return uint64_to_float16_scalbn(x, -shift, fpst);
78
-}
79
-
80
-uint32_t HELPER(vfp_toshh)(uint32_t x, uint32_t shift, void *fpst)
81
-{
82
- if (unlikely(float16_is_any_nan(x))) {
83
- float_raise(float_flag_invalid, fpst);
84
- return 0;
85
- }
86
- return float16_to_int16_scalbn(x, get_float_rounding_mode(fpst),
87
- shift, fpst);
88
-}
89
-
90
-uint32_t HELPER(vfp_touhh)(uint32_t x, uint32_t shift, void *fpst)
91
-{
92
- if (unlikely(float16_is_any_nan(x))) {
93
- float_raise(float_flag_invalid, fpst);
94
- return 0;
95
- }
96
- return float16_to_uint16_scalbn(x, get_float_rounding_mode(fpst),
97
- shift, fpst);
98
-}
99
-
100
-uint32_t HELPER(vfp_toslh)(uint32_t x, uint32_t shift, void *fpst)
101
-{
102
- if (unlikely(float16_is_any_nan(x))) {
103
- float_raise(float_flag_invalid, fpst);
104
- return 0;
105
- }
106
- return float16_to_int32_scalbn(x, get_float_rounding_mode(fpst),
107
- shift, fpst);
108
-}
109
-
110
-uint32_t HELPER(vfp_toulh)(uint32_t x, uint32_t shift, void *fpst)
111
-{
112
- if (unlikely(float16_is_any_nan(x))) {
113
- float_raise(float_flag_invalid, fpst);
114
- return 0;
115
- }
116
- return float16_to_uint32_scalbn(x, get_float_rounding_mode(fpst),
117
- shift, fpst);
118
-}
119
-
120
-uint64_t HELPER(vfp_tosqh)(uint32_t x, uint32_t shift, void *fpst)
121
-{
122
- if (unlikely(float16_is_any_nan(x))) {
123
- float_raise(float_flag_invalid, fpst);
124
- return 0;
125
- }
126
- return float16_to_int64_scalbn(x, get_float_rounding_mode(fpst),
127
- shift, fpst);
128
-}
129
-
130
-uint64_t HELPER(vfp_touqh)(uint32_t x, uint32_t shift, void *fpst)
131
-{
132
- if (unlikely(float16_is_any_nan(x))) {
133
- float_raise(float_flag_invalid, fpst);
134
- return 0;
135
- }
136
- return float16_to_uint64_scalbn(x, get_float_rounding_mode(fpst),
137
- shift, fpst);
138
-}
139
-
140
/* Set the current fp rounding mode and return the old one.
141
* The argument is a softfloat float_round_ value.
142
*/
143
--
144
2.20.1
145
146
diff view generated by jsdifflib
Deleted patch
1
Implement FP16 support for the Neon insns which use the DO_3S_FP_GVEC
2
macro: VADD, VSUB, VABD, VMUL.
3
1
4
For VABD this requires us to implement a new gvec_fabd_h helper
5
using the machinery we have already for the other helpers.
6
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200828183354.27913-24-peter.maydell@linaro.org
10
---
11
target/arm/helper.h | 1 +
12
target/arm/vec_helper.c | 6 ++++++
13
target/arm/translate-neon.c.inc | 36 +++++++++++++++++----------------
14
3 files changed, 26 insertions(+), 17 deletions(-)
15
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.h
19
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
24
+DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
27
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
28
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/vec_helper.c
31
+++ b/target/arm/vec_helper.c
32
@@ -XXX,XX +XXX,XX @@ static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
33
return result;
34
}
35
36
+static float16 float16_abd(float16 op1, float16 op2, float_status *stat)
37
+{
38
+ return float16_abs(float16_sub(op1, op2, stat));
39
+}
40
+
41
static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
42
{
43
return float32_abs(float32_sub(op1, op2, stat));
44
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
45
DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
46
DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
47
48
+DO_3OP(gvec_fabd_h, float16_abd, float16)
49
DO_3OP(gvec_fabd_s, float32_abd, float32)
50
51
#ifdef TARGET_AARCH64
52
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/translate-neon.c.inc
55
+++ b/target/arm/translate-neon.c.inc
56
@@ -XXX,XX +XXX,XX @@ static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
57
return true;
58
}
59
60
-/*
61
- * For all the functions using this macro, size == 1 means fp16,
62
- * which is an architecture extension we don't implement yet.
63
- */
64
-#define DO_3S_FP_GVEC(INSN,FUNC) \
65
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
66
- uint32_t rn_ofs, uint32_t rm_ofs, \
67
- uint32_t oprsz, uint32_t maxsz) \
68
+#define WRAP_FP_GVEC(WRAPNAME, FPST, FUNC) \
69
+ static void WRAPNAME(unsigned vece, uint32_t rd_ofs, \
70
+ uint32_t rn_ofs, uint32_t rm_ofs, \
71
+ uint32_t oprsz, uint32_t maxsz) \
72
{ \
73
- TCGv_ptr fpst = fpstatus_ptr(FPST_STD); \
74
+ TCGv_ptr fpst = fpstatus_ptr(FPST); \
75
tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpst, \
76
oprsz, maxsz, 0, FUNC); \
77
tcg_temp_free_ptr(fpst); \
78
- } \
79
+ }
80
+
81
+#define DO_3S_FP_GVEC(INSN,SFUNC,HFUNC) \
82
+ WRAP_FP_GVEC(gen_##INSN##_fp32_3s, FPST_STD, SFUNC) \
83
+ WRAP_FP_GVEC(gen_##INSN##_fp16_3s, FPST_STD_F16, HFUNC) \
84
static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
85
{ \
86
if (a->size != 0) { \
87
- /* TODO fp16 support */ \
88
- return false; \
89
+ if (!dc_isar_feature(aa32_fp16_arith, s)) { \
90
+ return false; \
91
+ } \
92
+ return do_3same(s, a, gen_##INSN##_fp16_3s); \
93
} \
94
- return do_3same(s, a, gen_##INSN##_3s); \
95
+ return do_3same(s, a, gen_##INSN##_fp32_3s); \
96
}
97
98
99
-DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
100
-DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
101
-DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
102
-DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
103
+DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s, gen_helper_gvec_fadd_h)
104
+DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s, gen_helper_gvec_fsub_h)
105
+DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s, gen_helper_gvec_fabd_h)
106
+DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s, gen_helper_gvec_fmul_h)
107
108
/*
109
* For all the functions using this macro, size == 1 means fp16,
110
--
111
2.20.1
112
113
diff view generated by jsdifflib
Deleted patch
1
We already have gvec helpers for floating point VRECPE and
2
VRQSRTE, so convert the Neon decoder to use them and
3
add the fp16 support.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200828183354.27913-25-peter.maydell@linaro.org
8
---
9
target/arm/translate-neon.c.inc | 31 +++++++++++++++++++++++++++++--
10
1 file changed, 29 insertions(+), 2 deletions(-)
11
12
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-neon.c.inc
15
+++ b/target/arm/translate-neon.c.inc
16
@@ -XXX,XX +XXX,XX @@ static bool do_2misc_fp(DisasContext *s, arg_2misc *a,
17
return do_2misc_fp(s, a, FUNC); \
18
}
19
20
-DO_2MISC_FP(VRECPE_F, gen_helper_recpe_f32)
21
-DO_2MISC_FP(VRSQRTE_F, gen_helper_rsqrte_f32)
22
DO_2MISC_FP(VCVT_FS, gen_helper_vfp_sitos)
23
DO_2MISC_FP(VCVT_FU, gen_helper_vfp_uitos)
24
DO_2MISC_FP(VCVT_SF, gen_helper_vfp_tosizs)
25
DO_2MISC_FP(VCVT_UF, gen_helper_vfp_touizs)
26
27
+#define DO_2MISC_FP_VEC(INSN, HFUNC, SFUNC) \
28
+ static void gen_##INSN(unsigned vece, uint32_t rd_ofs, \
29
+ uint32_t rm_ofs, \
30
+ uint32_t oprsz, uint32_t maxsz) \
31
+ { \
32
+ static gen_helper_gvec_2_ptr * const fns[4] = { \
33
+ NULL, HFUNC, SFUNC, NULL, \
34
+ }; \
35
+ TCGv_ptr fpst; \
36
+ fpst = fpstatus_ptr(vece == MO_16 ? FPST_STD_F16 : FPST_STD); \
37
+ tcg_gen_gvec_2_ptr(rd_ofs, rm_ofs, fpst, oprsz, maxsz, 0, \
38
+ fns[vece]); \
39
+ tcg_temp_free_ptr(fpst); \
40
+ } \
41
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
42
+ { \
43
+ if (a->size == MO_16) { \
44
+ if (!dc_isar_feature(aa32_fp16_arith, s)) { \
45
+ return false; \
46
+ } \
47
+ } else if (a->size != MO_32) { \
48
+ return false; \
49
+ } \
50
+ return do_2misc_vec(s, a, gen_##INSN); \
51
+ }
52
+
53
+DO_2MISC_FP_VEC(VRECPE_F, gen_helper_gvec_frecpe_h, gen_helper_gvec_frecpe_s)
54
+DO_2MISC_FP_VEC(VRSQRTE_F, gen_helper_gvec_frsqrte_h, gen_helper_gvec_frsqrte_s)
55
+
56
static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
57
{
58
if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
59
--
60
2.20.1
61
62
diff view generated by jsdifflib
Deleted patch
1
Rewrite Neon VABS/VNEG of floats to use gvec logical AND and XOR, so
2
that we can implement the fp16 version of the insns.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-26-peter.maydell@linaro.org
7
---
8
target/arm/translate-neon.c.inc | 34 +++++++++++++++++++++++++++------
9
1 file changed, 28 insertions(+), 6 deletions(-)
10
11
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-neon.c.inc
14
+++ b/target/arm/translate-neon.c.inc
15
@@ -XXX,XX +XXX,XX @@ static bool trans_VCNT(DisasContext *s, arg_2misc *a)
16
return do_2misc(s, a, gen_helper_neon_cnt_u8);
17
}
18
19
+static void gen_VABS_F(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
20
+ uint32_t oprsz, uint32_t maxsz)
21
+{
22
+ tcg_gen_gvec_andi(vece, rd_ofs, rm_ofs,
23
+ vece == MO_16 ? 0x7fff : 0x7fffffff,
24
+ oprsz, maxsz);
25
+}
26
+
27
static bool trans_VABS_F(DisasContext *s, arg_2misc *a)
28
{
29
- if (a->size != 2) {
30
+ if (a->size == MO_16) {
31
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
32
+ return false;
33
+ }
34
+ } else if (a->size != MO_32) {
35
return false;
36
}
37
- /* TODO: FP16 : size == 1 */
38
- return do_2misc(s, a, gen_helper_vfp_abss);
39
+ return do_2misc_vec(s, a, gen_VABS_F);
40
+}
41
+
42
+static void gen_VNEG_F(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
43
+ uint32_t oprsz, uint32_t maxsz)
44
+{
45
+ tcg_gen_gvec_xori(vece, rd_ofs, rm_ofs,
46
+ vece == MO_16 ? 0x8000 : 0x80000000,
47
+ oprsz, maxsz);
48
}
49
50
static bool trans_VNEG_F(DisasContext *s, arg_2misc *a)
51
{
52
- if (a->size != 2) {
53
+ if (a->size == MO_16) {
54
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
55
+ return false;
56
+ }
57
+ } else if (a->size != MO_32) {
58
return false;
59
}
60
- /* TODO: FP16 : size == 1 */
61
- return do_2misc(s, a, gen_helper_vfp_negs);
62
+ return do_2misc_vec(s, a, gen_VNEG_F);
63
}
64
65
static bool trans_VRECPE(DisasContext *s, arg_2misc *a)
66
--
67
2.20.1
68
69
diff view generated by jsdifflib
Deleted patch
1
Convert the Neon floating-point vector comparison ops VCEQ,
2
VCGE and VCGT over to using a gvec helper and use this to
3
implement the fp16 case.
4
1
5
(We put the float16_ceq() etc functions above the DO_2OP()
6
macro definition because later when we convert the
7
compare-against-zero instructions we'll want their
8
definitions to be visible at that point in the source file.)
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20200828183354.27913-27-peter.maydell@linaro.org
13
---
14
target/arm/helper.h | 9 +++++++
15
target/arm/vec_helper.c | 44 +++++++++++++++++++++++++++++++++
16
target/arm/translate-neon.c.inc | 6 ++---
17
3 files changed, 56 insertions(+), 3 deletions(-)
18
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.h
22
+++ b/target/arm/helper.h
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
27
+DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_5(gvec_fcge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+
33
+DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
+
36
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
37
void, ptr, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
39
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/vec_helper.c
42
+++ b/target/arm/vec_helper.c
43
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm,
44
clear_tail(d, opr_sz, simd_maxsz(desc));
45
}
46
47
+/*
48
+ * Floating point comparisons producing an integer result (all 1s or all 0s).
49
+ * Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
50
+ * Softfloat routines return 0/1, which we convert to the 0/-1 Neon requires.
51
+ */
52
+static uint16_t float16_ceq(float16 op1, float16 op2, float_status *stat)
53
+{
54
+ return -float16_eq_quiet(op1, op2, stat);
55
+}
56
+
57
+static uint32_t float32_ceq(float32 op1, float32 op2, float_status *stat)
58
+{
59
+ return -float32_eq_quiet(op1, op2, stat);
60
+}
61
+
62
+static uint16_t float16_cge(float16 op1, float16 op2, float_status *stat)
63
+{
64
+ return -float16_le(op2, op1, stat);
65
+}
66
+
67
+static uint32_t float32_cge(float32 op1, float32 op2, float_status *stat)
68
+{
69
+ return -float32_le(op2, op1, stat);
70
+}
71
+
72
+static uint16_t float16_cgt(float16 op1, float16 op2, float_status *stat)
73
+{
74
+ return -float16_lt(op2, op1, stat);
75
+}
76
+
77
+static uint32_t float32_cgt(float32 op1, float32 op2, float_status *stat)
78
+{
79
+ return -float32_lt(op2, op1, stat);
80
+}
81
+
82
#define DO_2OP(NAME, FUNC, TYPE) \
83
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
84
{ \
85
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
86
DO_3OP(gvec_fabd_h, float16_abd, float16)
87
DO_3OP(gvec_fabd_s, float32_abd, float32)
88
89
+DO_3OP(gvec_fceq_h, float16_ceq, float16)
90
+DO_3OP(gvec_fceq_s, float32_ceq, float32)
91
+
92
+DO_3OP(gvec_fcge_h, float16_cge, float16)
93
+DO_3OP(gvec_fcge_s, float32_cge, float32)
94
+
95
+DO_3OP(gvec_fcgt_h, float16_cgt, float16)
96
+DO_3OP(gvec_fcgt_s, float32_cgt, float32)
97
+
98
#ifdef TARGET_AARCH64
99
100
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
101
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
102
index XXXXXXX..XXXXXXX 100644
103
--- a/target/arm/translate-neon.c.inc
104
+++ b/target/arm/translate-neon.c.inc
105
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s, gen_helper_gvec_fadd_h)
106
DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s, gen_helper_gvec_fsub_h)
107
DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s, gen_helper_gvec_fabd_h)
108
DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s, gen_helper_gvec_fmul_h)
109
+DO_3S_FP_GVEC(VCEQ, gen_helper_gvec_fceq_s, gen_helper_gvec_fceq_h)
110
+DO_3S_FP_GVEC(VCGE, gen_helper_gvec_fcge_s, gen_helper_gvec_fcge_h)
111
+DO_3S_FP_GVEC(VCGT, gen_helper_gvec_fcgt_s, gen_helper_gvec_fcgt_h)
112
113
/*
114
* For all the functions using this macro, size == 1 means fp16,
115
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s, gen_helper_gvec_fmul_h)
116
return do_3same_fp(s, a, FUNC, READS_VD); \
117
}
118
119
-DO_3S_FP(VCEQ, gen_helper_neon_ceq_f32, false)
120
-DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
121
-DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
122
DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
123
DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
124
DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
125
--
126
2.20.1
127
128
diff view generated by jsdifflib
Deleted patch
1
Convert the Neon float-point VMAX and VMIN insns over to using
2
a gvec helper, and use this to implement the fp16 case.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-29-peter.maydell@linaro.org
7
---
8
target/arm/helper.h | 6 ++++++
9
target/arm/vec_helper.c | 6 ++++++
10
target/arm/translate-neon.c.inc | 5 ++---
11
3 files changed, 14 insertions(+), 3 deletions(-)
12
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
16
+++ b/target/arm/helper.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
21
+DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+
24
+DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+
27
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
28
void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
30
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/vec_helper.c
33
+++ b/target/arm/vec_helper.c
34
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_facge_s, float32_acge, float32)
35
DO_3OP(gvec_facgt_h, float16_acgt, float16)
36
DO_3OP(gvec_facgt_s, float32_acgt, float32)
37
38
+DO_3OP(gvec_fmax_h, float16_max, float16)
39
+DO_3OP(gvec_fmax_s, float32_max, float32)
40
+
41
+DO_3OP(gvec_fmin_h, float16_min, float16)
42
+DO_3OP(gvec_fmin_s, float32_min, float32)
43
+
44
#ifdef TARGET_AARCH64
45
46
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
47
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/translate-neon.c.inc
50
+++ b/target/arm/translate-neon.c.inc
51
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VCGE, gen_helper_gvec_fcge_s, gen_helper_gvec_fcge_h)
52
DO_3S_FP_GVEC(VCGT, gen_helper_gvec_fcgt_s, gen_helper_gvec_fcgt_h)
53
DO_3S_FP_GVEC(VACGE, gen_helper_gvec_facge_s, gen_helper_gvec_facge_h)
54
DO_3S_FP_GVEC(VACGT, gen_helper_gvec_facgt_s, gen_helper_gvec_facgt_h)
55
+DO_3S_FP_GVEC(VMAX, gen_helper_gvec_fmax_s, gen_helper_gvec_fmax_h)
56
+DO_3S_FP_GVEC(VMIN, gen_helper_gvec_fmin_s, gen_helper_gvec_fmin_h)
57
58
/*
59
* For all the functions using this macro, size == 1 means fp16,
60
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VACGT, gen_helper_gvec_facgt_s, gen_helper_gvec_facgt_h)
61
return do_3same_fp(s, a, FUNC, READS_VD); \
62
}
63
64
-DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
65
-DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
66
-
67
static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
68
TCGv_ptr fpstatus)
69
{
70
--
71
2.20.1
72
73
diff view generated by jsdifflib
Deleted patch
1
Convert the Neon floating point VMAXNM and VMINNM insns to
2
using a gvec helper and use this to implement the fp16 case.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-30-peter.maydell@linaro.org
7
---
8
target/arm/helper.h | 6 ++++++
9
target/arm/vec_helper.c | 6 ++++++
10
target/arm/translate-neon.c.inc | 23 +++++++++++++++--------
11
3 files changed, 27 insertions(+), 8 deletions(-)
12
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
16
+++ b/target/arm/helper.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
21
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+
24
+DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+
27
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
28
void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
30
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/vec_helper.c
33
+++ b/target/arm/vec_helper.c
34
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_fmax_s, float32_max, float32)
35
DO_3OP(gvec_fmin_h, float16_min, float16)
36
DO_3OP(gvec_fmin_s, float32_min, float32)
37
38
+DO_3OP(gvec_fmaxnum_h, float16_maxnum, float16)
39
+DO_3OP(gvec_fmaxnum_s, float32_maxnum, float32)
40
+
41
+DO_3OP(gvec_fminnum_h, float16_minnum, float16)
42
+DO_3OP(gvec_fminnum_s, float32_minnum, float32)
43
+
44
#ifdef TARGET_AARCH64
45
46
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
47
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/translate-neon.c.inc
50
+++ b/target/arm/translate-neon.c.inc
51
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
52
DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
53
DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
54
55
+WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
56
+WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
57
+WRAP_FP_GVEC(gen_VMINNM_fp32_3s, FPST_STD, gen_helper_gvec_fminnum_s)
58
+WRAP_FP_GVEC(gen_VMINNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fminnum_h)
59
+
60
static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
61
{
62
if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
63
@@ -XXX,XX +XXX,XX @@ static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
64
}
65
66
if (a->size != 0) {
67
- /* TODO fp16 support */
68
- return false;
69
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
70
+ return false;
71
+ }
72
+ return do_3same(s, a, gen_VMAXNM_fp16_3s);
73
}
74
-
75
- return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
76
+ return do_3same(s, a, gen_VMAXNM_fp32_3s);
77
}
78
79
static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
80
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
81
}
82
83
if (a->size != 0) {
84
- /* TODO fp16 support */
85
- return false;
86
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
87
+ return false;
88
+ }
89
+ return do_3same(s, a, gen_VMINNM_fp16_3s);
90
}
91
-
92
- return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
93
+ return do_3same(s, a, gen_VMINNM_fp32_3s);
94
}
95
96
WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
97
--
98
2.20.1
99
100
diff view generated by jsdifflib
Deleted patch
1
Convert the Neon floating-point VMLA and VMLS insns over to using a
2
gvec helper, and use this to implement the fp16 case.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200828183354.27913-31-peter.maydell@linaro.org
7
---
8
target/arm/helper.h | 6 +++++
9
target/arm/vec_helper.c | 42 +++++++++++++++++++++++++++++++++
10
target/arm/translate-neon.c.inc | 33 ++------------------------
11
3 files changed, 50 insertions(+), 31 deletions(-)
12
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.h
16
+++ b/target/arm/helper.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i3
18
DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
21
+DEF_HELPER_FLAGS_5(gvec_fmla_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+DEF_HELPER_FLAGS_5(gvec_fmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+
24
+DEF_HELPER_FLAGS_5(gvec_fmls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+
27
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
28
void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
30
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/vec_helper.c
33
+++ b/target/arm/vec_helper.c
34
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64)
35
#endif
36
#undef DO_3OP
37
38
+/* Non-fused multiply-add (unlike float16_muladd etc, which are fused) */
39
+static float16 float16_muladd_nf(float16 dest, float16 op1, float16 op2,
40
+ float_status *stat)
41
+{
42
+ return float16_add(dest, float16_mul(op1, op2, stat), stat);
43
+}
44
+
45
+static float32 float32_muladd_nf(float32 dest, float32 op1, float32 op2,
46
+ float_status *stat)
47
+{
48
+ return float32_add(dest, float32_mul(op1, op2, stat), stat);
49
+}
50
+
51
+static float16 float16_mulsub_nf(float16 dest, float16 op1, float16 op2,
52
+ float_status *stat)
53
+{
54
+ return float16_sub(dest, float16_mul(op1, op2, stat), stat);
55
+}
56
+
57
+static float32 float32_mulsub_nf(float32 dest, float32 op1, float32 op2,
58
+ float_status *stat)
59
+{
60
+ return float32_sub(dest, float32_mul(op1, op2, stat), stat);
61
+}
62
+
63
+#define DO_MULADD(NAME, FUNC, TYPE) \
64
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
65
+{ \
66
+ intptr_t i, oprsz = simd_oprsz(desc); \
67
+ TYPE *d = vd, *n = vn, *m = vm; \
68
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
69
+ d[i] = FUNC(d[i], n[i], m[i], stat); \
70
+ } \
71
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
72
+}
73
+
74
+DO_MULADD(gvec_fmla_h, float16_muladd_nf, float16)
75
+DO_MULADD(gvec_fmla_s, float32_muladd_nf, float32)
76
+
77
+DO_MULADD(gvec_fmls_h, float16_mulsub_nf, float16)
78
+DO_MULADD(gvec_fmls_s, float32_mulsub_nf, float32)
79
+
80
/* For the indexed ops, SVE applies the index per 128-bit vector segment.
81
* For AdvSIMD, there is of course only one such vector segment.
82
*/
83
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
84
index XXXXXXX..XXXXXXX 100644
85
--- a/target/arm/translate-neon.c.inc
86
+++ b/target/arm/translate-neon.c.inc
87
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VACGE, gen_helper_gvec_facge_s, gen_helper_gvec_facge_h)
88
DO_3S_FP_GVEC(VACGT, gen_helper_gvec_facgt_s, gen_helper_gvec_facgt_h)
89
DO_3S_FP_GVEC(VMAX, gen_helper_gvec_fmax_s, gen_helper_gvec_fmax_h)
90
DO_3S_FP_GVEC(VMIN, gen_helper_gvec_fmin_s, gen_helper_gvec_fmin_h)
91
-
92
-/*
93
- * For all the functions using this macro, size == 1 means fp16,
94
- * which is an architecture extension we don't implement yet.
95
- */
96
-#define DO_3S_FP(INSN,FUNC,READS_VD) \
97
- static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
98
- { \
99
- if (a->size != 0) { \
100
- /* TODO fp16 support */ \
101
- return false; \
102
- } \
103
- return do_3same_fp(s, a, FUNC, READS_VD); \
104
- }
105
-
106
-static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
107
- TCGv_ptr fpstatus)
108
-{
109
- gen_helper_vfp_muls(vn, vn, vm, fpstatus);
110
- gen_helper_vfp_adds(vd, vd, vn, fpstatus);
111
-}
112
-
113
-static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
114
- TCGv_ptr fpstatus)
115
-{
116
- gen_helper_vfp_muls(vn, vn, vm, fpstatus);
117
- gen_helper_vfp_subs(vd, vd, vn, fpstatus);
118
-}
119
-
120
-DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
121
-DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
122
+DO_3S_FP_GVEC(VMLA, gen_helper_gvec_fmla_s, gen_helper_gvec_fmla_h)
123
+DO_3S_FP_GVEC(VMLS, gen_helper_gvec_fmls_s, gen_helper_gvec_fmls_h)
124
125
WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
126
WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
127
--
128
2.20.1
129
130
diff view generated by jsdifflib