1
As promised, another pullreq... This one's mostly RTH's patches.
1
Mostly this is patches from me and RTH cleaning up and doing
2
more decodetree conversion for AArch32 Neon. The major new feature
3
is Dongjiu Geng's patchset to report host memory errors to KVM guests;
4
also a new aspeed board from Patrick Williams.
2
5
3
thanks
6
thanks
4
-- PMM
7
-- PMM
5
8
6
The following changes since commit 784c2e4f232adf5ef47a84a262ec72a07d068d6a:
9
The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:
7
10
8
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2018-10-19 15:30:40 +0100)
11
Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)
9
12
10
are available in the Git repository at:
13
are available in the Git repository at:
11
14
12
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20181019
15
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514
13
16
14
for you to fetch changes up to 88c9add25e7120e8622796c81ad3f3fb7f8d40e7:
17
for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:
15
18
16
target/arm: Only flush tlb if ASID changes (2018-10-19 17:38:48 +0100)
19
target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)
17
20
18
----------------------------------------------------------------
21
----------------------------------------------------------------
19
target-arm queue:
22
target-arm queue:
20
* ssi-sd: Make devices picking up backends unavailable with -device
23
* target/arm: Use correct GDB XML for M-profile cores
21
* Add support for VCPU event states
24
* target/arm: Code cleanup to use gvec APIs better
22
* Move towards making ID registers the source of truth for
25
* aspeed: Add support for the sonorapass-bmc board
23
whether a guest CPU implements a feature, rather than having
26
* target/arm: Support reporting KVM host memory errors
24
parallel ID registers and feature bit flags
27
to the guest via ACPI notifications
25
* Implement various HCR hypervisor trap/config bits
28
* target/arm: Finish conversion of Neon 3-reg-same insns to decodetree
26
* Get IL bit correct for v7 syndrome values
27
* Report correct syndrome for FP/SIMD traps to Hyp mode
28
* hw/arm/boot: Increase compliance with kernel arm64 boot protocol
29
* Refactor A32 Neon to use generic vector infrastructure
30
* Fix a bug in A32 VLD2 "(multiple 2-element structures)" insn
31
* net: cadence_gem: Report features correctly in ID register
32
* Avoid some unnecessary TLB flushes on TTBR register writes
33
29
34
----------------------------------------------------------------
30
----------------------------------------------------------------
35
Dongjiu Geng (1):
31
Dongjiu Geng (10):
36
target/arm: Add support for VCPU event states
32
acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
33
hw/arm/virt: Introduce a RAS machine option
34
docs: APEI GHES generation and CPER record description
35
ACPI: Build related register address fields via hardware error fw_cfg blob
36
ACPI: Build Hardware Error Source Table
37
ACPI: Record the Generic Error Status Block address
38
KVM: Move hwpoison page related functions into kvm-all.c
39
ACPI: Record Generic Error Status Block(GESB) table
40
target-arm: kvm64: handle SIGBUS signal from kernel or KVM
41
MAINTAINERS: Add ACPI/HEST/GHES entries
37
42
38
Edgar E. Iglesias (2):
43
Patrick Williams (1):
39
net: cadence_gem: Announce availability of priority queues
44
aspeed: Add support for the sonorapass-bmc board
40
net: cadence_gem: Announce 64bit addressing support
41
45
42
Markus Armbruster (1):
46
Peter Maydell (18):
43
ssi-sd: Make devices picking up backends unavailable with -device
47
target/arm: Use correct GDB XML for M-profile cores
48
target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
49
target/arm: Convert Neon 3-reg-same SHA to decodetree
50
target/arm: Convert Neon 64-bit element 3-reg-same insns
51
target/arm: Convert Neon VHADD 3-reg-same insns
52
target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
53
target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
54
target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
55
target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
56
target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
57
target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
58
target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
59
target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
60
target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
61
target/arm: Convert Neon 3-reg-same compare insns to decodetree
62
target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
63
target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
64
target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
44
65
45
Peter Maydell (10):
66
Richard Henderson (16):
46
target/arm: Improve debug logging of AArch32 exception return
67
target/arm: Create gen_gvec_[us]sra
47
target/arm: Make switch_mode() file-local
68
target/arm: Create gen_gvec_{u,s}{rshr,rsra}
48
target/arm: Implement HCR.FB
69
target/arm: Create gen_gvec_{sri,sli}
49
target/arm: Implement HCR.DC
70
target/arm: Remove unnecessary range check for VSHL
50
target/arm: ISR_EL1 bits track virtual interrupts if IMO/FMO set
71
target/arm: Tidy handle_vec_simd_shri
51
target/arm: Implement HCR.VI and VF
72
target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
52
target/arm: Implement HCR.PTW
73
target/arm: Create gen_gvec_{mla,mls}
53
target/arm: New utility function to extract EC from syndrome
74
target/arm: Swap argument order for VSHL during decode
54
target/arm: Get IL bit correct for v7 syndrome values
75
target/arm: Create gen_gvec_{cmtst,ushl,sshl}
55
target/arm: Report correct syndrome for FP/SIMD traps to Hyp mode
76
target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
77
target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
78
target/arm: Create gen_gvec_{qrdmla,qrdmls}
79
target/arm: Pass pointer to qc to qrdmla/qrdmls
80
target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
81
target/arm: Vectorize SABD/UABD
82
target/arm: Vectorize SABA/UABA
56
83
57
Richard Henderson (30):
84
docs/specs/acpi_hest_ghes.rst | 110 ++
58
target/arm: Move some system registers into a substructure
85
docs/specs/index.rst | 1 +
59
target/arm: V8M should not imply V7VE
86
configure | 4 +-
60
target/arm: Convert v8 extensions from feature bits to isar tests
87
default-configs/arm-softmmu.mak | 1 +
61
target/arm: Convert division from feature bits to isar0 tests
88
include/hw/acpi/aml-build.h | 1 +
62
target/arm: Convert jazelle from feature bit to isar1 test
89
include/hw/acpi/generic_event_device.h | 2 +
63
target/arm: Convert t32ee from feature bit to isar3 test
90
include/hw/acpi/ghes.h | 74 +
64
target/arm: Convert sve from feature bit to aa64pfr0 test
91
include/hw/arm/virt.h | 1 +
65
target/arm: Convert v8.2-fp16 from feature bit to aa64pfr0 test
92
include/qemu/uuid.h | 27 +
66
target/arm: Hoist address increment for vector memory ops
93
include/sysemu/kvm.h | 3 +-
67
target/arm: Don't call tcg_clear_temp_count
94
include/sysemu/kvm_int.h | 12 +
68
target/arm: Use tcg_gen_gvec_dup_i64 for LD[1-4]R
95
target/arm/cpu.h | 4 +
69
target/arm: Promote consecutive memory ops for aa64
96
target/arm/helper.h | 78 +-
70
target/arm: Mark some arrays const
97
target/arm/internals.h | 5 +-
71
target/arm: Use gvec for NEON VDUP
98
target/arm/translate.h | 84 +-
72
target/arm: Use gvec for NEON VMOV, VMVN, VBIC & VORR (immediate)
99
target/i386/cpu.h | 2 +
73
target/arm: Use gvec for NEON_3R_LOGIC insns
100
target/arm/neon-dp.decode | 119 +-
74
target/arm: Use gvec for NEON_3R_VADD_VSUB insns
101
accel/kvm/kvm-all.c | 36 +
75
target/arm: Use gvec for NEON_2RM_VMN, NEON_2RM_VNEG
102
hw/acpi/aml-build.c | 2 +
76
target/arm: Use gvec for NEON_3R_VMUL
103
hw/acpi/generic_event_device.c | 19 +
77
target/arm: Use gvec for VSHR, VSHL
104
hw/acpi/ghes.c | 448 ++++++
78
target/arm: Use gvec for VSRA
105
hw/acpi/nvdimm.c | 10 +-
79
target/arm: Use gvec for VSRI, VSLI
106
hw/arm/aspeed.c | 78 ++
80
target/arm: Use gvec for NEON_3R_VML
107
hw/arm/virt-acpi-build.c | 15 +
81
target/arm: Use gvec for NEON_3R_VTST_VCEQ, NEON_3R_VCGT, NEON_3R_VCGE
108
hw/arm/virt.c | 23 +
82
target/arm: Use gvec for NEON VLD all lanes
109
target/arm/cpu_tcg.c | 1 +
83
target/arm: Reorg NEON VLD/VST all elements
110
target/arm/gdbstub.c | 22 +-
84
target/arm: Promote consecutive memory ops for aa32
111
target/arm/helper.c | 2 +-
85
target/arm: Reorg NEON VLD/VST single element to one lane
112
target/arm/kvm64.c | 77 ++
86
target/arm: Remove writefn from TTBR0_EL3
113
target/arm/neon_helper.c | 17 -
87
target/arm: Only flush tlb if ASID changes
114
target/arm/tlb_helper.c | 2 +-
115
target/arm/translate-a64.c | 210 +--
116
target/arm/translate-neon.inc.c | 682 +++++++++-
117
target/arm/translate.c | 2349 +++++++++++++++++---------------
118
target/arm/vec_helper.c | 240 +++-
119
target/arm/vfp_helper.c | 9 +-
120
target/i386/kvm.c | 36 -
121
MAINTAINERS | 9 +
122
gdb-xml/arm-m-profile.xml | 27 +
123
hw/acpi/Kconfig | 4 +
124
hw/acpi/Makefile.objs | 1 +
125
41 files changed, 3402 insertions(+), 1445 deletions(-)
126
create mode 100644 docs/specs/acpi_hest_ghes.rst
127
create mode 100644 include/hw/acpi/ghes.h
128
create mode 100644 hw/acpi/ghes.c
129
create mode 100644 gdb-xml/arm-m-profile.xml
88
130
89
Stewart Hildebrand (1):
90
hw/arm/boot: Increase compliance with kernel arm64 boot protocol
91
92
target/arm/cpu.h | 227 ++++++-
93
target/arm/internals.h | 45 +-
94
target/arm/kvm_arm.h | 24 +
95
target/arm/translate.h | 21 +
96
hw/arm/boot.c | 18 +
97
hw/intc/armv7m_nvic.c | 12 +-
98
hw/net/cadence_gem.c | 9 +-
99
hw/sd/ssi-sd.c | 2 +
100
linux-user/aarch64/signal.c | 4 +-
101
linux-user/elfload.c | 60 +-
102
linux-user/syscall.c | 10 +-
103
target/arm/cpu.c | 242 ++++----
104
target/arm/cpu64.c | 148 +++--
105
target/arm/helper.c | 397 ++++++++----
106
target/arm/kvm.c | 60 ++
107
target/arm/kvm32.c | 13 +
108
target/arm/kvm64.c | 15 +-
109
target/arm/machine.c | 28 +-
110
target/arm/op_helper.c | 2 +-
111
target/arm/translate-a64.c | 715 ++++-----------------
112
target/arm/translate.c | 1451 ++++++++++++++++++++++++++++---------------
113
21 files changed, 2021 insertions(+), 1482 deletions(-)
114
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
GDB's remote protocol requires M-profile cores to use the feature
2
name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
3
feature used for A- and R-profile cores. We weren't doing this, which
4
meant GDB treated our M-profile cores like A-profile ones. This mostly
5
doesn't matter, but for instance means that it doesn't correctly
6
handle backtraces where an M-profile exception frame is involved.
2
7
3
Instantiating mps2-an505 (cortex-m33) will fail make check when
8
Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
4
V7VE asserts that ID_ISAR0.Divide includes ARM division. It is
9
cores. The integer registers have the same offsets as the
5
also wrong to include ARM_FEATURE_LPAE.
10
arm-core.xml, but register 25 is the M-profile XPSR rather than the
11
A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
12
arm_cpu_gdb_write_register() to handle XSPR reads and writes.
6
13
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
8
Message-id: 20181016223115.24100-3-richard.henderson@linaro.org
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
17
Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
11
---
18
---
12
target/arm/cpu.c | 6 +++++-
19
configure | 4 ++--
13
1 file changed, 5 insertions(+), 1 deletion(-)
20
target/arm/cpu_tcg.c | 1 +
21
target/arm/gdbstub.c | 22 ++++++++++++++++++----
22
gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
23
4 files changed, 48 insertions(+), 6 deletions(-)
24
create mode 100644 gdb-xml/arm-m-profile.xml
14
25
15
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
26
diff --git a/configure b/configure
27
index XXXXXXX..XXXXXXX 100755
28
--- a/configure
29
+++ b/configure
30
@@ -XXX,XX +XXX,XX @@ case "$target_name" in
31
TARGET_SYSTBL_ABI=common,oabi
32
bflt="yes"
33
mttcg="yes"
34
- gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
35
+ gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
36
;;
37
aarch64|aarch64_be)
38
TARGET_ARCH=aarch64
39
TARGET_BASE_ARCH=arm
40
bflt="yes"
41
mttcg="yes"
42
- gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
43
+ gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
44
;;
45
cris)
46
;;
47
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
16
index XXXXXXX..XXXXXXX 100644
48
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.c
49
--- a/target/arm/cpu_tcg.c
18
+++ b/target/arm/cpu.c
50
+++ b/target/arm/cpu_tcg.c
19
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
51
@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
20
52
#endif
21
/* Some features automatically imply others: */
53
22
if (arm_feature(env, ARM_FEATURE_V8)) {
54
cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
23
- set_feature(env, ARM_FEATURE_V7VE);
55
+ cc->gdb_core_xml_file = "arm-m-profile.xml";
56
}
57
58
static const ARMCPUInfo arm_tcg_cpus[] = {
59
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
60
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/gdbstub.c
62
+++ b/target/arm/gdbstub.c
63
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
64
}
65
return gdb_get_reg32(mem_buf, 0);
66
case 25:
67
- /* CPSR */
68
- return gdb_get_reg32(mem_buf, cpsr_read(env));
69
+ /* CPSR, or XPSR for M-profile */
24
+ if (arm_feature(env, ARM_FEATURE_M)) {
70
+ if (arm_feature(env, ARM_FEATURE_M)) {
25
+ set_feature(env, ARM_FEATURE_V7);
71
+ return gdb_get_reg32(mem_buf, xpsr_read(env));
26
+ } else {
72
+ } else {
27
+ set_feature(env, ARM_FEATURE_V7VE);
73
+ return gdb_get_reg32(mem_buf, cpsr_read(env));
28
+ }
74
+ }
29
}
75
}
30
if (arm_feature(env, ARM_FEATURE_V7VE)) {
76
/* Unknown register. */
31
/* v7 Virtualization Extensions. In real hardware this implies
77
return 0;
78
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
79
}
80
return 4;
81
case 25:
82
- /* CPSR */
83
- cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
84
+ /* CPSR, or XPSR for M-profile */
85
+ if (arm_feature(env, ARM_FEATURE_M)) {
86
+ /*
87
+ * Don't allow writing to XPSR.Exception as it can cause
88
+ * a transition into or out of handler mode (it's not
89
+ * writeable via the MSR insn so this is a reasonable
90
+ * restriction). Other fields are safe to update.
91
+ */
92
+ xpsr_write(env, tmp, ~XPSR_EXCP);
93
+ } else {
94
+ cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
95
+ }
96
return 4;
97
}
98
/* Unknown register. */
99
diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
100
new file mode 100644
101
index XXXXXXX..XXXXXXX
102
--- /dev/null
103
+++ b/gdb-xml/arm-m-profile.xml
104
@@ -XXX,XX +XXX,XX @@
105
+<?xml version="1.0"?>
106
+<!-- Copyright (C) 2010-2020 Free Software Foundation, Inc.
107
+
108
+ Copying and distribution of this file, with or without modification,
109
+ are permitted in any medium without royalty provided the copyright
110
+ notice and this notice are preserved. -->
111
+
112
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
113
+<feature name="org.gnu.gdb.arm.m-profile">
114
+ <reg name="r0" bitsize="32"/>
115
+ <reg name="r1" bitsize="32"/>
116
+ <reg name="r2" bitsize="32"/>
117
+ <reg name="r3" bitsize="32"/>
118
+ <reg name="r4" bitsize="32"/>
119
+ <reg name="r5" bitsize="32"/>
120
+ <reg name="r6" bitsize="32"/>
121
+ <reg name="r7" bitsize="32"/>
122
+ <reg name="r8" bitsize="32"/>
123
+ <reg name="r9" bitsize="32"/>
124
+ <reg name="r10" bitsize="32"/>
125
+ <reg name="r11" bitsize="32"/>
126
+ <reg name="r12" bitsize="32"/>
127
+ <reg name="sp" bitsize="32" type="data_ptr"/>
128
+ <reg name="lr" bitsize="32"/>
129
+ <reg name="pc" bitsize="32" type="code_ptr"/>
130
+ <reg name="xpsr" bitsize="32" regnum="25"/>
131
+</feature>
32
--
132
--
33
2.19.1
133
2.20.1
34
134
35
135
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Instead of shifts and masks, use direct loads and stores from
3
The functions eliminate duplication of the special cases for
4
the neon register file.
4
this operation. They match up with the GVecGen2iFn typedef.
5
5
6
Add out-of-line helpers. We got away with only having inline
7
expanders because the neon vector size is only 16 bytes, and
8
we know that the inline expansion will always succeed.
9
When we reuse this for SVE, tcg-gvec-op may decide to use an
10
out-of-line helper due to longer vector lengths.
11
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20181011205206.3552-21-richard.henderson@linaro.org
14
Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
16
---
11
target/arm/translate.c | 92 +++++++++++++++++++++++-------------------
17
target/arm/helper.h | 10 +++
12
1 file changed, 50 insertions(+), 42 deletions(-)
18
target/arm/translate.h | 7 +-
13
19
target/arm/translate-a64.c | 15 +---
20
target/arm/translate.c | 161 ++++++++++++++++++++++---------------
21
target/arm/vec_helper.c | 25 ++++++
22
5 files changed, 139 insertions(+), 79 deletions(-)
23
24
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/helper.h
27
+++ b/target/arm/helper.h
28
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
30
DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
32
+DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
41
+
42
#ifdef TARGET_AARCH64
43
#include "helper-a64.h"
44
#include "helper-sve.h"
45
diff --git a/target/arm/translate.h b/target/arm/translate.h
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/translate.h
48
+++ b/target/arm/translate.h
49
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
50
extern const GVecGen3 cmtst_op[4];
51
extern const GVecGen3 sshl_op[4];
52
extern const GVecGen3 ushl_op[4];
53
-extern const GVecGen2i ssra_op[4];
54
-extern const GVecGen2i usra_op[4];
55
extern const GVecGen2i sri_op[4];
56
extern const GVecGen2i sli_op[4];
57
extern const GVecGen4 uqadd_op[4];
58
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
59
void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
60
void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
61
62
+void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
63
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
64
+void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
65
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
66
+
67
/*
68
* Forward to the isar_feature_* tests given a DisasContext pointer.
69
*/
70
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate-a64.c
73
+++ b/target/arm/translate-a64.c
74
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
75
76
switch (opcode) {
77
case 0x02: /* SSRA / USRA (accumulate) */
78
- if (is_u) {
79
- /* Shift count same as element size produces zero to add. */
80
- if (shift == 8 << size) {
81
- goto done;
82
- }
83
- gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]);
84
- } else {
85
- /* Shift count same as element size produces all sign to add. */
86
- if (shift == 8 << size) {
87
- shift -= 1;
88
- }
89
- gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]);
90
- }
91
+ gen_gvec_fn2i(s, is_q, rd, rn, shift,
92
+ is_u ? gen_gvec_usra : gen_gvec_ssra, size);
93
return;
94
case 0x08: /* SRI */
95
/* Shift count same as element size is valid but does nothing. */
14
diff --git a/target/arm/translate.c b/target/arm/translate.c
96
diff --git a/target/arm/translate.c b/target/arm/translate.c
15
index XXXXXXX..XXXXXXX 100644
97
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/translate.c
98
--- a/target/arm/translate.c
17
+++ b/target/arm/translate.c
99
+++ b/target/arm/translate.c
18
@@ -XXX,XX +XXX,XX @@ static TCGv_i32 neon_load_reg(int reg, int pass)
100
@@ -XXX,XX +XXX,XX @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
19
return tmp;
101
tcg_gen_add_vec(vece, d, d, a);
20
}
102
}
21
103
22
+static void neon_load_element(TCGv_i32 var, int reg, int ele, TCGMemOp mop)
104
-static const TCGOpcode vecop_list_ssra[] = {
105
- INDEX_op_sari_vec, INDEX_op_add_vec, 0
106
-};
107
+void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
108
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
23
+{
109
+{
24
+ long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
110
+ static const TCGOpcode vecop_list[] = {
25
+
111
+ INDEX_op_sari_vec, INDEX_op_add_vec, 0
26
+ switch (mop) {
112
+ };
27
+ case MO_UB:
113
+ static const GVecGen2i ops[4] = {
28
+ tcg_gen_ld8u_i32(var, cpu_env, offset);
114
+ { .fni8 = gen_ssra8_i64,
29
+ break;
115
+ .fniv = gen_ssra_vec,
30
+ case MO_UW:
116
+ .fno = gen_helper_gvec_ssra_b,
31
+ tcg_gen_ld16u_i32(var, cpu_env, offset);
117
+ .load_dest = true,
32
+ break;
118
+ .opt_opc = vecop_list,
33
+ case MO_UL:
119
+ .vece = MO_8 },
34
+ tcg_gen_ld_i32(var, cpu_env, offset);
120
+ { .fni8 = gen_ssra16_i64,
35
+ break;
121
+ .fniv = gen_ssra_vec,
36
+ default:
122
+ .fno = gen_helper_gvec_ssra_h,
37
+ g_assert_not_reached();
123
+ .load_dest = true,
124
+ .opt_opc = vecop_list,
125
+ .vece = MO_16 },
126
+ { .fni4 = gen_ssra32_i32,
127
+ .fniv = gen_ssra_vec,
128
+ .fno = gen_helper_gvec_ssra_s,
129
+ .load_dest = true,
130
+ .opt_opc = vecop_list,
131
+ .vece = MO_32 },
132
+ { .fni8 = gen_ssra64_i64,
133
+ .fniv = gen_ssra_vec,
134
+ .fno = gen_helper_gvec_ssra_b,
135
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
136
+ .opt_opc = vecop_list,
137
+ .load_dest = true,
138
+ .vece = MO_64 },
139
+ };
140
141
-const GVecGen2i ssra_op[4] = {
142
- { .fni8 = gen_ssra8_i64,
143
- .fniv = gen_ssra_vec,
144
- .load_dest = true,
145
- .opt_opc = vecop_list_ssra,
146
- .vece = MO_8 },
147
- { .fni8 = gen_ssra16_i64,
148
- .fniv = gen_ssra_vec,
149
- .load_dest = true,
150
- .opt_opc = vecop_list_ssra,
151
- .vece = MO_16 },
152
- { .fni4 = gen_ssra32_i32,
153
- .fniv = gen_ssra_vec,
154
- .load_dest = true,
155
- .opt_opc = vecop_list_ssra,
156
- .vece = MO_32 },
157
- { .fni8 = gen_ssra64_i64,
158
- .fniv = gen_ssra_vec,
159
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
160
- .opt_opc = vecop_list_ssra,
161
- .load_dest = true,
162
- .vece = MO_64 },
163
-};
164
+ /* tszimm encoding produces immediates in the range [1..esize]. */
165
+ tcg_debug_assert(shift > 0);
166
+ tcg_debug_assert(shift <= (8 << vece));
167
+
168
+ /*
169
+ * Shifts larger than the element size are architecturally valid.
170
+ * Signed results in all sign bits.
171
+ */
172
+ shift = MIN(shift, (8 << vece) - 1);
173
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
174
+}
175
176
static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
177
{
178
@@ -XXX,XX +XXX,XX @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
179
tcg_gen_add_vec(vece, d, d, a);
180
}
181
182
-static const TCGOpcode vecop_list_usra[] = {
183
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
184
-};
185
+void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
186
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
187
+{
188
+ static const TCGOpcode vecop_list[] = {
189
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
190
+ };
191
+ static const GVecGen2i ops[4] = {
192
+ { .fni8 = gen_usra8_i64,
193
+ .fniv = gen_usra_vec,
194
+ .fno = gen_helper_gvec_usra_b,
195
+ .load_dest = true,
196
+ .opt_opc = vecop_list,
197
+ .vece = MO_8, },
198
+ { .fni8 = gen_usra16_i64,
199
+ .fniv = gen_usra_vec,
200
+ .fno = gen_helper_gvec_usra_h,
201
+ .load_dest = true,
202
+ .opt_opc = vecop_list,
203
+ .vece = MO_16, },
204
+ { .fni4 = gen_usra32_i32,
205
+ .fniv = gen_usra_vec,
206
+ .fno = gen_helper_gvec_usra_s,
207
+ .load_dest = true,
208
+ .opt_opc = vecop_list,
209
+ .vece = MO_32, },
210
+ { .fni8 = gen_usra64_i64,
211
+ .fniv = gen_usra_vec,
212
+ .fno = gen_helper_gvec_usra_d,
213
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
214
+ .load_dest = true,
215
+ .opt_opc = vecop_list,
216
+ .vece = MO_64, },
217
+ };
218
219
-const GVecGen2i usra_op[4] = {
220
- { .fni8 = gen_usra8_i64,
221
- .fniv = gen_usra_vec,
222
- .load_dest = true,
223
- .opt_opc = vecop_list_usra,
224
- .vece = MO_8, },
225
- { .fni8 = gen_usra16_i64,
226
- .fniv = gen_usra_vec,
227
- .load_dest = true,
228
- .opt_opc = vecop_list_usra,
229
- .vece = MO_16, },
230
- { .fni4 = gen_usra32_i32,
231
- .fniv = gen_usra_vec,
232
- .load_dest = true,
233
- .opt_opc = vecop_list_usra,
234
- .vece = MO_32, },
235
- { .fni8 = gen_usra64_i64,
236
- .fniv = gen_usra_vec,
237
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
238
- .load_dest = true,
239
- .opt_opc = vecop_list_usra,
240
- .vece = MO_64, },
241
-};
242
+ /* tszimm encoding produces immediates in the range [1..esize]. */
243
+ tcg_debug_assert(shift > 0);
244
+ tcg_debug_assert(shift <= (8 << vece));
245
+
246
+ /*
247
+ * Shifts larger than the element size are architecturally valid.
248
+ * Unsigned results in all zeros as input to accumulate: nop.
249
+ */
250
+ if (shift < (8 << vece)) {
251
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
252
+ } else {
253
+ /* Nop, but we do need to clear the tail. */
254
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
38
+ }
255
+ }
39
+}
256
+}
40
+
257
41
static void neon_load_element64(TCGv_i64 var, int reg, int ele, TCGMemOp mop)
258
static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
42
{
259
{
43
long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
260
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
44
@@ -XXX,XX +XXX,XX @@ static void neon_store_reg(int reg, int pass, TCGv_i32 var)
261
case 1: /* VSRA */
45
tcg_temp_free_i32(var);
262
/* Right shift comes here negative. */
263
shift = -shift;
264
- /* Shifts larger than the element size are architecturally
265
- * valid. Unsigned results in all zeros; signed results
266
- * in all sign bits.
267
- */
268
- if (!u) {
269
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
270
- MIN(shift, (8 << size) - 1),
271
- &ssra_op[size]);
272
- } else if (shift >= 8 << size) {
273
- /* rd += 0 */
274
+ if (u) {
275
+ gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
276
+ vec_size, vec_size);
277
} else {
278
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
279
- shift, &usra_op[size]);
280
+ gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
281
+ vec_size, vec_size);
282
}
283
return 0;
284
285
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
286
index XXXXXXX..XXXXXXX 100644
287
--- a/target/arm/vec_helper.c
288
+++ b/target/arm/vec_helper.c
289
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
290
clear_tail(d, oprsz, simd_maxsz(desc));
46
}
291
}
47
292
48
+static void neon_store_element(int reg, int ele, TCGMemOp size, TCGv_i32 var)
293
+
49
+{
294
+#define DO_SRA(NAME, TYPE) \
50
+ long offset = neon_element_offset(reg, ele, size);
295
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
51
+
296
+{ \
52
+ switch (size) {
297
+ intptr_t i, oprsz = simd_oprsz(desc); \
53
+ case MO_8:
298
+ int shift = simd_data(desc); \
54
+ tcg_gen_st8_i32(var, cpu_env, offset);
299
+ TYPE *d = vd, *n = vn; \
55
+ break;
300
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
56
+ case MO_16:
301
+ d[i] += n[i] >> shift; \
57
+ tcg_gen_st16_i32(var, cpu_env, offset);
302
+ } \
58
+ break;
303
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
59
+ case MO_32:
60
+ tcg_gen_st_i32(var, cpu_env, offset);
61
+ break;
62
+ default:
63
+ g_assert_not_reached();
64
+ }
65
+}
304
+}
66
+
305
+
67
static void neon_store_element64(int reg, int ele, TCGMemOp size, TCGv_i64 var)
306
+DO_SRA(gvec_ssra_b, int8_t)
68
{
307
+DO_SRA(gvec_ssra_h, int16_t)
69
long offset = neon_element_offset(reg, ele, size);
308
+DO_SRA(gvec_ssra_s, int32_t)
70
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
309
+DO_SRA(gvec_ssra_d, int64_t)
71
int stride;
310
+
72
int size;
311
+DO_SRA(gvec_usra_b, uint8_t)
73
int reg;
312
+DO_SRA(gvec_usra_h, uint16_t)
74
- int pass;
313
+DO_SRA(gvec_usra_s, uint32_t)
75
int load;
314
+DO_SRA(gvec_usra_d, uint64_t)
76
- int shift;
315
+
77
int n;
316
+#undef DO_SRA
78
int vec_size;
317
+
79
int mmu_idx;
318
/*
80
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
319
* Convert float16 to float32, raising no exceptions and
81
} else {
320
* preserving exceptional values, including SNaN.
82
/* Single element. */
83
int idx = (insn >> 4) & 0xf;
84
- pass = (insn >> 7) & 1;
85
+ int reg_idx;
86
switch (size) {
87
case 0:
88
- shift = ((insn >> 5) & 3) * 8;
89
+ reg_idx = (insn >> 5) & 7;
90
stride = 1;
91
break;
92
case 1:
93
- shift = ((insn >> 6) & 1) * 16;
94
+ reg_idx = (insn >> 6) & 3;
95
stride = (insn & (1 << 5)) ? 2 : 1;
96
break;
97
case 2:
98
- shift = 0;
99
+ reg_idx = (insn >> 7) & 1;
100
stride = (insn & (1 << 6)) ? 2 : 1;
101
break;
102
default:
103
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
104
*/
105
return 1;
106
}
107
+ tmp = tcg_temp_new_i32();
108
addr = tcg_temp_new_i32();
109
load_reg_var(s, addr, rn);
110
for (reg = 0; reg < nregs; reg++) {
111
if (load) {
112
- tmp = tcg_temp_new_i32();
113
- switch (size) {
114
- case 0:
115
- gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
116
- break;
117
- case 1:
118
- gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
119
- break;
120
- case 2:
121
- gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
122
- break;
123
- default: /* Avoid compiler warnings. */
124
- abort();
125
- }
126
- if (size != 2) {
127
- tmp2 = neon_load_reg(rd, pass);
128
- tcg_gen_deposit_i32(tmp, tmp2, tmp,
129
- shift, size ? 16 : 8);
130
- tcg_temp_free_i32(tmp2);
131
- }
132
- neon_store_reg(rd, pass, tmp);
133
+ gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
134
+ s->be_data | size);
135
+ neon_store_element(rd, reg_idx, size, tmp);
136
} else { /* Store */
137
- tmp = neon_load_reg(rd, pass);
138
- if (shift)
139
- tcg_gen_shri_i32(tmp, tmp, shift);
140
- switch (size) {
141
- case 0:
142
- gen_aa32_st8(s, tmp, addr, get_mem_index(s));
143
- break;
144
- case 1:
145
- gen_aa32_st16(s, tmp, addr, get_mem_index(s));
146
- break;
147
- case 2:
148
- gen_aa32_st32(s, tmp, addr, get_mem_index(s));
149
- break;
150
- }
151
- tcg_temp_free_i32(tmp);
152
+ neon_load_element(tmp, rd, reg_idx, size);
153
+ gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
154
+ s->be_data | size);
155
}
156
rd += stride;
157
tcg_gen_addi_i32(addr, addr, 1 << size);
158
}
159
tcg_temp_free_i32(addr);
160
+ tcg_temp_free_i32(tmp);
161
stride = nregs * (1 << size);
162
}
163
}
164
--
321
--
165
2.19.1
322
2.20.1
166
323
167
324
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Move ssra_op and usra_op expanders from translate-a64.c.
3
Create vectorized versions of handle_shri_with_rndacc
4
for shift+round and shift+round+accumulate. Add out-of-line
5
helpers in preparation for longer vector lengths from SVE.
4
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181011205206.3552-14-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
11
---
10
target/arm/translate.h | 2 +
12
target/arm/helper.h | 20 ++
11
target/arm/translate-a64.c | 106 ----------------------------
13
target/arm/translate.h | 9 +
12
target/arm/translate.c | 139 ++++++++++++++++++++++++++++++++++---
14
target/arm/translate-a64.c | 11 +-
13
3 files changed, 130 insertions(+), 117 deletions(-)
15
target/arm/translate.c | 463 +++++++++++++++++++++++++++++++++++--
16
target/arm/vec_helper.c | 50 ++++
17
5 files changed, 527 insertions(+), 26 deletions(-)
14
18
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.h
22
+++ b/target/arm/helper.h
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
24
DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
27
+DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
31
+
32
+DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
41
+
42
+DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
44
+DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
46
+
47
#ifdef TARGET_AARCH64
48
#include "helper-a64.h"
49
#include "helper-sve.h"
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
50
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
index XXXXXXX..XXXXXXX 100644
51
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
52
--- a/target/arm/translate.h
18
+++ b/target/arm/translate.h
53
+++ b/target/arm/translate.h
19
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
54
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
20
extern const GVecGen3 bsl_op;
55
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
21
extern const GVecGen3 bit_op;
56
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
22
extern const GVecGen3 bif_op;
57
23
+extern const GVecGen2i ssra_op[4];
58
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
24
+extern const GVecGen2i usra_op[4];
59
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
25
60
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
61
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
62
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
63
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
64
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
65
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
66
+
26
/*
67
/*
27
* Forward to the isar_feature_* tests given a DisasContext pointer.
68
* Forward to the isar_feature_* tests given a DisasContext pointer.
69
*/
28
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
70
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
71
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-a64.c
72
--- a/target/arm/translate-a64.c
31
+++ b/target/arm/translate-a64.c
73
+++ b/target/arm/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
74
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
75
return;
76
77
case 0x04: /* SRSHR / URSHR (rounding) */
78
- break;
79
+ gen_gvec_fn2i(s, is_q, rd, rn, shift,
80
+ is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
81
+ return;
82
+
83
case 0x06: /* SRSRA / URSRA (accum + rounding) */
84
- accumulate = true;
85
- break;
86
+ gen_gvec_fn2i(s, is_q, rd, rn, shift,
87
+ is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
88
+ return;
89
+
90
default:
91
g_assert_not_reached();
33
}
92
}
34
}
35
36
-static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
37
-{
38
- tcg_gen_vec_sar8i_i64(a, a, shift);
39
- tcg_gen_vec_add8_i64(d, d, a);
40
-}
41
-
42
-static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
43
-{
44
- tcg_gen_vec_sar16i_i64(a, a, shift);
45
- tcg_gen_vec_add16_i64(d, d, a);
46
-}
47
-
48
-static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
49
-{
50
- tcg_gen_sari_i32(a, a, shift);
51
- tcg_gen_add_i32(d, d, a);
52
-}
53
-
54
-static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
55
-{
56
- tcg_gen_sari_i64(a, a, shift);
57
- tcg_gen_add_i64(d, d, a);
58
-}
59
-
60
-static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
61
-{
62
- tcg_gen_sari_vec(vece, a, a, sh);
63
- tcg_gen_add_vec(vece, d, d, a);
64
-}
65
-
66
-static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
67
-{
68
- tcg_gen_vec_shr8i_i64(a, a, shift);
69
- tcg_gen_vec_add8_i64(d, d, a);
70
-}
71
-
72
-static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
73
-{
74
- tcg_gen_vec_shr16i_i64(a, a, shift);
75
- tcg_gen_vec_add16_i64(d, d, a);
76
-}
77
-
78
-static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
79
-{
80
- tcg_gen_shri_i32(a, a, shift);
81
- tcg_gen_add_i32(d, d, a);
82
-}
83
-
84
-static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
85
-{
86
- tcg_gen_shri_i64(a, a, shift);
87
- tcg_gen_add_i64(d, d, a);
88
-}
89
-
90
-static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
91
-{
92
- tcg_gen_shri_vec(vece, a, a, sh);
93
- tcg_gen_add_vec(vece, d, d, a);
94
-}
95
-
96
static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
97
{
98
uint64_t mask = dup_const(MO_8, 0xff >> shift);
99
@@ -XXX,XX +XXX,XX @@ static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
100
static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
101
int immh, int immb, int opcode, int rn, int rd)
102
{
103
- static const GVecGen2i ssra_op[4] = {
104
- { .fni8 = gen_ssra8_i64,
105
- .fniv = gen_ssra_vec,
106
- .load_dest = true,
107
- .opc = INDEX_op_sari_vec,
108
- .vece = MO_8 },
109
- { .fni8 = gen_ssra16_i64,
110
- .fniv = gen_ssra_vec,
111
- .load_dest = true,
112
- .opc = INDEX_op_sari_vec,
113
- .vece = MO_16 },
114
- { .fni4 = gen_ssra32_i32,
115
- .fniv = gen_ssra_vec,
116
- .load_dest = true,
117
- .opc = INDEX_op_sari_vec,
118
- .vece = MO_32 },
119
- { .fni8 = gen_ssra64_i64,
120
- .fniv = gen_ssra_vec,
121
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
122
- .load_dest = true,
123
- .opc = INDEX_op_sari_vec,
124
- .vece = MO_64 },
125
- };
126
- static const GVecGen2i usra_op[4] = {
127
- { .fni8 = gen_usra8_i64,
128
- .fniv = gen_usra_vec,
129
- .load_dest = true,
130
- .opc = INDEX_op_shri_vec,
131
- .vece = MO_8, },
132
- { .fni8 = gen_usra16_i64,
133
- .fniv = gen_usra_vec,
134
- .load_dest = true,
135
- .opc = INDEX_op_shri_vec,
136
- .vece = MO_16, },
137
- { .fni4 = gen_usra32_i32,
138
- .fniv = gen_usra_vec,
139
- .load_dest = true,
140
- .opc = INDEX_op_shri_vec,
141
- .vece = MO_32, },
142
- { .fni8 = gen_usra64_i64,
143
- .fniv = gen_usra_vec,
144
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
145
- .load_dest = true,
146
- .opc = INDEX_op_shri_vec,
147
- .vece = MO_64, },
148
- };
149
static const GVecGen2i sri_op[4] = {
150
{ .fni8 = gen_shr8_ins_i64,
151
.fniv = gen_shr_ins_vec,
152
diff --git a/target/arm/translate.c b/target/arm/translate.c
93
diff --git a/target/arm/translate.c b/target/arm/translate.c
153
index XXXXXXX..XXXXXXX 100644
94
index XXXXXXX..XXXXXXX 100644
154
--- a/target/arm/translate.c
95
--- a/target/arm/translate.c
155
+++ b/target/arm/translate.c
96
+++ b/target/arm/translate.c
156
@@ -XXX,XX +XXX,XX @@ const GVecGen3 bif_op = {
97
@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
157
.load_dest = true
98
}
158
};
99
}
159
100
160
+static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
101
+/*
161
+{
102
+ * Shift one less than the requested amount, and the low bit is
162
+ tcg_gen_vec_sar8i_i64(a, a, shift);
103
+ * the rounding bit. For the 8 and 16-bit operations, because we
163
+ tcg_gen_vec_add8_i64(d, d, a);
104
+ * mask the low bit, we can perform a normal integer shift instead
164
+}
105
+ * of a vector shift.
165
+
106
+ */
166
+static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
107
+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
167
+{
108
+{
168
+ tcg_gen_vec_sar16i_i64(a, a, shift);
109
+ TCGv_i64 t = tcg_temp_new_i64();
169
+ tcg_gen_vec_add16_i64(d, d, a);
110
+
170
+}
111
+ tcg_gen_shri_i64(t, a, sh - 1);
171
+
112
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
172
+static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
113
+ tcg_gen_vec_sar8i_i64(d, a, sh);
173
+{
114
+ tcg_gen_vec_add8_i64(d, d, t);
174
+ tcg_gen_sari_i32(a, a, shift);
115
+ tcg_temp_free_i64(t);
175
+ tcg_gen_add_i32(d, d, a);
116
+}
176
+}
117
+
177
+
118
+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
178
+static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
119
+{
179
+{
120
+ TCGv_i64 t = tcg_temp_new_i64();
180
+ tcg_gen_sari_i64(a, a, shift);
121
+
181
+ tcg_gen_add_i64(d, d, a);
122
+ tcg_gen_shri_i64(t, a, sh - 1);
182
+}
123
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
183
+
124
+ tcg_gen_vec_sar16i_i64(d, a, sh);
184
+static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
125
+ tcg_gen_vec_add16_i64(d, d, t);
185
+{
126
+ tcg_temp_free_i64(t);
186
+ tcg_gen_sari_vec(vece, a, a, sh);
127
+}
187
+ tcg_gen_add_vec(vece, d, d, a);
128
+
188
+}
129
+static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
189
+
130
+{
190
+const GVecGen2i ssra_op[4] = {
131
+ TCGv_i32 t = tcg_temp_new_i32();
191
+ { .fni8 = gen_ssra8_i64,
132
+
192
+ .fniv = gen_ssra_vec,
133
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
193
+ .load_dest = true,
134
+ tcg_gen_sari_i32(d, a, sh);
194
+ .opc = INDEX_op_sari_vec,
135
+ tcg_gen_add_i32(d, d, t);
195
+ .vece = MO_8 },
136
+ tcg_temp_free_i32(t);
196
+ { .fni8 = gen_ssra16_i64,
137
+}
197
+ .fniv = gen_ssra_vec,
138
+
198
+ .load_dest = true,
139
+static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
199
+ .opc = INDEX_op_sari_vec,
140
+{
200
+ .vece = MO_16 },
141
+ TCGv_i64 t = tcg_temp_new_i64();
201
+ { .fni4 = gen_ssra32_i32,
142
+
202
+ .fniv = gen_ssra_vec,
143
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
203
+ .load_dest = true,
144
+ tcg_gen_sari_i64(d, a, sh);
204
+ .opc = INDEX_op_sari_vec,
145
+ tcg_gen_add_i64(d, d, t);
205
+ .vece = MO_32 },
146
+ tcg_temp_free_i64(t);
206
+ { .fni8 = gen_ssra64_i64,
147
+}
207
+ .fniv = gen_ssra_vec,
148
+
208
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
149
+static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
209
+ .load_dest = true,
150
+{
210
+ .opc = INDEX_op_sari_vec,
151
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
211
+ .vece = MO_64 },
152
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
212
+};
153
+
213
+
154
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
214
+static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
155
+ tcg_gen_dupi_vec(vece, ones, 1);
215
+{
156
+ tcg_gen_and_vec(vece, t, t, ones);
216
+ tcg_gen_vec_shr8i_i64(a, a, shift);
157
+ tcg_gen_sari_vec(vece, d, a, sh);
217
+ tcg_gen_vec_add8_i64(d, d, a);
158
+ tcg_gen_add_vec(vece, d, d, t);
218
+}
159
+
219
+
160
+ tcg_temp_free_vec(t);
220
+static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
161
+ tcg_temp_free_vec(ones);
221
+{
162
+}
222
+ tcg_gen_vec_shr16i_i64(a, a, shift);
163
+
223
+ tcg_gen_vec_add16_i64(d, d, a);
164
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
224
+}
165
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
225
+
166
+{
226
+static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
167
+ static const TCGOpcode vecop_list[] = {
227
+{
168
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
228
+ tcg_gen_shri_i32(a, a, shift);
169
+ };
229
+ tcg_gen_add_i32(d, d, a);
170
+ static const GVecGen2i ops[4] = {
230
+}
171
+ { .fni8 = gen_srshr8_i64,
231
+
172
+ .fniv = gen_srshr_vec,
232
+static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
173
+ .fno = gen_helper_gvec_srshr_b,
233
+{
174
+ .opt_opc = vecop_list,
234
+ tcg_gen_shri_i64(a, a, shift);
175
+ .vece = MO_8 },
235
+ tcg_gen_add_i64(d, d, a);
176
+ { .fni8 = gen_srshr16_i64,
236
+}
177
+ .fniv = gen_srshr_vec,
237
+
178
+ .fno = gen_helper_gvec_srshr_h,
238
+static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
179
+ .opt_opc = vecop_list,
239
+{
180
+ .vece = MO_16 },
240
+ tcg_gen_shri_vec(vece, a, a, sh);
181
+ { .fni4 = gen_srshr32_i32,
241
+ tcg_gen_add_vec(vece, d, d, a);
182
+ .fniv = gen_srshr_vec,
242
+}
183
+ .fno = gen_helper_gvec_srshr_s,
243
+
184
+ .opt_opc = vecop_list,
244
+const GVecGen2i usra_op[4] = {
185
+ .vece = MO_32 },
245
+ { .fni8 = gen_usra8_i64,
186
+ { .fni8 = gen_srshr64_i64,
246
+ .fniv = gen_usra_vec,
187
+ .fniv = gen_srshr_vec,
247
+ .load_dest = true,
188
+ .fno = gen_helper_gvec_srshr_d,
248
+ .opc = INDEX_op_shri_vec,
189
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
249
+ .vece = MO_8, },
190
+ .opt_opc = vecop_list,
250
+ { .fni8 = gen_usra16_i64,
191
+ .vece = MO_64 },
251
+ .fniv = gen_usra_vec,
192
+ };
252
+ .load_dest = true,
193
+
253
+ .opc = INDEX_op_shri_vec,
194
+ /* tszimm encoding produces immediates in the range [1..esize] */
254
+ .vece = MO_16, },
195
+ tcg_debug_assert(shift > 0);
255
+ { .fni4 = gen_usra32_i32,
196
+ tcg_debug_assert(shift <= (8 << vece));
256
+ .fniv = gen_usra_vec,
197
+
257
+ .load_dest = true,
198
+ if (shift == (8 << vece)) {
258
+ .opc = INDEX_op_shri_vec,
199
+ /*
259
+ .vece = MO_32, },
200
+ * Shifts larger than the element size are architecturally valid.
260
+ { .fni8 = gen_usra64_i64,
201
+ * Signed results in all sign bits. With rounding, this produces
261
+ .fniv = gen_usra_vec,
202
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
262
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
203
+ * I.e. always zero.
263
+ .load_dest = true,
204
+ */
264
+ .opc = INDEX_op_shri_vec,
205
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
265
+ .vece = MO_64, },
206
+ } else {
266
+};
207
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
267
208
+ }
268
/* Translate a NEON data processing instruction. Return nonzero if the
209
+}
269
instruction is invalid.
210
+
211
+static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
212
+{
213
+ TCGv_i64 t = tcg_temp_new_i64();
214
+
215
+ gen_srshr8_i64(t, a, sh);
216
+ tcg_gen_vec_add8_i64(d, d, t);
217
+ tcg_temp_free_i64(t);
218
+}
219
+
220
+static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
221
+{
222
+ TCGv_i64 t = tcg_temp_new_i64();
223
+
224
+ gen_srshr16_i64(t, a, sh);
225
+ tcg_gen_vec_add16_i64(d, d, t);
226
+ tcg_temp_free_i64(t);
227
+}
228
+
229
+static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
230
+{
231
+ TCGv_i32 t = tcg_temp_new_i32();
232
+
233
+ gen_srshr32_i32(t, a, sh);
234
+ tcg_gen_add_i32(d, d, t);
235
+ tcg_temp_free_i32(t);
236
+}
237
+
238
+static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
239
+{
240
+ TCGv_i64 t = tcg_temp_new_i64();
241
+
242
+ gen_srshr64_i64(t, a, sh);
243
+ tcg_gen_add_i64(d, d, t);
244
+ tcg_temp_free_i64(t);
245
+}
246
+
247
+static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
248
+{
249
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
250
+
251
+ gen_srshr_vec(vece, t, a, sh);
252
+ tcg_gen_add_vec(vece, d, d, t);
253
+ tcg_temp_free_vec(t);
254
+}
255
+
256
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
257
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
258
+{
259
+ static const TCGOpcode vecop_list[] = {
260
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
261
+ };
262
+ static const GVecGen2i ops[4] = {
263
+ { .fni8 = gen_srsra8_i64,
264
+ .fniv = gen_srsra_vec,
265
+ .fno = gen_helper_gvec_srsra_b,
266
+ .opt_opc = vecop_list,
267
+ .load_dest = true,
268
+ .vece = MO_8 },
269
+ { .fni8 = gen_srsra16_i64,
270
+ .fniv = gen_srsra_vec,
271
+ .fno = gen_helper_gvec_srsra_h,
272
+ .opt_opc = vecop_list,
273
+ .load_dest = true,
274
+ .vece = MO_16 },
275
+ { .fni4 = gen_srsra32_i32,
276
+ .fniv = gen_srsra_vec,
277
+ .fno = gen_helper_gvec_srsra_s,
278
+ .opt_opc = vecop_list,
279
+ .load_dest = true,
280
+ .vece = MO_32 },
281
+ { .fni8 = gen_srsra64_i64,
282
+ .fniv = gen_srsra_vec,
283
+ .fno = gen_helper_gvec_srsra_d,
284
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
285
+ .opt_opc = vecop_list,
286
+ .load_dest = true,
287
+ .vece = MO_64 },
288
+ };
289
+
290
+ /* tszimm encoding produces immediates in the range [1..esize] */
291
+ tcg_debug_assert(shift > 0);
292
+ tcg_debug_assert(shift <= (8 << vece));
293
+
294
+ /*
295
+ * Shifts larger than the element size are architecturally valid.
296
+ * Signed results in all sign bits. With rounding, this produces
297
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
298
+ * I.e. always zero. With accumulation, this leaves D unchanged.
299
+ */
300
+ if (shift == (8 << vece)) {
301
+ /* Nop, but we do need to clear the tail. */
302
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
303
+ } else {
304
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
305
+ }
306
+}
307
+
308
+static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
309
+{
310
+ TCGv_i64 t = tcg_temp_new_i64();
311
+
312
+ tcg_gen_shri_i64(t, a, sh - 1);
313
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
314
+ tcg_gen_vec_shr8i_i64(d, a, sh);
315
+ tcg_gen_vec_add8_i64(d, d, t);
316
+ tcg_temp_free_i64(t);
317
+}
318
+
319
+static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
320
+{
321
+ TCGv_i64 t = tcg_temp_new_i64();
322
+
323
+ tcg_gen_shri_i64(t, a, sh - 1);
324
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
325
+ tcg_gen_vec_shr16i_i64(d, a, sh);
326
+ tcg_gen_vec_add16_i64(d, d, t);
327
+ tcg_temp_free_i64(t);
328
+}
329
+
330
+static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
331
+{
332
+ TCGv_i32 t = tcg_temp_new_i32();
333
+
334
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
335
+ tcg_gen_shri_i32(d, a, sh);
336
+ tcg_gen_add_i32(d, d, t);
337
+ tcg_temp_free_i32(t);
338
+}
339
+
340
+static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
341
+{
342
+ TCGv_i64 t = tcg_temp_new_i64();
343
+
344
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
345
+ tcg_gen_shri_i64(d, a, sh);
346
+ tcg_gen_add_i64(d, d, t);
347
+ tcg_temp_free_i64(t);
348
+}
349
+
350
+static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
351
+{
352
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
353
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
354
+
355
+ tcg_gen_shri_vec(vece, t, a, shift - 1);
356
+ tcg_gen_dupi_vec(vece, ones, 1);
357
+ tcg_gen_and_vec(vece, t, t, ones);
358
+ tcg_gen_shri_vec(vece, d, a, shift);
359
+ tcg_gen_add_vec(vece, d, d, t);
360
+
361
+ tcg_temp_free_vec(t);
362
+ tcg_temp_free_vec(ones);
363
+}
364
+
365
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
366
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
367
+{
368
+ static const TCGOpcode vecop_list[] = {
369
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
370
+ };
371
+ static const GVecGen2i ops[4] = {
372
+ { .fni8 = gen_urshr8_i64,
373
+ .fniv = gen_urshr_vec,
374
+ .fno = gen_helper_gvec_urshr_b,
375
+ .opt_opc = vecop_list,
376
+ .vece = MO_8 },
377
+ { .fni8 = gen_urshr16_i64,
378
+ .fniv = gen_urshr_vec,
379
+ .fno = gen_helper_gvec_urshr_h,
380
+ .opt_opc = vecop_list,
381
+ .vece = MO_16 },
382
+ { .fni4 = gen_urshr32_i32,
383
+ .fniv = gen_urshr_vec,
384
+ .fno = gen_helper_gvec_urshr_s,
385
+ .opt_opc = vecop_list,
386
+ .vece = MO_32 },
387
+ { .fni8 = gen_urshr64_i64,
388
+ .fniv = gen_urshr_vec,
389
+ .fno = gen_helper_gvec_urshr_d,
390
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
391
+ .opt_opc = vecop_list,
392
+ .vece = MO_64 },
393
+ };
394
+
395
+ /* tszimm encoding produces immediates in the range [1..esize] */
396
+ tcg_debug_assert(shift > 0);
397
+ tcg_debug_assert(shift <= (8 << vece));
398
+
399
+ if (shift == (8 << vece)) {
400
+ /*
401
+ * Shifts larger than the element size are architecturally valid.
402
+ * Unsigned results in zero. With rounding, this produces a
403
+ * copy of the most significant bit.
404
+ */
405
+ tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
406
+ } else {
407
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
408
+ }
409
+}
410
+
411
+static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
412
+{
413
+ TCGv_i64 t = tcg_temp_new_i64();
414
+
415
+ if (sh == 8) {
416
+ tcg_gen_vec_shr8i_i64(t, a, 7);
417
+ } else {
418
+ gen_urshr8_i64(t, a, sh);
419
+ }
420
+ tcg_gen_vec_add8_i64(d, d, t);
421
+ tcg_temp_free_i64(t);
422
+}
423
+
424
+static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
425
+{
426
+ TCGv_i64 t = tcg_temp_new_i64();
427
+
428
+ if (sh == 16) {
429
+ tcg_gen_vec_shr16i_i64(t, a, 15);
430
+ } else {
431
+ gen_urshr16_i64(t, a, sh);
432
+ }
433
+ tcg_gen_vec_add16_i64(d, d, t);
434
+ tcg_temp_free_i64(t);
435
+}
436
+
437
+static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
438
+{
439
+ TCGv_i32 t = tcg_temp_new_i32();
440
+
441
+ if (sh == 32) {
442
+ tcg_gen_shri_i32(t, a, 31);
443
+ } else {
444
+ gen_urshr32_i32(t, a, sh);
445
+ }
446
+ tcg_gen_add_i32(d, d, t);
447
+ tcg_temp_free_i32(t);
448
+}
449
+
450
+static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
451
+{
452
+ TCGv_i64 t = tcg_temp_new_i64();
453
+
454
+ if (sh == 64) {
455
+ tcg_gen_shri_i64(t, a, 63);
456
+ } else {
457
+ gen_urshr64_i64(t, a, sh);
458
+ }
459
+ tcg_gen_add_i64(d, d, t);
460
+ tcg_temp_free_i64(t);
461
+}
462
+
463
+static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
464
+{
465
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
466
+
467
+ if (sh == (8 << vece)) {
468
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
469
+ } else {
470
+ gen_urshr_vec(vece, t, a, sh);
471
+ }
472
+ tcg_gen_add_vec(vece, d, d, t);
473
+ tcg_temp_free_vec(t);
474
+}
475
+
476
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
477
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
478
+{
479
+ static const TCGOpcode vecop_list[] = {
480
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
481
+ };
482
+ static const GVecGen2i ops[4] = {
483
+ { .fni8 = gen_ursra8_i64,
484
+ .fniv = gen_ursra_vec,
485
+ .fno = gen_helper_gvec_ursra_b,
486
+ .opt_opc = vecop_list,
487
+ .load_dest = true,
488
+ .vece = MO_8 },
489
+ { .fni8 = gen_ursra16_i64,
490
+ .fniv = gen_ursra_vec,
491
+ .fno = gen_helper_gvec_ursra_h,
492
+ .opt_opc = vecop_list,
493
+ .load_dest = true,
494
+ .vece = MO_16 },
495
+ { .fni4 = gen_ursra32_i32,
496
+ .fniv = gen_ursra_vec,
497
+ .fno = gen_helper_gvec_ursra_s,
498
+ .opt_opc = vecop_list,
499
+ .load_dest = true,
500
+ .vece = MO_32 },
501
+ { .fni8 = gen_ursra64_i64,
502
+ .fniv = gen_ursra_vec,
503
+ .fno = gen_helper_gvec_ursra_d,
504
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
505
+ .opt_opc = vecop_list,
506
+ .load_dest = true,
507
+ .vece = MO_64 },
508
+ };
509
+
510
+ /* tszimm encoding produces immediates in the range [1..esize] */
511
+ tcg_debug_assert(shift > 0);
512
+ tcg_debug_assert(shift <= (8 << vece));
513
+
514
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
515
+}
516
+
517
static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
518
{
519
uint64_t mask = dup_const(MO_8, 0xff >> shift);
270
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
520
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
271
}
521
}
272
return 0;
522
return 0;
273
523
274
+ case 1: /* VSRA */
524
+ case 2: /* VRSHR */
275
+ /* Right shift comes here negative. */
525
+ /* Right shift comes here negative. */
276
+ shift = -shift;
526
+ shift = -shift;
277
+ /* Shifts larger than the element size are architecturally
527
+ if (u) {
278
+ * valid. Unsigned results in all zeros; signed results
528
+ gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
279
+ * in all sign bits.
529
+ vec_size, vec_size);
280
+ */
281
+ if (!u) {
282
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
283
+ MIN(shift, (8 << size) - 1),
284
+ &ssra_op[size]);
285
+ } else if (shift >= 8 << size) {
286
+ /* rd += 0 */
287
+ } else {
530
+ } else {
288
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
531
+ gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
289
+ shift, &usra_op[size]);
532
+ vec_size, vec_size);
290
+ }
533
+ }
291
+ return 0;
534
+ return 0;
292
+
535
+
293
case 5: /* VSHL, VSLI */
536
+ case 3: /* VRSRA */
294
if (!u) { /* VSHL */
537
+ /* Right shift comes here negative. */
295
/* Shifts larger than the element size are
538
+ shift = -shift;
539
+ if (u) {
540
+ gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
541
+ vec_size, vec_size);
542
+ } else {
543
+ gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
544
+ vec_size, vec_size);
545
+ }
546
+ return 0;
547
+
548
case 4: /* VSRI */
549
if (!u) {
550
return 1;
296
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
551
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
297
neon_load_reg64(cpu_V0, rm + pass);
552
neon_load_reg64(cpu_V0, rm + pass);
298
tcg_gen_movi_i64(cpu_V1, imm);
553
tcg_gen_movi_i64(cpu_V1, imm);
299
switch (op) {
554
switch (op) {
300
- case 1: /* VSRA */
555
- case 2: /* VRSHR */
556
- case 3: /* VRSRA */
301
- if (u)
557
- if (u)
302
- gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
558
- gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
303
- else
559
- else
304
- gen_helper_neon_shl_s64(cpu_V0, cpu_V0, cpu_V1);
560
- gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
305
- break;
561
- break;
306
case 2: /* VRSHR */
562
case 6: /* VQSHLU */
307
case 3: /* VRSRA */
563
gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
308
if (u)
564
cpu_V0, cpu_V1);
309
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
565
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
310
default:
566
default:
311
g_assert_not_reached();
567
g_assert_not_reached();
312
}
568
}
313
- if (op == 1 || op == 3) {
569
- if (op == 3) {
314
+ if (op == 3) {
570
- /* Accumulate. */
315
/* Accumulate. */
571
- neon_load_reg64(cpu_V1, rd + pass);
316
neon_load_reg64(cpu_V1, rd + pass);
572
- tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
317
tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
573
- }
574
neon_store_reg64(cpu_V0, rd + pass);
575
} else { /* size < 3 */
576
/* Operands in T0 and T1. */
318
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
577
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
319
tmp2 = tcg_temp_new_i32();
578
tmp2 = tcg_temp_new_i32();
320
tcg_gen_movi_i32(tmp2, imm);
579
tcg_gen_movi_i32(tmp2, imm);
321
switch (op) {
580
switch (op) {
322
- case 1: /* VSRA */
581
- case 2: /* VRSHR */
323
- GEN_NEON_INTEGER_OP(shl);
582
- case 3: /* VRSRA */
583
- GEN_NEON_INTEGER_OP(rshl);
324
- break;
584
- break;
325
case 2: /* VRSHR */
585
case 6: /* VQSHLU */
326
case 3: /* VRSRA */
586
switch (size) {
327
GEN_NEON_INTEGER_OP(rshl);
587
case 0:
328
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
588
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
589
g_assert_not_reached();
329
}
590
}
330
tcg_temp_free_i32(tmp2);
591
tcg_temp_free_i32(tmp2);
331
592
-
332
- if (op == 1 || op == 3) {
593
- if (op == 3) {
333
+ if (op == 3) {
594
- /* Accumulate. */
334
/* Accumulate. */
595
- tmp2 = neon_load_reg(rd, pass);
335
tmp2 = neon_load_reg(rd, pass);
596
- gen_neon_add(size, tmp, tmp2);
336
gen_neon_add(size, tmp, tmp2);
597
- tcg_temp_free_i32(tmp2);
598
- }
599
neon_store_reg(rd, pass, tmp);
600
}
601
} /* for pass */
602
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
603
index XXXXXXX..XXXXXXX 100644
604
--- a/target/arm/vec_helper.c
605
+++ b/target/arm/vec_helper.c
606
@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
607
608
#undef DO_SRA
609
610
+#define DO_RSHR(NAME, TYPE) \
611
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
612
+{ \
613
+ intptr_t i, oprsz = simd_oprsz(desc); \
614
+ int shift = simd_data(desc); \
615
+ TYPE *d = vd, *n = vn; \
616
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
617
+ TYPE tmp = n[i] >> (shift - 1); \
618
+ d[i] = (tmp >> 1) + (tmp & 1); \
619
+ } \
620
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
621
+}
622
+
623
+DO_RSHR(gvec_srshr_b, int8_t)
624
+DO_RSHR(gvec_srshr_h, int16_t)
625
+DO_RSHR(gvec_srshr_s, int32_t)
626
+DO_RSHR(gvec_srshr_d, int64_t)
627
+
628
+DO_RSHR(gvec_urshr_b, uint8_t)
629
+DO_RSHR(gvec_urshr_h, uint16_t)
630
+DO_RSHR(gvec_urshr_s, uint32_t)
631
+DO_RSHR(gvec_urshr_d, uint64_t)
632
+
633
+#undef DO_RSHR
634
+
635
+#define DO_RSRA(NAME, TYPE) \
636
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
637
+{ \
638
+ intptr_t i, oprsz = simd_oprsz(desc); \
639
+ int shift = simd_data(desc); \
640
+ TYPE *d = vd, *n = vn; \
641
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
642
+ TYPE tmp = n[i] >> (shift - 1); \
643
+ d[i] += (tmp >> 1) + (tmp & 1); \
644
+ } \
645
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
646
+}
647
+
648
+DO_RSRA(gvec_srsra_b, int8_t)
649
+DO_RSRA(gvec_srsra_h, int16_t)
650
+DO_RSRA(gvec_srsra_s, int32_t)
651
+DO_RSRA(gvec_srsra_d, int64_t)
652
+
653
+DO_RSRA(gvec_ursra_b, uint8_t)
654
+DO_RSRA(gvec_ursra_h, uint16_t)
655
+DO_RSRA(gvec_ursra_s, uint32_t)
656
+DO_RSRA(gvec_ursra_d, uint64_t)
657
+
658
+#undef DO_RSRA
659
+
660
/*
661
* Convert float16 to float32, raising no exceptions and
662
* preserving exceptional values, including SNaN.
337
--
663
--
338
2.19.1
664
2.20.1
339
665
340
666
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Move shi_op and sli_op expanders from translate-a64.c.
3
The functions eliminate duplication of the special cases for
4
4
this operation. They match up with the GVecGen2iFn typedef.
5
6
Add out-of-line helpers. We got away with only having inline
7
expanders because the neon vector size is only 16 bytes, and
8
we know that the inline expansion will always succeed.
9
When we reuse this for SVE, tcg-gvec-op may decide to use an
10
out-of-line helper due to longer vector lengths.
11
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181011205206.3552-15-richard.henderson@linaro.org
14
Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
16
---
10
target/arm/translate.h | 2 +
17
target/arm/helper.h | 10 ++
11
target/arm/translate-a64.c | 152 +----------------------
18
target/arm/translate.h | 7 +-
12
target/arm/translate.c | 244 ++++++++++++++++++++++++++-----------
19
target/arm/translate-a64.c | 20 +---
13
3 files changed, 179 insertions(+), 219 deletions(-)
20
target/arm/translate.c | 186 +++++++++++++++++++++----------------
14
21
target/arm/vec_helper.c | 38 ++++++++
22
5 files changed, 160 insertions(+), 101 deletions(-)
23
24
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/helper.h
27
+++ b/target/arm/helper.h
28
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
31
32
+DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
41
+
42
#ifdef TARGET_AARCH64
43
#include "helper-a64.h"
44
#include "helper-sve.h"
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
45
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
47
--- a/target/arm/translate.h
18
+++ b/target/arm/translate.h
48
+++ b/target/arm/translate.h
19
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 bit_op;
49
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
20
extern const GVecGen3 bif_op;
50
extern const GVecGen3 cmtst_op[4];
21
extern const GVecGen2i ssra_op[4];
51
extern const GVecGen3 sshl_op[4];
22
extern const GVecGen2i usra_op[4];
52
extern const GVecGen3 ushl_op[4];
23
+extern const GVecGen2i sri_op[4];
53
-extern const GVecGen2i sri_op[4];
24
+extern const GVecGen2i sli_op[4];
54
-extern const GVecGen2i sli_op[4];
25
55
extern const GVecGen4 uqadd_op[4];
56
extern const GVecGen4 sqadd_op[4];
57
extern const GVecGen4 uqsub_op[4];
58
@@ -XXX,XX +XXX,XX @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
59
void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
60
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
61
62
+void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
63
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
64
+void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
65
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
66
+
26
/*
67
/*
27
* Forward to the isar_feature_* tests given a DisasContext pointer.
68
* Forward to the isar_feature_* tests given a DisasContext pointer.
69
*/
28
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
70
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
71
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-a64.c
72
--- a/target/arm/translate-a64.c
31
+++ b/target/arm/translate-a64.c
73
+++ b/target/arm/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
74
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
33
}
75
is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
34
}
76
}
35
77
36
-static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
78
-/* Expand a 2-operand + immediate AdvSIMD vector operation using
79
- * an op descriptor.
80
- */
81
-static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd,
82
- int rn, int64_t imm, const GVecGen2i *gvec_op)
37
-{
83
-{
38
- uint64_t mask = dup_const(MO_8, 0xff >> shift);
84
- tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
39
- TCGv_i64 t = tcg_temp_new_i64();
85
- is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op);
40
-
41
- tcg_gen_shri_i64(t, a, shift);
42
- tcg_gen_andi_i64(t, t, mask);
43
- tcg_gen_andi_i64(d, d, ~mask);
44
- tcg_gen_or_i64(d, d, t);
45
- tcg_temp_free_i64(t);
46
-}
86
-}
47
-
87
-
48
-static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
88
/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */
49
-{
89
static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
50
- uint64_t mask = dup_const(MO_16, 0xffff >> shift);
90
int rn, int rm, const GVecGen3 *gvec_op)
51
- TCGv_i64 t = tcg_temp_new_i64();
52
-
53
- tcg_gen_shri_i64(t, a, shift);
54
- tcg_gen_andi_i64(t, t, mask);
55
- tcg_gen_andi_i64(d, d, ~mask);
56
- tcg_gen_or_i64(d, d, t);
57
- tcg_temp_free_i64(t);
58
-}
59
-
60
-static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
61
-{
62
- tcg_gen_shri_i32(a, a, shift);
63
- tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
64
-}
65
-
66
-static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
67
-{
68
- tcg_gen_shri_i64(a, a, shift);
69
- tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
70
-}
71
-
72
-static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
73
-{
74
- uint64_t mask = (2ull << ((8 << vece) - 1)) - 1;
75
- TCGv_vec t = tcg_temp_new_vec_matching(d);
76
- TCGv_vec m = tcg_temp_new_vec_matching(d);
77
-
78
- tcg_gen_dupi_vec(vece, m, mask ^ (mask >> sh));
79
- tcg_gen_shri_vec(vece, t, a, sh);
80
- tcg_gen_and_vec(vece, d, d, m);
81
- tcg_gen_or_vec(vece, d, d, t);
82
-
83
- tcg_temp_free_vec(t);
84
- tcg_temp_free_vec(m);
85
-}
86
-
87
/* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */
88
static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
89
int immh, int immb, int opcode, int rn, int rd)
90
{
91
- static const GVecGen2i sri_op[4] = {
92
- { .fni8 = gen_shr8_ins_i64,
93
- .fniv = gen_shr_ins_vec,
94
- .load_dest = true,
95
- .opc = INDEX_op_shri_vec,
96
- .vece = MO_8 },
97
- { .fni8 = gen_shr16_ins_i64,
98
- .fniv = gen_shr_ins_vec,
99
- .load_dest = true,
100
- .opc = INDEX_op_shri_vec,
101
- .vece = MO_16 },
102
- { .fni4 = gen_shr32_ins_i32,
103
- .fniv = gen_shr_ins_vec,
104
- .load_dest = true,
105
- .opc = INDEX_op_shri_vec,
106
- .vece = MO_32 },
107
- { .fni8 = gen_shr64_ins_i64,
108
- .fniv = gen_shr_ins_vec,
109
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
110
- .load_dest = true,
111
- .opc = INDEX_op_shri_vec,
112
- .vece = MO_64 },
113
- };
114
-
115
int size = 32 - clz32(immh) - 1;
116
int immhb = immh << 3 | immb;
117
int shift = 2 * (8 << size) - immhb;
118
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
91
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
92
gen_gvec_fn2i(s, is_q, rd, rn, shift,
93
is_u ? gen_gvec_usra : gen_gvec_ssra, size);
94
return;
95
+
96
case 0x08: /* SRI */
97
- /* Shift count same as element size is valid but does nothing. */
98
- if (shift == 8 << size) {
99
- goto done;
100
- }
101
- gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]);
102
+ gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
103
return;
104
105
case 0x00: /* SSHR / USHR */
106
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
107
}
108
tcg_temp_free_i64(tcg_round);
109
110
- done:
119
clear_vec_high(s, is_q, rd);
111
clear_vec_high(s, is_q, rd);
120
}
112
}
121
113
122
-static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
123
-{
124
- uint64_t mask = dup_const(MO_8, 0xff << shift);
125
- TCGv_i64 t = tcg_temp_new_i64();
126
-
127
- tcg_gen_shli_i64(t, a, shift);
128
- tcg_gen_andi_i64(t, t, mask);
129
- tcg_gen_andi_i64(d, d, ~mask);
130
- tcg_gen_or_i64(d, d, t);
131
- tcg_temp_free_i64(t);
132
-}
133
-
134
-static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
135
-{
136
- uint64_t mask = dup_const(MO_16, 0xffff << shift);
137
- TCGv_i64 t = tcg_temp_new_i64();
138
-
139
- tcg_gen_shli_i64(t, a, shift);
140
- tcg_gen_andi_i64(t, t, mask);
141
- tcg_gen_andi_i64(d, d, ~mask);
142
- tcg_gen_or_i64(d, d, t);
143
- tcg_temp_free_i64(t);
144
-}
145
-
146
-static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
147
-{
148
- tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
149
-}
150
-
151
-static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
152
-{
153
- tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
154
-}
155
-
156
-static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
157
-{
158
- uint64_t mask = (1ull << sh) - 1;
159
- TCGv_vec t = tcg_temp_new_vec_matching(d);
160
- TCGv_vec m = tcg_temp_new_vec_matching(d);
161
-
162
- tcg_gen_dupi_vec(vece, m, mask);
163
- tcg_gen_shli_vec(vece, t, a, sh);
164
- tcg_gen_and_vec(vece, d, d, m);
165
- tcg_gen_or_vec(vece, d, d, t);
166
-
167
- tcg_temp_free_vec(t);
168
- tcg_temp_free_vec(m);
169
-}
170
-
171
/* SHL/SLI - Vector shift left */
172
static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
173
int immh, int immb, int opcode, int rn, int rd)
174
{
175
- static const GVecGen2i shi_op[4] = {
176
- { .fni8 = gen_shl8_ins_i64,
177
- .fniv = gen_shl_ins_vec,
178
- .opc = INDEX_op_shli_vec,
179
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
180
- .load_dest = true,
181
- .vece = MO_8 },
182
- { .fni8 = gen_shl16_ins_i64,
183
- .fniv = gen_shl_ins_vec,
184
- .opc = INDEX_op_shli_vec,
185
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
186
- .load_dest = true,
187
- .vece = MO_16 },
188
- { .fni4 = gen_shl32_ins_i32,
189
- .fniv = gen_shl_ins_vec,
190
- .opc = INDEX_op_shli_vec,
191
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
192
- .load_dest = true,
193
- .vece = MO_32 },
194
- { .fni8 = gen_shl64_ins_i64,
195
- .fniv = gen_shl_ins_vec,
196
- .opc = INDEX_op_shli_vec,
197
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
198
- .load_dest = true,
199
- .vece = MO_64 },
200
- };
201
int size = 32 - clz32(immh) - 1;
202
int immhb = immh << 3 | immb;
203
int shift = immhb - (8 << size);
204
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
114
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
205
}
115
}
206
116
207
if (insert) {
117
if (insert) {
208
- gen_gvec_op2i(s, is_q, rd, rn, shift, &shi_op[size]);
118
- gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
209
+ gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
119
+ gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
210
} else {
120
} else {
211
gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
121
gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
212
}
122
}
213
diff --git a/target/arm/translate.c b/target/arm/translate.c
123
diff --git a/target/arm/translate.c b/target/arm/translate.c
214
index XXXXXXX..XXXXXXX 100644
124
index XXXXXXX..XXXXXXX 100644
215
--- a/target/arm/translate.c
125
--- a/target/arm/translate.c
216
+++ b/target/arm/translate.c
126
+++ b/target/arm/translate.c
217
@@ -XXX,XX +XXX,XX @@ const GVecGen2i usra_op[4] = {
127
@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
218
.vece = MO_64, },
128
219
};
129
static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
220
130
{
221
+static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
131
- if (sh == 0) {
132
- tcg_gen_mov_vec(d, a);
133
- } else {
134
- TCGv_vec t = tcg_temp_new_vec_matching(d);
135
- TCGv_vec m = tcg_temp_new_vec_matching(d);
136
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
137
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
138
139
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
140
- tcg_gen_shri_vec(vece, t, a, sh);
141
- tcg_gen_and_vec(vece, d, d, m);
142
- tcg_gen_or_vec(vece, d, d, t);
143
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
144
+ tcg_gen_shri_vec(vece, t, a, sh);
145
+ tcg_gen_and_vec(vece, d, d, m);
146
+ tcg_gen_or_vec(vece, d, d, t);
147
148
- tcg_temp_free_vec(t);
149
- tcg_temp_free_vec(m);
150
- }
151
+ tcg_temp_free_vec(t);
152
+ tcg_temp_free_vec(m);
153
}
154
155
-static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 };
156
+void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
157
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
222
+{
158
+{
223
+ uint64_t mask = dup_const(MO_8, 0xff >> shift);
159
+ static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
224
+ TCGv_i64 t = tcg_temp_new_i64();
160
+ const GVecGen2i ops[4] = {
225
+
161
+ { .fni8 = gen_shr8_ins_i64,
226
+ tcg_gen_shri_i64(t, a, shift);
162
+ .fniv = gen_shr_ins_vec,
227
+ tcg_gen_andi_i64(t, t, mask);
163
+ .fno = gen_helper_gvec_sri_b,
228
+ tcg_gen_andi_i64(d, d, ~mask);
164
+ .load_dest = true,
229
+ tcg_gen_or_i64(d, d, t);
165
+ .opt_opc = vecop_list,
230
+ tcg_temp_free_i64(t);
166
+ .vece = MO_8 },
231
+}
167
+ { .fni8 = gen_shr16_ins_i64,
232
+
168
+ .fniv = gen_shr_ins_vec,
233
+static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
169
+ .fno = gen_helper_gvec_sri_h,
234
+{
170
+ .load_dest = true,
235
+ uint64_t mask = dup_const(MO_16, 0xffff >> shift);
171
+ .opt_opc = vecop_list,
236
+ TCGv_i64 t = tcg_temp_new_i64();
172
+ .vece = MO_16 },
237
+
173
+ { .fni4 = gen_shr32_ins_i32,
238
+ tcg_gen_shri_i64(t, a, shift);
174
+ .fniv = gen_shr_ins_vec,
239
+ tcg_gen_andi_i64(t, t, mask);
175
+ .fno = gen_helper_gvec_sri_s,
240
+ tcg_gen_andi_i64(d, d, ~mask);
176
+ .load_dest = true,
241
+ tcg_gen_or_i64(d, d, t);
177
+ .opt_opc = vecop_list,
242
+ tcg_temp_free_i64(t);
178
+ .vece = MO_32 },
243
+}
179
+ { .fni8 = gen_shr64_ins_i64,
244
+
180
+ .fniv = gen_shr_ins_vec,
245
+static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
181
+ .fno = gen_helper_gvec_sri_d,
246
+{
182
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
247
+ tcg_gen_shri_i32(a, a, shift);
183
+ .load_dest = true,
248
+ tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
184
+ .opt_opc = vecop_list,
249
+}
185
+ .vece = MO_64 },
250
+
186
+ };
251
+static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
187
252
+{
188
-const GVecGen2i sri_op[4] = {
253
+ tcg_gen_shri_i64(a, a, shift);
189
- { .fni8 = gen_shr8_ins_i64,
254
+ tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
190
- .fniv = gen_shr_ins_vec,
255
+}
191
- .load_dest = true,
256
+
192
- .opt_opc = vecop_list_sri,
257
+static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
193
- .vece = MO_8 },
258
+{
194
- { .fni8 = gen_shr16_ins_i64,
259
+ if (sh == 0) {
195
- .fniv = gen_shr_ins_vec,
260
+ tcg_gen_mov_vec(d, a);
196
- .load_dest = true,
197
- .opt_opc = vecop_list_sri,
198
- .vece = MO_16 },
199
- { .fni4 = gen_shr32_ins_i32,
200
- .fniv = gen_shr_ins_vec,
201
- .load_dest = true,
202
- .opt_opc = vecop_list_sri,
203
- .vece = MO_32 },
204
- { .fni8 = gen_shr64_ins_i64,
205
- .fniv = gen_shr_ins_vec,
206
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
207
- .load_dest = true,
208
- .opt_opc = vecop_list_sri,
209
- .vece = MO_64 },
210
-};
211
+ /* tszimm encoding produces immediates in the range [1..esize]. */
212
+ tcg_debug_assert(shift > 0);
213
+ tcg_debug_assert(shift <= (8 << vece));
214
+
215
+ /* Shift of esize leaves destination unchanged. */
216
+ if (shift < (8 << vece)) {
217
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
261
+ } else {
218
+ } else {
262
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
219
+ /* Nop, but we do need to clear the tail. */
263
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
220
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
264
+
265
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
266
+ tcg_gen_shri_vec(vece, t, a, sh);
267
+ tcg_gen_and_vec(vece, d, d, m);
268
+ tcg_gen_or_vec(vece, d, d, t);
269
+
270
+ tcg_temp_free_vec(t);
271
+ tcg_temp_free_vec(m);
272
+ }
221
+ }
273
+}
222
+}
274
+
223
275
+const GVecGen2i sri_op[4] = {
224
static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
276
+ { .fni8 = gen_shr8_ins_i64,
225
{
277
+ .fniv = gen_shr_ins_vec,
226
@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
278
+ .load_dest = true,
227
279
+ .opc = INDEX_op_shri_vec,
228
static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
280
+ .vece = MO_8 },
229
{
281
+ { .fni8 = gen_shr16_ins_i64,
230
- if (sh == 0) {
282
+ .fniv = gen_shr_ins_vec,
231
- tcg_gen_mov_vec(d, a);
283
+ .load_dest = true,
232
- } else {
284
+ .opc = INDEX_op_shri_vec,
233
- TCGv_vec t = tcg_temp_new_vec_matching(d);
285
+ .vece = MO_16 },
234
- TCGv_vec m = tcg_temp_new_vec_matching(d);
286
+ { .fni4 = gen_shr32_ins_i32,
235
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
287
+ .fniv = gen_shr_ins_vec,
236
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
288
+ .load_dest = true,
237
289
+ .opc = INDEX_op_shri_vec,
238
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
290
+ .vece = MO_32 },
239
- tcg_gen_shli_vec(vece, t, a, sh);
291
+ { .fni8 = gen_shr64_ins_i64,
240
- tcg_gen_and_vec(vece, d, d, m);
292
+ .fniv = gen_shr_ins_vec,
241
- tcg_gen_or_vec(vece, d, d, t);
293
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
242
+ tcg_gen_shli_vec(vece, t, a, sh);
294
+ .load_dest = true,
243
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
295
+ .opc = INDEX_op_shri_vec,
244
+ tcg_gen_and_vec(vece, d, d, m);
296
+ .vece = MO_64 },
245
+ tcg_gen_or_vec(vece, d, d, t);
297
+};
246
298
+
247
- tcg_temp_free_vec(t);
299
+static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
248
- tcg_temp_free_vec(m);
249
- }
250
+ tcg_temp_free_vec(t);
251
+ tcg_temp_free_vec(m);
252
}
253
254
-static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 };
255
+void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
256
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
300
+{
257
+{
301
+ uint64_t mask = dup_const(MO_8, 0xff << shift);
258
+ static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
302
+ TCGv_i64 t = tcg_temp_new_i64();
259
+ const GVecGen2i ops[4] = {
303
+
260
+ { .fni8 = gen_shl8_ins_i64,
304
+ tcg_gen_shli_i64(t, a, shift);
261
+ .fniv = gen_shl_ins_vec,
305
+ tcg_gen_andi_i64(t, t, mask);
262
+ .fno = gen_helper_gvec_sli_b,
306
+ tcg_gen_andi_i64(d, d, ~mask);
263
+ .load_dest = true,
307
+ tcg_gen_or_i64(d, d, t);
264
+ .opt_opc = vecop_list,
308
+ tcg_temp_free_i64(t);
265
+ .vece = MO_8 },
309
+}
266
+ { .fni8 = gen_shl16_ins_i64,
310
+
267
+ .fniv = gen_shl_ins_vec,
311
+static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
268
+ .fno = gen_helper_gvec_sli_h,
312
+{
269
+ .load_dest = true,
313
+ uint64_t mask = dup_const(MO_16, 0xffff << shift);
270
+ .opt_opc = vecop_list,
314
+ TCGv_i64 t = tcg_temp_new_i64();
271
+ .vece = MO_16 },
315
+
272
+ { .fni4 = gen_shl32_ins_i32,
316
+ tcg_gen_shli_i64(t, a, shift);
273
+ .fniv = gen_shl_ins_vec,
317
+ tcg_gen_andi_i64(t, t, mask);
274
+ .fno = gen_helper_gvec_sli_s,
318
+ tcg_gen_andi_i64(d, d, ~mask);
275
+ .load_dest = true,
319
+ tcg_gen_or_i64(d, d, t);
276
+ .opt_opc = vecop_list,
320
+ tcg_temp_free_i64(t);
277
+ .vece = MO_32 },
321
+}
278
+ { .fni8 = gen_shl64_ins_i64,
322
+
279
+ .fniv = gen_shl_ins_vec,
323
+static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
280
+ .fno = gen_helper_gvec_sli_d,
324
+{
281
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
325
+ tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
282
+ .load_dest = true,
326
+}
283
+ .opt_opc = vecop_list,
327
+
284
+ .vece = MO_64 },
328
+static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
285
+ };
329
+{
286
330
+ tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
287
-const GVecGen2i sli_op[4] = {
331
+}
288
- { .fni8 = gen_shl8_ins_i64,
332
+
289
- .fniv = gen_shl_ins_vec,
333
+static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
290
- .load_dest = true,
334
+{
291
- .opt_opc = vecop_list_sli,
335
+ if (sh == 0) {
292
- .vece = MO_8 },
336
+ tcg_gen_mov_vec(d, a);
293
- { .fni8 = gen_shl16_ins_i64,
294
- .fniv = gen_shl_ins_vec,
295
- .load_dest = true,
296
- .opt_opc = vecop_list_sli,
297
- .vece = MO_16 },
298
- { .fni4 = gen_shl32_ins_i32,
299
- .fniv = gen_shl_ins_vec,
300
- .load_dest = true,
301
- .opt_opc = vecop_list_sli,
302
- .vece = MO_32 },
303
- { .fni8 = gen_shl64_ins_i64,
304
- .fniv = gen_shl_ins_vec,
305
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
306
- .load_dest = true,
307
- .opt_opc = vecop_list_sli,
308
- .vece = MO_64 },
309
-};
310
+ /* tszimm encoding produces immediates in the range [0..esize-1]. */
311
+ tcg_debug_assert(shift >= 0);
312
+ tcg_debug_assert(shift < (8 << vece));
313
+
314
+ if (shift == 0) {
315
+ tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
337
+ } else {
316
+ } else {
338
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
317
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
339
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
340
+
341
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
342
+ tcg_gen_shli_vec(vece, t, a, sh);
343
+ tcg_gen_and_vec(vece, d, d, m);
344
+ tcg_gen_or_vec(vece, d, d, t);
345
+
346
+ tcg_temp_free_vec(t);
347
+ tcg_temp_free_vec(m);
348
+ }
318
+ }
349
+}
319
+}
350
+
320
351
+const GVecGen2i sli_op[4] = {
321
static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
352
+ { .fni8 = gen_shl8_ins_i64,
322
{
353
+ .fniv = gen_shl_ins_vec,
354
+ .load_dest = true,
355
+ .opc = INDEX_op_shli_vec,
356
+ .vece = MO_8 },
357
+ { .fni8 = gen_shl16_ins_i64,
358
+ .fniv = gen_shl_ins_vec,
359
+ .load_dest = true,
360
+ .opc = INDEX_op_shli_vec,
361
+ .vece = MO_16 },
362
+ { .fni4 = gen_shl32_ins_i32,
363
+ .fniv = gen_shl_ins_vec,
364
+ .load_dest = true,
365
+ .opc = INDEX_op_shli_vec,
366
+ .vece = MO_32 },
367
+ { .fni8 = gen_shl64_ins_i64,
368
+ .fniv = gen_shl_ins_vec,
369
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
370
+ .load_dest = true,
371
+ .opc = INDEX_op_shli_vec,
372
+ .vece = MO_64 },
373
+};
374
+
375
/* Translate a NEON data processing instruction. Return nonzero if the
376
instruction is invalid.
377
We process data in a mixture of 32-bit and 64-bit chunks.
378
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
379
int pairwise;
380
int u;
381
int vec_size;
382
- uint32_t imm, mask;
383
+ uint32_t imm;
384
TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
385
TCGv_ptr ptr1, ptr2, ptr3;
386
TCGv_i64 tmp64;
387
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
323
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
388
}
324
}
325
/* Right shift comes here negative. */
326
shift = -shift;
327
- /* Shift out of range leaves destination unchanged. */
328
- if (shift < 8 << size) {
329
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
330
- shift, &sri_op[size]);
331
- }
332
+ gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
333
+ vec_size, vec_size);
389
return 0;
334
return 0;
390
335
391
+ case 4: /* VSRI */
392
+ if (!u) {
393
+ return 1;
394
+ }
395
+ /* Right shift comes here negative. */
396
+ shift = -shift;
397
+ /* Shift out of range leaves destination unchanged. */
398
+ if (shift < 8 << size) {
399
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
400
+ shift, &sri_op[size]);
401
+ }
402
+ return 0;
403
+
404
case 5: /* VSHL, VSLI */
336
case 5: /* VSHL, VSLI */
405
- if (!u) { /* VSHL */
337
if (u) { /* VSLI */
406
+ if (u) { /* VSLI */
338
- /* Shift out of range leaves destination unchanged. */
407
+ /* Shift out of range leaves destination unchanged. */
339
- if (shift < 8 << size) {
408
+ if (shift < 8 << size) {
340
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
409
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
341
- vec_size, shift, &sli_op[size]);
410
+ vec_size, shift, &sli_op[size]);
342
- }
411
+ }
343
+ gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
412
+ } else { /* VSHL */
344
+ vec_size, vec_size);
345
} else { /* VSHL */
413
/* Shifts larger than the element size are
346
/* Shifts larger than the element size are
414
* architecturally valid and results in zero.
347
* architecturally valid and results in zero.
415
*/
348
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
416
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
349
index XXXXXXX..XXXXXXX 100644
417
tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
350
--- a/target/arm/vec_helper.c
418
vec_size, vec_size);
351
+++ b/target/arm/vec_helper.c
419
}
352
@@ -XXX,XX +XXX,XX @@ DO_RSRA(gvec_ursra_d, uint64_t)
420
- return 0;
353
421
}
354
#undef DO_RSRA
422
- break;
355
423
+ return 0;
356
+#define DO_SRI(NAME, TYPE) \
424
}
357
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
425
358
+{ \
426
if (size == 3) {
359
+ intptr_t i, oprsz = simd_oprsz(desc); \
427
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
360
+ int shift = simd_data(desc); \
428
else
361
+ TYPE *d = vd, *n = vn; \
429
gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
362
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
430
break;
363
+ d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \
431
- case 4: /* VSRI */
364
+ } \
432
- case 5: /* VSHL, VSLI */
365
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
433
- gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
366
+}
434
- break;
367
+
435
case 6: /* VQSHLU */
368
+DO_SRI(gvec_sri_b, uint8_t)
436
gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
369
+DO_SRI(gvec_sri_h, uint16_t)
437
cpu_V0, cpu_V1);
370
+DO_SRI(gvec_sri_s, uint32_t)
438
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
371
+DO_SRI(gvec_sri_d, uint64_t)
439
/* Accumulate. */
372
+
440
neon_load_reg64(cpu_V1, rd + pass);
373
+#undef DO_SRI
441
tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
374
+
442
- } else if (op == 4 || (op == 5 && u)) {
375
+#define DO_SLI(NAME, TYPE) \
443
- /* Insert */
376
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
444
- neon_load_reg64(cpu_V1, rd + pass);
377
+{ \
445
- uint64_t mask;
378
+ intptr_t i, oprsz = simd_oprsz(desc); \
446
- if (shift < -63 || shift > 63) {
379
+ int shift = simd_data(desc); \
447
- mask = 0;
380
+ TYPE *d = vd, *n = vn; \
448
- } else {
381
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
449
- if (op == 4) {
382
+ d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \
450
- mask = 0xffffffffffffffffull >> -shift;
383
+ } \
451
- } else {
384
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
452
- mask = 0xffffffffffffffffull << shift;
385
+}
453
- }
386
+
454
- }
387
+DO_SLI(gvec_sli_b, uint8_t)
455
- tcg_gen_andi_i64(cpu_V1, cpu_V1, ~mask);
388
+DO_SLI(gvec_sli_h, uint16_t)
456
- tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
389
+DO_SLI(gvec_sli_s, uint32_t)
457
}
390
+DO_SLI(gvec_sli_d, uint64_t)
458
neon_store_reg64(cpu_V0, rd + pass);
391
+
459
} else { /* size < 3 */
392
+#undef DO_SLI
460
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
393
+
461
case 3: /* VRSRA */
394
/*
462
GEN_NEON_INTEGER_OP(rshl);
395
* Convert float16 to float32, raising no exceptions and
463
break;
396
* preserving exceptional values, including SNaN.
464
- case 4: /* VSRI */
465
- case 5: /* VSHL, VSLI */
466
- switch (size) {
467
- case 0: gen_helper_neon_shl_u8(tmp, tmp, tmp2); break;
468
- case 1: gen_helper_neon_shl_u16(tmp, tmp, tmp2); break;
469
- case 2: gen_helper_neon_shl_u32(tmp, tmp, tmp2); break;
470
- default: abort();
471
- }
472
- break;
473
case 6: /* VQSHLU */
474
switch (size) {
475
case 0:
476
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
477
tmp2 = neon_load_reg(rd, pass);
478
gen_neon_add(size, tmp, tmp2);
479
tcg_temp_free_i32(tmp2);
480
- } else if (op == 4 || (op == 5 && u)) {
481
- /* Insert */
482
- switch (size) {
483
- case 0:
484
- if (op == 4)
485
- mask = 0xff >> -shift;
486
- else
487
- mask = (uint8_t)(0xff << shift);
488
- mask |= mask << 8;
489
- mask |= mask << 16;
490
- break;
491
- case 1:
492
- if (op == 4)
493
- mask = 0xffff >> -shift;
494
- else
495
- mask = (uint16_t)(0xffff << shift);
496
- mask |= mask << 16;
497
- break;
498
- case 2:
499
- if (shift < -31 || shift > 31) {
500
- mask = 0;
501
- } else {
502
- if (op == 4)
503
- mask = 0xffffffffu >> -shift;
504
- else
505
- mask = 0xffffffffu << shift;
506
- }
507
- break;
508
- default:
509
- abort();
510
- }
511
- tmp2 = neon_load_reg(rd, pass);
512
- tcg_gen_andi_i32(tmp, tmp, mask);
513
- tcg_gen_andi_i32(tmp2, tmp2, ~mask);
514
- tcg_gen_or_i32(tmp, tmp, tmp2);
515
- tcg_temp_free_i32(tmp2);
516
}
517
neon_store_reg(rd, pass, tmp);
518
}
519
--
397
--
520
2.19.1
398
2.20.1
521
399
522
400
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
In 1dc8425e551, while converting to gvec, I added an extra range check
4
against the shift count. This was unnecessary because the encoding of
5
the shift count produces 0 to the element size - 1.
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-11-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/translate.c | 16 ++++++++--------
12
target/arm/translate.c | 12 ++----------
9
1 file changed, 8 insertions(+), 8 deletions(-)
13
1 file changed, 2 insertions(+), 10 deletions(-)
10
14
11
diff --git a/target/arm/translate.c b/target/arm/translate.c
15
diff --git a/target/arm/translate.c b/target/arm/translate.c
12
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate.c
17
--- a/target/arm/translate.c
14
+++ b/target/arm/translate.c
18
+++ b/target/arm/translate.c
15
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
19
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
16
tcg_temp_free_ptr(ptr1);
20
gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
17
tcg_temp_free_ptr(ptr2);
21
vec_size, vec_size);
18
break;
22
} else { /* VSHL */
19
+
23
- /* Shifts larger than the element size are
20
+ case NEON_2RM_VMVN:
24
- * architecturally valid and results in zero.
21
+ tcg_gen_gvec_not(0, rd_ofs, rm_ofs, vec_size, vec_size);
25
- */
22
+ break;
26
- if (shift >= 8 << size) {
23
+ case NEON_2RM_VNEG:
27
- tcg_gen_gvec_dup_imm(size, rd_ofs,
24
+ tcg_gen_gvec_neg(size, rd_ofs, rm_ofs, vec_size, vec_size);
28
- vec_size, vec_size, 0);
25
+ break;
29
- } else {
26
+
30
- tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
27
default:
31
- vec_size, vec_size);
28
elementwise:
32
- }
29
for (pass = 0; pass < (q ? 4 : 2); pass++) {
33
+ tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
30
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
34
+ vec_size, vec_size);
31
case NEON_2RM_VCNT:
35
}
32
gen_helper_neon_cnt_u8(tmp, tmp);
36
return 0;
33
break;
37
}
34
- case NEON_2RM_VMVN:
35
- tcg_gen_not_i32(tmp, tmp);
36
- break;
37
case NEON_2RM_VQABS:
38
switch (size) {
39
case 0:
40
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
41
default: abort();
42
}
43
break;
44
- case NEON_2RM_VNEG:
45
- tmp2 = tcg_const_i32(0);
46
- gen_neon_rsb(size, tmp, tmp2);
47
- tcg_temp_free_i32(tmp2);
48
- break;
49
case NEON_2RM_VCGT0_F:
50
{
51
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
52
--
38
--
53
2.19.1
39
2.20.1
54
40
55
41
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
For a sequence of loads or stores from a single register,
3
Now that we've converted all cases to gvec, there is quite a bit
4
little-endian operations can be promoted to an 8-byte op.
4
of dead code at the end of the function. Remove it.
5
This can reduce the number of operations by a factor of 8.
6
5
6
Sink the call to gen_gvec_fn2i to the end, loading a function
7
pointer within the switch statement.
8
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20181011205206.3552-5-richard.henderson@linaro.org
11
Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/translate-a64.c | 66 +++++++++++++++++++++++---------------
14
target/arm/translate-a64.c | 56 ++++++++++----------------------------
13
1 file changed, 40 insertions(+), 26 deletions(-)
15
1 file changed, 14 insertions(+), 42 deletions(-)
14
16
15
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
17
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-a64.c
19
--- a/target/arm/translate-a64.c
18
+++ b/target/arm/translate-a64.c
20
+++ b/target/arm/translate-a64.c
19
@@ -XXX,XX +XXX,XX @@ static void write_vec_element_i32(DisasContext *s, TCGv_i32 tcg_src,
21
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
20
22
int size = 32 - clz32(immh) - 1;
21
/* Store from vector register to memory */
23
int immhb = immh << 3 | immb;
22
static void do_vec_st(DisasContext *s, int srcidx, int element,
24
int shift = 2 * (8 << size) - immhb;
23
- TCGv_i64 tcg_addr, int size)
25
- bool accumulate = false;
24
+ TCGv_i64 tcg_addr, int size, TCGMemOp endian)
26
- int dsize = is_q ? 128 : 64;
25
{
27
- int esize = 8 << size;
26
- TCGMemOp memop = s->be_data + size;
28
- int elements = dsize/esize;
27
TCGv_i64 tcg_tmp = tcg_temp_new_i64();
29
- MemOp memop = size | (is_u ? 0 : MO_SIGN);
28
30
- TCGv_i64 tcg_rn = new_tmp_a64(s);
29
read_vec_element(s, tcg_tmp, srcidx, element, size);
31
- TCGv_i64 tcg_rd = new_tmp_a64(s);
30
- tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
32
- TCGv_i64 tcg_round;
31
+ tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), endian | size);
33
- uint64_t round_const;
32
34
- int i;
33
tcg_temp_free_i64(tcg_tmp);
35
+ GVecGen2iFn *gvec_fn;
36
37
if (extract32(immh, 3, 1) && !is_q) {
38
unallocated_encoding(s);
39
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
40
41
switch (opcode) {
42
case 0x02: /* SSRA / USRA (accumulate) */
43
- gen_gvec_fn2i(s, is_q, rd, rn, shift,
44
- is_u ? gen_gvec_usra : gen_gvec_ssra, size);
45
- return;
46
+ gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
47
+ break;
48
49
case 0x08: /* SRI */
50
- gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
51
- return;
52
+ gvec_fn = gen_gvec_sri;
53
+ break;
54
55
case 0x00: /* SSHR / USHR */
56
if (is_u) {
57
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
58
/* Shift count the same size as element size produces zero. */
59
tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
60
is_q ? 16 : 8, vec_full_reg_size(s), 0);
61
- } else {
62
- gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
63
+ return;
64
}
65
+ gvec_fn = tcg_gen_gvec_shri;
66
} else {
67
/* Shift count the same size as element size produces all sign. */
68
if (shift == 8 << size) {
69
shift -= 1;
70
}
71
- gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
72
+ gvec_fn = tcg_gen_gvec_sari;
73
}
74
- return;
75
+ break;
76
77
case 0x04: /* SRSHR / URSHR (rounding) */
78
- gen_gvec_fn2i(s, is_q, rd, rn, shift,
79
- is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
80
- return;
81
+ gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
82
+ break;
83
84
case 0x06: /* SRSRA / URSRA (accum + rounding) */
85
- gen_gvec_fn2i(s, is_q, rd, rn, shift,
86
- is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
87
- return;
88
+ gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
89
+ break;
90
91
default:
92
g_assert_not_reached();
93
}
94
95
- round_const = 1ULL << (shift - 1);
96
- tcg_round = tcg_const_i64(round_const);
97
-
98
- for (i = 0; i < elements; i++) {
99
- read_vec_element(s, tcg_rn, rn, i, memop);
100
- if (accumulate) {
101
- read_vec_element(s, tcg_rd, rd, i, memop);
102
- }
103
-
104
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
105
- accumulate, is_u, size, shift);
106
-
107
- write_vec_element(s, tcg_rd, rd, i, size);
108
- }
109
- tcg_temp_free_i64(tcg_round);
110
-
111
- clear_vec_high(s, is_q, rd);
112
+ gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
34
}
113
}
35
114
36
/* Load from memory to vector register */
115
/* SHL/SLI - Vector shift left */
37
static void do_vec_ld(DisasContext *s, int destidx, int element,
38
- TCGv_i64 tcg_addr, int size)
39
+ TCGv_i64 tcg_addr, int size, TCGMemOp endian)
40
{
41
- TCGMemOp memop = s->be_data + size;
42
TCGv_i64 tcg_tmp = tcg_temp_new_i64();
43
44
- tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
45
+ tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), endian | size);
46
write_vec_element(s, tcg_tmp, destidx, element, size);
47
48
tcg_temp_free_i64(tcg_tmp);
49
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
50
bool is_postidx = extract32(insn, 23, 1);
51
bool is_q = extract32(insn, 30, 1);
52
TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
53
+ TCGMemOp endian = s->be_data;
54
55
- int ebytes = 1 << size;
56
- int elements = (is_q ? 128 : 64) / (8 << size);
57
+ int ebytes; /* bytes per element */
58
+ int elements; /* elements per vector */
59
int rpt; /* num iterations */
60
int selem; /* structure elements */
61
int r;
62
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
63
gen_check_sp_alignment(s);
64
}
65
66
+ /* For our purposes, bytes are always little-endian. */
67
+ if (size == 0) {
68
+ endian = MO_LE;
69
+ }
70
+
71
+ /* Consecutive little-endian elements from a single register
72
+ * can be promoted to a larger little-endian operation.
73
+ */
74
+ if (selem == 1 && endian == MO_LE) {
75
+ size = 3;
76
+ }
77
+ ebytes = 1 << size;
78
+ elements = (is_q ? 16 : 8) / ebytes;
79
+
80
tcg_rn = cpu_reg_sp(s, rn);
81
tcg_addr = tcg_temp_new_i64();
82
tcg_gen_mov_i64(tcg_addr, tcg_rn);
83
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
84
for (r = 0; r < rpt; r++) {
85
int e;
86
for (e = 0; e < elements; e++) {
87
- int tt = (rt + r) % 32;
88
int xs;
89
for (xs = 0; xs < selem; xs++) {
90
+ int tt = (rt + r + xs) % 32;
91
if (is_store) {
92
- do_vec_st(s, tt, e, tcg_addr, size);
93
+ do_vec_st(s, tt, e, tcg_addr, size, endian);
94
} else {
95
- do_vec_ld(s, tt, e, tcg_addr, size);
96
-
97
- /* For non-quad operations, setting a slice of the low
98
- * 64 bits of the register clears the high 64 bits (in
99
- * the ARM ARM pseudocode this is implicit in the fact
100
- * that 'rval' is a 64 bit wide variable).
101
- * For quad operations, we might still need to zero the
102
- * high bits of SVE. We optimize by noticing that we only
103
- * need to do this the first time we touch a register.
104
- */
105
- if (e == 0 && (r == 0 || xs == selem - 1)) {
106
- clear_vec_high(s, is_q, tt);
107
- }
108
+ do_vec_ld(s, tt, e, tcg_addr, size, endian);
109
}
110
tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
111
- tt = (tt + 1) % 32;
112
}
113
}
114
}
115
116
+ if (!is_store) {
117
+ /* For non-quad operations, setting a slice of the low
118
+ * 64 bits of the register clears the high 64 bits (in
119
+ * the ARM ARM pseudocode this is implicit in the fact
120
+ * that 'rval' is a 64 bit wide variable).
121
+ * For quad operations, we might still need to zero the
122
+ * high bits of SVE.
123
+ */
124
+ for (r = 0; r < rpt * selem; r++) {
125
+ int tt = (rt + r) % 32;
126
+ clear_vec_high(s, is_q, tt);
127
+ }
128
+ }
129
+
130
if (is_postidx) {
131
int rm = extract32(insn, 16, 5);
132
if (rm == 31) {
133
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
134
} else {
135
/* Load/store one element per register */
136
if (is_load) {
137
- do_vec_ld(s, rt, index, tcg_addr, scale);
138
+ do_vec_ld(s, rt, index, tcg_addr, scale, s->be_data);
139
} else {
140
- do_vec_st(s, rt, index, tcg_addr, scale);
141
+ do_vec_st(s, rt, index, tcg_addr, scale, s->be_data);
142
}
143
}
144
tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
145
--
116
--
146
2.19.1
117
2.20.1
147
118
148
119
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is done generically in translator_loop.
3
Provide a functional interface for the vector expansion.
4
4
This fits better with the existing set of helpers that
5
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
5
we provide for other operations.
6
7
Macro-ize the 5 nearly identical comparisons.
8
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
8
Message-id: 20181011205206.3552-3-richard.henderson@linaro.org
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/translate-a64.c | 1 -
14
target/arm/translate.h | 16 ++-
13
target/arm/translate.c | 1 -
15
target/arm/translate-a64.c | 22 ++--
14
2 files changed, 2 deletions(-)
16
target/arm/translate.c | 254 ++++++++-----------------------------
15
17
3 files changed, 74 insertions(+), 218 deletions(-)
18
19
diff --git a/target/arm/translate.h b/target/arm/translate.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate.h
22
+++ b/target/arm/translate.h
23
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
24
uint64_t vfp_expand_imm(int size, uint8_t imm8);
25
26
/* Vector operations shared between ARM and AArch64. */
27
-extern const GVecGen2 ceq0_op[4];
28
-extern const GVecGen2 clt0_op[4];
29
-extern const GVecGen2 cgt0_op[4];
30
-extern const GVecGen2 cle0_op[4];
31
-extern const GVecGen2 cge0_op[4];
32
+void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
33
+ uint32_t opr_sz, uint32_t max_sz);
34
+void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
35
+ uint32_t opr_sz, uint32_t max_sz);
36
+void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
37
+ uint32_t opr_sz, uint32_t max_sz);
38
+void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
39
+ uint32_t opr_sz, uint32_t max_sz);
40
+void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
41
+ uint32_t opr_sz, uint32_t max_sz);
42
+
43
extern const GVecGen3 mla_op[4];
44
extern const GVecGen3 mls_op[4];
45
extern const GVecGen3 cmtst_op[4];
16
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
46
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
17
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate-a64.c
48
--- a/target/arm/translate-a64.c
19
+++ b/target/arm/translate-a64.c
49
+++ b/target/arm/translate-a64.c
20
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
50
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
21
51
is_q ? 16 : 8, vec_full_reg_size(s));
22
static void aarch64_tr_tb_start(DisasContextBase *db, CPUState *cpu)
23
{
24
- tcg_clear_temp_count();
25
}
52
}
26
53
27
static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
54
-/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
55
-static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
56
- int rn, const GVecGen2 *gvec_op)
57
-{
58
- tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
59
- is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
60
-}
61
-
62
/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */
63
static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
64
int rn, int rm, const GVecGen3 *gvec_op)
65
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
66
}
67
break;
68
case 0x8: /* CMGT, CMGE */
69
- gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
70
+ if (u) {
71
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
72
+ } else {
73
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
74
+ }
75
return;
76
case 0x9: /* CMEQ, CMLE */
77
- gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
78
+ if (u) {
79
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
80
+ } else {
81
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
82
+ }
83
return;
84
case 0xa: /* CMLT */
85
- gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
86
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
87
return;
88
case 0xb:
89
if (u) { /* ABS, NEG */
28
diff --git a/target/arm/translate.c b/target/arm/translate.c
90
diff --git a/target/arm/translate.c b/target/arm/translate.c
29
index XXXXXXX..XXXXXXX 100644
91
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate.c
92
--- a/target/arm/translate.c
31
+++ b/target/arm/translate.c
93
+++ b/target/arm/translate.c
32
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_start(DisasContextBase *dcbase, CPUState *cpu)
94
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
33
tcg_gen_movi_i32(tmp, 0);
95
return 1;
34
store_cpu_field(tmp, condexec_bits);
35
}
36
- tcg_clear_temp_count();
37
}
96
}
38
97
39
static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
98
-static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
99
-{
100
- tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
101
- tcg_gen_neg_i32(d, d);
102
-}
103
-
104
-static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
105
-{
106
- tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
107
- tcg_gen_neg_i64(d, d);
108
-}
109
-
110
-static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
111
-{
112
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
113
- tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
114
- tcg_temp_free_vec(zero);
115
-}
116
+#define GEN_CMP0(NAME, COND) \
117
+ static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \
118
+ { \
119
+ tcg_gen_setcondi_i32(COND, d, a, 0); \
120
+ tcg_gen_neg_i32(d, d); \
121
+ } \
122
+ static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \
123
+ { \
124
+ tcg_gen_setcondi_i64(COND, d, a, 0); \
125
+ tcg_gen_neg_i64(d, d); \
126
+ } \
127
+ static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
128
+ { \
129
+ TCGv_vec zero = tcg_const_zeros_vec_matching(d); \
130
+ tcg_gen_cmp_vec(COND, vece, d, a, zero); \
131
+ tcg_temp_free_vec(zero); \
132
+ } \
133
+ void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \
134
+ uint32_t opr_sz, uint32_t max_sz) \
135
+ { \
136
+ const GVecGen2 op[4] = { \
137
+ { .fno = gen_helper_gvec_##NAME##0_b, \
138
+ .fniv = gen_##NAME##0_vec, \
139
+ .opt_opc = vecop_list_cmp, \
140
+ .vece = MO_8 }, \
141
+ { .fno = gen_helper_gvec_##NAME##0_h, \
142
+ .fniv = gen_##NAME##0_vec, \
143
+ .opt_opc = vecop_list_cmp, \
144
+ .vece = MO_16 }, \
145
+ { .fni4 = gen_##NAME##0_i32, \
146
+ .fniv = gen_##NAME##0_vec, \
147
+ .opt_opc = vecop_list_cmp, \
148
+ .vece = MO_32 }, \
149
+ { .fni8 = gen_##NAME##0_i64, \
150
+ .fniv = gen_##NAME##0_vec, \
151
+ .opt_opc = vecop_list_cmp, \
152
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64, \
153
+ .vece = MO_64 }, \
154
+ }; \
155
+ tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \
156
+ }
157
158
static const TCGOpcode vecop_list_cmp[] = {
159
INDEX_op_cmp_vec, 0
160
};
161
162
-const GVecGen2 ceq0_op[4] = {
163
- { .fno = gen_helper_gvec_ceq0_b,
164
- .fniv = gen_ceq0_vec,
165
- .opt_opc = vecop_list_cmp,
166
- .vece = MO_8 },
167
- { .fno = gen_helper_gvec_ceq0_h,
168
- .fniv = gen_ceq0_vec,
169
- .opt_opc = vecop_list_cmp,
170
- .vece = MO_16 },
171
- { .fni4 = gen_ceq0_i32,
172
- .fniv = gen_ceq0_vec,
173
- .opt_opc = vecop_list_cmp,
174
- .vece = MO_32 },
175
- { .fni8 = gen_ceq0_i64,
176
- .fniv = gen_ceq0_vec,
177
- .opt_opc = vecop_list_cmp,
178
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
179
- .vece = MO_64 },
180
-};
181
+GEN_CMP0(ceq, TCG_COND_EQ)
182
+GEN_CMP0(cle, TCG_COND_LE)
183
+GEN_CMP0(cge, TCG_COND_GE)
184
+GEN_CMP0(clt, TCG_COND_LT)
185
+GEN_CMP0(cgt, TCG_COND_GT)
186
187
-static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
188
-{
189
- tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
190
- tcg_gen_neg_i32(d, d);
191
-}
192
-
193
-static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
194
-{
195
- tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
196
- tcg_gen_neg_i64(d, d);
197
-}
198
-
199
-static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
200
-{
201
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
202
- tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
203
- tcg_temp_free_vec(zero);
204
-}
205
-
206
-const GVecGen2 cle0_op[4] = {
207
- { .fno = gen_helper_gvec_cle0_b,
208
- .fniv = gen_cle0_vec,
209
- .opt_opc = vecop_list_cmp,
210
- .vece = MO_8 },
211
- { .fno = gen_helper_gvec_cle0_h,
212
- .fniv = gen_cle0_vec,
213
- .opt_opc = vecop_list_cmp,
214
- .vece = MO_16 },
215
- { .fni4 = gen_cle0_i32,
216
- .fniv = gen_cle0_vec,
217
- .opt_opc = vecop_list_cmp,
218
- .vece = MO_32 },
219
- { .fni8 = gen_cle0_i64,
220
- .fniv = gen_cle0_vec,
221
- .opt_opc = vecop_list_cmp,
222
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
223
- .vece = MO_64 },
224
-};
225
-
226
-static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
227
-{
228
- tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
229
- tcg_gen_neg_i32(d, d);
230
-}
231
-
232
-static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
233
-{
234
- tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
235
- tcg_gen_neg_i64(d, d);
236
-}
237
-
238
-static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
239
-{
240
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
241
- tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
242
- tcg_temp_free_vec(zero);
243
-}
244
-
245
-const GVecGen2 cge0_op[4] = {
246
- { .fno = gen_helper_gvec_cge0_b,
247
- .fniv = gen_cge0_vec,
248
- .opt_opc = vecop_list_cmp,
249
- .vece = MO_8 },
250
- { .fno = gen_helper_gvec_cge0_h,
251
- .fniv = gen_cge0_vec,
252
- .opt_opc = vecop_list_cmp,
253
- .vece = MO_16 },
254
- { .fni4 = gen_cge0_i32,
255
- .fniv = gen_cge0_vec,
256
- .opt_opc = vecop_list_cmp,
257
- .vece = MO_32 },
258
- { .fni8 = gen_cge0_i64,
259
- .fniv = gen_cge0_vec,
260
- .opt_opc = vecop_list_cmp,
261
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
262
- .vece = MO_64 },
263
-};
264
-
265
-static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
266
-{
267
- tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
268
- tcg_gen_neg_i32(d, d);
269
-}
270
-
271
-static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
272
-{
273
- tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
274
- tcg_gen_neg_i64(d, d);
275
-}
276
-
277
-static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
278
-{
279
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
280
- tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
281
- tcg_temp_free_vec(zero);
282
-}
283
-
284
-const GVecGen2 clt0_op[4] = {
285
- { .fno = gen_helper_gvec_clt0_b,
286
- .fniv = gen_clt0_vec,
287
- .opt_opc = vecop_list_cmp,
288
- .vece = MO_8 },
289
- { .fno = gen_helper_gvec_clt0_h,
290
- .fniv = gen_clt0_vec,
291
- .opt_opc = vecop_list_cmp,
292
- .vece = MO_16 },
293
- { .fni4 = gen_clt0_i32,
294
- .fniv = gen_clt0_vec,
295
- .opt_opc = vecop_list_cmp,
296
- .vece = MO_32 },
297
- { .fni8 = gen_clt0_i64,
298
- .fniv = gen_clt0_vec,
299
- .opt_opc = vecop_list_cmp,
300
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
301
- .vece = MO_64 },
302
-};
303
-
304
-static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
305
-{
306
- tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
307
- tcg_gen_neg_i32(d, d);
308
-}
309
-
310
-static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
311
-{
312
- tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
313
- tcg_gen_neg_i64(d, d);
314
-}
315
-
316
-static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
317
-{
318
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
319
- tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
320
- tcg_temp_free_vec(zero);
321
-}
322
-
323
-const GVecGen2 cgt0_op[4] = {
324
- { .fno = gen_helper_gvec_cgt0_b,
325
- .fniv = gen_cgt0_vec,
326
- .opt_opc = vecop_list_cmp,
327
- .vece = MO_8 },
328
- { .fno = gen_helper_gvec_cgt0_h,
329
- .fniv = gen_cgt0_vec,
330
- .opt_opc = vecop_list_cmp,
331
- .vece = MO_16 },
332
- { .fni4 = gen_cgt0_i32,
333
- .fniv = gen_cgt0_vec,
334
- .opt_opc = vecop_list_cmp,
335
- .vece = MO_32 },
336
- { .fni8 = gen_cgt0_i64,
337
- .fniv = gen_cgt0_vec,
338
- .opt_opc = vecop_list_cmp,
339
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
340
- .vece = MO_64 },
341
-};
342
+#undef GEN_CMP0
343
344
static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
345
{
346
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
347
break;
348
349
case NEON_2RM_VCEQ0:
350
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
351
- vec_size, &ceq0_op[size]);
352
+ gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
353
break;
354
case NEON_2RM_VCGT0:
355
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
356
- vec_size, &cgt0_op[size]);
357
+ gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
358
break;
359
case NEON_2RM_VCLE0:
360
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
361
- vec_size, &cle0_op[size]);
362
+ gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
363
break;
364
case NEON_2RM_VCGE0:
365
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
366
- vec_size, &cge0_op[size]);
367
+ gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
368
break;
369
case NEON_2RM_VCLT0:
370
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
371
- vec_size, &clt0_op[size]);
372
+ gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
373
break;
374
375
default:
40
--
376
--
41
2.19.1
377
2.20.1
42
378
43
379
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Provide a functional interface for the vector expansion.
4
This fits better with the existing set of helpers that
5
we provide for other operations.
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-4-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/translate-a64.c | 28 +++-------------------------
12
target/arm/translate.h | 7 +-
9
1 file changed, 3 insertions(+), 25 deletions(-)
13
target/arm/translate-a64.c | 4 +-
10
14
target/arm/translate-neon.inc.c | 16 +----
15
target/arm/translate.c | 117 +++++++++++++++++---------------
16
4 files changed, 71 insertions(+), 73 deletions(-)
17
18
diff --git a/target/arm/translate.h b/target/arm/translate.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate.h
21
+++ b/target/arm/translate.h
22
@@ -XXX,XX +XXX,XX @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
23
void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
24
uint32_t opr_sz, uint32_t max_sz);
25
26
-extern const GVecGen3 mla_op[4];
27
-extern const GVecGen3 mls_op[4];
28
+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
29
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
30
+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
31
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
32
+
33
extern const GVecGen3 cmtst_op[4];
34
extern const GVecGen3 sshl_op[4];
35
extern const GVecGen3 ushl_op[4];
11
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
36
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
12
index XXXXXXX..XXXXXXX 100644
37
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-a64.c
38
--- a/target/arm/translate-a64.c
14
+++ b/target/arm/translate-a64.c
39
+++ b/target/arm/translate-a64.c
15
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
40
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
16
for (xs = 0; xs < selem; xs++) {
41
return;
17
if (replicate) {
42
case 0x12: /* MLA, MLS */
18
/* Load and replicate to all elements */
43
if (u) {
19
- uint64_t mulconst;
44
- gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]);
20
TCGv_i64 tcg_tmp = tcg_temp_new_i64();
45
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size);
21
22
tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr,
23
get_mem_index(s), s->be_data + scale);
24
- switch (scale) {
25
- case 0:
26
- mulconst = 0x0101010101010101ULL;
27
- break;
28
- case 1:
29
- mulconst = 0x0001000100010001ULL;
30
- break;
31
- case 2:
32
- mulconst = 0x0000000100000001ULL;
33
- break;
34
- case 3:
35
- mulconst = 0;
36
- break;
37
- default:
38
- g_assert_not_reached();
39
- }
40
- if (mulconst) {
41
- tcg_gen_muli_i64(tcg_tmp, tcg_tmp, mulconst);
42
- }
43
- write_vec_element(s, tcg_tmp, rt, 0, MO_64);
44
- if (is_q) {
45
- write_vec_element(s, tcg_tmp, rt, 1, MO_64);
46
- }
47
+ tcg_gen_gvec_dup_i64(scale, vec_full_reg_offset(s, rt),
48
+ (is_q + 1) * 8, vec_full_reg_size(s),
49
+ tcg_tmp);
50
tcg_temp_free_i64(tcg_tmp);
51
- clear_vec_high(s, is_q, rt);
52
} else {
46
} else {
53
/* Load/store one element per register */
47
- gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]);
54
if (is_load) {
48
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size);
49
}
50
return;
51
case 0x11:
52
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/translate-neon.inc.c
55
+++ b/target/arm/translate-neon.inc.c
56
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
57
DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
58
DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
59
DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
60
+DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
61
+DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
62
63
#define DO_3SAME_CMP(INSN, COND) \
64
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
65
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
66
return do_3same(s, a, gen_VMUL_p_3s);
67
}
68
69
-#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY) \
70
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
71
- uint32_t rn_ofs, uint32_t rm_ofs, \
72
- uint32_t oprsz, uint32_t maxsz) \
73
- { \
74
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
75
- oprsz, maxsz, &OPARRAY[vece]); \
76
- } \
77
- DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
78
-
79
-
80
-DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
81
-DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
82
-
83
#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \
84
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
85
uint32_t rn_ofs, uint32_t rm_ofs, \
86
diff --git a/target/arm/translate.c b/target/arm/translate.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/translate.c
89
+++ b/target/arm/translate.c
90
@@ -XXX,XX +XXX,XX @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
91
/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
92
* these tables are shared with AArch64 which does support them.
93
*/
94
+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
95
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
96
+{
97
+ static const TCGOpcode vecop_list[] = {
98
+ INDEX_op_mul_vec, INDEX_op_add_vec, 0
99
+ };
100
+ static const GVecGen3 ops[4] = {
101
+ { .fni4 = gen_mla8_i32,
102
+ .fniv = gen_mla_vec,
103
+ .load_dest = true,
104
+ .opt_opc = vecop_list,
105
+ .vece = MO_8 },
106
+ { .fni4 = gen_mla16_i32,
107
+ .fniv = gen_mla_vec,
108
+ .load_dest = true,
109
+ .opt_opc = vecop_list,
110
+ .vece = MO_16 },
111
+ { .fni4 = gen_mla32_i32,
112
+ .fniv = gen_mla_vec,
113
+ .load_dest = true,
114
+ .opt_opc = vecop_list,
115
+ .vece = MO_32 },
116
+ { .fni8 = gen_mla64_i64,
117
+ .fniv = gen_mla_vec,
118
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
119
+ .load_dest = true,
120
+ .opt_opc = vecop_list,
121
+ .vece = MO_64 },
122
+ };
123
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
124
+}
125
126
-static const TCGOpcode vecop_list_mla[] = {
127
- INDEX_op_mul_vec, INDEX_op_add_vec, 0
128
-};
129
-
130
-static const TCGOpcode vecop_list_mls[] = {
131
- INDEX_op_mul_vec, INDEX_op_sub_vec, 0
132
-};
133
-
134
-const GVecGen3 mla_op[4] = {
135
- { .fni4 = gen_mla8_i32,
136
- .fniv = gen_mla_vec,
137
- .load_dest = true,
138
- .opt_opc = vecop_list_mla,
139
- .vece = MO_8 },
140
- { .fni4 = gen_mla16_i32,
141
- .fniv = gen_mla_vec,
142
- .load_dest = true,
143
- .opt_opc = vecop_list_mla,
144
- .vece = MO_16 },
145
- { .fni4 = gen_mla32_i32,
146
- .fniv = gen_mla_vec,
147
- .load_dest = true,
148
- .opt_opc = vecop_list_mla,
149
- .vece = MO_32 },
150
- { .fni8 = gen_mla64_i64,
151
- .fniv = gen_mla_vec,
152
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
153
- .load_dest = true,
154
- .opt_opc = vecop_list_mla,
155
- .vece = MO_64 },
156
-};
157
-
158
-const GVecGen3 mls_op[4] = {
159
- { .fni4 = gen_mls8_i32,
160
- .fniv = gen_mls_vec,
161
- .load_dest = true,
162
- .opt_opc = vecop_list_mls,
163
- .vece = MO_8 },
164
- { .fni4 = gen_mls16_i32,
165
- .fniv = gen_mls_vec,
166
- .load_dest = true,
167
- .opt_opc = vecop_list_mls,
168
- .vece = MO_16 },
169
- { .fni4 = gen_mls32_i32,
170
- .fniv = gen_mls_vec,
171
- .load_dest = true,
172
- .opt_opc = vecop_list_mls,
173
- .vece = MO_32 },
174
- { .fni8 = gen_mls64_i64,
175
- .fniv = gen_mls_vec,
176
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
177
- .load_dest = true,
178
- .opt_opc = vecop_list_mls,
179
- .vece = MO_64 },
180
-};
181
+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
182
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
183
+{
184
+ static const TCGOpcode vecop_list[] = {
185
+ INDEX_op_mul_vec, INDEX_op_sub_vec, 0
186
+ };
187
+ static const GVecGen3 ops[4] = {
188
+ { .fni4 = gen_mls8_i32,
189
+ .fniv = gen_mls_vec,
190
+ .load_dest = true,
191
+ .opt_opc = vecop_list,
192
+ .vece = MO_8 },
193
+ { .fni4 = gen_mls16_i32,
194
+ .fniv = gen_mls_vec,
195
+ .load_dest = true,
196
+ .opt_opc = vecop_list,
197
+ .vece = MO_16 },
198
+ { .fni4 = gen_mls32_i32,
199
+ .fniv = gen_mls_vec,
200
+ .load_dest = true,
201
+ .opt_opc = vecop_list,
202
+ .vece = MO_32 },
203
+ { .fni8 = gen_mls64_i64,
204
+ .fniv = gen_mls_vec,
205
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
206
+ .load_dest = true,
207
+ .opt_opc = vecop_list,
208
+ .vece = MO_64 },
209
+ };
210
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
211
+}
212
213
/* CMTST : test is "if (X & Y != 0)". */
214
static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
55
--
215
--
56
2.19.1
216
2.20.1
57
217
58
218
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Since QEMU does not implement ASIDs, changes to the ASID must flush the
3
Rather than perform the argument swap during code generation,
4
tlb. However, if the ASID does not change there is no reason to flush.
4
perform it during decode. This means it doesn't have to be
5
special cased later, and we can share code with aarch64 code
6
generation. Hopefully the decode comment addresses any confusion
7
that might arise in between.
5
8
6
In testing a boot of the Ubuntu installer to the first menu, this reduces
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
the number of flushes by 30%, or nearly 600k instances.
8
9
Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
12
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
13
Message-id: 20181019015617.22583-3-richard.henderson@linaro.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
13
---
16
target/arm/helper.c | 8 +++-----
14
target/arm/neon-dp.decode | 17 +++++++++++++++--
17
1 file changed, 3 insertions(+), 5 deletions(-)
15
target/arm/translate-neon.inc.c | 3 +--
16
2 files changed, 16 insertions(+), 4 deletions(-)
18
17
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
18
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
20
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.c
20
--- a/target/arm/neon-dp.decode
22
+++ b/target/arm/helper.c
21
+++ b/target/arm/neon-dp.decode
23
@@ -XXX,XX +XXX,XX @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
22
@@ -XXX,XX +XXX,XX @@ VCGT_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
24
static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
23
VCGE_S_3s 1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
25
uint64_t value)
24
VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
26
{
25
27
- /* 64 bit accesses to the TTBRs can change the ASID and so we
26
-VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
28
- * must flush the TLB.
27
-VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
29
- */
28
+# The _rev suffix indicates that Vn and Vm are reversed. This is
30
- if (cpreg_field_is_64bit(ri)) {
29
+# the case for shifts. In the Arm ARM these insns are documented
31
+ /* If the ASID changes (with a 64-bit write), we must flush the TLB. */
30
+# with the Vm and Vn fields in their usual places, but in the
32
+ if (cpreg_field_is_64bit(ri) &&
31
+# assembly the operands are listed "backwards", ie in the order
33
+ extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
32
+# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
34
ARMCPU *cpu = arm_env_get_cpu(env);
33
+# to consider Vm and Vn as being in different fields in the insn,
35
-
34
+# which allows us to avoid special-casing shifts in the trans_
36
tlb_flush(CPU(cpu));
35
+# function code. We would otherwise need to manually swap the operands
37
}
36
+# over to call Neon helper functions that are shared with AArch64,
38
raw_write(env, ri, value);
37
+# which does not have this odd reversed-operand situation.
38
+@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \
39
+ &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
40
+
41
+VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
42
+VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
43
44
VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
45
VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
46
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/translate-neon.inc.c
49
+++ b/target/arm/translate-neon.inc.c
50
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
51
uint32_t rn_ofs, uint32_t rm_ofs, \
52
uint32_t oprsz, uint32_t maxsz) \
53
{ \
54
- /* Note the operation is vshl vd,vm,vn */ \
55
- tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, \
56
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
57
oprsz, maxsz, &OPARRAY[vece]); \
58
} \
59
DO_3SAME(INSN, gen_##INSN##_3s)
39
--
60
--
40
2.19.1
61
2.20.1
41
62
42
63
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Most of the v8 extensions are self-contained within the ISAR
3
Provide a functional interface for the vector expansion.
4
registers and are not implied by other feature bits, which
4
This fits better with the existing set of helpers that
5
makes them the easiest to convert.
5
we provide for other operations.
6
6
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181016223115.24100-4-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
11
---
13
target/arm/cpu.h | 131 +++++++++++++++++++++++++++++++++----
12
target/arm/translate.h | 10 ++-
14
target/arm/translate.h | 7 ++
13
target/arm/translate-a64.c | 18 ++--
15
linux-user/elfload.c | 46 ++++++++-----
14
target/arm/translate-neon.inc.c | 23 +----
16
target/arm/cpu.c | 27 +++++---
15
target/arm/translate.c | 146 +++++++++++++++++---------------
17
target/arm/cpu64.c | 57 +++++++++-------
16
4 files changed, 95 insertions(+), 102 deletions(-)
18
target/arm/translate-a64.c | 101 ++++++++++++++--------------
17
19
target/arm/translate.c | 36 +++++-----
20
7 files changed, 273 insertions(+), 132 deletions(-)
21
22
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
23
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/cpu.h
25
+++ b/target/arm/cpu.h
26
@@ -XXX,XX +XXX,XX @@ typedef enum ARMPSCIState {
27
PSCI_ON_PENDING = 2
28
} ARMPSCIState;
29
30
+typedef struct ARMISARegisters ARMISARegisters;
31
+
32
/**
33
* ARMCPU:
34
* @env: #CPUARMState
35
@@ -XXX,XX +XXX,XX @@ enum arm_features {
36
ARM_FEATURE_LPAE, /* has Large Physical Address Extension */
37
ARM_FEATURE_V8,
38
ARM_FEATURE_AARCH64, /* supports 64 bit mode */
39
- ARM_FEATURE_V8_AES, /* implements AES part of v8 Crypto Extensions */
40
ARM_FEATURE_CBAR, /* has cp15 CBAR */
41
ARM_FEATURE_CRC, /* ARMv8 CRC instructions */
42
ARM_FEATURE_CBAR_RO, /* has cp15 CBAR and it is read-only */
43
ARM_FEATURE_EL2, /* has EL2 Virtualization support */
44
ARM_FEATURE_EL3, /* has EL3 Secure monitor support */
45
- ARM_FEATURE_V8_SHA1, /* implements SHA1 part of v8 Crypto Extensions */
46
- ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto Extensions */
47
- ARM_FEATURE_V8_PMULL, /* implements PMULL part of v8 Crypto Extensions */
48
ARM_FEATURE_THUMB_DSP, /* DSP insns supported in the Thumb encodings */
49
ARM_FEATURE_PMU, /* has PMU support */
50
ARM_FEATURE_VBAR, /* has cp15 VBAR */
51
ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
52
ARM_FEATURE_JAZELLE, /* has (trivial) Jazelle implementation */
53
ARM_FEATURE_SVE, /* has Scalable Vector Extension */
54
- ARM_FEATURE_V8_SHA512, /* implements SHA512 part of v8 Crypto Extensions */
55
- ARM_FEATURE_V8_SHA3, /* implements SHA3 part of v8 Crypto Extensions */
56
- ARM_FEATURE_V8_SM3, /* implements SM3 part of v8 Crypto Extensions */
57
- ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
58
- ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */
59
- ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
60
- ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */
61
ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
62
- ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */
63
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
64
};
65
66
@@ -XXX,XX +XXX,XX @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno)
67
/* Shared between translate-sve.c and sve_helper.c. */
68
extern const uint64_t pred_esz_masks[4];
69
70
+/*
71
+ * 32-bit feature tests via id registers.
72
+ */
73
+static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
74
+{
75
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
76
+}
77
+
78
+static inline bool isar_feature_aa32_pmull(const ARMISARegisters *id)
79
+{
80
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) > 1;
81
+}
82
+
83
+static inline bool isar_feature_aa32_sha1(const ARMISARegisters *id)
84
+{
85
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, SHA1) != 0;
86
+}
87
+
88
+static inline bool isar_feature_aa32_sha2(const ARMISARegisters *id)
89
+{
90
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, SHA2) != 0;
91
+}
92
+
93
+static inline bool isar_feature_aa32_crc32(const ARMISARegisters *id)
94
+{
95
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, CRC32) != 0;
96
+}
97
+
98
+static inline bool isar_feature_aa32_rdm(const ARMISARegisters *id)
99
+{
100
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, RDM) != 0;
101
+}
102
+
103
+static inline bool isar_feature_aa32_vcma(const ARMISARegisters *id)
104
+{
105
+ return FIELD_EX32(id->id_isar5, ID_ISAR5, VCMA) != 0;
106
+}
107
+
108
+static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
109
+{
110
+ return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
111
+}
112
+
113
+/*
114
+ * 64-bit feature tests via id registers.
115
+ */
116
+static inline bool isar_feature_aa64_aes(const ARMISARegisters *id)
117
+{
118
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) != 0;
119
+}
120
+
121
+static inline bool isar_feature_aa64_pmull(const ARMISARegisters *id)
122
+{
123
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) > 1;
124
+}
125
+
126
+static inline bool isar_feature_aa64_sha1(const ARMISARegisters *id)
127
+{
128
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA1) != 0;
129
+}
130
+
131
+static inline bool isar_feature_aa64_sha256(const ARMISARegisters *id)
132
+{
133
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) != 0;
134
+}
135
+
136
+static inline bool isar_feature_aa64_sha512(const ARMISARegisters *id)
137
+{
138
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) > 1;
139
+}
140
+
141
+static inline bool isar_feature_aa64_crc32(const ARMISARegisters *id)
142
+{
143
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, CRC32) != 0;
144
+}
145
+
146
+static inline bool isar_feature_aa64_atomics(const ARMISARegisters *id)
147
+{
148
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, ATOMIC) != 0;
149
+}
150
+
151
+static inline bool isar_feature_aa64_rdm(const ARMISARegisters *id)
152
+{
153
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, RDM) != 0;
154
+}
155
+
156
+static inline bool isar_feature_aa64_sha3(const ARMISARegisters *id)
157
+{
158
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA3) != 0;
159
+}
160
+
161
+static inline bool isar_feature_aa64_sm3(const ARMISARegisters *id)
162
+{
163
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM3) != 0;
164
+}
165
+
166
+static inline bool isar_feature_aa64_sm4(const ARMISARegisters *id)
167
+{
168
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM4) != 0;
169
+}
170
+
171
+static inline bool isar_feature_aa64_dp(const ARMISARegisters *id)
172
+{
173
+ return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, DP) != 0;
174
+}
175
+
176
+static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
177
+{
178
+ return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
179
+}
180
+
181
+/*
182
+ * Forward to the above feature tests given an ARMCPU pointer.
183
+ */
184
+#define cpu_isar_feature(name, cpu) \
185
+ ({ ARMCPU *cpu_ = (cpu); isar_feature_##name(&cpu_->isar); })
186
+
187
#endif
188
diff --git a/target/arm/translate.h b/target/arm/translate.h
18
diff --git a/target/arm/translate.h b/target/arm/translate.h
189
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
190
--- a/target/arm/translate.h
20
--- a/target/arm/translate.h
191
+++ b/target/arm/translate.h
21
+++ b/target/arm/translate.h
192
@@ -XXX,XX +XXX,XX @@
22
@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
193
/* internal defines */
23
void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
194
typedef struct DisasContext {
24
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
195
DisasContextBase base;
25
196
+ const ARMISARegisters *isar;
26
-extern const GVecGen3 cmtst_op[4];
197
27
-extern const GVecGen3 sshl_op[4];
198
target_ulong pc;
28
-extern const GVecGen3 ushl_op[4];
199
target_ulong page_start;
29
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
200
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
30
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
201
return ret;
31
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
202
}
32
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
203
33
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
204
+/*
34
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
205
+ * Forward to the isar_feature_* tests given a DisasContext pointer.
206
+ */
207
+#define dc_isar_feature(name, ctx) \
208
+ ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
209
+
35
+
210
#endif /* TARGET_ARM_TRANSLATE_H */
36
extern const GVecGen4 uqadd_op[4];
211
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
37
extern const GVecGen4 sqadd_op[4];
212
index XXXXXXX..XXXXXXX 100644
38
extern const GVecGen4 uqsub_op[4];
213
--- a/linux-user/elfload.c
214
+++ b/linux-user/elfload.c
215
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
216
/* probe for the extra features */
217
#define GET_FEATURE(feat, hwcap) \
218
do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
219
+
220
+#define GET_FEATURE_ID(feat, hwcap) \
221
+ do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
222
+
223
/* EDSP is in v5TE and above, but all our v5 CPUs are v5TE */
224
GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
225
GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
226
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap2(void)
227
ARMCPU *cpu = ARM_CPU(thread_cpu);
228
uint32_t hwcaps = 0;
229
230
- GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP2_ARM_AES);
231
- GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP2_ARM_PMULL);
232
- GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP2_ARM_SHA1);
233
- GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP2_ARM_SHA2);
234
- GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP2_ARM_CRC32);
235
+ GET_FEATURE_ID(aa32_aes, ARM_HWCAP2_ARM_AES);
236
+ GET_FEATURE_ID(aa32_pmull, ARM_HWCAP2_ARM_PMULL);
237
+ GET_FEATURE_ID(aa32_sha1, ARM_HWCAP2_ARM_SHA1);
238
+ GET_FEATURE_ID(aa32_sha2, ARM_HWCAP2_ARM_SHA2);
239
+ GET_FEATURE_ID(aa32_crc32, ARM_HWCAP2_ARM_CRC32);
240
return hwcaps;
241
}
242
243
#undef GET_FEATURE
244
+#undef GET_FEATURE_ID
245
246
#else
247
/* 64 bit ARM definitions */
248
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
249
/* probe for the extra features */
250
#define GET_FEATURE(feat, hwcap) \
251
do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
252
- GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP_A64_AES);
253
- GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP_A64_PMULL);
254
- GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP_A64_SHA1);
255
- GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP_A64_SHA2);
256
- GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP_A64_CRC32);
257
- GET_FEATURE(ARM_FEATURE_V8_SHA3, ARM_HWCAP_A64_SHA3);
258
- GET_FEATURE(ARM_FEATURE_V8_SM3, ARM_HWCAP_A64_SM3);
259
- GET_FEATURE(ARM_FEATURE_V8_SM4, ARM_HWCAP_A64_SM4);
260
- GET_FEATURE(ARM_FEATURE_V8_SHA512, ARM_HWCAP_A64_SHA512);
261
+#define GET_FEATURE_ID(feat, hwcap) \
262
+ do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
263
+
264
+ GET_FEATURE_ID(aa64_aes, ARM_HWCAP_A64_AES);
265
+ GET_FEATURE_ID(aa64_pmull, ARM_HWCAP_A64_PMULL);
266
+ GET_FEATURE_ID(aa64_sha1, ARM_HWCAP_A64_SHA1);
267
+ GET_FEATURE_ID(aa64_sha256, ARM_HWCAP_A64_SHA2);
268
+ GET_FEATURE_ID(aa64_sha512, ARM_HWCAP_A64_SHA512);
269
+ GET_FEATURE_ID(aa64_crc32, ARM_HWCAP_A64_CRC32);
270
+ GET_FEATURE_ID(aa64_sha3, ARM_HWCAP_A64_SHA3);
271
+ GET_FEATURE_ID(aa64_sm3, ARM_HWCAP_A64_SM3);
272
+ GET_FEATURE_ID(aa64_sm4, ARM_HWCAP_A64_SM4);
273
GET_FEATURE(ARM_FEATURE_V8_FP16,
274
ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
275
- GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
276
- GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
277
- GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP);
278
- GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
279
+ GET_FEATURE_ID(aa64_atomics, ARM_HWCAP_A64_ATOMICS);
280
+ GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
281
+ GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
282
+ GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
283
GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
284
+
285
#undef GET_FEATURE
286
+#undef GET_FEATURE_ID
287
288
return hwcaps;
289
}
290
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
291
index XXXXXXX..XXXXXXX 100644
292
--- a/target/arm/cpu.c
293
+++ b/target/arm/cpu.c
294
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
295
cortex_a15_initfn(obj);
296
#ifdef CONFIG_USER_ONLY
297
/* We don't set these in system emulation mode for the moment,
298
- * since we don't correctly set the ID registers to advertise them,
299
+ * since we don't correctly set (all of) the ID registers to
300
+ * advertise them.
301
*/
302
set_feature(&cpu->env, ARM_FEATURE_V8);
303
- set_feature(&cpu->env, ARM_FEATURE_V8_AES);
304
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
305
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
306
- set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
307
- set_feature(&cpu->env, ARM_FEATURE_CRC);
308
- set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
309
- set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
310
- set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
311
+ {
312
+ uint32_t t;
313
+
314
+ t = cpu->isar.id_isar5;
315
+ t = FIELD_DP32(t, ID_ISAR5, AES, 2);
316
+ t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
317
+ t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
318
+ t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
319
+ t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
320
+ t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
321
+ cpu->isar.id_isar5 = t;
322
+
323
+ t = cpu->isar.id_isar6;
324
+ t = FIELD_DP32(t, ID_ISAR6, DP, 1);
325
+ cpu->isar.id_isar6 = t;
326
+ }
327
#endif
328
}
329
}
330
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
331
index XXXXXXX..XXXXXXX 100644
332
--- a/target/arm/cpu64.c
333
+++ b/target/arm/cpu64.c
334
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
335
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
336
set_feature(&cpu->env, ARM_FEATURE_AARCH64);
337
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
338
- set_feature(&cpu->env, ARM_FEATURE_V8_AES);
339
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
340
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
341
- set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
342
- set_feature(&cpu->env, ARM_FEATURE_CRC);
343
set_feature(&cpu->env, ARM_FEATURE_EL2);
344
set_feature(&cpu->env, ARM_FEATURE_EL3);
345
set_feature(&cpu->env, ARM_FEATURE_PMU);
346
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
347
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
348
set_feature(&cpu->env, ARM_FEATURE_AARCH64);
349
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
350
- set_feature(&cpu->env, ARM_FEATURE_V8_AES);
351
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
352
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
353
- set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
354
- set_feature(&cpu->env, ARM_FEATURE_CRC);
355
set_feature(&cpu->env, ARM_FEATURE_EL2);
356
set_feature(&cpu->env, ARM_FEATURE_EL3);
357
set_feature(&cpu->env, ARM_FEATURE_PMU);
358
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
359
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
360
set_feature(&cpu->env, ARM_FEATURE_AARCH64);
361
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
362
- set_feature(&cpu->env, ARM_FEATURE_V8_AES);
363
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
364
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
365
- set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
366
- set_feature(&cpu->env, ARM_FEATURE_CRC);
367
set_feature(&cpu->env, ARM_FEATURE_EL2);
368
set_feature(&cpu->env, ARM_FEATURE_EL3);
369
set_feature(&cpu->env, ARM_FEATURE_PMU);
370
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
371
if (kvm_enabled()) {
372
kvm_arm_set_cpu_features_from_host(cpu);
373
} else {
374
+ uint64_t t;
375
+ uint32_t u;
376
aarch64_a57_initfn(obj);
377
+
378
+ t = cpu->isar.id_aa64isar0;
379
+ t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
380
+ t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
381
+ t = FIELD_DP64(t, ID_AA64ISAR0, SHA2, 2); /* SHA512 */
382
+ t = FIELD_DP64(t, ID_AA64ISAR0, CRC32, 1);
383
+ t = FIELD_DP64(t, ID_AA64ISAR0, ATOMIC, 2);
384
+ t = FIELD_DP64(t, ID_AA64ISAR0, RDM, 1);
385
+ t = FIELD_DP64(t, ID_AA64ISAR0, SHA3, 1);
386
+ t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);
387
+ t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);
388
+ t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);
389
+ cpu->isar.id_aa64isar0 = t;
390
+
391
+ t = cpu->isar.id_aa64isar1;
392
+ t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
393
+ cpu->isar.id_aa64isar1 = t;
394
+
395
+ /* Replicate the same data to the 32-bit id registers. */
396
+ u = cpu->isar.id_isar5;
397
+ u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
398
+ u = FIELD_DP32(u, ID_ISAR5, SHA1, 1);
399
+ u = FIELD_DP32(u, ID_ISAR5, SHA2, 1);
400
+ u = FIELD_DP32(u, ID_ISAR5, CRC32, 1);
401
+ u = FIELD_DP32(u, ID_ISAR5, RDM, 1);
402
+ u = FIELD_DP32(u, ID_ISAR5, VCMA, 1);
403
+ cpu->isar.id_isar5 = u;
404
+
405
+ u = cpu->isar.id_isar6;
406
+ u = FIELD_DP32(u, ID_ISAR6, DP, 1);
407
+ cpu->isar.id_isar6 = u;
408
+
409
#ifdef CONFIG_USER_ONLY
410
/* We don't set these in system emulation mode for the moment,
411
* since we don't correctly set the ID registers to advertise them,
412
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
413
* whereas the architecture requires them to be present in both if
414
* present in either.
415
*/
416
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA512);
417
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA3);
418
- set_feature(&cpu->env, ARM_FEATURE_V8_SM3);
419
- set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
420
- set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
421
- set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
422
- set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
423
set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
424
- set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
425
set_feature(&cpu->env, ARM_FEATURE_SVE);
426
/* For usermode -cpu max we can use a larger and more efficient DCZ
427
* blocksize since we don't have to follow what the hardware does.
428
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
39
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
429
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
430
--- a/target/arm/translate-a64.c
41
--- a/target/arm/translate-a64.c
431
+++ b/target/arm/translate-a64.c
42
+++ b/target/arm/translate-a64.c
432
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
43
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
433
}
44
is_q ? 16 : 8, vec_full_reg_size(s));
434
if (rt2 == 31
45
}
435
&& ((rt | rs) & 1) == 0
46
436
- && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
47
-/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */
437
+ && dc_isar_feature(aa64_atomics, s)) {
48
-static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
438
/* CASP / CASPL */
49
- int rn, int rm, const GVecGen3 *gvec_op)
439
gen_compare_and_swap_pair(s, rs, rt, rn, size | 2);
50
-{
440
return;
51
- tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
441
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
52
- vec_full_reg_offset(s, rm), is_q ? 16 : 8,
442
}
53
- vec_full_reg_size(s), gvec_op);
443
if (rt2 == 31
54
-}
444
&& ((rt | rs) & 1) == 0
55
-
445
- && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
56
/* Expand a 3-operand operation using an out-of-line helper. */
446
+ && dc_isar_feature(aa64_atomics, s)) {
57
static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
447
/* CASPA / CASPAL */
58
int rn, int rm, int data, gen_helper_gvec_3 *fn)
448
gen_compare_and_swap_pair(s, rs, rt, rn, size | 2);
59
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
449
return;
60
(u ? uqsub_op : sqsub_op) + size);
450
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
61
return;
451
case 0xb: /* CASL */
62
case 0x08: /* SSHL, USHL */
452
case 0xe: /* CASA */
63
- gen_gvec_op3(s, is_q, rd, rn, rm,
453
case 0xf: /* CASAL */
64
- u ? &ushl_op[size] : &sshl_op[size]);
454
- if (rt2 == 31 && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
65
+ if (u) {
455
+ if (rt2 == 31 && dc_isar_feature(aa64_atomics, s)) {
66
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
456
gen_compare_and_swap(s, rs, rt, rn, size);
67
+ } else {
68
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
69
+ }
70
return;
71
case 0x0c: /* SMAX, UMAX */
72
if (u) {
73
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
74
return;
75
case 0x11:
76
if (!u) { /* CMTST */
77
- gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
78
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
457
return;
79
return;
458
}
80
}
459
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
81
/* else CMEQ */
460
int rs = extract32(insn, 16, 5);
82
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
461
int rn = extract32(insn, 5, 5);
83
index XXXXXXX..XXXXXXX 100644
462
int o3_opc = extract32(insn, 12, 4);
84
--- a/target/arm/translate-neon.inc.c
463
- int feature = ARM_FEATURE_V8_ATOMICS;
85
+++ b/target/arm/translate-neon.inc.c
464
TCGv_i64 tcg_rn, tcg_rs;
86
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
465
AtomicThreeOpFn *fn;
87
DO_3SAME(VORR, tcg_gen_gvec_or)
466
88
DO_3SAME(VORN, tcg_gen_gvec_orc)
467
- if (is_vector) {
89
DO_3SAME(VEOR, tcg_gen_gvec_xor)
468
+ if (is_vector || !dc_isar_feature(aa64_atomics, s)) {
90
+DO_3SAME(VSHL_S, gen_gvec_sshl)
469
unallocated_encoding(s);
91
+DO_3SAME(VSHL_U, gen_gvec_ushl)
470
return;
92
93
/* These insns are all gvec_bitsel but with the inputs in various orders. */
94
#define DO_3SAME_BITSEL(INSN, O1, O2, O3) \
95
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
96
DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
97
DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
98
DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
99
+DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
100
101
#define DO_3SAME_CMP(INSN, COND) \
102
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
103
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
104
DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
105
DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
106
107
-static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
108
- uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
109
-{
110
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
111
-}
112
-DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
113
-
114
#define DO_3SAME_GVEC4(INSN, OPARRAY) \
115
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
116
uint32_t rn_ofs, uint32_t rm_ofs, \
117
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
471
}
118
}
472
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
119
return do_3same(s, a, gen_VMUL_p_3s);
473
unallocated_encoding(s);
120
}
474
return;
121
-
475
}
122
-#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \
476
- if (!arm_dc_feature(s, feature)) {
123
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
477
- unallocated_encoding(s);
124
- uint32_t rn_ofs, uint32_t rm_ofs, \
478
- return;
125
- uint32_t oprsz, uint32_t maxsz) \
479
- }
126
- { \
480
127
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
481
if (rn == 31) {
128
- oprsz, maxsz, &OPARRAY[vece]); \
482
gen_check_sp_alignment(s);
129
- } \
483
@@ -XXX,XX +XXX,XX @@ static void handle_crc32(DisasContext *s,
130
- DO_3SAME(INSN, gen_##INSN##_3s)
484
TCGv_i64 tcg_acc, tcg_val;
131
-
485
TCGv_i32 tcg_bytes;
132
-DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
486
133
-DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
487
- if (!arm_dc_feature(s, ARM_FEATURE_CRC)
488
+ if (!dc_isar_feature(aa64_crc32, s)
489
|| (sf == 1 && sz != 3)
490
|| (sf == 0 && sz == 3)) {
491
unallocated_encoding(s);
492
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
493
bool u = extract32(insn, 29, 1);
494
TCGv_i32 ele1, ele2, ele3;
495
TCGv_i64 res;
496
- int feature;
497
+ bool feature;
498
499
switch (u * 16 + opcode) {
500
case 0x10: /* SQRDMLAH (vector) */
501
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
502
unallocated_encoding(s);
503
return;
504
}
505
- feature = ARM_FEATURE_V8_RDM;
506
+ feature = dc_isar_feature(aa64_rdm, s);
507
break;
508
default:
509
unallocated_encoding(s);
510
return;
511
}
512
- if (!arm_dc_feature(s, feature)) {
513
+ if (!feature) {
514
unallocated_encoding(s);
515
return;
516
}
517
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
518
return;
519
}
520
if (size == 3) {
521
- if (!arm_dc_feature(s, ARM_FEATURE_V8_PMULL)) {
522
+ if (!dc_isar_feature(aa64_pmull, s)) {
523
unallocated_encoding(s);
524
return;
525
}
526
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
527
int size = extract32(insn, 22, 2);
528
bool u = extract32(insn, 29, 1);
529
bool is_q = extract32(insn, 30, 1);
530
- int feature, rot;
531
+ bool feature;
532
+ int rot;
533
534
switch (u * 16 + opcode) {
535
case 0x10: /* SQRDMLAH (vector) */
536
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
537
unallocated_encoding(s);
538
return;
539
}
540
- feature = ARM_FEATURE_V8_RDM;
541
+ feature = dc_isar_feature(aa64_rdm, s);
542
break;
543
case 0x02: /* SDOT (vector) */
544
case 0x12: /* UDOT (vector) */
545
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
546
unallocated_encoding(s);
547
return;
548
}
549
- feature = ARM_FEATURE_V8_DOTPROD;
550
+ feature = dc_isar_feature(aa64_dp, s);
551
break;
552
case 0x18: /* FCMLA, #0 */
553
case 0x19: /* FCMLA, #90 */
554
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
555
unallocated_encoding(s);
556
return;
557
}
558
- feature = ARM_FEATURE_V8_FCMA;
559
+ feature = dc_isar_feature(aa64_fcma, s);
560
break;
561
default:
562
unallocated_encoding(s);
563
return;
564
}
565
- if (!arm_dc_feature(s, feature)) {
566
+ if (!feature) {
567
unallocated_encoding(s);
568
return;
569
}
570
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
571
break;
572
case 0x1d: /* SQRDMLAH */
573
case 0x1f: /* SQRDMLSH */
574
- if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
575
+ if (!dc_isar_feature(aa64_rdm, s)) {
576
unallocated_encoding(s);
577
return;
578
}
579
break;
580
case 0x0e: /* SDOT */
581
case 0x1e: /* UDOT */
582
- if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
583
+ if (size != MO_32 || !dc_isar_feature(aa64_dp, s)) {
584
unallocated_encoding(s);
585
return;
586
}
587
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
588
case 0x13: /* FCMLA #90 */
589
case 0x15: /* FCMLA #180 */
590
case 0x17: /* FCMLA #270 */
591
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
592
+ if (!dc_isar_feature(aa64_fcma, s)) {
593
unallocated_encoding(s);
594
return;
595
}
596
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_aes(DisasContext *s, uint32_t insn)
597
TCGv_i32 tcg_decrypt;
598
CryptoThreeOpIntFn *genfn;
599
600
- if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
601
- || size != 0) {
602
+ if (!dc_isar_feature(aa64_aes, s) || size != 0) {
603
unallocated_encoding(s);
604
return;
605
}
606
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
607
int rd = extract32(insn, 0, 5);
608
CryptoThreeOpFn *genfn;
609
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
610
- int feature = ARM_FEATURE_V8_SHA256;
611
+ bool feature;
612
613
if (size != 0) {
614
unallocated_encoding(s);
615
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
616
case 2: /* SHA1M */
617
case 3: /* SHA1SU0 */
618
genfn = NULL;
619
- feature = ARM_FEATURE_V8_SHA1;
620
+ feature = dc_isar_feature(aa64_sha1, s);
621
break;
622
case 4: /* SHA256H */
623
genfn = gen_helper_crypto_sha256h;
624
+ feature = dc_isar_feature(aa64_sha256, s);
625
break;
626
case 5: /* SHA256H2 */
627
genfn = gen_helper_crypto_sha256h2;
628
+ feature = dc_isar_feature(aa64_sha256, s);
629
break;
630
case 6: /* SHA256SU1 */
631
genfn = gen_helper_crypto_sha256su1;
632
+ feature = dc_isar_feature(aa64_sha256, s);
633
break;
634
default:
635
unallocated_encoding(s);
636
return;
637
}
638
639
- if (!arm_dc_feature(s, feature)) {
640
+ if (!feature) {
641
unallocated_encoding(s);
642
return;
643
}
644
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
645
int rn = extract32(insn, 5, 5);
646
int rd = extract32(insn, 0, 5);
647
CryptoTwoOpFn *genfn;
648
- int feature;
649
+ bool feature;
650
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
651
652
if (size != 0) {
653
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
654
655
switch (opcode) {
656
case 0: /* SHA1H */
657
- feature = ARM_FEATURE_V8_SHA1;
658
+ feature = dc_isar_feature(aa64_sha1, s);
659
genfn = gen_helper_crypto_sha1h;
660
break;
661
case 1: /* SHA1SU1 */
662
- feature = ARM_FEATURE_V8_SHA1;
663
+ feature = dc_isar_feature(aa64_sha1, s);
664
genfn = gen_helper_crypto_sha1su1;
665
break;
666
case 2: /* SHA256SU0 */
667
- feature = ARM_FEATURE_V8_SHA256;
668
+ feature = dc_isar_feature(aa64_sha256, s);
669
genfn = gen_helper_crypto_sha256su0;
670
break;
671
default:
672
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
673
return;
674
}
675
676
- if (!arm_dc_feature(s, feature)) {
677
+ if (!feature) {
678
unallocated_encoding(s);
679
return;
680
}
681
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
682
int rm = extract32(insn, 16, 5);
683
int rn = extract32(insn, 5, 5);
684
int rd = extract32(insn, 0, 5);
685
- int feature;
686
+ bool feature;
687
CryptoThreeOpFn *genfn;
688
689
if (o == 0) {
690
switch (opcode) {
691
case 0: /* SHA512H */
692
- feature = ARM_FEATURE_V8_SHA512;
693
+ feature = dc_isar_feature(aa64_sha512, s);
694
genfn = gen_helper_crypto_sha512h;
695
break;
696
case 1: /* SHA512H2 */
697
- feature = ARM_FEATURE_V8_SHA512;
698
+ feature = dc_isar_feature(aa64_sha512, s);
699
genfn = gen_helper_crypto_sha512h2;
700
break;
701
case 2: /* SHA512SU1 */
702
- feature = ARM_FEATURE_V8_SHA512;
703
+ feature = dc_isar_feature(aa64_sha512, s);
704
genfn = gen_helper_crypto_sha512su1;
705
break;
706
case 3: /* RAX1 */
707
- feature = ARM_FEATURE_V8_SHA3;
708
+ feature = dc_isar_feature(aa64_sha3, s);
709
genfn = NULL;
710
break;
711
}
712
} else {
713
switch (opcode) {
714
case 0: /* SM3PARTW1 */
715
- feature = ARM_FEATURE_V8_SM3;
716
+ feature = dc_isar_feature(aa64_sm3, s);
717
genfn = gen_helper_crypto_sm3partw1;
718
break;
719
case 1: /* SM3PARTW2 */
720
- feature = ARM_FEATURE_V8_SM3;
721
+ feature = dc_isar_feature(aa64_sm3, s);
722
genfn = gen_helper_crypto_sm3partw2;
723
break;
724
case 2: /* SM4EKEY */
725
- feature = ARM_FEATURE_V8_SM4;
726
+ feature = dc_isar_feature(aa64_sm4, s);
727
genfn = gen_helper_crypto_sm4ekey;
728
break;
729
default:
730
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
731
}
732
}
733
734
- if (!arm_dc_feature(s, feature)) {
735
+ if (!feature) {
736
unallocated_encoding(s);
737
return;
738
}
739
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
740
int rn = extract32(insn, 5, 5);
741
int rd = extract32(insn, 0, 5);
742
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
743
- int feature;
744
+ bool feature;
745
CryptoTwoOpFn *genfn;
746
747
switch (opcode) {
748
case 0: /* SHA512SU0 */
749
- feature = ARM_FEATURE_V8_SHA512;
750
+ feature = dc_isar_feature(aa64_sha512, s);
751
genfn = gen_helper_crypto_sha512su0;
752
break;
753
case 1: /* SM4E */
754
- feature = ARM_FEATURE_V8_SM4;
755
+ feature = dc_isar_feature(aa64_sm4, s);
756
genfn = gen_helper_crypto_sm4e;
757
break;
758
default:
759
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
760
return;
761
}
762
763
- if (!arm_dc_feature(s, feature)) {
764
+ if (!feature) {
765
unallocated_encoding(s);
766
return;
767
}
768
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
769
int ra = extract32(insn, 10, 5);
770
int rn = extract32(insn, 5, 5);
771
int rd = extract32(insn, 0, 5);
772
- int feature;
773
+ bool feature;
774
775
switch (op0) {
776
case 0: /* EOR3 */
777
case 1: /* BCAX */
778
- feature = ARM_FEATURE_V8_SHA3;
779
+ feature = dc_isar_feature(aa64_sha3, s);
780
break;
781
case 2: /* SM3SS1 */
782
- feature = ARM_FEATURE_V8_SM3;
783
+ feature = dc_isar_feature(aa64_sm3, s);
784
break;
785
default:
786
unallocated_encoding(s);
787
return;
788
}
789
790
- if (!arm_dc_feature(s, feature)) {
791
+ if (!feature) {
792
unallocated_encoding(s);
793
return;
794
}
795
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn)
796
TCGv_i64 tcg_op1, tcg_op2, tcg_res[2];
797
int pass;
798
799
- if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA3)) {
800
+ if (!dc_isar_feature(aa64_sha3, s)) {
801
unallocated_encoding(s);
802
return;
803
}
804
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
805
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
806
TCGv_i32 tcg_imm2, tcg_opcode;
807
808
- if (!arm_dc_feature(s, ARM_FEATURE_V8_SM3)) {
809
+ if (!dc_isar_feature(aa64_sm3, s)) {
810
unallocated_encoding(s);
811
return;
812
}
813
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
814
ARMCPU *arm_cpu = arm_env_get_cpu(env);
815
int bound;
816
817
+ dc->isar = &arm_cpu->isar;
818
dc->pc = dc->base.pc_first;
819
dc->condjmp = 0;
820
821
diff --git a/target/arm/translate.c b/target/arm/translate.c
134
diff --git a/target/arm/translate.c b/target/arm/translate.c
822
index XXXXXXX..XXXXXXX 100644
135
index XXXXXXX..XXXXXXX 100644
823
--- a/target/arm/translate.c
136
--- a/target/arm/translate.c
824
+++ b/target/arm/translate.c
137
+++ b/target/arm/translate.c
825
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
138
@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
826
static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
139
tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
827
int q, int rd, int rn, int rm)
140
}
141
142
-static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
143
-
144
-const GVecGen3 cmtst_op[4] = {
145
- { .fni4 = gen_helper_neon_tst_u8,
146
- .fniv = gen_cmtst_vec,
147
- .opt_opc = vecop_list_cmtst,
148
- .vece = MO_8 },
149
- { .fni4 = gen_helper_neon_tst_u16,
150
- .fniv = gen_cmtst_vec,
151
- .opt_opc = vecop_list_cmtst,
152
- .vece = MO_16 },
153
- { .fni4 = gen_cmtst_i32,
154
- .fniv = gen_cmtst_vec,
155
- .opt_opc = vecop_list_cmtst,
156
- .vece = MO_32 },
157
- { .fni8 = gen_cmtst_i64,
158
- .fniv = gen_cmtst_vec,
159
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
160
- .opt_opc = vecop_list_cmtst,
161
- .vece = MO_64 },
162
-};
163
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
164
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
165
+{
166
+ static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
167
+ static const GVecGen3 ops[4] = {
168
+ { .fni4 = gen_helper_neon_tst_u8,
169
+ .fniv = gen_cmtst_vec,
170
+ .opt_opc = vecop_list,
171
+ .vece = MO_8 },
172
+ { .fni4 = gen_helper_neon_tst_u16,
173
+ .fniv = gen_cmtst_vec,
174
+ .opt_opc = vecop_list,
175
+ .vece = MO_16 },
176
+ { .fni4 = gen_cmtst_i32,
177
+ .fniv = gen_cmtst_vec,
178
+ .opt_opc = vecop_list,
179
+ .vece = MO_32 },
180
+ { .fni8 = gen_cmtst_i64,
181
+ .fniv = gen_cmtst_vec,
182
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
183
+ .opt_opc = vecop_list,
184
+ .vece = MO_64 },
185
+ };
186
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
187
+}
188
189
void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
828
{
190
{
829
- if (arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
191
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
830
+ if (dc_isar_feature(aa32_rdm, s)) {
192
tcg_temp_free_vec(rsh);
831
int opr_sz = (1 + q) * 8;
193
}
832
tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
194
833
vfp_reg_offset(1, rn),
195
-static const TCGOpcode ushl_list[] = {
834
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
196
- INDEX_op_neg_vec, INDEX_op_shlv_vec,
835
return 1;
197
- INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
836
}
198
-};
837
if (!u) { /* SHA-1 */
199
-
838
- if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)) {
200
-const GVecGen3 ushl_op[4] = {
839
+ if (!dc_isar_feature(aa32_sha1, s)) {
201
- { .fniv = gen_ushl_vec,
840
return 1;
202
- .fno = gen_helper_gvec_ushl_b,
841
}
203
- .opt_opc = ushl_list,
842
ptr1 = vfp_reg_ptr(true, rd);
204
- .vece = MO_8 },
843
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
205
- { .fniv = gen_ushl_vec,
844
gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
206
- .fno = gen_helper_gvec_ushl_h,
845
tcg_temp_free_i32(tmp4);
207
- .opt_opc = ushl_list,
846
} else { /* SHA-256 */
208
- .vece = MO_16 },
847
- if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA256) || size == 3) {
209
- { .fni4 = gen_ushl_i32,
848
+ if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
210
- .fniv = gen_ushl_vec,
849
return 1;
211
- .opt_opc = ushl_list,
850
}
212
- .vece = MO_32 },
851
ptr1 = vfp_reg_ptr(true, rd);
213
- { .fni8 = gen_ushl_i64,
852
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
214
- .fniv = gen_ushl_vec,
853
if (op == 14 && size == 2) {
215
- .opt_opc = ushl_list,
854
TCGv_i64 tcg_rn, tcg_rm, tcg_rd;
216
- .vece = MO_64 },
855
217
-};
856
- if (!arm_dc_feature(s, ARM_FEATURE_V8_PMULL)) {
218
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
857
+ if (!dc_isar_feature(aa32_pmull, s)) {
219
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
858
return 1;
220
+{
859
}
221
+ static const TCGOpcode vecop_list[] = {
860
tcg_rn = tcg_temp_new_i64();
222
+ INDEX_op_neg_vec, INDEX_op_shlv_vec,
861
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
223
+ INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
862
{
224
+ };
863
NeonGenThreeOpEnvFn *fn;
225
+ static const GVecGen3 ops[4] = {
864
226
+ { .fniv = gen_ushl_vec,
865
- if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
227
+ .fno = gen_helper_gvec_ushl_b,
866
+ if (!dc_isar_feature(aa32_rdm, s)) {
228
+ .opt_opc = vecop_list,
867
return 1;
229
+ .vece = MO_8 },
868
}
230
+ { .fniv = gen_ushl_vec,
869
if (u && ((rd | rn) & 1)) {
231
+ .fno = gen_helper_gvec_ushl_h,
870
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
232
+ .opt_opc = vecop_list,
871
break;
233
+ .vece = MO_16 },
872
}
234
+ { .fni4 = gen_ushl_i32,
873
case NEON_2RM_AESE: case NEON_2RM_AESMC:
235
+ .fniv = gen_ushl_vec,
874
- if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
236
+ .opt_opc = vecop_list,
875
- || ((rm | rd) & 1)) {
237
+ .vece = MO_32 },
876
+ if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
238
+ { .fni8 = gen_ushl_i64,
877
return 1;
239
+ .fniv = gen_ushl_vec,
878
}
240
+ .opt_opc = vecop_list,
879
ptr1 = vfp_reg_ptr(true, rd);
241
+ .vece = MO_64 },
880
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
242
+ };
881
tcg_temp_free_i32(tmp3);
243
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
882
break;
244
+}
883
case NEON_2RM_SHA1H:
245
884
- if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)
246
void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
885
- || ((rm | rd) & 1)) {
247
{
886
+ if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
248
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
887
return 1;
249
tcg_temp_free_vec(tmp);
888
}
250
}
889
ptr1 = vfp_reg_ptr(true, rd);
251
890
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
252
-static const TCGOpcode sshl_list[] = {
891
}
253
- INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
892
/* bit 6 (q): set -> SHA256SU0, cleared -> SHA1SU1 */
254
- INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
893
if (q) {
255
-};
894
- if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA256)) {
256
-
895
+ if (!dc_isar_feature(aa32_sha2, s)) {
257
-const GVecGen3 sshl_op[4] = {
896
return 1;
258
- { .fniv = gen_sshl_vec,
897
}
259
- .fno = gen_helper_gvec_sshl_b,
898
- } else if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)) {
260
- .opt_opc = sshl_list,
899
+ } else if (!dc_isar_feature(aa32_sha1, s)) {
261
- .vece = MO_8 },
900
return 1;
262
- { .fniv = gen_sshl_vec,
901
}
263
- .fno = gen_helper_gvec_sshl_h,
902
ptr1 = vfp_reg_ptr(true, rd);
264
- .opt_opc = sshl_list,
903
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
265
- .vece = MO_16 },
904
/* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
266
- { .fni4 = gen_sshl_i32,
905
int size = extract32(insn, 20, 1);
267
- .fniv = gen_sshl_vec,
906
data = extract32(insn, 23, 2); /* rot */
268
- .opt_opc = sshl_list,
907
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
269
- .vece = MO_32 },
908
+ if (!dc_isar_feature(aa32_vcma, s)
270
- { .fni8 = gen_sshl_i64,
909
|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
271
- .fniv = gen_sshl_vec,
910
return 1;
272
- .opt_opc = sshl_list,
911
}
273
- .vece = MO_64 },
912
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
274
-};
913
/* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
275
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
914
int size = extract32(insn, 20, 1);
276
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
915
data = extract32(insn, 24, 1); /* rot */
277
+{
916
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
278
+ static const TCGOpcode vecop_list[] = {
917
+ if (!dc_isar_feature(aa32_vcma, s)
279
+ INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
918
|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
280
+ INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
919
return 1;
281
+ };
920
}
282
+ static const GVecGen3 ops[4] = {
921
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
283
+ { .fniv = gen_sshl_vec,
922
} else if ((insn & 0xfeb00f00) == 0xfc200d00) {
284
+ .fno = gen_helper_gvec_sshl_b,
923
/* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
285
+ .opt_opc = vecop_list,
924
bool u = extract32(insn, 4, 1);
286
+ .vece = MO_8 },
925
- if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
287
+ { .fniv = gen_sshl_vec,
926
+ if (!dc_isar_feature(aa32_dp, s)) {
288
+ .fno = gen_helper_gvec_sshl_h,
927
return 1;
289
+ .opt_opc = vecop_list,
928
}
290
+ .vece = MO_16 },
929
fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
291
+ { .fni4 = gen_sshl_i32,
930
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
292
+ .fniv = gen_sshl_vec,
931
int size = extract32(insn, 23, 1);
293
+ .opt_opc = vecop_list,
932
int index;
294
+ .vece = MO_32 },
933
295
+ { .fni8 = gen_sshl_i64,
934
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
296
+ .fniv = gen_sshl_vec,
935
+ if (!dc_isar_feature(aa32_vcma, s)) {
297
+ .opt_opc = vecop_list,
936
return 1;
298
+ .vece = MO_64 },
937
}
299
+ };
938
if (size == 0) {
300
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
939
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
301
+}
940
} else if ((insn & 0xffb00f00) == 0xfe200d00) {
302
941
/* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
303
static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
942
int u = extract32(insn, 4, 1);
304
TCGv_vec a, TCGv_vec b)
943
- if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
944
+ if (!dc_isar_feature(aa32_dp, s)) {
945
return 1;
946
}
947
fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
948
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
949
* op1 == 3 is UNPREDICTABLE but handle as UNDEFINED.
950
* Bits 8, 10 and 11 should be zero.
951
*/
952
- if (!arm_dc_feature(s, ARM_FEATURE_CRC) || op1 == 0x3 ||
953
- (c & 0xd) != 0) {
954
+ if (!dc_isar_feature(aa32_crc32, s) || op1 == 0x3 || (c & 0xd) != 0) {
955
goto illegal_op;
956
}
957
958
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
959
case 0x28:
960
case 0x29:
961
case 0x2a:
962
- if (!arm_dc_feature(s, ARM_FEATURE_CRC)) {
963
+ if (!dc_isar_feature(aa32_crc32, s)) {
964
goto illegal_op;
965
}
966
break;
967
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
968
CPUARMState *env = cs->env_ptr;
969
ARMCPU *cpu = arm_env_get_cpu(env);
970
971
+ dc->isar = &cpu->isar;
972
dc->pc = dc->base.pc_first;
973
dc->condjmp = 0;
974
975
--
305
--
976
2.19.1
306
2.20.1
977
307
978
308
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Provide a functional interface for the vector expansion.
4
This fits better with the existing set of helpers that
5
we provide for other operations.
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-18-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
5
[PMM: added parens in ?: expression]
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
11
---
9
target/arm/translate.c | 81 ++++++++++++++----------------------------
12
target/arm/translate.h | 13 +-
10
1 file changed, 26 insertions(+), 55 deletions(-)
13
target/arm/translate-a64.c | 22 ++-
11
14
target/arm/translate-neon.inc.c | 19 +--
15
target/arm/translate.c | 228 +++++++++++++++++---------------
16
4 files changed, 147 insertions(+), 135 deletions(-)
17
18
diff --git a/target/arm/translate.h b/target/arm/translate.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate.h
21
+++ b/target/arm/translate.h
22
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
23
void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
24
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
26
-extern const GVecGen4 uqadd_op[4];
27
-extern const GVecGen4 sqadd_op[4];
28
-extern const GVecGen4 uqsub_op[4];
29
-extern const GVecGen4 sqsub_op[4];
30
void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
31
void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
32
void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
33
void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
34
void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
35
36
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
37
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
38
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
39
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
40
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
41
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
42
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
43
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
44
+
45
void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
46
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
47
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
48
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/translate-a64.c
51
+++ b/target/arm/translate-a64.c
52
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
53
54
switch (opcode) {
55
case 0x01: /* SQADD, UQADD */
56
- tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
57
- offsetof(CPUARMState, vfp.qc),
58
- vec_full_reg_offset(s, rn),
59
- vec_full_reg_offset(s, rm),
60
- is_q ? 16 : 8, vec_full_reg_size(s),
61
- (u ? uqadd_op : sqadd_op) + size);
62
+ if (u) {
63
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
64
+ } else {
65
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
66
+ }
67
return;
68
case 0x05: /* SQSUB, UQSUB */
69
- tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
70
- offsetof(CPUARMState, vfp.qc),
71
- vec_full_reg_offset(s, rn),
72
- vec_full_reg_offset(s, rm),
73
- is_q ? 16 : 8, vec_full_reg_size(s),
74
- (u ? uqsub_op : sqsub_op) + size);
75
+ if (u) {
76
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
77
+ } else {
78
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
79
+ }
80
return;
81
case 0x08: /* SSHL, USHL */
82
if (u) {
83
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
84
index XXXXXXX..XXXXXXX 100644
85
--- a/target/arm/translate-neon.inc.c
86
+++ b/target/arm/translate-neon.inc.c
87
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
88
DO_3SAME(VEOR, tcg_gen_gvec_xor)
89
DO_3SAME(VSHL_S, gen_gvec_sshl)
90
DO_3SAME(VSHL_U, gen_gvec_ushl)
91
+DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
92
+DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
93
+DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
94
+DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
95
96
/* These insns are all gvec_bitsel but with the inputs in various orders. */
97
#define DO_3SAME_BITSEL(INSN, O1, O2, O3) \
98
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
99
DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
100
DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
101
102
-#define DO_3SAME_GVEC4(INSN, OPARRAY) \
103
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
104
- uint32_t rn_ofs, uint32_t rm_ofs, \
105
- uint32_t oprsz, uint32_t maxsz) \
106
- { \
107
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), \
108
- rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]); \
109
- } \
110
- DO_3SAME(INSN, gen_##INSN##_3s)
111
-
112
-DO_3SAME_GVEC4(VQADD_S, sqadd_op)
113
-DO_3SAME_GVEC4(VQADD_U, uqadd_op)
114
-DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
115
-DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
116
-
117
static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
118
uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
119
{
12
diff --git a/target/arm/translate.c b/target/arm/translate.c
120
diff --git a/target/arm/translate.c b/target/arm/translate.c
13
index XXXXXXX..XXXXXXX 100644
121
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate.c
122
--- a/target/arm/translate.c
15
+++ b/target/arm/translate.c
123
+++ b/target/arm/translate.c
16
@@ -XXX,XX +XXX,XX @@ static void gen_vfp_msr(TCGv_i32 tmp)
124
@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
17
tcg_temp_free_i32(tmp);
125
tcg_temp_free_vec(x);
18
}
126
}
19
127
20
-static void gen_neon_dup_u8(TCGv_i32 var, int shift)
128
-static const TCGOpcode vecop_list_uqadd[] = {
21
-{
129
- INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
22
- TCGv_i32 tmp = tcg_temp_new_i32();
130
-};
23
- if (shift)
131
-
24
- tcg_gen_shri_i32(var, var, shift);
132
-const GVecGen4 uqadd_op[4] = {
25
- tcg_gen_ext8u_i32(var, var);
133
- { .fniv = gen_uqadd_vec,
26
- tcg_gen_shli_i32(tmp, var, 8);
134
- .fno = gen_helper_gvec_uqadd_b,
27
- tcg_gen_or_i32(var, var, tmp);
135
- .write_aofs = true,
28
- tcg_gen_shli_i32(tmp, var, 16);
136
- .opt_opc = vecop_list_uqadd,
29
- tcg_gen_or_i32(var, var, tmp);
137
- .vece = MO_8 },
30
- tcg_temp_free_i32(tmp);
138
- { .fniv = gen_uqadd_vec,
31
-}
139
- .fno = gen_helper_gvec_uqadd_h,
32
-
140
- .write_aofs = true,
33
static void gen_neon_dup_low16(TCGv_i32 var)
141
- .opt_opc = vecop_list_uqadd,
34
{
142
- .vece = MO_16 },
35
TCGv_i32 tmp = tcg_temp_new_i32();
143
- { .fniv = gen_uqadd_vec,
36
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
144
- .fno = gen_helper_gvec_uqadd_s,
37
tcg_temp_free_i32(tmp);
145
- .write_aofs = true,
146
- .opt_opc = vecop_list_uqadd,
147
- .vece = MO_32 },
148
- { .fniv = gen_uqadd_vec,
149
- .fno = gen_helper_gvec_uqadd_d,
150
- .write_aofs = true,
151
- .opt_opc = vecop_list_uqadd,
152
- .vece = MO_64 },
153
-};
154
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
155
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
156
+{
157
+ static const TCGOpcode vecop_list[] = {
158
+ INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
159
+ };
160
+ static const GVecGen4 ops[4] = {
161
+ { .fniv = gen_uqadd_vec,
162
+ .fno = gen_helper_gvec_uqadd_b,
163
+ .write_aofs = true,
164
+ .opt_opc = vecop_list,
165
+ .vece = MO_8 },
166
+ { .fniv = gen_uqadd_vec,
167
+ .fno = gen_helper_gvec_uqadd_h,
168
+ .write_aofs = true,
169
+ .opt_opc = vecop_list,
170
+ .vece = MO_16 },
171
+ { .fniv = gen_uqadd_vec,
172
+ .fno = gen_helper_gvec_uqadd_s,
173
+ .write_aofs = true,
174
+ .opt_opc = vecop_list,
175
+ .vece = MO_32 },
176
+ { .fniv = gen_uqadd_vec,
177
+ .fno = gen_helper_gvec_uqadd_d,
178
+ .write_aofs = true,
179
+ .opt_opc = vecop_list,
180
+ .vece = MO_64 },
181
+ };
182
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
183
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
184
+}
185
186
static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
187
TCGv_vec a, TCGv_vec b)
188
@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
189
tcg_temp_free_vec(x);
38
}
190
}
39
191
40
-static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size)
192
-static const TCGOpcode vecop_list_sqadd[] = {
41
-{
193
- INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
42
- /* Load a single Neon element and replicate into a 32 bit TCG reg */
194
-};
43
- TCGv_i32 tmp = tcg_temp_new_i32();
195
-
44
- switch (size) {
196
-const GVecGen4 sqadd_op[4] = {
45
- case 0:
197
- { .fniv = gen_sqadd_vec,
46
- gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
198
- .fno = gen_helper_gvec_sqadd_b,
47
- gen_neon_dup_u8(tmp, 0);
199
- .opt_opc = vecop_list_sqadd,
48
- break;
200
- .write_aofs = true,
49
- case 1:
201
- .vece = MO_8 },
50
- gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
202
- { .fniv = gen_sqadd_vec,
51
- gen_neon_dup_low16(tmp);
203
- .fno = gen_helper_gvec_sqadd_h,
52
- break;
204
- .opt_opc = vecop_list_sqadd,
53
- case 2:
205
- .write_aofs = true,
54
- gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
206
- .vece = MO_16 },
55
- break;
207
- { .fniv = gen_sqadd_vec,
56
- default: /* Avoid compiler warnings. */
208
- .fno = gen_helper_gvec_sqadd_s,
57
- abort();
209
- .opt_opc = vecop_list_sqadd,
58
- }
210
- .write_aofs = true,
59
- return tmp;
211
- .vece = MO_32 },
60
-}
212
- { .fniv = gen_sqadd_vec,
61
-
213
- .fno = gen_helper_gvec_sqadd_d,
62
static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
214
- .opt_opc = vecop_list_sqadd,
63
uint32_t dp)
215
- .write_aofs = true,
64
{
216
- .vece = MO_64 },
65
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
217
-};
66
int load;
218
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
67
int shift;
219
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
68
int n;
220
+{
69
+ int vec_size;
221
+ static const TCGOpcode vecop_list[] = {
70
TCGv_i32 addr;
222
+ INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
71
TCGv_i32 tmp;
223
+ };
72
TCGv_i32 tmp2;
224
+ static const GVecGen4 ops[4] = {
73
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
225
+ { .fniv = gen_sqadd_vec,
74
}
226
+ .fno = gen_helper_gvec_sqadd_b,
75
addr = tcg_temp_new_i32();
227
+ .opt_opc = vecop_list,
76
load_reg_var(s, addr, rn);
228
+ .write_aofs = true,
77
- if (nregs == 1) {
229
+ .vece = MO_8 },
78
- /* VLD1 to all lanes: bit 5 indicates how many Dregs to write */
230
+ { .fniv = gen_sqadd_vec,
79
- tmp = gen_load_and_replicate(s, addr, size);
231
+ .fno = gen_helper_gvec_sqadd_h,
80
- tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
232
+ .opt_opc = vecop_list,
81
- tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
233
+ .write_aofs = true,
82
- if (insn & (1 << 5)) {
234
+ .vece = MO_16 },
83
- tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 0));
235
+ { .fniv = gen_sqadd_vec,
84
- tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 1));
236
+ .fno = gen_helper_gvec_sqadd_s,
85
- }
237
+ .opt_opc = vecop_list,
86
- tcg_temp_free_i32(tmp);
238
+ .write_aofs = true,
87
- } else {
239
+ .vece = MO_32 },
88
- /* VLD2/3/4 to all lanes: bit 5 indicates register stride */
240
+ { .fniv = gen_sqadd_vec,
89
- stride = (insn & (1 << 5)) ? 2 : 1;
241
+ .fno = gen_helper_gvec_sqadd_d,
90
- for (reg = 0; reg < nregs; reg++) {
242
+ .opt_opc = vecop_list,
91
- tmp = gen_load_and_replicate(s, addr, size);
243
+ .write_aofs = true,
92
- tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
244
+ .vece = MO_64 },
93
- tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
245
+ };
94
- tcg_temp_free_i32(tmp);
246
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
95
- tcg_gen_addi_i32(addr, addr, 1 << size);
247
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
96
- rd += stride;
248
+}
97
+
249
98
+ /* VLD1 to all lanes: bit 5 indicates how many Dregs to write.
250
static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
99
+ * VLD2/3/4 to all lanes: bit 5 indicates register stride.
251
TCGv_vec a, TCGv_vec b)
100
+ */
252
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
101
+ stride = (insn & (1 << 5)) ? 2 : 1;
253
tcg_temp_free_vec(x);
102
+ vec_size = nregs == 1 ? stride * 8 : 8;
254
}
103
+
255
104
+ tmp = tcg_temp_new_i32();
256
-static const TCGOpcode vecop_list_uqsub[] = {
105
+ for (reg = 0; reg < nregs; reg++) {
257
- INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
106
+ gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
258
-};
107
+ s->be_data | size);
259
-
108
+ if ((rd & 1) && vec_size == 16) {
260
-const GVecGen4 uqsub_op[4] = {
109
+ /* We cannot write 16 bytes at once because the
261
- { .fniv = gen_uqsub_vec,
110
+ * destination is unaligned.
262
- .fno = gen_helper_gvec_uqsub_b,
111
+ */
263
- .opt_opc = vecop_list_uqsub,
112
+ tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
264
- .write_aofs = true,
113
+ 8, 8, tmp);
265
- .vece = MO_8 },
114
+ tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0),
266
- { .fniv = gen_uqsub_vec,
115
+ neon_reg_offset(rd, 0), 8, 8);
267
- .fno = gen_helper_gvec_uqsub_h,
116
+ } else {
268
- .opt_opc = vecop_list_uqsub,
117
+ tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
269
- .write_aofs = true,
118
+ vec_size, vec_size, tmp);
270
- .vece = MO_16 },
119
}
271
- { .fniv = gen_uqsub_vec,
120
+ tcg_gen_addi_i32(addr, addr, 1 << size);
272
- .fno = gen_helper_gvec_uqsub_s,
121
+ rd += stride;
273
- .opt_opc = vecop_list_uqsub,
122
}
274
- .write_aofs = true,
123
+ tcg_temp_free_i32(tmp);
275
- .vece = MO_32 },
124
tcg_temp_free_i32(addr);
276
- { .fniv = gen_uqsub_vec,
125
stride = (1 << size) * nregs;
277
- .fno = gen_helper_gvec_uqsub_d,
126
} else {
278
- .opt_opc = vecop_list_uqsub,
279
- .write_aofs = true,
280
- .vece = MO_64 },
281
-};
282
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
283
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
284
+{
285
+ static const TCGOpcode vecop_list[] = {
286
+ INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
287
+ };
288
+ static const GVecGen4 ops[4] = {
289
+ { .fniv = gen_uqsub_vec,
290
+ .fno = gen_helper_gvec_uqsub_b,
291
+ .opt_opc = vecop_list,
292
+ .write_aofs = true,
293
+ .vece = MO_8 },
294
+ { .fniv = gen_uqsub_vec,
295
+ .fno = gen_helper_gvec_uqsub_h,
296
+ .opt_opc = vecop_list,
297
+ .write_aofs = true,
298
+ .vece = MO_16 },
299
+ { .fniv = gen_uqsub_vec,
300
+ .fno = gen_helper_gvec_uqsub_s,
301
+ .opt_opc = vecop_list,
302
+ .write_aofs = true,
303
+ .vece = MO_32 },
304
+ { .fniv = gen_uqsub_vec,
305
+ .fno = gen_helper_gvec_uqsub_d,
306
+ .opt_opc = vecop_list,
307
+ .write_aofs = true,
308
+ .vece = MO_64 },
309
+ };
310
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
311
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
312
+}
313
314
static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
315
TCGv_vec a, TCGv_vec b)
316
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
317
tcg_temp_free_vec(x);
318
}
319
320
-static const TCGOpcode vecop_list_sqsub[] = {
321
- INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
322
-};
323
-
324
-const GVecGen4 sqsub_op[4] = {
325
- { .fniv = gen_sqsub_vec,
326
- .fno = gen_helper_gvec_sqsub_b,
327
- .opt_opc = vecop_list_sqsub,
328
- .write_aofs = true,
329
- .vece = MO_8 },
330
- { .fniv = gen_sqsub_vec,
331
- .fno = gen_helper_gvec_sqsub_h,
332
- .opt_opc = vecop_list_sqsub,
333
- .write_aofs = true,
334
- .vece = MO_16 },
335
- { .fniv = gen_sqsub_vec,
336
- .fno = gen_helper_gvec_sqsub_s,
337
- .opt_opc = vecop_list_sqsub,
338
- .write_aofs = true,
339
- .vece = MO_32 },
340
- { .fniv = gen_sqsub_vec,
341
- .fno = gen_helper_gvec_sqsub_d,
342
- .opt_opc = vecop_list_sqsub,
343
- .write_aofs = true,
344
- .vece = MO_64 },
345
-};
346
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
347
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
348
+{
349
+ static const TCGOpcode vecop_list[] = {
350
+ INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
351
+ };
352
+ static const GVecGen4 ops[4] = {
353
+ { .fniv = gen_sqsub_vec,
354
+ .fno = gen_helper_gvec_sqsub_b,
355
+ .opt_opc = vecop_list,
356
+ .write_aofs = true,
357
+ .vece = MO_8 },
358
+ { .fniv = gen_sqsub_vec,
359
+ .fno = gen_helper_gvec_sqsub_h,
360
+ .opt_opc = vecop_list,
361
+ .write_aofs = true,
362
+ .vece = MO_16 },
363
+ { .fniv = gen_sqsub_vec,
364
+ .fno = gen_helper_gvec_sqsub_s,
365
+ .opt_opc = vecop_list,
366
+ .write_aofs = true,
367
+ .vece = MO_32 },
368
+ { .fniv = gen_sqsub_vec,
369
+ .fno = gen_helper_gvec_sqsub_d,
370
+ .opt_opc = vecop_list,
371
+ .write_aofs = true,
372
+ .vece = MO_64 },
373
+ };
374
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
375
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
376
+}
377
378
/* Translate a NEON data processing instruction. Return nonzero if the
379
instruction is invalid.
127
--
380
--
128
2.19.1
381
2.20.1
129
382
130
383
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3
These operations do not touch fp_status.
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20181016223115.24100-9-richard.henderson@linaro.org
7
Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
---
9
target/arm/cpu.h | 17 +++++++++++++++-
10
target/arm/helper.h | 4 ++--
10
linux-user/elfload.c | 6 +-----
11
target/arm/translate-a64.c | 5 ++---
11
target/arm/cpu64.c | 16 ++++++++-------
12
target/arm/translate.c | 12 ++----------
12
target/arm/helper.c | 2 +-
13
target/arm/vfp_helper.c | 5 ++---
13
target/arm/translate-a64.c | 40 +++++++++++++++++++-------------------
14
4 files changed, 8 insertions(+), 18 deletions(-)
14
target/arm/translate.c | 6 +++---
15
6 files changed, 50 insertions(+), 37 deletions(-)
16
15
17
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
18
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/cpu.h
18
--- a/target/arm/helper.h
20
+++ b/target/arm/cpu.h
19
+++ b/target/arm/helper.h
21
@@ -XXX,XX +XXX,XX @@ enum arm_features {
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
22
ARM_FEATURE_PMU, /* has PMU support */
21
DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
23
ARM_FEATURE_VBAR, /* has cp15 VBAR */
22
DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
24
ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
23
DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
25
- ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
24
-DEF_HELPER_2(recpe_u32, i32, i32, ptr)
26
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
25
-DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr)
27
};
26
+DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32)
28
27
+DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32)
29
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
28
DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32)
30
return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
29
31
}
30
DEF_HELPER_3(shl_cc, i32, env, i32, i32)
32
33
+static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
34
+{
35
+ /*
36
+ * This is a placeholder for use by VCMA until the rest of
37
+ * the ARMv8.2-FP16 extension is implemented for aa32 mode.
38
+ * At which point we can properly set and check MVFR1.FPHP.
39
+ */
40
+ return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
41
+}
42
+
43
/*
44
* 64-bit feature tests via id registers.
45
*/
46
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
47
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
48
}
49
50
+static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
51
+{
52
+ /* We always set the AdvSIMD and FP fields identically wrt FP16. */
53
+ return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
54
+}
55
+
56
static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
57
{
58
return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
59
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
60
index XXXXXXX..XXXXXXX 100644
61
--- a/linux-user/elfload.c
62
+++ b/linux-user/elfload.c
63
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
64
hwcaps |= ARM_HWCAP_A64_ASIMD;
65
66
/* probe for the extra features */
67
-#define GET_FEATURE(feat, hwcap) \
68
- do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
69
#define GET_FEATURE_ID(feat, hwcap) \
70
do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
71
72
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
73
GET_FEATURE_ID(aa64_sha3, ARM_HWCAP_A64_SHA3);
74
GET_FEATURE_ID(aa64_sm3, ARM_HWCAP_A64_SM3);
75
GET_FEATURE_ID(aa64_sm4, ARM_HWCAP_A64_SM4);
76
- GET_FEATURE(ARM_FEATURE_V8_FP16,
77
- ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
78
+ GET_FEATURE_ID(aa64_fp16, ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
79
GET_FEATURE_ID(aa64_atomics, ARM_HWCAP_A64_ATOMICS);
80
GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
81
GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
82
GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
83
GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
84
85
-#undef GET_FEATURE
86
#undef GET_FEATURE_ID
87
88
return hwcaps;
89
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
90
index XXXXXXX..XXXXXXX 100644
91
--- a/target/arm/cpu64.c
92
+++ b/target/arm/cpu64.c
93
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
94
95
t = cpu->isar.id_aa64pfr0;
96
t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
97
+ t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);
98
+ t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);
99
cpu->isar.id_aa64pfr0 = t;
100
101
/* Replicate the same data to the 32-bit id registers. */
102
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
103
u = FIELD_DP32(u, ID_ISAR6, DP, 1);
104
cpu->isar.id_isar6 = u;
105
106
-#ifdef CONFIG_USER_ONLY
107
- /* We don't set these in system emulation mode for the moment,
108
- * since we don't correctly set the ID registers to advertise them,
109
- * and in some cases they're only available in AArch64 and not AArch32,
110
- * whereas the architecture requires them to be present in both if
111
- * present in either.
112
+ /*
113
+ * FIXME: We do not yet support ARMv8.2-fp16 for AArch32 yet,
114
+ * so do not set MVFR1.FPHP. Strictly speaking this is not legal,
115
+ * but it is also not legal to enable SVE without support for FP16,
116
+ * and enabling SVE in system mode is more useful in the short term.
117
*/
118
- set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
119
+
120
+#ifdef CONFIG_USER_ONLY
121
/* For usermode -cpu max we can use a larger and more efficient DCZ
122
* blocksize since we don't have to follow what the hardware does.
123
*/
124
diff --git a/target/arm/helper.c b/target/arm/helper.c
125
index XXXXXXX..XXXXXXX 100644
126
--- a/target/arm/helper.c
127
+++ b/target/arm/helper.c
128
@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
129
uint32_t changed;
130
131
/* When ARMv8.2-FP16 is not supported, FZ16 is RES0. */
132
- if (!arm_feature(env, ARM_FEATURE_V8_FP16)) {
133
+ if (!cpu_isar_feature(aa64_fp16, arm_env_get_cpu(env))) {
134
val &= ~FPCR_FZ16;
135
}
136
137
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
31
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
138
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
139
--- a/target/arm/translate-a64.c
33
--- a/target/arm/translate-a64.c
140
+++ b/target/arm/translate-a64.c
34
+++ b/target/arm/translate-a64.c
141
@@ -XXX,XX +XXX,XX @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
35
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode,
142
break;
36
143
case 3:
37
switch (opcode) {
144
size = MO_16;
38
case 0x3c: /* URECPE */
145
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
39
- gen_helper_recpe_u32(tcg_res, tcg_op, fpst);
146
+ if (dc_isar_feature(aa64_fp16, s)) {
40
+ gen_helper_recpe_u32(tcg_res, tcg_op);
147
break;
41
break;
148
}
42
case 0x3d: /* FRECPE */
149
/* fallthru */
43
gen_helper_recpe_f32(tcg_res, tcg_op, fpst);
150
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
44
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
151
break;
152
case 3:
153
size = MO_16;
154
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
155
+ if (dc_isar_feature(aa64_fp16, s)) {
156
break;
157
}
158
/* fallthru */
159
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
160
break;
161
case 3:
162
sz = MO_16;
163
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
164
+ if (dc_isar_feature(aa64_fp16, s)) {
165
break;
166
}
167
/* fallthru */
168
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
169
handle_fp_1src_double(s, opcode, rd, rn);
170
break;
171
case 3:
172
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
173
+ if (!dc_isar_feature(aa64_fp16, s)) {
174
unallocated_encoding(s);
45
unallocated_encoding(s);
175
return;
46
return;
176
}
47
}
177
@@ -XXX,XX +XXX,XX @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
48
- need_fpstatus = true;
178
handle_fp_2src_double(s, opcode, rd, rn, rm);
179
break;
180
case 3:
181
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
182
+ if (!dc_isar_feature(aa64_fp16, s)) {
183
unallocated_encoding(s);
184
return;
185
}
186
@@ -XXX,XX +XXX,XX @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
187
handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
188
break;
189
case 3:
190
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
191
+ if (!dc_isar_feature(aa64_fp16, s)) {
192
unallocated_encoding(s);
193
return;
194
}
195
@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
196
break;
197
case 3:
198
sz = MO_16;
199
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
200
+ if (dc_isar_feature(aa64_fp16, s)) {
201
break;
49
break;
202
}
50
case 0x1e: /* FRINT32Z */
203
/* fallthru */
51
case 0x1f: /* FRINT64Z */
204
@@ -XXX,XX +XXX,XX @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
52
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
205
case 1: /* float64 */
53
gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus);
206
break;
54
break;
207
case 3: /* float16 */
55
case 0x7c: /* URSQRTE */
208
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
56
- gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus);
209
+ if (dc_isar_feature(aa64_fp16, s)) {
57
+ gen_helper_rsqrte_u32(tcg_res, tcg_op);
210
break;
58
break;
211
}
59
case 0x1e: /* FRINT32Z */
212
/* fallthru */
60
case 0x5e: /* FRINT32X */
213
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
214
break;
215
case 0x6: /* 16-bit float, 32-bit int */
216
case 0xe: /* 16-bit float, 64-bit int */
217
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
218
+ if (dc_isar_feature(aa64_fp16, s)) {
219
break;
220
}
221
/* fallthru */
222
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
223
case 1: /* float64 */
224
break;
225
case 3: /* float16 */
226
- if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
227
+ if (dc_isar_feature(aa64_fp16, s)) {
228
break;
229
}
230
/* fallthru */
231
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
232
*/
233
is_min = extract32(size, 1, 1);
234
is_fp = true;
235
- if (!is_u && arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
236
+ if (!is_u && dc_isar_feature(aa64_fp16, s)) {
237
size = 1;
238
} else if (!is_u || !is_q || extract32(size, 0, 1)) {
239
unallocated_encoding(s);
240
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
241
242
if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
243
/* Check for FMOV (vector, immediate) - half-precision */
244
- if (!(arm_dc_feature(s, ARM_FEATURE_V8_FP16) && o2 && cmode == 0xf)) {
245
+ if (!(dc_isar_feature(aa64_fp16, s) && o2 && cmode == 0xf)) {
246
unallocated_encoding(s);
247
return;
248
}
249
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
250
case 0x2f: /* FMINP */
251
/* FP op, size[0] is 32 or 64 bit*/
252
if (!u) {
253
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
254
+ if (!dc_isar_feature(aa64_fp16, s)) {
255
unallocated_encoding(s);
256
return;
257
} else {
258
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
259
size = MO_32;
260
} else if (immh & 2) {
261
size = MO_16;
262
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
263
+ if (!dc_isar_feature(aa64_fp16, s)) {
264
unallocated_encoding(s);
265
return;
266
}
267
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
268
size = MO_32;
269
} else if (immh & 0x2) {
270
size = MO_16;
271
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
272
+ if (!dc_isar_feature(aa64_fp16, s)) {
273
unallocated_encoding(s);
274
return;
275
}
276
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
277
return;
278
}
279
280
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
281
+ if (!dc_isar_feature(aa64_fp16, s)) {
282
unallocated_encoding(s);
283
}
284
285
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
286
TCGv_ptr fpst;
287
bool pairwise = false;
288
289
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
290
+ if (!dc_isar_feature(aa64_fp16, s)) {
291
unallocated_encoding(s);
292
return;
293
}
294
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
295
case 0x1c: /* FCADD, #90 */
296
case 0x1e: /* FCADD, #270 */
297
if (size == 0
298
- || (size == 1 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))
299
+ || (size == 1 && !dc_isar_feature(aa64_fp16, s))
300
|| (size == 3 && !is_q)) {
301
unallocated_encoding(s);
302
return;
303
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
304
bool need_fpst = true;
305
int rmode;
306
307
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
308
+ if (!dc_isar_feature(aa64_fp16, s)) {
309
unallocated_encoding(s);
310
return;
311
}
312
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
313
}
314
break;
315
}
316
- if (is_fp16 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
317
+ if (is_fp16 && !dc_isar_feature(aa64_fp16, s)) {
318
unallocated_encoding(s);
319
return;
320
}
321
diff --git a/target/arm/translate.c b/target/arm/translate.c
61
diff --git a/target/arm/translate.c b/target/arm/translate.c
322
index XXXXXXX..XXXXXXX 100644
62
index XXXXXXX..XXXXXXX 100644
323
--- a/target/arm/translate.c
63
--- a/target/arm/translate.c
324
+++ b/target/arm/translate.c
64
+++ b/target/arm/translate.c
325
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
65
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
326
int size = extract32(insn, 20, 1);
66
break;
327
data = extract32(insn, 23, 2); /* rot */
67
}
328
if (!dc_isar_feature(aa32_vcma, s)
68
case NEON_2RM_VRECPE:
329
- || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
69
- {
330
+ || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
70
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
331
return 1;
71
- gen_helper_recpe_u32(tmp, tmp, fpstatus);
332
}
72
- tcg_temp_free_ptr(fpstatus);
333
fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
73
+ gen_helper_recpe_u32(tmp, tmp);
334
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
74
break;
335
int size = extract32(insn, 20, 1);
75
- }
336
data = extract32(insn, 24, 1); /* rot */
76
case NEON_2RM_VRSQRTE:
337
if (!dc_isar_feature(aa32_vcma, s)
77
- {
338
- || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
78
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
339
+ || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
79
- gen_helper_rsqrte_u32(tmp, tmp, fpstatus);
340
return 1;
80
- tcg_temp_free_ptr(fpstatus);
341
}
81
+ gen_helper_rsqrte_u32(tmp, tmp);
342
fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
82
break;
343
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
83
- }
344
return 1;
84
case NEON_2RM_VRECPE_F:
345
}
85
{
346
if (size == 0) {
86
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
347
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
87
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
348
+ if (!dc_isar_feature(aa32_fp16_arith, s)) {
88
index XXXXXXX..XXXXXXX 100644
349
return 1;
89
--- a/target/arm/vfp_helper.c
350
}
90
+++ b/target/arm/vfp_helper.c
351
/* For fp16, rm is just Vm, and index is M. */
91
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
92
return make_float64(val);
93
}
94
95
-uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
96
+uint32_t HELPER(recpe_u32)(uint32_t a)
97
{
98
- /* float_status *s = fpstp; */
99
int input, estimate;
100
101
if ((a & 0x80000000) == 0) {
102
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
103
return deposit32(0, (32 - 9), 9, estimate);
104
}
105
106
-uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp)
107
+uint32_t HELPER(rsqrte_u32)(uint32_t a)
108
{
109
int estimate;
110
352
--
111
--
353
2.19.1
112
2.20.1
354
113
355
114
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Move mla_op and mls_op expanders from translate-a64.c.
3
Provide a functional interface for the vector expansion.
4
This fits better with the existing set of helpers that
5
we provide for other operations.
4
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181011205206.3552-16-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
11
---
10
target/arm/translate.h | 2 +
12
target/arm/translate.h | 5 ++++
11
target/arm/translate-a64.c | 106 -----------------------------
13
target/arm/translate-a64.c | 34 ++----------------------
12
target/arm/translate.c | 134 ++++++++++++++++++++++++++++++++-----
14
target/arm/translate.c | 54 +++++++++++++++++++-------------------
13
3 files changed, 120 insertions(+), 122 deletions(-)
15
3 files changed, 34 insertions(+), 59 deletions(-)
14
16
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
17
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
19
--- a/target/arm/translate.h
18
+++ b/target/arm/translate.h
20
+++ b/target/arm/translate.h
19
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
21
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
20
extern const GVecGen3 bsl_op;
22
void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
21
extern const GVecGen3 bit_op;
23
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
22
extern const GVecGen3 bif_op;
24
23
+extern const GVecGen3 mla_op[4];
25
+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
24
+extern const GVecGen3 mls_op[4];
26
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
extern const GVecGen2i ssra_op[4];
27
+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
26
extern const GVecGen2i usra_op[4];
28
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
27
extern const GVecGen2i sri_op[4];
29
+
30
/*
31
* Forward to the isar_feature_* tests given a DisasContext pointer.
32
*/
28
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
33
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-a64.c
35
--- a/target/arm/translate-a64.c
31
+++ b/target/arm/translate-a64.c
36
+++ b/target/arm/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
37
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
33
}
38
is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
34
}
39
}
35
40
36
-static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
41
-/* Expand a 3-operand + env pointer operation using
42
- * an out-of-line helper.
43
- */
44
-static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
45
- int rn, int rm, gen_helper_gvec_3_ptr *fn)
37
-{
46
-{
38
- gen_helper_neon_mul_u8(a, a, b);
47
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
39
- gen_helper_neon_add_u8(d, d, a);
48
- vec_full_reg_offset(s, rn),
49
- vec_full_reg_offset(s, rm), cpu_env,
50
- is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
40
-}
51
-}
41
-
52
-
42
-static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
53
/* Expand a 3-operand + fpstatus pointer + simd data value operation using
43
-{
54
* an out-of-line helper.
44
- gen_helper_neon_mul_u16(a, a, b);
55
*/
45
- gen_helper_neon_add_u16(d, d, a);
56
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
46
-}
57
47
-
58
switch (opcode) {
48
-static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
59
case 0x0: /* SQRDMLAH (vector) */
49
-{
60
- switch (size) {
50
- tcg_gen_mul_i32(a, a, b);
61
- case 1:
51
- tcg_gen_add_i32(d, d, a);
62
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
52
-}
63
- break;
53
-
64
- case 2:
54
-static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
65
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
55
-{
66
- break;
56
- tcg_gen_mul_i64(a, a, b);
67
- default:
57
- tcg_gen_add_i64(d, d, a);
68
- g_assert_not_reached();
58
-}
69
- }
59
-
70
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size);
60
-static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
71
return;
61
-{
72
62
- tcg_gen_mul_vec(vece, a, a, b);
73
case 0x1: /* SQRDMLSH (vector) */
63
- tcg_gen_add_vec(vece, d, d, a);
74
- switch (size) {
64
-}
75
- case 1:
65
-
76
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
66
-static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
77
- break;
67
-{
78
- case 2:
68
- gen_helper_neon_mul_u8(a, a, b);
79
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
69
- gen_helper_neon_sub_u8(d, d, a);
80
- break;
70
-}
81
- default:
71
-
82
- g_assert_not_reached();
72
-static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
83
- }
73
-{
84
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size);
74
- gen_helper_neon_mul_u16(a, a, b);
85
return;
75
- gen_helper_neon_sub_u16(d, d, a);
86
76
-}
87
case 0x2: /* SDOT / UDOT */
77
-
78
-static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
79
-{
80
- tcg_gen_mul_i32(a, a, b);
81
- tcg_gen_sub_i32(d, d, a);
82
-}
83
-
84
-static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
85
-{
86
- tcg_gen_mul_i64(a, a, b);
87
- tcg_gen_sub_i64(d, d, a);
88
-}
89
-
90
-static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
91
-{
92
- tcg_gen_mul_vec(vece, a, a, b);
93
- tcg_gen_sub_vec(vece, d, d, a);
94
-}
95
-
96
/* Integer op subgroup of C3.6.16. */
97
static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
98
{
99
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
100
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
101
.vece = MO_64 },
102
};
103
- static const GVecGen3 mla_op[4] = {
104
- { .fni4 = gen_mla8_i32,
105
- .fniv = gen_mla_vec,
106
- .opc = INDEX_op_mul_vec,
107
- .load_dest = true,
108
- .vece = MO_8 },
109
- { .fni4 = gen_mla16_i32,
110
- .fniv = gen_mla_vec,
111
- .opc = INDEX_op_mul_vec,
112
- .load_dest = true,
113
- .vece = MO_16 },
114
- { .fni4 = gen_mla32_i32,
115
- .fniv = gen_mla_vec,
116
- .opc = INDEX_op_mul_vec,
117
- .load_dest = true,
118
- .vece = MO_32 },
119
- { .fni8 = gen_mla64_i64,
120
- .fniv = gen_mla_vec,
121
- .opc = INDEX_op_mul_vec,
122
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
123
- .load_dest = true,
124
- .vece = MO_64 },
125
- };
126
- static const GVecGen3 mls_op[4] = {
127
- { .fni4 = gen_mls8_i32,
128
- .fniv = gen_mls_vec,
129
- .opc = INDEX_op_mul_vec,
130
- .load_dest = true,
131
- .vece = MO_8 },
132
- { .fni4 = gen_mls16_i32,
133
- .fniv = gen_mls_vec,
134
- .opc = INDEX_op_mul_vec,
135
- .load_dest = true,
136
- .vece = MO_16 },
137
- { .fni4 = gen_mls32_i32,
138
- .fniv = gen_mls_vec,
139
- .opc = INDEX_op_mul_vec,
140
- .load_dest = true,
141
- .vece = MO_32 },
142
- { .fni8 = gen_mls64_i64,
143
- .fniv = gen_mls_vec,
144
- .opc = INDEX_op_mul_vec,
145
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
146
- .load_dest = true,
147
- .vece = MO_64 },
148
- };
149
150
int is_q = extract32(insn, 30, 1);
151
int u = extract32(insn, 29, 1);
152
diff --git a/target/arm/translate.c b/target/arm/translate.c
88
diff --git a/target/arm/translate.c b/target/arm/translate.c
153
index XXXXXXX..XXXXXXX 100644
89
index XXXXXXX..XXXXXXX 100644
154
--- a/target/arm/translate.c
90
--- a/target/arm/translate.c
155
+++ b/target/arm/translate.c
91
+++ b/target/arm/translate.c
156
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
92
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
157
#define NEON_3R_VABA 15
93
[NEON_2RM_VCVT_UF] = 0x4,
158
#define NEON_3R_VADD_VSUB 16
159
#define NEON_3R_VTST_VCEQ 17
160
-#define NEON_3R_VML 18 /* VMLA, VMLAL, VMLS, VMLSL */
161
+#define NEON_3R_VML 18 /* VMLA, VMLS */
162
#define NEON_3R_VMUL 19
163
#define NEON_3R_VPMAX 20
164
#define NEON_3R_VPMIN 21
165
@@ -XXX,XX +XXX,XX @@ const GVecGen2i sli_op[4] = {
166
.vece = MO_64 },
167
};
94
};
168
95
169
+static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
96
-
170
+{
97
-/* Expand v8.1 simd helper. */
171
+ gen_helper_neon_mul_u8(a, a, b);
98
-static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
172
+ gen_helper_neon_add_u8(d, d, a);
99
- int q, int rd, int rn, int rm)
100
+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
101
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
102
{
103
- if (dc_isar_feature(aa32_rdm, s)) {
104
- int opr_sz = (1 + q) * 8;
105
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
106
- vfp_reg_offset(1, rn),
107
- vfp_reg_offset(1, rm), cpu_env,
108
- opr_sz, opr_sz, 0, fn);
109
- return 0;
110
- }
111
- return 1;
112
+ static gen_helper_gvec_3_ptr * const fns[2] = {
113
+ gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
114
+ };
115
+ tcg_debug_assert(vece >= 1 && vece <= 2);
116
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
117
+ opr_sz, max_sz, 0, fns[vece - 1]);
173
+}
118
+}
174
+
119
+
175
+static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
120
+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
121
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
176
+{
122
+{
177
+ gen_helper_neon_mul_u8(a, a, b);
123
+ static gen_helper_gvec_3_ptr * const fns[2] = {
178
+ gen_helper_neon_sub_u8(d, d, a);
124
+ gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
179
+}
125
+ };
180
+
126
+ tcg_debug_assert(vece >= 1 && vece <= 2);
181
+static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
127
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
182
+{
128
+ opr_sz, max_sz, 0, fns[vece - 1]);
183
+ gen_helper_neon_mul_u16(a, a, b);
129
}
184
+ gen_helper_neon_add_u16(d, d, a);
130
185
+}
131
#define GEN_CMP0(NAME, COND) \
186
+
187
+static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
188
+{
189
+ gen_helper_neon_mul_u16(a, a, b);
190
+ gen_helper_neon_sub_u16(d, d, a);
191
+}
192
+
193
+static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
194
+{
195
+ tcg_gen_mul_i32(a, a, b);
196
+ tcg_gen_add_i32(d, d, a);
197
+}
198
+
199
+static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
200
+{
201
+ tcg_gen_mul_i32(a, a, b);
202
+ tcg_gen_sub_i32(d, d, a);
203
+}
204
+
205
+static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
206
+{
207
+ tcg_gen_mul_i64(a, a, b);
208
+ tcg_gen_add_i64(d, d, a);
209
+}
210
+
211
+static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
212
+{
213
+ tcg_gen_mul_i64(a, a, b);
214
+ tcg_gen_sub_i64(d, d, a);
215
+}
216
+
217
+static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
218
+{
219
+ tcg_gen_mul_vec(vece, a, a, b);
220
+ tcg_gen_add_vec(vece, d, d, a);
221
+}
222
+
223
+static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
224
+{
225
+ tcg_gen_mul_vec(vece, a, a, b);
226
+ tcg_gen_sub_vec(vece, d, d, a);
227
+}
228
+
229
+/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
230
+ * these tables are shared with AArch64 which does support them.
231
+ */
232
+const GVecGen3 mla_op[4] = {
233
+ { .fni4 = gen_mla8_i32,
234
+ .fniv = gen_mla_vec,
235
+ .opc = INDEX_op_mul_vec,
236
+ .load_dest = true,
237
+ .vece = MO_8 },
238
+ { .fni4 = gen_mla16_i32,
239
+ .fniv = gen_mla_vec,
240
+ .opc = INDEX_op_mul_vec,
241
+ .load_dest = true,
242
+ .vece = MO_16 },
243
+ { .fni4 = gen_mla32_i32,
244
+ .fniv = gen_mla_vec,
245
+ .opc = INDEX_op_mul_vec,
246
+ .load_dest = true,
247
+ .vece = MO_32 },
248
+ { .fni8 = gen_mla64_i64,
249
+ .fniv = gen_mla_vec,
250
+ .opc = INDEX_op_mul_vec,
251
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
252
+ .load_dest = true,
253
+ .vece = MO_64 },
254
+};
255
+
256
+const GVecGen3 mls_op[4] = {
257
+ { .fni4 = gen_mls8_i32,
258
+ .fniv = gen_mls_vec,
259
+ .opc = INDEX_op_mul_vec,
260
+ .load_dest = true,
261
+ .vece = MO_8 },
262
+ { .fni4 = gen_mls16_i32,
263
+ .fniv = gen_mls_vec,
264
+ .opc = INDEX_op_mul_vec,
265
+ .load_dest = true,
266
+ .vece = MO_16 },
267
+ { .fni4 = gen_mls32_i32,
268
+ .fniv = gen_mls_vec,
269
+ .opc = INDEX_op_mul_vec,
270
+ .load_dest = true,
271
+ .vece = MO_32 },
272
+ { .fni8 = gen_mls64_i64,
273
+ .fniv = gen_mls_vec,
274
+ .opc = INDEX_op_mul_vec,
275
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
276
+ .load_dest = true,
277
+ .vece = MO_64 },
278
+};
279
+
280
/* Translate a NEON data processing instruction. Return nonzero if the
281
instruction is invalid.
282
We process data in a mixture of 32-bit and 64-bit chunks.
283
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
132
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
284
return 0;
133
break; /* VPADD */
285
}
134
}
286
break;
135
/* VQRDMLAH */
287
+
136
- switch (size) {
288
+ case NEON_3R_VML: /* VMLA, VMLS */
137
- case 1:
289
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
138
- return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
290
+ u ? &mls_op[size] : &mla_op[size]);
139
- q, rd, rn, rm);
291
+ return 0;
140
- case 2:
292
}
141
- return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
293
+
142
- q, rd, rn, rm);
294
if (size == 3) {
143
+ if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
295
/* 64-bit element instructions. */
144
+ gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
296
for (pass = 0; pass < (q ? 2 : 1); pass++) {
145
+ vec_size, vec_size);
146
+ return 0;
147
}
148
return 1;
149
297
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
150
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
298
}
151
break;
299
}
152
}
300
break;
153
/* VQRDMLSH */
301
- case NEON_3R_VML: /* VMLA, VMLAL, VMLS,VMLSL */
302
- switch (size) {
154
- switch (size) {
303
- case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
155
- case 1:
304
- case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
156
- return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
305
- case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
157
- q, rd, rn, rm);
306
- default: abort();
158
- case 2:
307
- }
159
- return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
308
- tcg_temp_free_i32(tmp2);
160
- q, rd, rn, rm);
309
- tmp2 = neon_load_reg(rd, pass);
161
+ if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
310
- if (u) { /* VMLS */
162
+ gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
311
- gen_neon_rsb(size, tmp, tmp2);
163
+ vec_size, vec_size);
312
- } else { /* VMLA */
164
+ return 0;
313
- gen_neon_add(size, tmp, tmp2);
165
}
314
- }
166
return 1;
315
- break;
167
316
case NEON_3R_VMUL:
317
/* VMUL.P8; other cases already eliminated. */
318
gen_helper_neon_mul_p8(tmp, tmp, tmp2);
319
--
168
--
320
2.19.1
169
2.20.1
321
170
322
171
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Also introduces neon_element_offset to find the env offset
3
Pass a pointer directly to env->vfp.qc[0], rather than env.
4
of a specific element within a neon register.
4
This will allow SVE2, which does not modify QC, to pass a
5
5
pointer to dummy storage.
6
7
Change the return type of inl_qrdml.h_s16 to match the
8
sense of the operation: signed.
9
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20181011205206.3552-7-richard.henderson@linaro.org
12
Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
14
---
11
target/arm/translate.c | 63 ++++++++++++++++++++++++------------------
15
target/arm/translate.c | 18 ++++++++---
12
1 file changed, 36 insertions(+), 27 deletions(-)
16
target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
17
2 files changed, 54 insertions(+), 34 deletions(-)
13
18
14
diff --git a/target/arm/translate.c b/target/arm/translate.c
19
diff --git a/target/arm/translate.c b/target/arm/translate.c
15
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/translate.c
21
--- a/target/arm/translate.c
17
+++ b/target/arm/translate.c
22
+++ b/target/arm/translate.c
18
@@ -XXX,XX +XXX,XX @@ neon_reg_offset (int reg, int n)
23
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
19
return vfp_reg_offset(0, sreg);
24
[NEON_2RM_VCVT_UF] = 0x4,
20
}
25
};
21
26
22
+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
27
+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
23
+ * where 0 is the least significant end of the register.
28
+ uint32_t opr_sz, uint32_t max_sz,
24
+ */
29
+ gen_helper_gvec_3_ptr *fn)
25
+static inline long
26
+neon_element_offset(int reg, int element, TCGMemOp size)
27
+{
30
+{
28
+ int element_size = 1 << size;
31
+ TCGv_ptr qc_ptr = tcg_temp_new_ptr();
29
+ int ofs = element * element_size;
32
+
30
+#ifdef HOST_WORDS_BIGENDIAN
33
+ tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
31
+ /* Calculate the offset assuming fully little-endian,
34
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
32
+ * then XOR to account for the order of the 8-byte units.
35
+ opr_sz, max_sz, 0, fn);
33
+ */
36
+ tcg_temp_free_ptr(qc_ptr);
34
+ if (element_size < 8) {
35
+ ofs ^= 8 - element_size;
36
+ }
37
+#endif
38
+ return neon_reg_offset(reg, 0) + ofs;
39
+}
37
+}
40
+
38
+
41
static TCGv_i32 neon_load_reg(int reg, int pass)
39
void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
42
{
40
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
43
TCGv_i32 tmp = tcg_temp_new_i32();
41
{
44
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
42
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
45
tmp = load_reg(s, rd);
43
gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
46
if (insn & (1 << 23)) {
44
};
47
/* VDUP */
45
tcg_debug_assert(vece >= 1 && vece <= 2);
48
- if (size == 0) {
46
- tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
49
- gen_neon_dup_u8(tmp, 0);
47
- opr_sz, max_sz, 0, fns[vece - 1]);
50
- } else if (size == 1) {
48
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
51
- gen_neon_dup_low16(tmp);
49
}
52
- }
50
53
- for (n = 0; n <= pass * 2; n++) {
51
void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
54
- tmp2 = tcg_temp_new_i32();
52
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
55
- tcg_gen_mov_i32(tmp2, tmp);
53
gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
56
- neon_store_reg(rn, n, tmp2);
54
};
57
- }
55
tcg_debug_assert(vece >= 1 && vece <= 2);
58
- neon_store_reg(rn, n, tmp);
56
- tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
59
+ int vec_size = pass ? 16 : 8;
57
- opr_sz, max_sz, 0, fns[vece - 1]);
60
+ tcg_gen_gvec_dup_i32(size, neon_reg_offset(rn, 0),
58
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
61
+ vec_size, vec_size, tmp);
59
}
62
+ tcg_temp_free_i32(tmp);
60
63
} else {
61
#define GEN_CMP0(NAME, COND) \
64
/* VMOV */
62
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
65
switch (size) {
63
index XXXXXXX..XXXXXXX 100644
66
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
64
--- a/target/arm/vec_helper.c
67
tcg_temp_free_i32(tmp);
65
+++ b/target/arm/vec_helper.c
68
} else if ((insn & 0x380) == 0) {
66
@@ -XXX,XX +XXX,XX @@
69
/* VDUP */
67
#define H4(x) (x)
70
+ int element;
68
#endif
71
+ TCGMemOp size;
69
72
+
70
-#define SET_QC() env->vfp.qc[0] = 1
73
if ((insn & (7 << 16)) == 0 || (q && (rd & 1))) {
71
-
74
return 1;
72
static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
75
}
73
{
76
- if (insn & (1 << 19)) {
74
uint64_t *d = vd + opr_sz;
77
- tmp = neon_load_reg(rm, 1);
75
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
78
- } else {
76
}
79
- tmp = neon_load_reg(rm, 0);
77
80
- }
78
/* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
81
if (insn & (1 << 16)) {
79
-static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
82
- gen_neon_dup_u8(tmp, ((insn >> 17) & 3) * 8);
80
- int16_t src2, int16_t src3)
83
+ size = MO_8;
81
+static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
84
+ element = (insn >> 17) & 7;
82
+ int16_t src3, uint32_t *sat)
85
} else if (insn & (1 << 17)) {
83
{
86
- if ((insn >> 18) & 1)
84
/* Simplify:
87
- gen_neon_dup_high16(tmp);
85
* = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
88
- else
86
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
89
- gen_neon_dup_low16(tmp);
87
ret = ((int32_t)src3 << 15) + ret + (1 << 14);
90
+ size = MO_16;
88
ret >>= 15;
91
+ element = (insn >> 18) & 3;
89
if (ret != (int16_t)ret) {
92
+ } else {
90
- SET_QC();
93
+ size = MO_32;
91
+ *sat = 1;
94
+ element = (insn >> 19) & 1;
92
ret = (ret < 0 ? -0x8000 : 0x7fff);
95
}
93
}
96
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
94
return ret;
97
- tmp2 = tcg_temp_new_i32();
95
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
98
- tcg_gen_mov_i32(tmp2, tmp);
96
uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
99
- neon_store_reg(rd, pass, tmp2);
97
uint32_t src2, uint32_t src3)
100
- }
98
{
101
- tcg_temp_free_i32(tmp);
99
- uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
102
+ tcg_gen_gvec_dup_mem(size, neon_reg_offset(rd, 0),
100
- uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
103
+ neon_element_offset(rm, element, size),
101
+ uint32_t *sat = &env->vfp.qc[0];
104
+ q ? 16 : 8, q ? 16 : 8);
102
+ uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
105
} else {
103
+ uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
106
return 1;
104
return deposit32(e1, 16, 16, e2);
107
}
105
}
106
107
void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
108
- void *ve, uint32_t desc)
109
+ void *vq, uint32_t desc)
110
{
111
uintptr_t opr_sz = simd_oprsz(desc);
112
int16_t *d = vd;
113
int16_t *n = vn;
114
int16_t *m = vm;
115
- CPUARMState *env = ve;
116
uintptr_t i;
117
118
for (i = 0; i < opr_sz / 2; ++i) {
119
- d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
120
+ d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
121
}
122
clear_tail(d, opr_sz, simd_maxsz(desc));
123
}
124
125
/* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
126
-static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
127
- int16_t src2, int16_t src3)
128
+static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
129
+ int16_t src3, uint32_t *sat)
130
{
131
/* Similarly, using subtraction:
132
* = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
133
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
134
ret = ((int32_t)src3 << 15) - ret + (1 << 14);
135
ret >>= 15;
136
if (ret != (int16_t)ret) {
137
- SET_QC();
138
+ *sat = 1;
139
ret = (ret < 0 ? -0x8000 : 0x7fff);
140
}
141
return ret;
142
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
143
uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
144
uint32_t src2, uint32_t src3)
145
{
146
- uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
147
- uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
148
+ uint32_t *sat = &env->vfp.qc[0];
149
+ uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
150
+ uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
151
return deposit32(e1, 16, 16, e2);
152
}
153
154
void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
155
- void *ve, uint32_t desc)
156
+ void *vq, uint32_t desc)
157
{
158
uintptr_t opr_sz = simd_oprsz(desc);
159
int16_t *d = vd;
160
int16_t *n = vn;
161
int16_t *m = vm;
162
- CPUARMState *env = ve;
163
uintptr_t i;
164
165
for (i = 0; i < opr_sz / 2; ++i) {
166
- d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
167
+ d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
168
}
169
clear_tail(d, opr_sz, simd_maxsz(desc));
170
}
171
172
/* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
173
-uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
174
- int32_t src2, int32_t src3)
175
+static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
176
+ int32_t src3, uint32_t *sat)
177
{
178
/* Simplify similarly to int_qrdmlah_s16 above. */
179
int64_t ret = (int64_t)src1 * src2;
180
ret = ((int64_t)src3 << 31) + ret + (1 << 30);
181
ret >>= 31;
182
if (ret != (int32_t)ret) {
183
- SET_QC();
184
+ *sat = 1;
185
ret = (ret < 0 ? INT32_MIN : INT32_MAX);
186
}
187
return ret;
188
}
189
190
+uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
191
+ int32_t src2, int32_t src3)
192
+{
193
+ uint32_t *sat = &env->vfp.qc[0];
194
+ return inl_qrdmlah_s32(src1, src2, src3, sat);
195
+}
196
+
197
void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
198
- void *ve, uint32_t desc)
199
+ void *vq, uint32_t desc)
200
{
201
uintptr_t opr_sz = simd_oprsz(desc);
202
int32_t *d = vd;
203
int32_t *n = vn;
204
int32_t *m = vm;
205
- CPUARMState *env = ve;
206
uintptr_t i;
207
208
for (i = 0; i < opr_sz / 4; ++i) {
209
- d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
210
+ d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
211
}
212
clear_tail(d, opr_sz, simd_maxsz(desc));
213
}
214
215
/* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
216
-uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
217
- int32_t src2, int32_t src3)
218
+static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
219
+ int32_t src3, uint32_t *sat)
220
{
221
/* Simplify similarly to int_qrdmlsh_s16 above. */
222
int64_t ret = (int64_t)src1 * src2;
223
ret = ((int64_t)src3 << 31) - ret + (1 << 30);
224
ret >>= 31;
225
if (ret != (int32_t)ret) {
226
- SET_QC();
227
+ *sat = 1;
228
ret = (ret < 0 ? INT32_MIN : INT32_MAX);
229
}
230
return ret;
231
}
232
233
+uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
234
+ int32_t src2, int32_t src3)
235
+{
236
+ uint32_t *sat = &env->vfp.qc[0];
237
+ return inl_qrdmlsh_s32(src1, src2, src3, sat);
238
+}
239
+
240
void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
241
- void *ve, uint32_t desc)
242
+ void *vq, uint32_t desc)
243
{
244
uintptr_t opr_sz = simd_oprsz(desc);
245
int32_t *d = vd;
246
int32_t *n = vn;
247
int32_t *m = vm;
248
- CPUARMState *env = ve;
249
uintptr_t i;
250
251
for (i = 0; i < opr_sz / 4; ++i) {
252
- d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
253
+ d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
254
}
255
clear_tail(d, opr_sz, simd_maxsz(desc));
256
}
108
--
257
--
109
2.19.1
258
2.20.1
110
259
111
260
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Create struct ARMISARegisters, to be accessed during translation.
3
Must clear the tail for AdvSIMD when SVE is enabled.
4
4
5
Fixes: ca40a6e6e39
6
Cc: qemu-stable@nongnu.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181016223115.24100-2-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
11
---
10
target/arm/cpu.h | 32 ++++----
12
target/arm/vec_helper.c | 2 ++
11
hw/intc/armv7m_nvic.c | 12 +--
13
1 file changed, 2 insertions(+)
12
target/arm/cpu.c | 178 +++++++++++++++++++++---------------------
13
target/arm/cpu64.c | 70 ++++++++---------
14
target/arm/helper.c | 28 +++----
15
5 files changed, 162 insertions(+), 158 deletions(-)
16
14
17
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/cpu.h
17
--- a/target/arm/vec_helper.c
20
+++ b/target/arm/cpu.h
18
+++ b/target/arm/vec_helper.c
21
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
19
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
22
* ARMv7AR ARM Architecture Reference Manual. A reset_ prefix
20
d[i + j] = TYPE##_mul(n[i + j], mm, stat); \
23
* is used for reset values of non-constant registers; no reset_
21
} \
24
* prefix means a constant register.
22
} \
25
+ * Some of these registers are split out into a substructure that
23
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
26
+ * is shared with the translators to control the ISA.
27
*/
28
+ struct ARMISARegisters {
29
+ uint32_t id_isar0;
30
+ uint32_t id_isar1;
31
+ uint32_t id_isar2;
32
+ uint32_t id_isar3;
33
+ uint32_t id_isar4;
34
+ uint32_t id_isar5;
35
+ uint32_t id_isar6;
36
+ uint32_t mvfr0;
37
+ uint32_t mvfr1;
38
+ uint32_t mvfr2;
39
+ uint64_t id_aa64isar0;
40
+ uint64_t id_aa64isar1;
41
+ uint64_t id_aa64pfr0;
42
+ uint64_t id_aa64pfr1;
43
+ } isar;
44
uint32_t midr;
45
uint32_t revidr;
46
uint32_t reset_fpsid;
47
- uint32_t mvfr0;
48
- uint32_t mvfr1;
49
- uint32_t mvfr2;
50
uint32_t ctr;
51
uint32_t reset_sctlr;
52
uint32_t id_pfr0;
53
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
54
uint32_t id_mmfr2;
55
uint32_t id_mmfr3;
56
uint32_t id_mmfr4;
57
- uint32_t id_isar0;
58
- uint32_t id_isar1;
59
- uint32_t id_isar2;
60
- uint32_t id_isar3;
61
- uint32_t id_isar4;
62
- uint32_t id_isar5;
63
- uint32_t id_isar6;
64
- uint64_t id_aa64pfr0;
65
- uint64_t id_aa64pfr1;
66
uint64_t id_aa64dfr0;
67
uint64_t id_aa64dfr1;
68
uint64_t id_aa64afr0;
69
uint64_t id_aa64afr1;
70
- uint64_t id_aa64isar0;
71
- uint64_t id_aa64isar1;
72
uint64_t id_aa64mmfr0;
73
uint64_t id_aa64mmfr1;
74
uint32_t dbgdidr;
75
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
76
index XXXXXXX..XXXXXXX 100644
77
--- a/hw/intc/armv7m_nvic.c
78
+++ b/hw/intc/armv7m_nvic.c
79
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
80
case 0xd5c: /* MMFR3. */
81
return cpu->id_mmfr3;
82
case 0xd60: /* ISAR0. */
83
- return cpu->id_isar0;
84
+ return cpu->isar.id_isar0;
85
case 0xd64: /* ISAR1. */
86
- return cpu->id_isar1;
87
+ return cpu->isar.id_isar1;
88
case 0xd68: /* ISAR2. */
89
- return cpu->id_isar2;
90
+ return cpu->isar.id_isar2;
91
case 0xd6c: /* ISAR3. */
92
- return cpu->id_isar3;
93
+ return cpu->isar.id_isar3;
94
case 0xd70: /* ISAR4. */
95
- return cpu->id_isar4;
96
+ return cpu->isar.id_isar4;
97
case 0xd74: /* ISAR5. */
98
- return cpu->id_isar5;
99
+ return cpu->isar.id_isar5;
100
case 0xd78: /* CLIDR */
101
return cpu->clidr;
102
case 0xd7c: /* CTR */
103
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/target/arm/cpu.c
106
+++ b/target/arm/cpu.c
107
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
108
g_hash_table_foreach(cpu->cp_regs, cp_reg_check_reset, cpu);
109
110
env->vfp.xregs[ARM_VFP_FPSID] = cpu->reset_fpsid;
111
- env->vfp.xregs[ARM_VFP_MVFR0] = cpu->mvfr0;
112
- env->vfp.xregs[ARM_VFP_MVFR1] = cpu->mvfr1;
113
- env->vfp.xregs[ARM_VFP_MVFR2] = cpu->mvfr2;
114
+ env->vfp.xregs[ARM_VFP_MVFR0] = cpu->isar.mvfr0;
115
+ env->vfp.xregs[ARM_VFP_MVFR1] = cpu->isar.mvfr1;
116
+ env->vfp.xregs[ARM_VFP_MVFR2] = cpu->isar.mvfr2;
117
118
cpu->power_state = cpu->start_powered_off ? PSCI_OFF : PSCI_ON;
119
s->halted = cpu->start_powered_off;
120
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
121
* registers as well. These are id_pfr1[7:4] and id_aa64pfr0[15:12].
122
*/
123
cpu->id_pfr1 &= ~0xf0;
124
- cpu->id_aa64pfr0 &= ~0xf000;
125
+ cpu->isar.id_aa64pfr0 &= ~0xf000;
126
}
127
128
if (!cpu->has_el2) {
129
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
130
* registers if we don't have EL2. These are id_pfr1[15:12] and
131
* id_aa64pfr0_el1[11:8].
132
*/
133
- cpu->id_aa64pfr0 &= ~0xf00;
134
+ cpu->isar.id_aa64pfr0 &= ~0xf00;
135
cpu->id_pfr1 &= ~0xf000;
136
}
137
138
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
139
set_feature(&cpu->env, ARM_FEATURE_CACHE_BLOCK_OPS);
140
cpu->midr = 0x4107b362;
141
cpu->reset_fpsid = 0x410120b4;
142
- cpu->mvfr0 = 0x11111111;
143
- cpu->mvfr1 = 0x00000000;
144
+ cpu->isar.mvfr0 = 0x11111111;
145
+ cpu->isar.mvfr1 = 0x00000000;
146
cpu->ctr = 0x1dd20d2;
147
cpu->reset_sctlr = 0x00050078;
148
cpu->id_pfr0 = 0x111;
149
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
150
cpu->id_mmfr0 = 0x01130003;
151
cpu->id_mmfr1 = 0x10030302;
152
cpu->id_mmfr2 = 0x01222110;
153
- cpu->id_isar0 = 0x00140011;
154
- cpu->id_isar1 = 0x12002111;
155
- cpu->id_isar2 = 0x11231111;
156
- cpu->id_isar3 = 0x01102131;
157
- cpu->id_isar4 = 0x141;
158
+ cpu->isar.id_isar0 = 0x00140011;
159
+ cpu->isar.id_isar1 = 0x12002111;
160
+ cpu->isar.id_isar2 = 0x11231111;
161
+ cpu->isar.id_isar3 = 0x01102131;
162
+ cpu->isar.id_isar4 = 0x141;
163
cpu->reset_auxcr = 7;
164
}
24
}
165
25
166
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
26
DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
167
set_feature(&cpu->env, ARM_FEATURE_CACHE_BLOCK_OPS);
27
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
168
cpu->midr = 0x4117b363;
28
mm, a[i + j], 0, stat); \
169
cpu->reset_fpsid = 0x410120b4;
29
} \
170
- cpu->mvfr0 = 0x11111111;
30
} \
171
- cpu->mvfr1 = 0x00000000;
31
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
172
+ cpu->isar.mvfr0 = 0x11111111;
173
+ cpu->isar.mvfr1 = 0x00000000;
174
cpu->ctr = 0x1dd20d2;
175
cpu->reset_sctlr = 0x00050078;
176
cpu->id_pfr0 = 0x111;
177
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
178
cpu->id_mmfr0 = 0x01130003;
179
cpu->id_mmfr1 = 0x10030302;
180
cpu->id_mmfr2 = 0x01222110;
181
- cpu->id_isar0 = 0x00140011;
182
- cpu->id_isar1 = 0x12002111;
183
- cpu->id_isar2 = 0x11231111;
184
- cpu->id_isar3 = 0x01102131;
185
- cpu->id_isar4 = 0x141;
186
+ cpu->isar.id_isar0 = 0x00140011;
187
+ cpu->isar.id_isar1 = 0x12002111;
188
+ cpu->isar.id_isar2 = 0x11231111;
189
+ cpu->isar.id_isar3 = 0x01102131;
190
+ cpu->isar.id_isar4 = 0x141;
191
cpu->reset_auxcr = 7;
192
}
32
}
193
33
194
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
34
DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
195
set_feature(&cpu->env, ARM_FEATURE_EL3);
196
cpu->midr = 0x410fb767;
197
cpu->reset_fpsid = 0x410120b5;
198
- cpu->mvfr0 = 0x11111111;
199
- cpu->mvfr1 = 0x00000000;
200
+ cpu->isar.mvfr0 = 0x11111111;
201
+ cpu->isar.mvfr1 = 0x00000000;
202
cpu->ctr = 0x1dd20d2;
203
cpu->reset_sctlr = 0x00050078;
204
cpu->id_pfr0 = 0x111;
205
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
206
cpu->id_mmfr0 = 0x01130003;
207
cpu->id_mmfr1 = 0x10030302;
208
cpu->id_mmfr2 = 0x01222100;
209
- cpu->id_isar0 = 0x0140011;
210
- cpu->id_isar1 = 0x12002111;
211
- cpu->id_isar2 = 0x11231121;
212
- cpu->id_isar3 = 0x01102131;
213
- cpu->id_isar4 = 0x01141;
214
+ cpu->isar.id_isar0 = 0x0140011;
215
+ cpu->isar.id_isar1 = 0x12002111;
216
+ cpu->isar.id_isar2 = 0x11231121;
217
+ cpu->isar.id_isar3 = 0x01102131;
218
+ cpu->isar.id_isar4 = 0x01141;
219
cpu->reset_auxcr = 7;
220
}
221
222
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
223
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
224
cpu->midr = 0x410fb022;
225
cpu->reset_fpsid = 0x410120b4;
226
- cpu->mvfr0 = 0x11111111;
227
- cpu->mvfr1 = 0x00000000;
228
+ cpu->isar.mvfr0 = 0x11111111;
229
+ cpu->isar.mvfr1 = 0x00000000;
230
cpu->ctr = 0x1d192992; /* 32K icache 32K dcache */
231
cpu->id_pfr0 = 0x111;
232
cpu->id_pfr1 = 0x1;
233
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
234
cpu->id_mmfr0 = 0x01100103;
235
cpu->id_mmfr1 = 0x10020302;
236
cpu->id_mmfr2 = 0x01222000;
237
- cpu->id_isar0 = 0x00100011;
238
- cpu->id_isar1 = 0x12002111;
239
- cpu->id_isar2 = 0x11221011;
240
- cpu->id_isar3 = 0x01102131;
241
- cpu->id_isar4 = 0x141;
242
+ cpu->isar.id_isar0 = 0x00100011;
243
+ cpu->isar.id_isar1 = 0x12002111;
244
+ cpu->isar.id_isar2 = 0x11221011;
245
+ cpu->isar.id_isar3 = 0x01102131;
246
+ cpu->isar.id_isar4 = 0x141;
247
cpu->reset_auxcr = 1;
248
}
249
250
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
251
cpu->id_mmfr1 = 0x00000000;
252
cpu->id_mmfr2 = 0x00000000;
253
cpu->id_mmfr3 = 0x00000000;
254
- cpu->id_isar0 = 0x01141110;
255
- cpu->id_isar1 = 0x02111000;
256
- cpu->id_isar2 = 0x21112231;
257
- cpu->id_isar3 = 0x01111110;
258
- cpu->id_isar4 = 0x01310102;
259
- cpu->id_isar5 = 0x00000000;
260
- cpu->id_isar6 = 0x00000000;
261
+ cpu->isar.id_isar0 = 0x01141110;
262
+ cpu->isar.id_isar1 = 0x02111000;
263
+ cpu->isar.id_isar2 = 0x21112231;
264
+ cpu->isar.id_isar3 = 0x01111110;
265
+ cpu->isar.id_isar4 = 0x01310102;
266
+ cpu->isar.id_isar5 = 0x00000000;
267
+ cpu->isar.id_isar6 = 0x00000000;
268
}
269
270
static void cortex_m4_initfn(Object *obj)
271
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
272
cpu->id_mmfr1 = 0x00000000;
273
cpu->id_mmfr2 = 0x00000000;
274
cpu->id_mmfr3 = 0x00000000;
275
- cpu->id_isar0 = 0x01141110;
276
- cpu->id_isar1 = 0x02111000;
277
- cpu->id_isar2 = 0x21112231;
278
- cpu->id_isar3 = 0x01111110;
279
- cpu->id_isar4 = 0x01310102;
280
- cpu->id_isar5 = 0x00000000;
281
- cpu->id_isar6 = 0x00000000;
282
+ cpu->isar.id_isar0 = 0x01141110;
283
+ cpu->isar.id_isar1 = 0x02111000;
284
+ cpu->isar.id_isar2 = 0x21112231;
285
+ cpu->isar.id_isar3 = 0x01111110;
286
+ cpu->isar.id_isar4 = 0x01310102;
287
+ cpu->isar.id_isar5 = 0x00000000;
288
+ cpu->isar.id_isar6 = 0x00000000;
289
}
290
291
static void cortex_m33_initfn(Object *obj)
292
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
293
cpu->id_mmfr1 = 0x00000000;
294
cpu->id_mmfr2 = 0x01000000;
295
cpu->id_mmfr3 = 0x00000000;
296
- cpu->id_isar0 = 0x01101110;
297
- cpu->id_isar1 = 0x02212000;
298
- cpu->id_isar2 = 0x20232232;
299
- cpu->id_isar3 = 0x01111131;
300
- cpu->id_isar4 = 0x01310132;
301
- cpu->id_isar5 = 0x00000000;
302
- cpu->id_isar6 = 0x00000000;
303
+ cpu->isar.id_isar0 = 0x01101110;
304
+ cpu->isar.id_isar1 = 0x02212000;
305
+ cpu->isar.id_isar2 = 0x20232232;
306
+ cpu->isar.id_isar3 = 0x01111131;
307
+ cpu->isar.id_isar4 = 0x01310132;
308
+ cpu->isar.id_isar5 = 0x00000000;
309
+ cpu->isar.id_isar6 = 0x00000000;
310
cpu->clidr = 0x00000000;
311
cpu->ctr = 0x8000c000;
312
}
313
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
314
cpu->id_mmfr1 = 0x00000000;
315
cpu->id_mmfr2 = 0x01200000;
316
cpu->id_mmfr3 = 0x0211;
317
- cpu->id_isar0 = 0x02101111;
318
- cpu->id_isar1 = 0x13112111;
319
- cpu->id_isar2 = 0x21232141;
320
- cpu->id_isar3 = 0x01112131;
321
- cpu->id_isar4 = 0x0010142;
322
- cpu->id_isar5 = 0x0;
323
- cpu->id_isar6 = 0x0;
324
+ cpu->isar.id_isar0 = 0x02101111;
325
+ cpu->isar.id_isar1 = 0x13112111;
326
+ cpu->isar.id_isar2 = 0x21232141;
327
+ cpu->isar.id_isar3 = 0x01112131;
328
+ cpu->isar.id_isar4 = 0x0010142;
329
+ cpu->isar.id_isar5 = 0x0;
330
+ cpu->isar.id_isar6 = 0x0;
331
cpu->mp_is_up = true;
332
cpu->pmsav7_dregion = 16;
333
define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
334
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
335
set_feature(&cpu->env, ARM_FEATURE_EL3);
336
cpu->midr = 0x410fc080;
337
cpu->reset_fpsid = 0x410330c0;
338
- cpu->mvfr0 = 0x11110222;
339
- cpu->mvfr1 = 0x00011111;
340
+ cpu->isar.mvfr0 = 0x11110222;
341
+ cpu->isar.mvfr1 = 0x00011111;
342
cpu->ctr = 0x82048004;
343
cpu->reset_sctlr = 0x00c50078;
344
cpu->id_pfr0 = 0x1031;
345
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
346
cpu->id_mmfr1 = 0x20000000;
347
cpu->id_mmfr2 = 0x01202000;
348
cpu->id_mmfr3 = 0x11;
349
- cpu->id_isar0 = 0x00101111;
350
- cpu->id_isar1 = 0x12112111;
351
- cpu->id_isar2 = 0x21232031;
352
- cpu->id_isar3 = 0x11112131;
353
- cpu->id_isar4 = 0x00111142;
354
+ cpu->isar.id_isar0 = 0x00101111;
355
+ cpu->isar.id_isar1 = 0x12112111;
356
+ cpu->isar.id_isar2 = 0x21232031;
357
+ cpu->isar.id_isar3 = 0x11112131;
358
+ cpu->isar.id_isar4 = 0x00111142;
359
cpu->dbgdidr = 0x15141000;
360
cpu->clidr = (1 << 27) | (2 << 24) | 3;
361
cpu->ccsidr[0] = 0xe007e01a; /* 16k L1 dcache. */
362
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
363
set_feature(&cpu->env, ARM_FEATURE_CBAR);
364
cpu->midr = 0x410fc090;
365
cpu->reset_fpsid = 0x41033090;
366
- cpu->mvfr0 = 0x11110222;
367
- cpu->mvfr1 = 0x01111111;
368
+ cpu->isar.mvfr0 = 0x11110222;
369
+ cpu->isar.mvfr1 = 0x01111111;
370
cpu->ctr = 0x80038003;
371
cpu->reset_sctlr = 0x00c50078;
372
cpu->id_pfr0 = 0x1031;
373
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
374
cpu->id_mmfr1 = 0x20000000;
375
cpu->id_mmfr2 = 0x01230000;
376
cpu->id_mmfr3 = 0x00002111;
377
- cpu->id_isar0 = 0x00101111;
378
- cpu->id_isar1 = 0x13112111;
379
- cpu->id_isar2 = 0x21232041;
380
- cpu->id_isar3 = 0x11112131;
381
- cpu->id_isar4 = 0x00111142;
382
+ cpu->isar.id_isar0 = 0x00101111;
383
+ cpu->isar.id_isar1 = 0x13112111;
384
+ cpu->isar.id_isar2 = 0x21232041;
385
+ cpu->isar.id_isar3 = 0x11112131;
386
+ cpu->isar.id_isar4 = 0x00111142;
387
cpu->dbgdidr = 0x35141000;
388
cpu->clidr = (1 << 27) | (1 << 24) | 3;
389
cpu->ccsidr[0] = 0xe00fe019; /* 16k L1 dcache. */
390
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
391
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7;
392
cpu->midr = 0x410fc075;
393
cpu->reset_fpsid = 0x41023075;
394
- cpu->mvfr0 = 0x10110222;
395
- cpu->mvfr1 = 0x11111111;
396
+ cpu->isar.mvfr0 = 0x10110222;
397
+ cpu->isar.mvfr1 = 0x11111111;
398
cpu->ctr = 0x84448003;
399
cpu->reset_sctlr = 0x00c50078;
400
cpu->id_pfr0 = 0x00001131;
401
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
402
/* a7_mpcore_r0p5_trm, page 4-4 gives 0x01101110; but
403
* table 4-41 gives 0x02101110, which includes the arm div insns.
404
*/
405
- cpu->id_isar0 = 0x02101110;
406
- cpu->id_isar1 = 0x13112111;
407
- cpu->id_isar2 = 0x21232041;
408
- cpu->id_isar3 = 0x11112131;
409
- cpu->id_isar4 = 0x10011142;
410
+ cpu->isar.id_isar0 = 0x02101110;
411
+ cpu->isar.id_isar1 = 0x13112111;
412
+ cpu->isar.id_isar2 = 0x21232041;
413
+ cpu->isar.id_isar3 = 0x11112131;
414
+ cpu->isar.id_isar4 = 0x10011142;
415
cpu->dbgdidr = 0x3515f005;
416
cpu->clidr = 0x0a200023;
417
cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
418
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
419
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A15;
420
cpu->midr = 0x412fc0f1;
421
cpu->reset_fpsid = 0x410430f0;
422
- cpu->mvfr0 = 0x10110222;
423
- cpu->mvfr1 = 0x11111111;
424
+ cpu->isar.mvfr0 = 0x10110222;
425
+ cpu->isar.mvfr1 = 0x11111111;
426
cpu->ctr = 0x8444c004;
427
cpu->reset_sctlr = 0x00c50078;
428
cpu->id_pfr0 = 0x00001131;
429
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
430
cpu->id_mmfr1 = 0x20000000;
431
cpu->id_mmfr2 = 0x01240000;
432
cpu->id_mmfr3 = 0x02102211;
433
- cpu->id_isar0 = 0x02101110;
434
- cpu->id_isar1 = 0x13112111;
435
- cpu->id_isar2 = 0x21232041;
436
- cpu->id_isar3 = 0x11112131;
437
- cpu->id_isar4 = 0x10011142;
438
+ cpu->isar.id_isar0 = 0x02101110;
439
+ cpu->isar.id_isar1 = 0x13112111;
440
+ cpu->isar.id_isar2 = 0x21232041;
441
+ cpu->isar.id_isar3 = 0x11112131;
442
+ cpu->isar.id_isar4 = 0x10011142;
443
cpu->dbgdidr = 0x3515f021;
444
cpu->clidr = 0x0a200023;
445
cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
446
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
447
index XXXXXXX..XXXXXXX 100644
448
--- a/target/arm/cpu64.c
449
+++ b/target/arm/cpu64.c
450
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
451
cpu->midr = 0x411fd070;
452
cpu->revidr = 0x00000000;
453
cpu->reset_fpsid = 0x41034070;
454
- cpu->mvfr0 = 0x10110222;
455
- cpu->mvfr1 = 0x12111111;
456
- cpu->mvfr2 = 0x00000043;
457
+ cpu->isar.mvfr0 = 0x10110222;
458
+ cpu->isar.mvfr1 = 0x12111111;
459
+ cpu->isar.mvfr2 = 0x00000043;
460
cpu->ctr = 0x8444c004;
461
cpu->reset_sctlr = 0x00c50838;
462
cpu->id_pfr0 = 0x00000131;
463
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
464
cpu->id_mmfr1 = 0x40000000;
465
cpu->id_mmfr2 = 0x01260000;
466
cpu->id_mmfr3 = 0x02102211;
467
- cpu->id_isar0 = 0x02101110;
468
- cpu->id_isar1 = 0x13112111;
469
- cpu->id_isar2 = 0x21232042;
470
- cpu->id_isar3 = 0x01112131;
471
- cpu->id_isar4 = 0x00011142;
472
- cpu->id_isar5 = 0x00011121;
473
- cpu->id_isar6 = 0;
474
- cpu->id_aa64pfr0 = 0x00002222;
475
+ cpu->isar.id_isar0 = 0x02101110;
476
+ cpu->isar.id_isar1 = 0x13112111;
477
+ cpu->isar.id_isar2 = 0x21232042;
478
+ cpu->isar.id_isar3 = 0x01112131;
479
+ cpu->isar.id_isar4 = 0x00011142;
480
+ cpu->isar.id_isar5 = 0x00011121;
481
+ cpu->isar.id_isar6 = 0;
482
+ cpu->isar.id_aa64pfr0 = 0x00002222;
483
cpu->id_aa64dfr0 = 0x10305106;
484
cpu->pmceid0 = 0x00000000;
485
cpu->pmceid1 = 0x00000000;
486
- cpu->id_aa64isar0 = 0x00011120;
487
+ cpu->isar.id_aa64isar0 = 0x00011120;
488
cpu->id_aa64mmfr0 = 0x00001124;
489
cpu->dbgdidr = 0x3516d000;
490
cpu->clidr = 0x0a200023;
491
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
492
cpu->midr = 0x410fd034;
493
cpu->revidr = 0x00000000;
494
cpu->reset_fpsid = 0x41034070;
495
- cpu->mvfr0 = 0x10110222;
496
- cpu->mvfr1 = 0x12111111;
497
- cpu->mvfr2 = 0x00000043;
498
+ cpu->isar.mvfr0 = 0x10110222;
499
+ cpu->isar.mvfr1 = 0x12111111;
500
+ cpu->isar.mvfr2 = 0x00000043;
501
cpu->ctr = 0x84448004; /* L1Ip = VIPT */
502
cpu->reset_sctlr = 0x00c50838;
503
cpu->id_pfr0 = 0x00000131;
504
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
505
cpu->id_mmfr1 = 0x40000000;
506
cpu->id_mmfr2 = 0x01260000;
507
cpu->id_mmfr3 = 0x02102211;
508
- cpu->id_isar0 = 0x02101110;
509
- cpu->id_isar1 = 0x13112111;
510
- cpu->id_isar2 = 0x21232042;
511
- cpu->id_isar3 = 0x01112131;
512
- cpu->id_isar4 = 0x00011142;
513
- cpu->id_isar5 = 0x00011121;
514
- cpu->id_isar6 = 0;
515
- cpu->id_aa64pfr0 = 0x00002222;
516
+ cpu->isar.id_isar0 = 0x02101110;
517
+ cpu->isar.id_isar1 = 0x13112111;
518
+ cpu->isar.id_isar2 = 0x21232042;
519
+ cpu->isar.id_isar3 = 0x01112131;
520
+ cpu->isar.id_isar4 = 0x00011142;
521
+ cpu->isar.id_isar5 = 0x00011121;
522
+ cpu->isar.id_isar6 = 0;
523
+ cpu->isar.id_aa64pfr0 = 0x00002222;
524
cpu->id_aa64dfr0 = 0x10305106;
525
- cpu->id_aa64isar0 = 0x00011120;
526
+ cpu->isar.id_aa64isar0 = 0x00011120;
527
cpu->id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
528
cpu->dbgdidr = 0x3516d000;
529
cpu->clidr = 0x0a200023;
530
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
531
cpu->midr = 0x410fd083;
532
cpu->revidr = 0x00000000;
533
cpu->reset_fpsid = 0x41034080;
534
- cpu->mvfr0 = 0x10110222;
535
- cpu->mvfr1 = 0x12111111;
536
- cpu->mvfr2 = 0x00000043;
537
+ cpu->isar.mvfr0 = 0x10110222;
538
+ cpu->isar.mvfr1 = 0x12111111;
539
+ cpu->isar.mvfr2 = 0x00000043;
540
cpu->ctr = 0x8444c004;
541
cpu->reset_sctlr = 0x00c50838;
542
cpu->id_pfr0 = 0x00000131;
543
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
544
cpu->id_mmfr1 = 0x40000000;
545
cpu->id_mmfr2 = 0x01260000;
546
cpu->id_mmfr3 = 0x02102211;
547
- cpu->id_isar0 = 0x02101110;
548
- cpu->id_isar1 = 0x13112111;
549
- cpu->id_isar2 = 0x21232042;
550
- cpu->id_isar3 = 0x01112131;
551
- cpu->id_isar4 = 0x00011142;
552
- cpu->id_isar5 = 0x00011121;
553
- cpu->id_aa64pfr0 = 0x00002222;
554
+ cpu->isar.id_isar0 = 0x02101110;
555
+ cpu->isar.id_isar1 = 0x13112111;
556
+ cpu->isar.id_isar2 = 0x21232042;
557
+ cpu->isar.id_isar3 = 0x01112131;
558
+ cpu->isar.id_isar4 = 0x00011142;
559
+ cpu->isar.id_isar5 = 0x00011121;
560
+ cpu->isar.id_aa64pfr0 = 0x00002222;
561
cpu->id_aa64dfr0 = 0x10305106;
562
cpu->pmceid0 = 0x00000000;
563
cpu->pmceid1 = 0x00000000;
564
- cpu->id_aa64isar0 = 0x00011120;
565
+ cpu->isar.id_aa64isar0 = 0x00011120;
566
cpu->id_aa64mmfr0 = 0x00001124;
567
cpu->dbgdidr = 0x3516d000;
568
cpu->clidr = 0x0a200023;
569
diff --git a/target/arm/helper.c b/target/arm/helper.c
570
index XXXXXXX..XXXXXXX 100644
571
--- a/target/arm/helper.c
572
+++ b/target/arm/helper.c
573
@@ -XXX,XX +XXX,XX @@ static uint64_t id_pfr1_read(CPUARMState *env, const ARMCPRegInfo *ri)
574
static uint64_t id_aa64pfr0_read(CPUARMState *env, const ARMCPRegInfo *ri)
575
{
576
ARMCPU *cpu = arm_env_get_cpu(env);
577
- uint64_t pfr0 = cpu->id_aa64pfr0;
578
+ uint64_t pfr0 = cpu->isar.id_aa64pfr0;
579
580
if (env->gicv3state) {
581
pfr0 |= 1 << 24;
582
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
583
{ .name = "ID_ISAR0", .state = ARM_CP_STATE_BOTH,
584
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
585
.access = PL1_R, .type = ARM_CP_CONST,
586
- .resetvalue = cpu->id_isar0 },
587
+ .resetvalue = cpu->isar.id_isar0 },
588
{ .name = "ID_ISAR1", .state = ARM_CP_STATE_BOTH,
589
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 1,
590
.access = PL1_R, .type = ARM_CP_CONST,
591
- .resetvalue = cpu->id_isar1 },
592
+ .resetvalue = cpu->isar.id_isar1 },
593
{ .name = "ID_ISAR2", .state = ARM_CP_STATE_BOTH,
594
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 2,
595
.access = PL1_R, .type = ARM_CP_CONST,
596
- .resetvalue = cpu->id_isar2 },
597
+ .resetvalue = cpu->isar.id_isar2 },
598
{ .name = "ID_ISAR3", .state = ARM_CP_STATE_BOTH,
599
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 3,
600
.access = PL1_R, .type = ARM_CP_CONST,
601
- .resetvalue = cpu->id_isar3 },
602
+ .resetvalue = cpu->isar.id_isar3 },
603
{ .name = "ID_ISAR4", .state = ARM_CP_STATE_BOTH,
604
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 4,
605
.access = PL1_R, .type = ARM_CP_CONST,
606
- .resetvalue = cpu->id_isar4 },
607
+ .resetvalue = cpu->isar.id_isar4 },
608
{ .name = "ID_ISAR5", .state = ARM_CP_STATE_BOTH,
609
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 5,
610
.access = PL1_R, .type = ARM_CP_CONST,
611
- .resetvalue = cpu->id_isar5 },
612
+ .resetvalue = cpu->isar.id_isar5 },
613
{ .name = "ID_MMFR4", .state = ARM_CP_STATE_BOTH,
614
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
615
.access = PL1_R, .type = ARM_CP_CONST,
616
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
617
{ .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
618
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
619
.access = PL1_R, .type = ARM_CP_CONST,
620
- .resetvalue = cpu->id_isar6 },
621
+ .resetvalue = cpu->isar.id_isar6 },
622
REGINFO_SENTINEL
623
};
624
define_arm_cp_regs(cpu, v6_idregs);
625
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
626
{ .name = "ID_AA64PFR1_EL1", .state = ARM_CP_STATE_AA64,
627
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 1,
628
.access = PL1_R, .type = ARM_CP_CONST,
629
- .resetvalue = cpu->id_aa64pfr1},
630
+ .resetvalue = cpu->isar.id_aa64pfr1},
631
{ .name = "ID_AA64PFR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
632
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 2,
633
.access = PL1_R, .type = ARM_CP_CONST,
634
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
635
{ .name = "ID_AA64ISAR0_EL1", .state = ARM_CP_STATE_AA64,
636
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 0,
637
.access = PL1_R, .type = ARM_CP_CONST,
638
- .resetvalue = cpu->id_aa64isar0 },
639
+ .resetvalue = cpu->isar.id_aa64isar0 },
640
{ .name = "ID_AA64ISAR1_EL1", .state = ARM_CP_STATE_AA64,
641
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 1,
642
.access = PL1_R, .type = ARM_CP_CONST,
643
- .resetvalue = cpu->id_aa64isar1 },
644
+ .resetvalue = cpu->isar.id_aa64isar1 },
645
{ .name = "ID_AA64ISAR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
646
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 2,
647
.access = PL1_R, .type = ARM_CP_CONST,
648
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
649
{ .name = "MVFR0_EL1", .state = ARM_CP_STATE_AA64,
650
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 0,
651
.access = PL1_R, .type = ARM_CP_CONST,
652
- .resetvalue = cpu->mvfr0 },
653
+ .resetvalue = cpu->isar.mvfr0 },
654
{ .name = "MVFR1_EL1", .state = ARM_CP_STATE_AA64,
655
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 1,
656
.access = PL1_R, .type = ARM_CP_CONST,
657
- .resetvalue = cpu->mvfr1 },
658
+ .resetvalue = cpu->isar.mvfr1 },
659
{ .name = "MVFR2_EL1", .state = ARM_CP_STATE_AA64,
660
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
661
.access = PL1_R, .type = ARM_CP_CONST,
662
- .resetvalue = cpu->mvfr2 },
663
+ .resetvalue = cpu->isar.mvfr2 },
664
{ .name = "MVFR3_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
665
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 3,
666
.access = PL1_R, .type = ARM_CP_CONST,
667
--
35
--
668
2.19.1
36
2.20.1
669
37
670
38
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Move expanders for VBSL, VBIT, and VBIF from translate-a64.c.
3
Include 64-bit element size in preparation for SVE2.
4
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181011205206.3552-9-richard.henderson@linaro.org
7
Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
9
---
10
target/arm/translate.h | 6 ++
10
target/arm/helper.h | 10 +++
11
target/arm/translate-a64.c | 61 --------------
11
target/arm/translate.h | 5 ++
12
target/arm/translate.c | 162 +++++++++++++++++++++++++++----------
12
target/arm/translate-a64.c | 8 ++-
13
3 files changed, 124 insertions(+), 105 deletions(-)
13
target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++-
14
14
target/arm/vec_helper.c | 24 +++++++
15
5 files changed, 176 insertions(+), 4 deletions(-)
16
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.h
20
+++ b/target/arm/helper.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
22
DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
23
DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
24
25
+DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+
35
#ifdef TARGET_AARCH64
36
#include "helper-a64.h"
37
#include "helper-sve.h"
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
38
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
40
--- a/target/arm/translate.h
18
+++ b/target/arm/translate.h
41
+++ b/target/arm/translate.h
19
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
42
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
20
return ret;
43
void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
21
}
44
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
22
45
23
+
46
+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
24
+/* Vector operations shared between ARM and AArch64. */
47
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
+extern const GVecGen3 bsl_op;
48
+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
26
+extern const GVecGen3 bit_op;
49
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
27
+extern const GVecGen3 bif_op;
28
+
50
+
29
/*
51
/*
30
* Forward to the isar_feature_* tests given a DisasContext pointer.
52
* Forward to the isar_feature_* tests given a DisasContext pointer.
31
*/
53
*/
32
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
54
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
33
index XXXXXXX..XXXXXXX 100644
55
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate-a64.c
56
--- a/target/arm/translate-a64.c
35
+++ b/target/arm/translate-a64.c
57
+++ b/target/arm/translate-a64.c
36
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
58
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
37
}
59
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size);
38
}
60
}
39
61
return;
40
-static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
62
+ case 0xe: /* SABD, UABD */
41
-{
63
+ if (u) {
42
- tcg_gen_xor_i64(rn, rn, rm);
64
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size);
43
- tcg_gen_and_i64(rn, rn, rd);
65
+ } else {
44
- tcg_gen_xor_i64(rd, rm, rn);
66
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
45
-}
67
+ }
46
-
68
+ return;
47
-static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
69
case 0x10: /* ADD, SUB */
48
-{
70
if (u) {
49
- tcg_gen_xor_i64(rn, rn, rd);
71
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
50
- tcg_gen_and_i64(rn, rn, rm);
72
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
51
- tcg_gen_xor_i64(rd, rd, rn);
73
genenvfn = fns[size][u];
52
-}
74
break;
53
-
75
}
54
-static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
76
- case 0xe: /* SABD, UABD */
55
-{
77
case 0xf: /* SABA, UABA */
56
- tcg_gen_xor_i64(rn, rn, rd);
78
{
57
- tcg_gen_andc_i64(rn, rn, rm);
79
static NeonGenTwoOpFn * const fns[3][2] = {
58
- tcg_gen_xor_i64(rd, rd, rn);
59
-}
60
-
61
-static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
62
-{
63
- tcg_gen_xor_vec(vece, rn, rn, rm);
64
- tcg_gen_and_vec(vece, rn, rn, rd);
65
- tcg_gen_xor_vec(vece, rd, rm, rn);
66
-}
67
-
68
-static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
69
-{
70
- tcg_gen_xor_vec(vece, rn, rn, rd);
71
- tcg_gen_and_vec(vece, rn, rn, rm);
72
- tcg_gen_xor_vec(vece, rd, rd, rn);
73
-}
74
-
75
-static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
76
-{
77
- tcg_gen_xor_vec(vece, rn, rn, rd);
78
- tcg_gen_andc_vec(vece, rn, rn, rm);
79
- tcg_gen_xor_vec(vece, rd, rd, rn);
80
-}
81
-
82
/* Logic op (opcode == 3) subgroup of C3.6.16. */
83
static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
84
{
85
- static const GVecGen3 bsl_op = {
86
- .fni8 = gen_bsl_i64,
87
- .fniv = gen_bsl_vec,
88
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
89
- .load_dest = true
90
- };
91
- static const GVecGen3 bit_op = {
92
- .fni8 = gen_bit_i64,
93
- .fniv = gen_bit_vec,
94
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
95
- .load_dest = true
96
- };
97
- static const GVecGen3 bif_op = {
98
- .fni8 = gen_bif_i64,
99
- .fniv = gen_bif_vec,
100
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
101
- .load_dest = true
102
- };
103
-
104
int rd = extract32(insn, 0, 5);
105
int rn = extract32(insn, 5, 5);
106
int rm = extract32(insn, 16, 5);
107
diff --git a/target/arm/translate.c b/target/arm/translate.c
80
diff --git a/target/arm/translate.c b/target/arm/translate.c
108
index XXXXXXX..XXXXXXX 100644
81
index XXXXXXX..XXXXXXX 100644
109
--- a/target/arm/translate.c
82
--- a/target/arm/translate.c
110
+++ b/target/arm/translate.c
83
+++ b/target/arm/translate.c
111
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
84
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
112
return 0;
85
rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
113
}
86
}
114
87
115
-/* Bitwise select. dest = c ? t : f. Clobbers T and F. */
88
+static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
116
-static void gen_neon_bsl(TCGv_i32 dest, TCGv_i32 t, TCGv_i32 f, TCGv_i32 c)
89
+{
117
-{
90
+ TCGv_i32 t = tcg_temp_new_i32();
118
- tcg_gen_and_i32(t, t, c);
91
+
119
- tcg_gen_andc_i32(f, f, c);
92
+ tcg_gen_sub_i32(t, a, b);
120
- tcg_gen_or_i32(dest, t, f);
93
+ tcg_gen_sub_i32(d, b, a);
121
-}
94
+ tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
122
-
95
+ tcg_temp_free_i32(t);
123
static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
96
+}
124
{
97
+
125
switch (size) {
98
+static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
126
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
99
+{
127
return 1;
100
+ TCGv_i64 t = tcg_temp_new_i64();
128
}
101
+
129
102
+ tcg_gen_sub_i64(t, a, b);
130
+/*
103
+ tcg_gen_sub_i64(d, b, a);
131
+ * Expanders for VBitOps_VBIF, VBIT, VBSL.
104
+ tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
132
+ */
105
+ tcg_temp_free_i64(t);
133
+static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
106
+}
134
+{
107
+
135
+ tcg_gen_xor_i64(rn, rn, rm);
108
+static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
136
+ tcg_gen_and_i64(rn, rn, rd);
109
+{
137
+ tcg_gen_xor_i64(rd, rm, rn);
110
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
138
+}
111
+
139
+
112
+ tcg_gen_smin_vec(vece, t, a, b);
140
+static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
113
+ tcg_gen_smax_vec(vece, d, a, b);
141
+{
114
+ tcg_gen_sub_vec(vece, d, d, t);
142
+ tcg_gen_xor_i64(rn, rn, rd);
115
+ tcg_temp_free_vec(t);
143
+ tcg_gen_and_i64(rn, rn, rm);
116
+}
144
+ tcg_gen_xor_i64(rd, rd, rn);
117
+
145
+}
118
+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
146
+
119
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
147
+static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
120
+{
148
+{
121
+ static const TCGOpcode vecop_list[] = {
149
+ tcg_gen_xor_i64(rn, rn, rd);
122
+ INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
150
+ tcg_gen_andc_i64(rn, rn, rm);
123
+ };
151
+ tcg_gen_xor_i64(rd, rd, rn);
124
+ static const GVecGen3 ops[4] = {
152
+}
125
+ { .fniv = gen_sabd_vec,
153
+
126
+ .fno = gen_helper_gvec_sabd_b,
154
+static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
127
+ .opt_opc = vecop_list,
155
+{
128
+ .vece = MO_8 },
156
+ tcg_gen_xor_vec(vece, rn, rn, rm);
129
+ { .fniv = gen_sabd_vec,
157
+ tcg_gen_and_vec(vece, rn, rn, rd);
130
+ .fno = gen_helper_gvec_sabd_h,
158
+ tcg_gen_xor_vec(vece, rd, rm, rn);
131
+ .opt_opc = vecop_list,
159
+}
132
+ .vece = MO_16 },
160
+
133
+ { .fni4 = gen_sabd_i32,
161
+static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
134
+ .fniv = gen_sabd_vec,
162
+{
135
+ .fno = gen_helper_gvec_sabd_s,
163
+ tcg_gen_xor_vec(vece, rn, rn, rd);
136
+ .opt_opc = vecop_list,
164
+ tcg_gen_and_vec(vece, rn, rn, rm);
137
+ .vece = MO_32 },
165
+ tcg_gen_xor_vec(vece, rd, rd, rn);
138
+ { .fni8 = gen_sabd_i64,
166
+}
139
+ .fniv = gen_sabd_vec,
167
+
140
+ .fno = gen_helper_gvec_sabd_d,
168
+static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
141
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
169
+{
142
+ .opt_opc = vecop_list,
170
+ tcg_gen_xor_vec(vece, rn, rn, rd);
143
+ .vece = MO_64 },
171
+ tcg_gen_andc_vec(vece, rn, rn, rm);
144
+ };
172
+ tcg_gen_xor_vec(vece, rd, rd, rn);
145
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
173
+}
146
+}
174
+
147
+
175
+const GVecGen3 bsl_op = {
148
+static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
176
+ .fni8 = gen_bsl_i64,
149
+{
177
+ .fniv = gen_bsl_vec,
150
+ TCGv_i32 t = tcg_temp_new_i32();
178
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
151
+
179
+ .load_dest = true
152
+ tcg_gen_sub_i32(t, a, b);
180
+};
153
+ tcg_gen_sub_i32(d, b, a);
181
+
154
+ tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
182
+const GVecGen3 bit_op = {
155
+ tcg_temp_free_i32(t);
183
+ .fni8 = gen_bit_i64,
156
+}
184
+ .fniv = gen_bit_vec,
157
+
185
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
158
+static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
186
+ .load_dest = true
159
+{
187
+};
160
+ TCGv_i64 t = tcg_temp_new_i64();
188
+
161
+
189
+const GVecGen3 bif_op = {
162
+ tcg_gen_sub_i64(t, a, b);
190
+ .fni8 = gen_bif_i64,
163
+ tcg_gen_sub_i64(d, b, a);
191
+ .fniv = gen_bif_vec,
164
+ tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
192
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
165
+ tcg_temp_free_i64(t);
193
+ .load_dest = true
166
+}
194
+};
167
+
195
+
168
+static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
169
+{
170
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
171
+
172
+ tcg_gen_umin_vec(vece, t, a, b);
173
+ tcg_gen_umax_vec(vece, d, a, b);
174
+ tcg_gen_sub_vec(vece, d, d, t);
175
+ tcg_temp_free_vec(t);
176
+}
177
+
178
+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
179
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
180
+{
181
+ static const TCGOpcode vecop_list[] = {
182
+ INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
183
+ };
184
+ static const GVecGen3 ops[4] = {
185
+ { .fniv = gen_uabd_vec,
186
+ .fno = gen_helper_gvec_uabd_b,
187
+ .opt_opc = vecop_list,
188
+ .vece = MO_8 },
189
+ { .fniv = gen_uabd_vec,
190
+ .fno = gen_helper_gvec_uabd_h,
191
+ .opt_opc = vecop_list,
192
+ .vece = MO_16 },
193
+ { .fni4 = gen_uabd_i32,
194
+ .fniv = gen_uabd_vec,
195
+ .fno = gen_helper_gvec_uabd_s,
196
+ .opt_opc = vecop_list,
197
+ .vece = MO_32 },
198
+ { .fni8 = gen_uabd_i64,
199
+ .fniv = gen_uabd_vec,
200
+ .fno = gen_helper_gvec_uabd_d,
201
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
202
+ .opt_opc = vecop_list,
203
+ .vece = MO_64 },
204
+ };
205
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
206
+}
196
+
207
+
197
/* Translate a NEON data processing instruction. Return nonzero if the
208
/* Translate a NEON data processing instruction. Return nonzero if the
198
instruction is invalid.
209
instruction is invalid.
199
We process data in a mixture of 32-bit and 64-bit chunks.
210
We process data in a mixture of 32-bit and 64-bit chunks.
200
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
211
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
201
{
202
int op;
203
int q;
204
- int rd, rn, rm;
205
+ int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
206
int size;
207
int shift;
208
int pass;
209
int count;
210
int pairwise;
211
int u;
212
+ int vec_size;
213
uint32_t imm, mask;
214
TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
215
TCGv_ptr ptr1, ptr2, ptr3;
216
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
217
VFP_DREG_N(rn, insn);
218
VFP_DREG_M(rm, insn);
219
size = (insn >> 20) & 3;
220
+ vec_size = q ? 16 : 8;
221
+ rd_ofs = neon_reg_offset(rd, 0);
222
+ rn_ofs = neon_reg_offset(rn, 0);
223
+ rm_ofs = neon_reg_offset(rm, 0);
224
+
225
if ((insn & (1 << 23)) == 0) {
226
/* Three register same length. */
227
op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
228
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
229
q, rd, rn, rm);
230
}
212
}
231
return 1;
213
return 1;
232
+
214
233
+ case NEON_3R_LOGIC: /* Logic ops. */
215
+ case NEON_3R_VABD:
234
+ switch ((u << 2) | size) {
216
+ if (u) {
235
+ case 0: /* VAND */
217
+ gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
236
+ tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
218
+ vec_size, vec_size);
237
+ vec_size, vec_size);
219
+ } else {
238
+ break;
220
+ gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
239
+ case 1: /* VBIC */
221
+ vec_size, vec_size);
240
+ tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
241
+ vec_size, vec_size);
242
+ break;
243
+ case 2:
244
+ if (rn == rm) {
245
+ /* VMOV */
246
+ tcg_gen_gvec_mov(0, rd_ofs, rn_ofs, vec_size, vec_size);
247
+ } else {
248
+ /* VORR */
249
+ tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
250
+ vec_size, vec_size);
251
+ }
252
+ break;
253
+ case 3: /* VORN */
254
+ tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
255
+ vec_size, vec_size);
256
+ break;
257
+ case 4: /* VEOR */
258
+ tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
259
+ vec_size, vec_size);
260
+ break;
261
+ case 5: /* VBSL */
262
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
263
+ vec_size, vec_size, &bsl_op);
264
+ break;
265
+ case 6: /* VBIT */
266
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
267
+ vec_size, vec_size, &bit_op);
268
+ break;
269
+ case 7: /* VBIF */
270
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
271
+ vec_size, vec_size, &bif_op);
272
+ break;
273
+ }
222
+ }
274
+ return 0;
223
+ return 0;
275
}
224
+
276
- if (size == 3 && op != NEON_3R_LOGIC) {
225
case NEON_3R_VADD_VSUB:
277
+ if (size == 3) {
226
case NEON_3R_LOGIC:
278
/* 64-bit element instructions. */
227
case NEON_3R_VMAX:
279
for (pass = 0; pass < (q ? 2 : 1); pass++) {
280
neon_load_reg64(cpu_V0, rn + pass);
281
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
228
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
282
case NEON_3R_VRHADD:
229
case NEON_3R_VQRSHL:
283
GEN_NEON_INTEGER_OP(rhadd);
230
GEN_NEON_INTEGER_OP_ENV(qrshl);
284
break;
231
break;
285
- case NEON_3R_LOGIC: /* Logic ops. */
232
- case NEON_3R_VABD:
286
- switch ((u << 2) | size) {
233
- GEN_NEON_INTEGER_OP(abd);
287
- case 0: /* VAND */
288
- tcg_gen_and_i32(tmp, tmp, tmp2);
289
- break;
290
- case 1: /* BIC */
291
- tcg_gen_andc_i32(tmp, tmp, tmp2);
292
- break;
293
- case 2: /* VORR */
294
- tcg_gen_or_i32(tmp, tmp, tmp2);
295
- break;
296
- case 3: /* VORN */
297
- tcg_gen_orc_i32(tmp, tmp, tmp2);
298
- break;
299
- case 4: /* VEOR */
300
- tcg_gen_xor_i32(tmp, tmp, tmp2);
301
- break;
302
- case 5: /* VBSL */
303
- tmp3 = neon_load_reg(rd, pass);
304
- gen_neon_bsl(tmp, tmp, tmp2, tmp3);
305
- tcg_temp_free_i32(tmp3);
306
- break;
307
- case 6: /* VBIT */
308
- tmp3 = neon_load_reg(rd, pass);
309
- gen_neon_bsl(tmp, tmp, tmp3, tmp2);
310
- tcg_temp_free_i32(tmp3);
311
- break;
312
- case 7: /* VBIF */
313
- tmp3 = neon_load_reg(rd, pass);
314
- gen_neon_bsl(tmp, tmp3, tmp, tmp2);
315
- tcg_temp_free_i32(tmp3);
316
- break;
317
- }
318
- break;
234
- break;
319
case NEON_3R_VHSUB:
235
case NEON_3R_VABA:
320
GEN_NEON_INTEGER_OP(hsub);
236
GEN_NEON_INTEGER_OP(abd);
321
break;
237
tcg_temp_free_i32(tmp2);
238
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
239
index XXXXXXX..XXXXXXX 100644
240
--- a/target/arm/vec_helper.c
241
+++ b/target/arm/vec_helper.c
242
@@ -XXX,XX +XXX,XX @@ DO_CMP0(gvec_cgt0_h, int16_t, >)
243
DO_CMP0(gvec_cge0_h, int16_t, >=)
244
245
#undef DO_CMP0
246
+
247
+#define DO_ABD(NAME, TYPE) \
248
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
249
+{ \
250
+ intptr_t i, opr_sz = simd_oprsz(desc); \
251
+ TYPE *d = vd, *n = vn, *m = vm; \
252
+ \
253
+ for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \
254
+ d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \
255
+ } \
256
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
257
+}
258
+
259
+DO_ABD(gvec_sabd_b, int8_t)
260
+DO_ABD(gvec_sabd_h, int16_t)
261
+DO_ABD(gvec_sabd_s, int32_t)
262
+DO_ABD(gvec_sabd_d, int64_t)
263
+
264
+DO_ABD(gvec_uabd_b, uint8_t)
265
+DO_ABD(gvec_uabd_h, uint16_t)
266
+DO_ABD(gvec_uabd_s, uint32_t)
267
+DO_ABD(gvec_uabd_d, uint64_t)
268
+
269
+#undef DO_ABD
322
--
270
--
323
2.19.1
271
2.20.1
324
272
325
273
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Move cmtst_op expanders from translate-a64.c.
3
Include 64-bit element size in preparation for SVE2.
4
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181011205206.3552-17-richard.henderson@linaro.org
7
Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
9
---
10
target/arm/translate.h | 2 +
10
target/arm/helper.h | 17 +++--
11
target/arm/translate-a64.c | 38 ------------------
11
target/arm/translate.h | 5 ++
12
target/arm/translate.c | 81 +++++++++++++++++++++++++++-----------
12
target/arm/neon_helper.c | 10 ---
13
3 files changed, 60 insertions(+), 61 deletions(-)
13
target/arm/translate-a64.c | 17 ++---
14
14
target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++--
15
target/arm/vec_helper.c | 24 +++++++
16
6 files changed, 174 insertions(+), 33 deletions(-)
17
18
diff --git a/target/arm/helper.h b/target/arm/helper.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper.h
21
+++ b/target/arm/helper.h
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
23
DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
24
DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
25
26
-DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
27
-DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
28
-DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
29
-DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
30
-DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
31
-DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
32
-
33
DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
34
DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
35
DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
36
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
40
+DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
+
45
+DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
49
+
50
#ifdef TARGET_AARCH64
51
#include "helper-a64.h"
52
#include "helper-sve.h"
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
53
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
index XXXXXXX..XXXXXXX 100644
54
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
55
--- a/target/arm/translate.h
18
+++ b/target/arm/translate.h
56
+++ b/target/arm/translate.h
19
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 bit_op;
57
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
20
extern const GVecGen3 bif_op;
58
void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
21
extern const GVecGen3 mla_op[4];
59
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
22
extern const GVecGen3 mls_op[4];
60
23
+extern const GVecGen3 cmtst_op[4];
61
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
24
extern const GVecGen2i ssra_op[4];
62
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
extern const GVecGen2i usra_op[4];
63
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
26
extern const GVecGen2i sri_op[4];
64
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
27
extern const GVecGen2i sli_op[4];
65
+
28
+void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
29
30
/*
66
/*
31
* Forward to the isar_feature_* tests given a DisasContext pointer.
67
* Forward to the isar_feature_* tests given a DisasContext pointer.
68
*/
69
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/neon_helper.c
72
+++ b/target/arm/neon_helper.c
73
@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
74
NEON_POP(pmax_u16, neon_u16, 2)
75
#undef NEON_FN
76
77
-#define NEON_FN(dest, src1, src2) \
78
- dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
79
-NEON_VOP(abd_s8, neon_s8, 4)
80
-NEON_VOP(abd_u8, neon_u8, 4)
81
-NEON_VOP(abd_s16, neon_s16, 2)
82
-NEON_VOP(abd_u16, neon_u16, 2)
83
-NEON_VOP(abd_s32, neon_s32, 1)
84
-NEON_VOP(abd_u32, neon_u32, 1)
85
-#undef NEON_FN
86
-
87
#define NEON_FN(dest, src1, src2) do { \
88
int8_t tmp; \
89
tmp = (int8_t)src2; \
32
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
90
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
33
index XXXXXXX..XXXXXXX 100644
91
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate-a64.c
92
--- a/target/arm/translate-a64.c
35
+++ b/target/arm/translate-a64.c
93
+++ b/target/arm/translate-a64.c
36
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
94
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
37
}
95
gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
38
}
96
}
39
97
return;
40
-/* CMTST : test is "if (X & Y != 0)". */
98
+ case 0xf: /* SABA, UABA */
41
-static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
99
+ if (u) {
42
-{
100
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
43
- tcg_gen_and_i32(d, a, b);
101
+ } else {
44
- tcg_gen_setcondi_i32(TCG_COND_NE, d, d, 0);
102
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
45
- tcg_gen_neg_i32(d, d);
103
+ }
46
-}
104
+ return;
47
-
105
case 0x10: /* ADD, SUB */
48
-static void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
106
if (u) {
49
-{
107
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
50
- tcg_gen_and_i64(d, a, b);
108
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
51
- tcg_gen_setcondi_i64(TCG_COND_NE, d, d, 0);
109
genenvfn = fns[size][u];
52
- tcg_gen_neg_i64(d, d);
110
break;
53
-}
111
}
54
-
112
- case 0xf: /* SABA, UABA */
55
-static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
113
- {
56
-{
114
- static NeonGenTwoOpFn * const fns[3][2] = {
57
- tcg_gen_and_vec(vece, d, a, b);
115
- { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
58
- tcg_gen_dupi_vec(vece, a, 0);
116
- { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
59
- tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
117
- { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
60
-}
118
- };
61
-
119
- genfn = fns[size][u];
62
static void handle_3same_64(DisasContext *s, int opcode, bool u,
120
- break;
63
TCGv_i64 tcg_rd, TCGv_i64 tcg_rn, TCGv_i64 tcg_rm)
121
- }
64
{
122
case 0x16: /* SQDMULH, SQRDMULH */
65
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
123
{
66
/* Integer op subgroup of C3.6.16. */
124
static NeonGenTwoOpEnvFn * const fns[2][2] = {
67
static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
68
{
69
- static const GVecGen3 cmtst_op[4] = {
70
- { .fni4 = gen_helper_neon_tst_u8,
71
- .fniv = gen_cmtst_vec,
72
- .vece = MO_8 },
73
- { .fni4 = gen_helper_neon_tst_u16,
74
- .fniv = gen_cmtst_vec,
75
- .vece = MO_16 },
76
- { .fni4 = gen_cmtst_i32,
77
- .fniv = gen_cmtst_vec,
78
- .vece = MO_32 },
79
- { .fni8 = gen_cmtst_i64,
80
- .fniv = gen_cmtst_vec,
81
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
82
- .vece = MO_64 },
83
- };
84
-
85
int is_q = extract32(insn, 30, 1);
86
int u = extract32(insn, 29, 1);
87
int size = extract32(insn, 22, 2);
88
diff --git a/target/arm/translate.c b/target/arm/translate.c
125
diff --git a/target/arm/translate.c b/target/arm/translate.c
89
index XXXXXXX..XXXXXXX 100644
126
index XXXXXXX..XXXXXXX 100644
90
--- a/target/arm/translate.c
127
--- a/target/arm/translate.c
91
+++ b/target/arm/translate.c
128
+++ b/target/arm/translate.c
92
@@ -XXX,XX +XXX,XX @@ const GVecGen3 mls_op[4] = {
129
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
93
.vece = MO_64 },
130
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
94
};
131
}
95
132
96
+/* CMTST : test is "if (X & Y != 0)". */
133
+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
97
+static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
134
+{
98
+{
135
+ TCGv_i32 t = tcg_temp_new_i32();
99
+ tcg_gen_and_i32(d, a, b);
136
+ gen_sabd_i32(t, a, b);
100
+ tcg_gen_setcondi_i32(TCG_COND_NE, d, d, 0);
137
+ tcg_gen_add_i32(d, d, t);
101
+ tcg_gen_neg_i32(d, d);
138
+ tcg_temp_free_i32(t);
102
+}
139
+}
103
+
140
+
104
+void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
141
+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
105
+{
142
+{
106
+ tcg_gen_and_i64(d, a, b);
143
+ TCGv_i64 t = tcg_temp_new_i64();
107
+ tcg_gen_setcondi_i64(TCG_COND_NE, d, d, 0);
144
+ gen_sabd_i64(t, a, b);
108
+ tcg_gen_neg_i64(d, d);
145
+ tcg_gen_add_i64(d, d, t);
109
+}
146
+ tcg_temp_free_i64(t);
110
+
147
+}
111
+static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
148
+
112
+{
149
+static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
113
+ tcg_gen_and_vec(vece, d, a, b);
150
+{
114
+ tcg_gen_dupi_vec(vece, a, 0);
151
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
115
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
152
+ gen_sabd_vec(vece, t, a, b);
116
+}
153
+ tcg_gen_add_vec(vece, d, d, t);
117
+
154
+ tcg_temp_free_vec(t);
118
+const GVecGen3 cmtst_op[4] = {
155
+}
119
+ { .fni4 = gen_helper_neon_tst_u8,
156
+
120
+ .fniv = gen_cmtst_vec,
157
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
121
+ .vece = MO_8 },
158
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
122
+ { .fni4 = gen_helper_neon_tst_u16,
159
+{
123
+ .fniv = gen_cmtst_vec,
160
+ static const TCGOpcode vecop_list[] = {
124
+ .vece = MO_16 },
161
+ INDEX_op_sub_vec, INDEX_op_add_vec,
125
+ { .fni4 = gen_cmtst_i32,
162
+ INDEX_op_smin_vec, INDEX_op_smax_vec, 0
126
+ .fniv = gen_cmtst_vec,
163
+ };
127
+ .vece = MO_32 },
164
+ static const GVecGen3 ops[4] = {
128
+ { .fni8 = gen_cmtst_i64,
165
+ { .fniv = gen_saba_vec,
129
+ .fniv = gen_cmtst_vec,
166
+ .fno = gen_helper_gvec_saba_b,
130
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
167
+ .opt_opc = vecop_list,
131
+ .vece = MO_64 },
168
+ .load_dest = true,
132
+};
169
+ .vece = MO_8 },
170
+ { .fniv = gen_saba_vec,
171
+ .fno = gen_helper_gvec_saba_h,
172
+ .opt_opc = vecop_list,
173
+ .load_dest = true,
174
+ .vece = MO_16 },
175
+ { .fni4 = gen_saba_i32,
176
+ .fniv = gen_saba_vec,
177
+ .fno = gen_helper_gvec_saba_s,
178
+ .opt_opc = vecop_list,
179
+ .load_dest = true,
180
+ .vece = MO_32 },
181
+ { .fni8 = gen_saba_i64,
182
+ .fniv = gen_saba_vec,
183
+ .fno = gen_helper_gvec_saba_d,
184
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
185
+ .opt_opc = vecop_list,
186
+ .load_dest = true,
187
+ .vece = MO_64 },
188
+ };
189
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
190
+}
191
+
192
+static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
193
+{
194
+ TCGv_i32 t = tcg_temp_new_i32();
195
+ gen_uabd_i32(t, a, b);
196
+ tcg_gen_add_i32(d, d, t);
197
+ tcg_temp_free_i32(t);
198
+}
199
+
200
+static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
201
+{
202
+ TCGv_i64 t = tcg_temp_new_i64();
203
+ gen_uabd_i64(t, a, b);
204
+ tcg_gen_add_i64(d, d, t);
205
+ tcg_temp_free_i64(t);
206
+}
207
+
208
+static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
209
+{
210
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
211
+ gen_uabd_vec(vece, t, a, b);
212
+ tcg_gen_add_vec(vece, d, d, t);
213
+ tcg_temp_free_vec(t);
214
+}
215
+
216
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
217
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
218
+{
219
+ static const TCGOpcode vecop_list[] = {
220
+ INDEX_op_sub_vec, INDEX_op_add_vec,
221
+ INDEX_op_umin_vec, INDEX_op_umax_vec, 0
222
+ };
223
+ static const GVecGen3 ops[4] = {
224
+ { .fniv = gen_uaba_vec,
225
+ .fno = gen_helper_gvec_uaba_b,
226
+ .opt_opc = vecop_list,
227
+ .load_dest = true,
228
+ .vece = MO_8 },
229
+ { .fniv = gen_uaba_vec,
230
+ .fno = gen_helper_gvec_uaba_h,
231
+ .opt_opc = vecop_list,
232
+ .load_dest = true,
233
+ .vece = MO_16 },
234
+ { .fni4 = gen_uaba_i32,
235
+ .fniv = gen_uaba_vec,
236
+ .fno = gen_helper_gvec_uaba_s,
237
+ .opt_opc = vecop_list,
238
+ .load_dest = true,
239
+ .vece = MO_32 },
240
+ { .fni8 = gen_uaba_i64,
241
+ .fniv = gen_uaba_vec,
242
+ .fno = gen_helper_gvec_uaba_d,
243
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
244
+ .opt_opc = vecop_list,
245
+ .load_dest = true,
246
+ .vece = MO_64 },
247
+ };
248
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
249
+}
133
+
250
+
134
/* Translate a NEON data processing instruction. Return nonzero if the
251
/* Translate a NEON data processing instruction. Return nonzero if the
135
instruction is invalid.
252
instruction is invalid.
136
We process data in a mixture of 32-bit and 64-bit chunks.
253
We process data in a mixture of 32-bit and 64-bit chunks.
137
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
254
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
138
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
255
}
139
u ? &mls_op[size] : &mla_op[size]);
140
return 0;
256
return 0;
141
+
257
142
+ case NEON_3R_VTST_VCEQ:
258
+ case NEON_3R_VABA:
143
+ if (u) { /* VCEQ */
259
+ if (u) {
144
+ tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs,
260
+ gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
145
+ vec_size, vec_size);
261
+ vec_size, vec_size);
146
+ } else { /* VTST */
262
+ } else {
147
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
263
+ gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
148
+ vec_size, vec_size, &cmtst_op[size]);
264
+ vec_size, vec_size);
149
+ }
265
+ }
150
+ return 0;
266
+ return 0;
151
+
267
+
152
+ case NEON_3R_VCGT:
268
case NEON_3R_VADD_VSUB:
153
+ tcg_gen_gvec_cmp(u ? TCG_COND_GTU : TCG_COND_GT, size,
269
case NEON_3R_LOGIC:
154
+ rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
270
case NEON_3R_VMAX:
155
+ return 0;
156
+
157
+ case NEON_3R_VCGE:
158
+ tcg_gen_gvec_cmp(u ? TCG_COND_GEU : TCG_COND_GE, size,
159
+ rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
160
+ return 0;
161
}
162
163
if (size == 3) {
164
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
271
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
165
case NEON_3R_VQSUB:
272
case NEON_3R_VQRSHL:
166
GEN_NEON_INTEGER_OP_ENV(qsub);
273
GEN_NEON_INTEGER_OP_ENV(qrshl);
167
break;
274
break;
168
- case NEON_3R_VCGT:
275
- case NEON_3R_VABA:
169
- GEN_NEON_INTEGER_OP(cgt);
276
- GEN_NEON_INTEGER_OP(abd);
277
- tcg_temp_free_i32(tmp2);
278
- tmp2 = neon_load_reg(rd, pass);
279
- gen_neon_add(size, tmp, tmp2);
170
- break;
280
- break;
171
- case NEON_3R_VCGE:
281
case NEON_3R_VPMAX:
172
- GEN_NEON_INTEGER_OP(cge);
282
GEN_NEON_INTEGER_OP(pmax);
173
- break;
174
case NEON_3R_VSHL:
175
GEN_NEON_INTEGER_OP(shl);
176
break;
283
break;
177
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
284
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
178
tmp2 = neon_load_reg(rd, pass);
285
index XXXXXXX..XXXXXXX 100644
179
gen_neon_add(size, tmp, tmp2);
286
--- a/target/arm/vec_helper.c
180
break;
287
+++ b/target/arm/vec_helper.c
181
- case NEON_3R_VTST_VCEQ:
288
@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
182
- if (!u) { /* VTST */
289
DO_ABD(gvec_uabd_d, uint64_t)
183
- switch (size) {
290
184
- case 0: gen_helper_neon_tst_u8(tmp, tmp, tmp2); break;
291
#undef DO_ABD
185
- case 1: gen_helper_neon_tst_u16(tmp, tmp, tmp2); break;
292
+
186
- case 2: gen_helper_neon_tst_u32(tmp, tmp, tmp2); break;
293
+#define DO_ABA(NAME, TYPE) \
187
- default: abort();
294
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
188
- }
295
+{ \
189
- } else { /* VCEQ */
296
+ intptr_t i, opr_sz = simd_oprsz(desc); \
190
- switch (size) {
297
+ TYPE *d = vd, *n = vn, *m = vm; \
191
- case 0: gen_helper_neon_ceq_u8(tmp, tmp, tmp2); break;
298
+ \
192
- case 1: gen_helper_neon_ceq_u16(tmp, tmp, tmp2); break;
299
+ for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \
193
- case 2: gen_helper_neon_ceq_u32(tmp, tmp, tmp2); break;
300
+ d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \
194
- default: abort();
301
+ } \
195
- }
302
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
196
- }
303
+}
197
- break;
304
+
198
case NEON_3R_VMUL:
305
+DO_ABA(gvec_saba_b, int8_t)
199
/* VMUL.P8; other cases already eliminated. */
306
+DO_ABA(gvec_saba_h, int16_t)
200
gen_helper_neon_mul_p8(tmp, tmp, tmp2);
307
+DO_ABA(gvec_saba_s, int32_t)
308
+DO_ABA(gvec_saba_d, int64_t)
309
+
310
+DO_ABA(gvec_uaba_b, uint8_t)
311
+DO_ABA(gvec_uaba_h, uint16_t)
312
+DO_ABA(gvec_uaba_s, uint32_t)
313
+DO_ABA(gvec_uaba_d, uint64_t)
314
+
315
+#undef DO_ABA
201
--
316
--
202
2.19.1
317
2.20.1
203
318
204
319
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Patrick Williams <patrick@stwcx.xyz>
2
2
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3
Sonora Pass is a 2 socket x86 motherboard designed by Facebook
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
and supported by OpenBMC. Strapping configuration was obtained
5
Message-id: 20181016223115.24100-7-richard.henderson@linaro.org
5
from hardware and i2c configuration is based on dts found at:
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
7
https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts
8
9
Booted a test image of http://github.com/facebook/openbmc to login
10
prompt.
11
12
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
13
Reviewed-by: Amithash Prasad <amithash@fb.com>
14
Reviewed-by: Cédric Le Goater <clg@kaod.org>
15
[PMM: fixed block comment style nit]
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
17
---
9
target/arm/cpu.h | 6 +++++-
18
hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
10
linux-user/elfload.c | 2 +-
19
1 file changed, 78 insertions(+)
11
target/arm/cpu.c | 4 ----
12
target/arm/helper.c | 2 +-
13
target/arm/machine.c | 3 +--
14
5 files changed, 8 insertions(+), 9 deletions(-)
15
20
16
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
21
diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
17
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.h
23
--- a/hw/arm/aspeed.c
19
+++ b/target/arm/cpu.h
24
+++ b/hw/arm/aspeed.c
20
@@ -XXX,XX +XXX,XX @@ enum arm_features {
25
@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
21
ARM_FEATURE_NEON,
26
SCU_AST2500_HW_STRAP_ACPI_ENABLE | \
22
ARM_FEATURE_M, /* Microcontroller profile. */
27
SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
23
ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling. */
28
24
- ARM_FEATURE_THUMB2EE,
29
+/* Sonorapass hardware value: 0xF100D216 */
25
ARM_FEATURE_V7MP, /* v7 Multiprocessing Extensions */
30
+#define SONORAPASS_BMC_HW_STRAP1 ( \
26
ARM_FEATURE_V7VE, /* v7 Virtualization Extensions (non-EL2 parts) */
31
+ SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE | \
27
ARM_FEATURE_V4T,
32
+ SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE | \
28
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_jazelle(const ARMISARegisters *id)
33
+ SCU_AST2500_HW_STRAP_UART_DEBUG | \
29
return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
34
+ SCU_AST2500_HW_STRAP_RESERVED28 | \
35
+ SCU_AST2500_HW_STRAP_DDR4_ENABLE | \
36
+ SCU_HW_STRAP_VGA_CLASS_CODE | \
37
+ SCU_HW_STRAP_LPC_RESET_PIN | \
38
+ SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) | \
39
+ SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) | \
40
+ SCU_HW_STRAP_VGA_BIOS_ROM | \
41
+ SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) | \
42
+ SCU_AST2500_HW_STRAP_RESERVED1)
43
+
44
/* Swift hardware value: 0xF11AD206 */
45
#define SWIFT_BMC_HW_STRAP1 ( \
46
AST2500_HW_STRAP1_DEFAULTS | \
47
@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
48
i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
30
}
49
}
31
50
32
+static inline bool isar_feature_t32ee(const ARMISARegisters *id)
51
+static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
33
+{
52
+{
34
+ return FIELD_EX32(id->id_isar3, ID_ISAR3, T32EE) != 0;
53
+ AspeedSoCState *soc = &bmc->soc;
54
+
55
+ /* bus 2 : */
56
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
57
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
58
+ /* bus 2 : pca9546 @ 0x73 */
59
+
60
+ /* bus 3 : pca9548 @ 0x70 */
61
+
62
+ /* bus 4 : */
63
+ uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
64
+ smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
65
+ eeprom4_54);
66
+ /* PCA9539 @ 0x76, but PCA9552 is compatible */
67
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
68
+ /* PCA9539 @ 0x77, but PCA9552 is compatible */
69
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
70
+
71
+ /* bus 6 : */
72
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
73
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
74
+ /* bus 6 : pca9546 @ 0x73 */
75
+
76
+ /* bus 8 : */
77
+ uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
78
+ smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
79
+ eeprom8_56);
80
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
81
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
82
+ /* bus 8 : adc128d818 @ 0x1d */
83
+ /* bus 8 : adc128d818 @ 0x1f */
84
+
85
+ /*
86
+ * bus 13 : pca9548 @ 0x71
87
+ * - channel 3:
88
+ * - tmm421 @ 0x4c
89
+ * - tmp421 @ 0x4e
90
+ * - tmp421 @ 0x4f
91
+ */
92
+
35
+}
93
+}
36
+
94
+
37
static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
95
static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
38
{
96
{
39
return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
97
AspeedSoCState *soc = &bmc->soc;
40
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
98
@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
41
index XXXXXXX..XXXXXXX 100644
99
mc->default_ram_size = 512 * MiB;
42
--- a/linux-user/elfload.c
100
};
43
+++ b/linux-user/elfload.c
101
44
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
102
+static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
45
GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
103
+{
46
GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
104
+ MachineClass *mc = MACHINE_CLASS(oc);
47
GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
105
+ AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
48
- GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
106
+
49
+ GET_FEATURE_ID(t32ee, ARM_HWCAP_ARM_THUMBEE);
107
+ mc->desc = "OCP SonoraPass BMC (ARM1176)";
50
GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
108
+ amc->soc_name = "ast2500-a1";
51
GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
109
+ amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
52
GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
110
+ amc->fmc_model = "mx66l1g45g";
53
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
111
+ amc->spi_model = "mx66l1g45g";
54
index XXXXXXX..XXXXXXX 100644
112
+ amc->num_cs = 2;
55
--- a/target/arm/cpu.c
113
+ amc->i2c_init = sonorapass_bmc_i2c_init;
56
+++ b/target/arm/cpu.c
114
+ mc->default_ram_size = 512 * MiB;
57
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
115
+};
58
set_feature(&cpu->env, ARM_FEATURE_V7);
116
+
59
set_feature(&cpu->env, ARM_FEATURE_VFP3);
117
static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
60
set_feature(&cpu->env, ARM_FEATURE_NEON);
61
- set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
62
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
63
set_feature(&cpu->env, ARM_FEATURE_EL3);
64
cpu->midr = 0x410fc080;
65
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
66
set_feature(&cpu->env, ARM_FEATURE_VFP3);
67
set_feature(&cpu->env, ARM_FEATURE_VFP_FP16);
68
set_feature(&cpu->env, ARM_FEATURE_NEON);
69
- set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
70
set_feature(&cpu->env, ARM_FEATURE_EL3);
71
/* Note that A9 supports the MP extensions even for
72
* A9UP and single-core A9MP (which are both different
73
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
74
set_feature(&cpu->env, ARM_FEATURE_V7VE);
75
set_feature(&cpu->env, ARM_FEATURE_VFP4);
76
set_feature(&cpu->env, ARM_FEATURE_NEON);
77
- set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
78
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
79
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
80
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
81
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
82
set_feature(&cpu->env, ARM_FEATURE_V7VE);
83
set_feature(&cpu->env, ARM_FEATURE_VFP4);
84
set_feature(&cpu->env, ARM_FEATURE_NEON);
85
- set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
86
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
87
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
88
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
89
diff --git a/target/arm/helper.c b/target/arm/helper.c
90
index XXXXXXX..XXXXXXX 100644
91
--- a/target/arm/helper.c
92
+++ b/target/arm/helper.c
93
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
94
define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
95
define_arm_cp_regs(cpu, vmsa_cp_reginfo);
96
}
97
- if (arm_feature(env, ARM_FEATURE_THUMB2EE)) {
98
+ if (cpu_isar_feature(t32ee, cpu)) {
99
define_arm_cp_regs(cpu, t2ee_cp_reginfo);
100
}
101
if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
102
diff --git a/target/arm/machine.c b/target/arm/machine.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/target/arm/machine.c
105
+++ b/target/arm/machine.c
106
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
107
static bool thumb2ee_needed(void *opaque)
108
{
118
{
109
ARMCPU *cpu = opaque;
119
MachineClass *mc = MACHINE_CLASS(oc);
110
- CPUARMState *env = &cpu->env;
120
@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
111
121
.name = MACHINE_TYPE_NAME("swift-bmc"),
112
- return arm_feature(env, ARM_FEATURE_THUMB2EE);
122
.parent = TYPE_ASPEED_MACHINE,
113
+ return cpu_isar_feature(t32ee, cpu);
123
.class_init = aspeed_machine_swift_class_init,
114
}
124
+ }, {
115
125
+ .name = MACHINE_TYPE_NAME("sonorapass-bmc"),
116
static const VMStateDescription vmstate_thumb2ee = {
126
+ .parent = TYPE_ASPEED_MACHINE,
127
+ .class_init = aspeed_machine_sonorapass_class_init,
128
}, {
129
.name = MACHINE_TYPE_NAME("witherspoon-bmc"),
130
.parent = TYPE_ASPEED_MACHINE,
117
--
131
--
118
2.19.1
132
2.20.1
119
133
120
134
diff view generated by jsdifflib
1
From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
Announce 64bit addressing support.
3
The little end UUID is used in many places, so make
4
NVDIMM_UUID_LE to a common macro to convert the UUID
5
to a little end array.
4
6
5
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
7
Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
6
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
8
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
7
Message-id: 20181017213932.19973-3-edgar.iglesias@gmail.com
9
Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
12
---
11
hw/net/cadence_gem.c | 3 ++-
13
include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
12
1 file changed, 2 insertions(+), 1 deletion(-)
14
hw/acpi/nvdimm.c | 10 +++-------
15
2 files changed, 30 insertions(+), 7 deletions(-)
13
16
14
diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
17
diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/net/cadence_gem.c
19
--- a/include/qemu/uuid.h
17
+++ b/hw/net/cadence_gem.c
20
+++ b/include/qemu/uuid.h
21
@@ -XXX,XX +XXX,XX @@ typedef struct {
22
};
23
} QemuUUID;
24
25
+/**
26
+ * UUID_LE - converts the fields of UUID to little-endian array,
27
+ * each of parameters is the filed of UUID.
28
+ *
29
+ * @time_low: The low field of the timestamp
30
+ * @time_mid: The middle field of the timestamp
31
+ * @time_hi_and_version: The high field of the timestamp
32
+ * multiplexed with the version number
33
+ * @clock_seq_hi_and_reserved: The high field of the clock
34
+ * sequence multiplexed with the variant
35
+ * @clock_seq_low: The low field of the clock sequence
36
+ * @node0: The spatially unique node0 identifier
37
+ * @node1: The spatially unique node1 identifier
38
+ * @node2: The spatially unique node2 identifier
39
+ * @node3: The spatially unique node3 identifier
40
+ * @node4: The spatially unique node4 identifier
41
+ * @node5: The spatially unique node5 identifier
42
+ */
43
+#define UUID_LE(time_low, time_mid, time_hi_and_version, \
44
+ clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2, \
45
+ node3, node4, node5) \
46
+ { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
47
+ ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
48
+ (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff, \
49
+ (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
50
+ (node3), (node4), (node5) }
51
+
52
#define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
53
"%02hhx%02hhx-%02hhx%02hhx-" \
54
"%02hhx%02hhx-" \
55
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/hw/acpi/nvdimm.c
58
+++ b/hw/acpi/nvdimm.c
18
@@ -XXX,XX +XXX,XX @@
59
@@ -XXX,XX +XXX,XX @@
19
#define GEM_DESCONF4 (0x0000028C/4)
60
*/
20
#define GEM_DESCONF5 (0x00000290/4)
61
21
#define GEM_DESCONF6 (0x00000294/4)
62
#include "qemu/osdep.h"
22
+#define GEM_DESCONF6_64B_MASK (1U << 23)
63
+#include "qemu/uuid.h"
23
#define GEM_DESCONF7 (0x00000298/4)
64
#include "hw/acpi/acpi.h"
24
65
#include "hw/acpi/aml-build.h"
25
#define GEM_INT_Q1_STATUS (0x00000400 / 4)
66
#include "hw/acpi/bios-linker-loader.h"
26
@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
67
@@ -XXX,XX +XXX,XX @@
27
s->regs[GEM_DESCONF] = 0x02500111;
68
#include "hw/mem/nvdimm.h"
28
s->regs[GEM_DESCONF2] = 0x2ab13fff;
69
#include "qemu/nvdimm-utils.h"
29
s->regs[GEM_DESCONF5] = 0x002f2045;
70
30
- s->regs[GEM_DESCONF6] = 0x0;
71
-#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7) \
31
+ s->regs[GEM_DESCONF6] = GEM_DESCONF6_64B_MASK;
72
- { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
32
73
- (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff, \
33
if (s->num_priority_queues > 1) {
74
- (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
34
queues_mask = MAKE_64BIT_MASK(1, s->num_priority_queues - 1);
75
-
76
/*
77
* define Byte Addressable Persistent Memory (PM) Region according to
78
* ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
79
*/
80
static const uint8_t nvdimm_nfit_spa_uuid[] =
81
- NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
82
- 0x18, 0xb7, 0x8c, 0xdb);
83
+ UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
84
+ 0x18, 0xb7, 0x8c, 0xdb);
85
86
/*
87
* NVDIMM Firmware Interface Table
35
--
88
--
36
2.19.1
89
2.20.1
37
90
38
91
diff view generated by jsdifflib
1
From: Markus Armbruster <armbru@redhat.com>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
Device models aren't supposed to go on fishing expeditions for
3
RAS Virtualization feature is not supported now, so
4
backends. They should expose suitable properties for the user to set.
4
add a RAS machine option and disable it by default.
5
For onboard devices, board code sets them.
6
5
7
Device ssi-sd picks up its block backend in its init() method with
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
drive_get_next() instead. This mistake is already marked FIXME since
7
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
9
commit af9e40a.
8
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
10
9
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
11
Unset user_creatable to remove the mistake from our external
10
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
12
interface. Since the SSI bus doesn't support hotplug, only -device
11
Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
13
can be affected. Only certain ARM machines have ssi-sd and provide an
14
SSI bus for it; this patch breaks -device ssi-sd for these machines.
15
No actual use of -device ssi-sd is known.
16
17
Signed-off-by: Markus Armbruster <armbru@redhat.com>
18
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
19
Acked-by: Thomas Huth <thuth@redhat.com>
20
Message-id: 20181009060835.4608-1-armbru@redhat.com
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
---
13
---
23
hw/sd/ssi-sd.c | 2 ++
14
include/hw/arm/virt.h | 1 +
24
1 file changed, 2 insertions(+)
15
hw/arm/virt.c | 23 +++++++++++++++++++++++
16
2 files changed, 24 insertions(+)
25
17
26
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
18
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
27
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
28
--- a/hw/sd/ssi-sd.c
20
--- a/include/hw/arm/virt.h
29
+++ b/hw/sd/ssi-sd.c
21
+++ b/include/hw/arm/virt.h
30
@@ -XXX,XX +XXX,XX @@ static void ssi_sd_class_init(ObjectClass *klass, void *data)
22
@@ -XXX,XX +XXX,XX @@ typedef struct {
31
k->cs_polarity = SSI_CS_LOW;
23
bool highmem_ecam;
32
dc->vmsd = &vmstate_ssi_sd;
24
bool its;
33
dc->reset = ssi_sd_reset;
25
bool virt;
34
+ /* Reason: init() method uses drive_get_next() */
26
+ bool ras;
35
+ dc->user_creatable = false;
27
OnOffAuto acpi;
28
VirtGICType gic_version;
29
VirtIOMMUType iommu;
30
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/hw/arm/virt.c
33
+++ b/hw/arm/virt.c
34
@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
35
visit_type_OnOffAuto(v, name, &vms->acpi, errp);
36
}
36
}
37
37
38
static const TypeInfo ssi_sd_info = {
38
+static bool virt_get_ras(Object *obj, Error **errp)
39
+{
40
+ VirtMachineState *vms = VIRT_MACHINE(obj);
41
+
42
+ return vms->ras;
43
+}
44
+
45
+static void virt_set_ras(Object *obj, bool value, Error **errp)
46
+{
47
+ VirtMachineState *vms = VIRT_MACHINE(obj);
48
+
49
+ vms->ras = value;
50
+}
51
+
52
static char *virt_get_gic_version(Object *obj, Error **errp)
53
{
54
VirtMachineState *vms = VIRT_MACHINE(obj);
55
@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
56
"Valid values are none and smmuv3",
57
NULL);
58
59
+ /* Default disallows RAS instantiation */
60
+ vms->ras = false;
61
+ object_property_add_bool(obj, "ras", virt_get_ras,
62
+ virt_set_ras, NULL);
63
+ object_property_set_description(obj, "ras",
64
+ "Set on/off to enable/disable reporting host memory errors "
65
+ "to a KVM guest using ACPI and guest external abort exceptions",
66
+ NULL);
67
+
68
vms->irqmap = a15irqmap;
69
70
virt_flash_create(vms);
39
--
71
--
40
2.19.1
72
2.20.1
41
73
42
74
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
The EL3 version of this register does not include an ASID,
3
Add APEI/GHES detailed design document
4
and so the tlb_flush performed by vmsa_ttbr_write is not needed.
5
4
6
Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>
5
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
9
Message-id: 20181019015617.22583-2-richard.henderson@linaro.org
8
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
9
Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
11
---
12
target/arm/helper.c | 2 +-
12
docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
13
1 file changed, 1 insertion(+), 1 deletion(-)
13
docs/specs/index.rst | 1 +
14
2 files changed, 111 insertions(+)
15
create mode 100644 docs/specs/acpi_hest_ghes.rst
14
16
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
18
new file mode 100644
19
index XXXXXXX..XXXXXXX
20
--- /dev/null
21
+++ b/docs/specs/acpi_hest_ghes.rst
22
@@ -XXX,XX +XXX,XX @@
23
+APEI tables generating and CPER record
24
+======================================
25
+
26
+..
27
+ Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
28
+
29
+ This work is licensed under the terms of the GNU GPL, version 2 or later.
30
+ See the COPYING file in the top-level directory.
31
+
32
+Design Details
33
+--------------
34
+
35
+::
36
+
37
+ etc/acpi/tables etc/hardware_errors
38
+ ==================== ===============================
39
+ + +--------------------------+ +----------------------------+
40
+ | | HEST | +--------->| error_block_address1 |------+
41
+ | +--------------------------+ | +----------------------------+ |
42
+ | | GHES1 | | +------->| error_block_address2 |------+-+
43
+ | +--------------------------+ | | +----------------------------+ | |
44
+ | | ................. | | | | .............. | | |
45
+ | | error_status_address-----+-+ | -----------------------------+ | |
46
+ | | ................. | | +--->| error_block_addressN |------+-+---+
47
+ | | read_ack_register--------+-+ | | +----------------------------+ | | |
48
+ | | read_ack_preserve | +-+---+--->| read_ack_register1 | | | |
49
+ | | read_ack_write | | | +----------------------------+ | | |
50
+ + +--------------------------+ | +-+--->| read_ack_register2 | | | |
51
+ | | GHES2 | | | | +----------------------------+ | | |
52
+ + +--------------------------+ | | | | ............. | | | |
53
+ | | ................. | | | | +----------------------------+ | | |
54
+ | | error_status_address-----+---+ | | +->| read_ack_registerN | | | |
55
+ | | ................. | | | | +----------------------------+ | | |
56
+ | | read_ack_register--------+-----+ | | |Generic Error Status Block 1|<-----+ | |
57
+ | | read_ack_preserve | | | |-+------------------------+-+ | |
58
+ | | read_ack_write | | | | | CPER | | | |
59
+ + +--------------------------| | | | | CPER | | | |
60
+ | | ............... | | | | | .... | | | |
61
+ + +--------------------------+ | | | | CPER | | | |
62
+ | | GHESN | | | |-+------------------------+-| | |
63
+ + +--------------------------+ | | |Generic Error Status Block 2|<-------+ |
64
+ | | ................. | | | |-+------------------------+-+ |
65
+ | | error_status_address-----+-------+ | | | CPER | | |
66
+ | | ................. | | | | CPER | | |
67
+ | | read_ack_register--------+---------+ | | .... | | |
68
+ | | read_ack_preserve | | | CPER | | |
69
+ | | read_ack_write | +-+------------------------+-+ |
70
+ + +--------------------------+ | .......... | |
71
+ |----------------------------+ |
72
+ |Generic Error Status Block N |<----------+
73
+ |-+-------------------------+-+
74
+ | | CPER | |
75
+ | | CPER | |
76
+ | | .... | |
77
+ | | CPER | |
78
+ +-+-------------------------+-+
79
+
80
+
81
+(1) QEMU generates the ACPI HEST table. This table goes in the current
82
+ "etc/acpi/tables" fw_cfg blob. Each error source has different
83
+ notification types.
84
+
85
+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
86
+ also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
87
+ contains an address registers table and an Error Status Data Block table.
88
+
89
+(3) The address registers table contains N Error Block Address entries
90
+ and N Read Ack Register entries. The size for each entry is 8-byte.
91
+ The Error Status Data Block table contains N Error Status Data Block
92
+ entries. The size for each entry is 4096(0x1000) bytes. The total size
93
+ for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
94
+ N is the number of the kinds of hardware error sources.
95
+
96
+(4) QEMU generates the ACPI linker/loader script for the firmware. The
97
+ firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
98
+ and copies blob contents there.
99
+
100
+(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
101
+ "error_status_address" fields of the HEST table with a pointer to the
102
+ corresponding "address registers" in the "etc/hardware_errors" blob.
103
+
104
+(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
105
+ "read_ack_register" fields of the HEST table with a pointer to the
106
+ corresponding "read_ack_register" within the "etc/hardware_errors" blob.
107
+
108
+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
109
+ addresses in the "error_block_address" fields with a pointer to the
110
+ respective "Error Status Data Block" in the "etc/hardware_errors" blob.
111
+
112
+(8) QEMU defines a third and write-only fw_cfg blob which is called
113
+ "etc/hardware_errors_addr". Through that blob, the firmware can send back
114
+ the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
115
+ blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
116
+ for the firmware. The firmware will write back the start address of
117
+ "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
118
+
119
+(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
120
+ "Error Status Data Block", guest memory, and then injects platform specific
121
+ interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
122
+ notification which is necessary for notifying the guest.
123
+
124
+(10) This notification (in virtual hardware) will be handled by the guest
125
+ kernel, on receiving notification, guest APEI driver could read the CPER error
126
+ and take appropriate action.
127
+
128
+(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
129
+ find out "Error Status Data Block" entry corresponding to error source. So supported
130
+ source_id values should be assigned here and not be changed afterwards to make sure
131
+ that guest will write error into expected "Error Status Data Block" even if guest was
132
+ migrated to a newer QEMU.
133
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
16
index XXXXXXX..XXXXXXX 100644
134
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
135
--- a/docs/specs/index.rst
18
+++ b/target/arm/helper.c
136
+++ b/docs/specs/index.rst
19
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
137
@@ -XXX,XX +XXX,XX @@ Contents:
20
.fieldoffset = offsetof(CPUARMState, cp15.mvbar) },
138
ppc-spapr-xive
21
{ .name = "TTBR0_EL3", .state = ARM_CP_STATE_AA64,
139
acpi_hw_reduced_hotplug
22
.opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 0,
140
tpm
23
- .access = PL3_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
141
+ acpi_hest_ghes
24
+ .access = PL3_RW, .resetvalue = 0,
25
.fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[3]) },
26
{ .name = "TCR_EL3", .state = ARM_CP_STATE_AA64,
27
.opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 2,
28
--
142
--
29
2.19.1
143
2.20.1
30
144
31
145
diff view generated by jsdifflib
1
From: Richard Henderson <rth@twiddle.net>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
This can reduce the number of opcodes required for certain
3
This patch builds error_block_address and read_ack_register fields
4
complex forms of load-multiple (e.g. ld4.16b).
4
in hardware errors table , the error_block_address points to Generic
5
5
Error Status Block(GESB) via bios_linker. The max size for one GESB
6
Signed-off-by: Richard Henderson <rth@twiddle.net>
6
is 1kb, For more detailed information, please refer to
7
Message-id: 20181011205206.3552-2-richard.henderson@linaro.org
7
document: docs/specs/acpi_hest_ghes.rst
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
9
Now we only support one Error source, if necessary, we can extend to
10
support more.
11
12
Suggested-by: Laszlo Ersek <lersek@redhat.com>
13
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
14
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
15
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
16
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
17
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
18
Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
20
---
11
target/arm/translate-a64.c | 12 ++++++++----
21
default-configs/arm-softmmu.mak | 1 +
12
1 file changed, 8 insertions(+), 4 deletions(-)
22
include/hw/acpi/aml-build.h | 1 +
13
23
include/hw/acpi/ghes.h | 28 +++++++++++
14
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
24
hw/acpi/aml-build.c | 2 +
15
index XXXXXXX..XXXXXXX 100644
25
hw/acpi/ghes.c | 89 +++++++++++++++++++++++++++++++++
16
--- a/target/arm/translate-a64.c
26
hw/arm/virt-acpi-build.c | 5 ++
17
+++ b/target/arm/translate-a64.c
27
hw/acpi/Kconfig | 4 ++
18
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
28
hw/acpi/Makefile.objs | 1 +
19
bool is_store = !extract32(insn, 22, 1);
29
8 files changed, 131 insertions(+)
20
bool is_postidx = extract32(insn, 23, 1);
30
create mode 100644 include/hw/acpi/ghes.h
21
bool is_q = extract32(insn, 30, 1);
31
create mode 100644 hw/acpi/ghes.c
22
- TCGv_i64 tcg_addr, tcg_rn;
32
23
+ TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
33
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
24
34
index XXXXXXX..XXXXXXX 100644
25
int ebytes = 1 << size;
35
--- a/default-configs/arm-softmmu.mak
26
int elements = (is_q ? 128 : 64) / (8 << size);
36
+++ b/default-configs/arm-softmmu.mak
27
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
37
@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
28
tcg_rn = cpu_reg_sp(s, rn);
38
CONFIG_FSL_IMX6UL=y
29
tcg_addr = tcg_temp_new_i64();
39
CONFIG_SEMIHOSTING=y
30
tcg_gen_mov_i64(tcg_addr, tcg_rn);
40
CONFIG_ALLWINNER_H3=y
31
+ tcg_ebytes = tcg_const_i64(ebytes);
41
+CONFIG_ACPI_APEI=y
32
42
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
33
for (r = 0; r < rpt; r++) {
43
index XXXXXXX..XXXXXXX 100644
34
int e;
44
--- a/include/hw/acpi/aml-build.h
35
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
45
+++ b/include/hw/acpi/aml-build.h
36
clear_vec_high(s, is_q, tt);
46
@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
37
}
47
GArray *rsdp;
38
}
48
GArray *tcpalog;
39
- tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
49
GArray *vmgenid;
40
+ tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
50
+ GArray *hardware_errors;
41
tt = (tt + 1) % 32;
51
BIOSLinker *linker;
42
}
52
} AcpiBuildTables;
43
}
53
44
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
54
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
45
tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
55
new file mode 100644
46
}
56
index XXXXXXX..XXXXXXX
47
}
57
--- /dev/null
48
+ tcg_temp_free_i64(tcg_ebytes);
58
+++ b/include/hw/acpi/ghes.h
49
tcg_temp_free_i64(tcg_addr);
59
@@ -XXX,XX +XXX,XX @@
60
+/*
61
+ * Support for generating APEI tables and recording CPER for Guests
62
+ *
63
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
64
+ *
65
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
66
+ *
67
+ * This program is free software; you can redistribute it and/or modify
68
+ * it under the terms of the GNU General Public License as published by
69
+ * the Free Software Foundation; either version 2 of the License, or
70
+ * (at your option) any later version.
71
+
72
+ * This program is distributed in the hope that it will be useful,
73
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
74
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
75
+ * GNU General Public License for more details.
76
+
77
+ * You should have received a copy of the GNU General Public License along
78
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
79
+ */
80
+
81
+#ifndef ACPI_GHES_H
82
+#define ACPI_GHES_H
83
+
84
+#include "hw/acpi/bios-linker-loader.h"
85
+
86
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
87
+#endif
88
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
89
index XXXXXXX..XXXXXXX 100644
90
--- a/hw/acpi/aml-build.c
91
+++ b/hw/acpi/aml-build.c
92
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
93
tables->table_data = g_array_new(false, true /* clear */, 1);
94
tables->tcpalog = g_array_new(false, true /* clear */, 1);
95
tables->vmgenid = g_array_new(false, true /* clear */, 1);
96
+ tables->hardware_errors = g_array_new(false, true /* clear */, 1);
97
tables->linker = bios_linker_loader_init();
50
}
98
}
51
99
52
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
100
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
53
bool replicate = false;
101
g_array_free(tables->table_data, true);
54
int index = is_q << 3 | S << 2 | size;
102
g_array_free(tables->tcpalog, mfre);
55
int ebytes, xs;
103
g_array_free(tables->vmgenid, mfre);
56
- TCGv_i64 tcg_addr, tcg_rn;
104
+ g_array_free(tables->hardware_errors, mfre);
57
+ TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
58
59
switch (scale) {
60
case 3:
61
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
62
tcg_rn = cpu_reg_sp(s, rn);
63
tcg_addr = tcg_temp_new_i64();
64
tcg_gen_mov_i64(tcg_addr, tcg_rn);
65
+ tcg_ebytes = tcg_const_i64(ebytes);
66
67
for (xs = 0; xs < selem; xs++) {
68
if (replicate) {
69
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
70
do_vec_st(s, rt, index, tcg_addr, scale);
71
}
72
}
73
- tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
74
+ tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
75
rt = (rt + 1) % 32;
76
}
77
78
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
79
tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
80
}
81
}
82
+ tcg_temp_free_i64(tcg_ebytes);
83
tcg_temp_free_i64(tcg_addr);
84
}
105
}
85
106
107
/*
108
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
109
new file mode 100644
110
index XXXXXXX..XXXXXXX
111
--- /dev/null
112
+++ b/hw/acpi/ghes.c
113
@@ -XXX,XX +XXX,XX @@
114
+/*
115
+ * Support for generating APEI tables and recording CPER for Guests
116
+ *
117
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
118
+ *
119
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
120
+ *
121
+ * This program is free software; you can redistribute it and/or modify
122
+ * it under the terms of the GNU General Public License as published by
123
+ * the Free Software Foundation; either version 2 of the License, or
124
+ * (at your option) any later version.
125
+
126
+ * This program is distributed in the hope that it will be useful,
127
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
128
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
129
+ * GNU General Public License for more details.
130
+
131
+ * You should have received a copy of the GNU General Public License along
132
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
133
+ */
134
+
135
+#include "qemu/osdep.h"
136
+#include "qemu/units.h"
137
+#include "hw/acpi/ghes.h"
138
+#include "hw/acpi/aml-build.h"
139
+
140
+#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
141
+#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
142
+
143
+/* The max size in bytes for one error block */
144
+#define ACPI_GHES_MAX_RAW_DATA_LENGTH (1 * KiB)
145
+
146
+/* Now only support ARMv8 SEA notification type error source */
147
+#define ACPI_GHES_ERROR_SOURCE_COUNT 1
148
+
149
+/*
150
+ * Build table for the hardware error fw_cfg blob.
151
+ * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
152
+ * See docs/specs/acpi_hest_ghes.rst for blobs format.
153
+ */
154
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
155
+{
156
+ int i, error_status_block_offset;
157
+
158
+ /* Build error_block_address */
159
+ for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
160
+ build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
161
+ }
162
+
163
+ /* Build read_ack_register */
164
+ for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
165
+ /*
166
+ * Initialize the value of read_ack_register to 1, so GHES can be
167
+ * writeable after (re)boot.
168
+ * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
169
+ * (GHESv2 - Type 10)
170
+ */
171
+ build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
172
+ }
173
+
174
+ /* Generic Error Status Block offset in the hardware error fw_cfg blob */
175
+ error_status_block_offset = hardware_errors->len;
176
+
177
+ /* Reserve space for Error Status Data Block */
178
+ acpi_data_push(hardware_errors,
179
+ ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
180
+
181
+ /* Tell guest firmware to place hardware_errors blob into RAM */
182
+ bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
183
+ hardware_errors, sizeof(uint64_t), false);
184
+
185
+ for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
186
+ /*
187
+ * Tell firmware to patch error_block_address entries to point to
188
+ * corresponding "Generic Error Status Block"
189
+ */
190
+ bios_linker_loader_add_pointer(linker,
191
+ ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
192
+ sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
193
+ error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
194
+ }
195
+
196
+ /*
197
+ * tell firmware to write hardware_errors GPA into
198
+ * hardware_errors_addr fw_cfg, once the former has been initialized.
199
+ */
200
+ bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
201
+ 0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
202
+}
203
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
204
index XXXXXXX..XXXXXXX 100644
205
--- a/hw/arm/virt-acpi-build.c
206
+++ b/hw/arm/virt-acpi-build.c
207
@@ -XXX,XX +XXX,XX @@
208
#include "sysemu/reset.h"
209
#include "kvm_arm.h"
210
#include "migration/vmstate.h"
211
+#include "hw/acpi/ghes.h"
212
213
#define ARM_SPI_BASE 32
214
215
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
216
acpi_add_table(table_offsets, tables_blob);
217
build_spcr(tables_blob, tables->linker, vms);
218
219
+ if (vms->ras) {
220
+ build_ghes_error_table(tables->hardware_errors, tables->linker);
221
+ }
222
+
223
if (ms->numa_state->num_nodes > 0) {
224
acpi_add_table(table_offsets, tables_blob);
225
build_srat(tables_blob, tables->linker, vms);
226
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
227
index XXXXXXX..XXXXXXX 100644
228
--- a/hw/acpi/Kconfig
229
+++ b/hw/acpi/Kconfig
230
@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
231
bool
232
depends on ACPI
233
234
+config ACPI_APEI
235
+ bool
236
+ depends on ACPI
237
+
238
config ACPI_PCI
239
bool
240
depends on ACPI && PCI
241
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
242
index XXXXXXX..XXXXXXX 100644
243
--- a/hw/acpi/Makefile.objs
244
+++ b/hw/acpi/Makefile.objs
245
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
246
common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
247
common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
248
common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
249
+common-obj-$(CONFIG_ACPI_APEI) += ghes.o
250
common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
251
common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
252
86
--
253
--
87
2.19.1
254
2.20.1
88
255
89
256
diff view generated by jsdifflib
1
From: Stewart Hildebrand <Stewart.Hildebrand@dornerworks.com>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
"The Image must be placed text_offset bytes from a 2MB aligned base
3
This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
4
address anywhere in usable system RAM and called there."
4
Now it only supports ARMv8 SEA, a type of Generic Hardware Error
5
5
Source version 2(GHESv2) error source. Afterwards, we can extend
6
For the virt board, we write our startup bootloader at the very
6
the supported types if needed. For the CPER section, currently it
7
bottom of RAM, so that bit can't be used for the image. To avoid
7
is memory section because kernel mainly wants userspace to handle
8
overlap in case the image requests to be loaded at an offset
8
the memory errors.
9
smaller than our bootloader, we increment the load offset to the
9
10
next 2MB.
10
This patch follows the spec ACPI 6.2 to build the Hardware Error
11
11
Source table. For more detailed information, please refer to
12
This fixes a boot failure for Xen AArch64.
12
document: docs/specs/acpi_hest_ghes.rst
13
13
14
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
14
build_ghes_hw_error_notification() helper will help to add Hardware
15
Tested-by: Andre Przywara <andre.przywara@arm.com>
15
Error Notification to ACPI tables without using packed C structures
16
Message-id: b8a89518794b4436af0c151ed10de4fa@dornerworks.com
16
and avoid endianness issues as API doesn't need explicit conversion.
17
[PMM: Rephrased a comment a bit]
17
18
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
19
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
20
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
21
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
22
Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
---
24
---
21
hw/arm/boot.c | 18 ++++++++++++++++++
25
include/hw/acpi/ghes.h | 39 ++++++++++++
22
1 file changed, 18 insertions(+)
26
hw/acpi/ghes.c | 126 +++++++++++++++++++++++++++++++++++++++
23
27
hw/arm/virt-acpi-build.c | 2 +
24
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
28
3 files changed, 167 insertions(+)
29
30
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
25
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
26
--- a/hw/arm/boot.c
32
--- a/include/hw/acpi/ghes.h
27
+++ b/hw/arm/boot.c
33
+++ b/include/hw/acpi/ghes.h
28
@@ -XXX,XX +XXX,XX @@
34
@@ -XXX,XX +XXX,XX @@
29
#include "qemu/config-file.h"
35
30
#include "qemu/option.h"
36
#include "hw/acpi/bios-linker-loader.h"
31
#include "exec/address-spaces.h"
37
32
+#include "qemu/units.h"
38
+/*
33
39
+ * Values for Hardware Error Notification Type field
34
/* Kernel boot protocol is specified in the kernel docs
40
+ */
35
* Documentation/arm/Booting and Documentation/arm64/booting.txt
41
+enum AcpiGhesNotifyType {
42
+ /* Polled */
43
+ ACPI_GHES_NOTIFY_POLLED = 0,
44
+ /* External Interrupt */
45
+ ACPI_GHES_NOTIFY_EXTERNAL = 1,
46
+ /* Local Interrupt */
47
+ ACPI_GHES_NOTIFY_LOCAL = 2,
48
+ /* SCI */
49
+ ACPI_GHES_NOTIFY_SCI = 3,
50
+ /* NMI */
51
+ ACPI_GHES_NOTIFY_NMI = 4,
52
+ /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
53
+ ACPI_GHES_NOTIFY_CMCI = 5,
54
+ /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
55
+ ACPI_GHES_NOTIFY_MCE = 6,
56
+ /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
57
+ ACPI_GHES_NOTIFY_GPIO = 7,
58
+ /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
59
+ ACPI_GHES_NOTIFY_SEA = 8,
60
+ /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
61
+ ACPI_GHES_NOTIFY_SEI = 9,
62
+ /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
63
+ ACPI_GHES_NOTIFY_GSIV = 10,
64
+ /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
65
+ ACPI_GHES_NOTIFY_SDEI = 11,
66
+ /* 12 and greater are reserved */
67
+ ACPI_GHES_NOTIFY_RESERVED = 12
68
+};
69
+
70
+enum {
71
+ ACPI_HEST_SRC_ID_SEA = 0,
72
+ /* future ids go here */
73
+ ACPI_HEST_SRC_ID_RESERVED,
74
+};
75
+
76
void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
77
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
78
#endif
79
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/hw/acpi/ghes.c
82
+++ b/hw/acpi/ghes.c
36
@@ -XXX,XX +XXX,XX @@
83
@@ -XXX,XX +XXX,XX @@
37
#define ARM64_TEXT_OFFSET_OFFSET 8
84
#include "qemu/units.h"
38
#define ARM64_MAGIC_OFFSET 56
85
#include "hw/acpi/ghes.h"
39
86
#include "hw/acpi/aml-build.h"
40
+#define BOOTLOADER_MAX_SIZE (4 * KiB)
87
+#include "qemu/error-report.h"
41
+
88
42
AddressSpace *arm_boot_address_space(ARMCPU *cpu,
89
#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
43
const struct arm_boot_info *info)
90
#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
44
{
91
@@ -XXX,XX +XXX,XX @@
45
@@ -XXX,XX +XXX,XX @@ static void write_bootloader(const char *name, hwaddr addr,
92
/* Now only support ARMv8 SEA notification type error source */
46
code[i] = tswap32(insn);
93
#define ACPI_GHES_ERROR_SOURCE_COUNT 1
94
95
+/* Generic Hardware Error Source version 2 */
96
+#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2 10
97
+
98
+/* Address offset in Generic Address Structure(GAS) */
99
+#define GAS_ADDR_OFFSET 4
100
+
101
+/*
102
+ * Hardware Error Notification
103
+ * ACPI 4.0: 17.3.2.7 Hardware Error Notification
104
+ * Composes dummy Hardware Error Notification descriptor of specified type
105
+ */
106
+static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
107
+{
108
+ /* Type */
109
+ build_append_int_noprefix(table, type, 1);
110
+ /*
111
+ * Length:
112
+ * Total length of the structure in bytes
113
+ */
114
+ build_append_int_noprefix(table, 28, 1);
115
+ /* Configuration Write Enable */
116
+ build_append_int_noprefix(table, 0, 2);
117
+ /* Poll Interval */
118
+ build_append_int_noprefix(table, 0, 4);
119
+ /* Vector */
120
+ build_append_int_noprefix(table, 0, 4);
121
+ /* Switch To Polling Threshold Value */
122
+ build_append_int_noprefix(table, 0, 4);
123
+ /* Switch To Polling Threshold Window */
124
+ build_append_int_noprefix(table, 0, 4);
125
+ /* Error Threshold Value */
126
+ build_append_int_noprefix(table, 0, 4);
127
+ /* Error Threshold Window */
128
+ build_append_int_noprefix(table, 0, 4);
129
+}
130
+
131
/*
132
* Build table for the hardware error fw_cfg blob.
133
* Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
134
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
135
bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
136
0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
137
}
138
+
139
+/* Build Generic Hardware Error Source version 2 (GHESv2) */
140
+static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
141
+{
142
+ uint64_t address_offset;
143
+ /*
144
+ * Type:
145
+ * Generic Hardware Error Source version 2(GHESv2 - Type 10)
146
+ */
147
+ build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
148
+ /* Source Id */
149
+ build_append_int_noprefix(table_data, source_id, 2);
150
+ /* Related Source Id */
151
+ build_append_int_noprefix(table_data, 0xffff, 2);
152
+ /* Flags */
153
+ build_append_int_noprefix(table_data, 0, 1);
154
+ /* Enabled */
155
+ build_append_int_noprefix(table_data, 1, 1);
156
+
157
+ /* Number of Records To Pre-allocate */
158
+ build_append_int_noprefix(table_data, 1, 4);
159
+ /* Max Sections Per Record */
160
+ build_append_int_noprefix(table_data, 1, 4);
161
+ /* Max Raw Data Length */
162
+ build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
163
+
164
+ address_offset = table_data->len;
165
+ /* Error Status Address */
166
+ build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
167
+ 4 /* QWord access */, 0);
168
+ bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
169
+ address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
170
+ ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
171
+
172
+ switch (source_id) {
173
+ case ACPI_HEST_SRC_ID_SEA:
174
+ /*
175
+ * Notification Structure
176
+ * Now only enable ARMv8 SEA notification type
177
+ */
178
+ build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
179
+ break;
180
+ default:
181
+ error_report("Not support this error source");
182
+ abort();
183
+ }
184
+
185
+ /* Error Status Block Length */
186
+ build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
187
+
188
+ /*
189
+ * Read Ack Register
190
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
191
+ * version 2 (GHESv2 - Type 10)
192
+ */
193
+ address_offset = table_data->len;
194
+ build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
195
+ 4 /* QWord access */, 0);
196
+ bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
197
+ address_offset + GAS_ADDR_OFFSET,
198
+ sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
199
+ (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
200
+
201
+ /*
202
+ * Read Ack Preserve field
203
+ * We only provide the first bit in Read Ack Register to OSPM to write
204
+ * while the other bits are preserved.
205
+ */
206
+ build_append_int_noprefix(table_data, ~0x1ULL, 8);
207
+ /* Read Ack Write */
208
+ build_append_int_noprefix(table_data, 0x1, 8);
209
+}
210
+
211
+/* Build Hardware Error Source Table */
212
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
213
+{
214
+ uint64_t hest_start = table_data->len;
215
+
216
+ /* Hardware Error Source Table header*/
217
+ acpi_data_push(table_data, sizeof(AcpiTableHeader));
218
+
219
+ /* Error Source Count */
220
+ build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
221
+
222
+ build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
223
+
224
+ build_header(linker, table_data, (void *)(table_data->data + hest_start),
225
+ "HEST", table_data->len - hest_start, 1, NULL, NULL);
226
+}
227
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
228
index XXXXXXX..XXXXXXX 100644
229
--- a/hw/arm/virt-acpi-build.c
230
+++ b/hw/arm/virt-acpi-build.c
231
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
232
233
if (vms->ras) {
234
build_ghes_error_table(tables->hardware_errors, tables->linker);
235
+ acpi_add_table(table_offsets, tables_blob);
236
+ acpi_build_hest(tables_blob, tables->linker);
47
}
237
}
48
238
49
+ assert((len * sizeof(uint32_t)) < BOOTLOADER_MAX_SIZE);
239
if (ms->numa_state->num_nodes > 0) {
50
+
51
rom_add_blob_fixed_as(name, code, len * sizeof(uint32_t), addr, as);
52
53
g_free(code);
54
@@ -XXX,XX +XXX,XX @@ static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
55
memcpy(&hdrvals, buffer + ARM64_TEXT_OFFSET_OFFSET, sizeof(hdrvals));
56
if (hdrvals[1] != 0) {
57
kernel_load_offset = le64_to_cpu(hdrvals[0]);
58
+
59
+ /*
60
+ * We write our startup "bootloader" at the very bottom of RAM,
61
+ * so that bit can't be used for the image. Luckily the Image
62
+ * format specification is that the image requests only an offset
63
+ * from a 2MB boundary, not an absolute load address. So if the
64
+ * image requests an offset that might mean it overlaps with the
65
+ * bootloader, we can just load it starting at 2MB+offset rather
66
+ * than 0MB + offset.
67
+ */
68
+ if (kernel_load_offset < BOOTLOADER_MAX_SIZE) {
69
+ kernel_load_offset += 2 * MiB;
70
+ }
71
}
72
}
73
74
--
240
--
75
2.19.1
241
2.20.1
76
242
77
243
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
This patch extends the qemu-kvm state sync logic with support for
3
Record the GHEB address via fw_cfg file, when recording
4
KVM_GET/SET_VCPU_EVENTS, giving access to yet missing SError exception.
4
a error to CPER, it will use this address to find out
5
And also it can support the exception state migration.
5
Generic Error Data Entries and write the error.
6
6
7
The SError exception states include SError pending state and ESR value,
7
In order to avoid migration failure, make hardware
8
the kvm_put/get_vcpu_events() will be called when set or get system
8
error table address to a part of GED device instead
9
registers. When do migration, if source machine has SError pending,
9
of global variable, then this address will be migrated
10
QEMU will do this migration regardless whether the target machine supports
10
to target QEMU.
11
to specify guest ESR value, because if target machine does not support that,
12
it can also inject the SError with zero ESR value.
13
11
12
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
14
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
13
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
15
Reviewed-by: Andrew Jones <drjones@redhat.com>
14
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
17
Message-id: 1538067351-23931-3-git-send-email-gengdongjiu@huawei.com
16
Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
---
18
---
20
target/arm/cpu.h | 7 ++++++
19
include/hw/acpi/generic_event_device.h | 2 ++
21
target/arm/kvm_arm.h | 24 ++++++++++++++++++
20
include/hw/acpi/ghes.h | 6 ++++++
22
target/arm/kvm.c | 60 ++++++++++++++++++++++++++++++++++++++++++++
21
hw/acpi/generic_event_device.c | 19 +++++++++++++++++++
23
target/arm/kvm32.c | 13 ++++++++++
22
hw/acpi/ghes.c | 14 ++++++++++++++
24
target/arm/kvm64.c | 13 ++++++++++
23
hw/arm/virt-acpi-build.c | 8 ++++++++
25
target/arm/machine.c | 22 ++++++++++++++++
24
5 files changed, 49 insertions(+)
26
6 files changed, 139 insertions(+)
27
25
28
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
26
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
29
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/cpu.h
28
--- a/include/hw/acpi/generic_event_device.h
31
+++ b/target/arm/cpu.h
29
+++ b/include/hw/acpi/generic_event_device.h
32
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
30
@@ -XXX,XX +XXX,XX @@
33
*/
31
34
} exception;
32
#include "hw/sysbus.h"
35
33
#include "hw/acpi/memory_hotplug.h"
36
+ /* Information associated with an SError */
34
+#include "hw/acpi/ghes.h"
37
+ struct {
35
38
+ uint8_t pending;
36
#define ACPI_POWER_BUTTON_DEVICE "PWRB"
39
+ uint8_t has_esr;
37
40
+ uint64_t esr;
38
@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
41
+ } serror;
39
GEDState ged_state;
40
uint32_t ged_event_bitmap;
41
qemu_irq irq;
42
+ AcpiGhesState ghes_state;
43
} AcpiGedState;
44
45
void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
46
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/include/hw/acpi/ghes.h
49
+++ b/include/hw/acpi/ghes.h
50
@@ -XXX,XX +XXX,XX @@ enum {
51
ACPI_HEST_SRC_ID_RESERVED,
52
};
53
54
+typedef struct AcpiGhesState {
55
+ uint64_t ghes_addr_le;
56
+} AcpiGhesState;
42
+
57
+
43
/* Thumb-2 EE state. */
58
void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
44
uint32_t teecr;
59
void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
45
uint32_t teehbr;
60
+void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
46
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
61
+ GArray *hardware_errors);
62
#endif
63
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
47
index XXXXXXX..XXXXXXX 100644
64
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/kvm_arm.h
65
--- a/hw/acpi/generic_event_device.c
49
+++ b/target/arm/kvm_arm.h
66
+++ b/hw/acpi/generic_event_device.c
50
@@ -XXX,XX +XXX,XX @@ bool write_kvmstate_to_list(ARMCPU *cpu);
67
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
51
*/
68
}
52
void kvm_arm_reset_vcpu(ARMCPU *cpu);
53
54
+/**
55
+ * kvm_arm_init_serror_injection:
56
+ * @cs: CPUState
57
+ *
58
+ * Check whether KVM can set guest SError syndrome.
59
+ */
60
+void kvm_arm_init_serror_injection(CPUState *cs);
61
+
62
+/**
63
+ * kvm_get_vcpu_events:
64
+ * @cpu: ARMCPU
65
+ *
66
+ * Get VCPU related state from kvm.
67
+ */
68
+int kvm_get_vcpu_events(ARMCPU *cpu);
69
+
70
+/**
71
+ * kvm_put_vcpu_events:
72
+ * @cpu: ARMCPU
73
+ *
74
+ * Put VCPU related state to kvm.
75
+ */
76
+int kvm_put_vcpu_events(ARMCPU *cpu);
77
+
78
#ifdef CONFIG_KVM
79
/**
80
* kvm_arm_create_scratch_host_vcpu:
81
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
82
index XXXXXXX..XXXXXXX 100644
83
--- a/target/arm/kvm.c
84
+++ b/target/arm/kvm.c
85
@@ -XXX,XX +XXX,XX @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
86
};
69
};
87
70
88
static bool cap_has_mp_state;
71
+static bool ghes_needed(void *opaque)
89
+static bool cap_has_inject_serror_esr;
90
91
static ARMHostCPUFeatures arm_host_cpu_features;
92
93
@@ -XXX,XX +XXX,XX @@ int kvm_arm_vcpu_init(CPUState *cs)
94
return kvm_vcpu_ioctl(cs, KVM_ARM_VCPU_INIT, &init);
95
}
96
97
+void kvm_arm_init_serror_injection(CPUState *cs)
98
+{
72
+{
99
+ cap_has_inject_serror_esr = kvm_check_extension(cs->kvm_state,
73
+ AcpiGedState *s = opaque;
100
+ KVM_CAP_ARM_INJECT_SERROR_ESR);
74
+ return s->ghes_state.ghes_addr_le;
101
+}
75
+}
102
+
76
+
103
bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
77
+static const VMStateDescription vmstate_ghes_state = {
104
int *fdarray,
78
+ .name = "acpi-ged/ghes",
105
struct kvm_vcpu_init *init)
106
@@ -XXX,XX +XXX,XX @@ int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu)
107
return 0;
108
}
109
110
+int kvm_put_vcpu_events(ARMCPU *cpu)
111
+{
112
+ CPUARMState *env = &cpu->env;
113
+ struct kvm_vcpu_events events;
114
+ int ret;
115
+
116
+ if (!kvm_has_vcpu_events()) {
117
+ return 0;
118
+ }
119
+
120
+ memset(&events, 0, sizeof(events));
121
+ events.exception.serror_pending = env->serror.pending;
122
+
123
+ /* Inject SError to guest with specified syndrome if host kernel
124
+ * supports it, otherwise inject SError without syndrome.
125
+ */
126
+ if (cap_has_inject_serror_esr) {
127
+ events.exception.serror_has_esr = env->serror.has_esr;
128
+ events.exception.serror_esr = env->serror.esr;
129
+ }
130
+
131
+ ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
132
+ if (ret) {
133
+ error_report("failed to put vcpu events");
134
+ }
135
+
136
+ return ret;
137
+}
138
+
139
+int kvm_get_vcpu_events(ARMCPU *cpu)
140
+{
141
+ CPUARMState *env = &cpu->env;
142
+ struct kvm_vcpu_events events;
143
+ int ret;
144
+
145
+ if (!kvm_has_vcpu_events()) {
146
+ return 0;
147
+ }
148
+
149
+ memset(&events, 0, sizeof(events));
150
+ ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_VCPU_EVENTS, &events);
151
+ if (ret) {
152
+ error_report("failed to get vcpu events");
153
+ return ret;
154
+ }
155
+
156
+ env->serror.pending = events.exception.serror_pending;
157
+ env->serror.has_esr = events.exception.serror_has_esr;
158
+ env->serror.esr = events.exception.serror_esr;
159
+
160
+ return 0;
161
+}
162
+
163
void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
164
{
165
}
166
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
167
index XXXXXXX..XXXXXXX 100644
168
--- a/target/arm/kvm32.c
169
+++ b/target/arm/kvm32.c
170
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init_vcpu(CPUState *cs)
171
}
172
cpu->mp_affinity = mpidr & ARM32_AFFINITY_MASK;
173
174
+ /* Check whether userspace can specify guest syndrome value */
175
+ kvm_arm_init_serror_injection(cs);
176
+
177
return kvm_arm_init_cpreg_list(cpu);
178
}
179
180
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
181
return ret;
182
}
183
184
+ ret = kvm_put_vcpu_events(cpu);
185
+ if (ret) {
186
+ return ret;
187
+ }
188
+
189
/* Note that we do not call write_cpustate_to_list()
190
* here, so we are only writing the tuple list back to
191
* KVM. This is safe because nothing can change the
192
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
193
}
194
vfp_set_fpscr(env, fpscr);
195
196
+ ret = kvm_get_vcpu_events(cpu);
197
+ if (ret) {
198
+ return ret;
199
+ }
200
+
201
if (!write_kvmstate_to_list(cpu)) {
202
return EINVAL;
203
}
204
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
205
index XXXXXXX..XXXXXXX 100644
206
--- a/target/arm/kvm64.c
207
+++ b/target/arm/kvm64.c
208
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init_vcpu(CPUState *cs)
209
210
kvm_arm_init_debug(cs);
211
212
+ /* Check whether user space can specify guest syndrome value */
213
+ kvm_arm_init_serror_injection(cs);
214
+
215
return kvm_arm_init_cpreg_list(cpu);
216
}
217
218
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
219
return ret;
220
}
221
222
+ ret = kvm_put_vcpu_events(cpu);
223
+ if (ret) {
224
+ return ret;
225
+ }
226
+
227
if (!write_list_to_kvmstate(cpu, level)) {
228
return EINVAL;
229
}
230
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
231
}
232
vfp_set_fpcr(env, fpr);
233
234
+ ret = kvm_get_vcpu_events(cpu);
235
+ if (ret) {
236
+ return ret;
237
+ }
238
+
239
if (!write_kvmstate_to_list(cpu)) {
240
return EINVAL;
241
}
242
diff --git a/target/arm/machine.c b/target/arm/machine.c
243
index XXXXXXX..XXXXXXX 100644
244
--- a/target/arm/machine.c
245
+++ b/target/arm/machine.c
246
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_sve = {
247
};
248
#endif /* AARCH64 */
249
250
+static bool serror_needed(void *opaque)
251
+{
252
+ ARMCPU *cpu = opaque;
253
+ CPUARMState *env = &cpu->env;
254
+
255
+ return env->serror.pending != 0;
256
+}
257
+
258
+static const VMStateDescription vmstate_serror = {
259
+ .name = "cpu/serror",
260
+ .version_id = 1,
79
+ .version_id = 1,
261
+ .minimum_version_id = 1,
80
+ .minimum_version_id = 1,
262
+ .needed = serror_needed,
81
+ .needed = ghes_needed,
263
+ .fields = (VMStateField[]) {
82
+ .fields = (VMStateField[]) {
264
+ VMSTATE_UINT8(env.serror.pending, ARMCPU),
83
+ VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
265
+ VMSTATE_UINT8(env.serror.has_esr, ARMCPU),
84
+ vmstate_ghes_state, AcpiGhesState),
266
+ VMSTATE_UINT64(env.serror.esr, ARMCPU),
267
+ VMSTATE_END_OF_LIST()
85
+ VMSTATE_END_OF_LIST()
268
+ }
86
+ }
269
+};
87
+};
270
+
88
+
271
static bool m_needed(void *opaque)
89
static const VMStateDescription vmstate_acpi_ged = {
272
{
90
.name = "acpi-ged",
273
ARMCPU *cpu = opaque;
91
.version_id = 1,
274
@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_arm_cpu = {
92
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
275
#ifdef TARGET_AARCH64
93
},
276
&vmstate_sve,
94
.subsections = (const VMStateDescription * []) {
277
#endif
95
&vmstate_memhp_state,
278
+ &vmstate_serror,
96
+ &vmstate_ghes_state,
279
NULL
97
NULL
280
}
98
}
281
};
99
};
100
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/hw/acpi/ghes.c
103
+++ b/hw/acpi/ghes.c
104
@@ -XXX,XX +XXX,XX @@
105
#include "hw/acpi/ghes.h"
106
#include "hw/acpi/aml-build.h"
107
#include "qemu/error-report.h"
108
+#include "hw/acpi/generic_event_device.h"
109
+#include "hw/nvram/fw_cfg.h"
110
111
#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
112
#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
113
@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
114
build_header(linker, table_data, (void *)(table_data->data + hest_start),
115
"HEST", table_data->len - hest_start, 1, NULL, NULL);
116
}
117
+
118
+void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
119
+ GArray *hardware_error)
120
+{
121
+ /* Create a read-only fw_cfg file for GHES */
122
+ fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
123
+ hardware_error->len);
124
+
125
+ /* Create a read-write fw_cfg file for Address */
126
+ fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
127
+ NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
128
+}
129
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/hw/arm/virt-acpi-build.c
132
+++ b/hw/arm/virt-acpi-build.c
133
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
134
{
135
AcpiBuildTables tables;
136
AcpiBuildState *build_state;
137
+ AcpiGedState *acpi_ged_state;
138
139
if (!vms->fw_cfg) {
140
trace_virt_acpi_setup();
141
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
142
fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
143
acpi_data_len(tables.tcpalog));
144
145
+ if (vms->ras) {
146
+ assert(vms->acpi_dev);
147
+ acpi_ged_state = ACPI_GED(vms->acpi_dev);
148
+ acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
149
+ vms->fw_cfg, tables.hardware_errors);
150
+ }
151
+
152
build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
153
build_state, tables.rsdp,
154
ACPI_BUILD_RSDP_FILE, 0);
282
--
155
--
283
2.19.1
156
2.20.1
284
157
285
158
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
Instead of shifts and masks, use direct loads and stores from the neon
3
kvm_hwpoison_page_add() and kvm_unpoison_all() will both
4
register file. Mirror the iteration structure of the ARM pseudocode
4
be used by X86 and ARM platforms, so moving them into
5
more closely. Correct the parameters of the VLD2 A2 insn.
5
"accel/kvm/kvm-all.c" to avoid duplicate code.
6
6
7
Note that this includes a bugfix for handling of the insn
7
For architectures that don't use the poison-list functionality
8
"VLD2 (multiple 2-element structures)" -- we were using an
8
the reset handler will harmlessly do nothing, so let's register
9
incorrect stride value.
9
the kvm_unpoison_all() function in the generic kvm_init() function.
10
10
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20181011205206.3552-19-richard.henderson@linaro.org
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
13
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
14
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
15
Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
17
---
16
target/arm/translate.c | 170 ++++++++++++++++++-----------------------
18
include/sysemu/kvm_int.h | 12 ++++++++++++
17
1 file changed, 74 insertions(+), 96 deletions(-)
19
accel/kvm/kvm-all.c | 36 ++++++++++++++++++++++++++++++++++++
20
target/i386/kvm.c | 36 ------------------------------------
21
3 files changed, 48 insertions(+), 36 deletions(-)
18
22
19
diff --git a/target/arm/translate.c b/target/arm/translate.c
23
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
20
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate.c
25
--- a/include/sysemu/kvm_int.h
22
+++ b/target/arm/translate.c
26
+++ b/include/sysemu/kvm_int.h
23
@@ -XXX,XX +XXX,XX @@ static TCGv_i32 neon_load_reg(int reg, int pass)
27
@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
24
return tmp;
28
AddressSpace *as, int as_id);
29
30
void kvm_set_max_memslot_size(hwaddr max_slot_size);
31
+
32
+/**
33
+ * kvm_hwpoison_page_add:
34
+ *
35
+ * Parameters:
36
+ * @ram_addr: the address in the RAM for the poisoned page
37
+ *
38
+ * Add a poisoned page to the list
39
+ *
40
+ * Return: None.
41
+ */
42
+void kvm_hwpoison_page_add(ram_addr_t ram_addr);
43
#endif
44
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/accel/kvm/kvm-all.c
47
+++ b/accel/kvm/kvm-all.c
48
@@ -XXX,XX +XXX,XX @@
49
#include "qapi/visitor.h"
50
#include "qapi/qapi-types-common.h"
51
#include "qapi/qapi-visit-common.h"
52
+#include "sysemu/reset.h"
53
54
#include "hw/boards.h"
55
56
@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
57
return ret;
25
}
58
}
26
59
27
+static void neon_load_element64(TCGv_i64 var, int reg, int ele, TCGMemOp mop)
60
+typedef struct HWPoisonPage {
61
+ ram_addr_t ram_addr;
62
+ QLIST_ENTRY(HWPoisonPage) list;
63
+} HWPoisonPage;
64
+
65
+static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
66
+ QLIST_HEAD_INITIALIZER(hwpoison_page_list);
67
+
68
+static void kvm_unpoison_all(void *param)
28
+{
69
+{
29
+ long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
70
+ HWPoisonPage *page, *next_page;
30
+
71
+
31
+ switch (mop) {
72
+ QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
32
+ case MO_UB:
73
+ QLIST_REMOVE(page, list);
33
+ tcg_gen_ld8u_i64(var, cpu_env, offset);
74
+ qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
34
+ break;
75
+ g_free(page);
35
+ case MO_UW:
36
+ tcg_gen_ld16u_i64(var, cpu_env, offset);
37
+ break;
38
+ case MO_UL:
39
+ tcg_gen_ld32u_i64(var, cpu_env, offset);
40
+ break;
41
+ case MO_Q:
42
+ tcg_gen_ld_i64(var, cpu_env, offset);
43
+ break;
44
+ default:
45
+ g_assert_not_reached();
46
+ }
76
+ }
47
+}
77
+}
48
+
78
+
49
static void neon_store_reg(int reg, int pass, TCGv_i32 var)
79
+void kvm_hwpoison_page_add(ram_addr_t ram_addr)
50
{
51
tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass));
52
tcg_temp_free_i32(var);
53
}
54
55
+static void neon_store_element64(int reg, int ele, TCGMemOp size, TCGv_i64 var)
56
+{
80
+{
57
+ long offset = neon_element_offset(reg, ele, size);
81
+ HWPoisonPage *page;
58
+
82
+
59
+ switch (size) {
83
+ QLIST_FOREACH(page, &hwpoison_page_list, list) {
60
+ case MO_8:
84
+ if (page->ram_addr == ram_addr) {
61
+ tcg_gen_st8_i64(var, cpu_env, offset);
85
+ return;
62
+ break;
86
+ }
63
+ case MO_16:
64
+ tcg_gen_st16_i64(var, cpu_env, offset);
65
+ break;
66
+ case MO_32:
67
+ tcg_gen_st32_i64(var, cpu_env, offset);
68
+ break;
69
+ case MO_64:
70
+ tcg_gen_st_i64(var, cpu_env, offset);
71
+ break;
72
+ default:
73
+ g_assert_not_reached();
74
+ }
87
+ }
88
+ page = g_new(HWPoisonPage, 1);
89
+ page->ram_addr = ram_addr;
90
+ QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
75
+}
91
+}
76
+
92
+
77
static inline void neon_load_reg64(TCGv_i64 var, int reg)
93
static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
78
{
94
{
79
tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
95
#if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
80
@@ -XXX,XX +XXX,XX @@ static struct {
96
@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
81
int interleave;
97
s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
82
int spacing;
98
}
83
} const neon_ls_element_type[11] = {
99
84
- {4, 4, 1},
100
+ qemu_register_reset(kvm_unpoison_all, NULL);
85
- {4, 4, 2},
86
+ {1, 4, 1},
87
+ {1, 4, 2},
88
{4, 1, 1},
89
- {4, 2, 1},
90
- {3, 3, 1},
91
- {3, 3, 2},
92
+ {2, 2, 2},
93
+ {1, 3, 1},
94
+ {1, 3, 2},
95
{3, 1, 1},
96
{1, 1, 1},
97
- {2, 2, 1},
98
- {2, 2, 2},
99
+ {1, 2, 1},
100
+ {1, 2, 2},
101
{2, 1, 1}
102
};
103
104
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
105
int shift;
106
int n;
107
int vec_size;
108
+ int mmu_idx;
109
+ TCGMemOp endian;
110
TCGv_i32 addr;
111
TCGv_i32 tmp;
112
TCGv_i32 tmp2;
113
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
114
rn = (insn >> 16) & 0xf;
115
rm = insn & 0xf;
116
load = (insn & (1 << 21)) != 0;
117
+ endian = s->be_data;
118
+ mmu_idx = get_mem_index(s);
119
if ((insn & (1 << 23)) == 0) {
120
/* Load store all elements. */
121
op = (insn >> 8) & 0xf;
122
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
123
nregs = neon_ls_element_type[op].nregs;
124
interleave = neon_ls_element_type[op].interleave;
125
spacing = neon_ls_element_type[op].spacing;
126
- if (size == 3 && (interleave | spacing) != 1)
127
+ if (size == 3 && (interleave | spacing) != 1) {
128
return 1;
129
+ }
130
+ tmp64 = tcg_temp_new_i64();
131
addr = tcg_temp_new_i32();
132
+ tmp2 = tcg_const_i32(1 << size);
133
load_reg_var(s, addr, rn);
134
- stride = (1 << size) * interleave;
135
for (reg = 0; reg < nregs; reg++) {
136
- if (interleave > 2 || (interleave == 2 && nregs == 2)) {
137
- load_reg_var(s, addr, rn);
138
- tcg_gen_addi_i32(addr, addr, (1 << size) * reg);
139
- } else if (interleave == 2 && nregs == 4 && reg == 2) {
140
- load_reg_var(s, addr, rn);
141
- tcg_gen_addi_i32(addr, addr, 1 << size);
142
- }
143
- if (size == 3) {
144
- tmp64 = tcg_temp_new_i64();
145
- if (load) {
146
- gen_aa32_ld64(s, tmp64, addr, get_mem_index(s));
147
- neon_store_reg64(tmp64, rd);
148
- } else {
149
- neon_load_reg64(tmp64, rd);
150
- gen_aa32_st64(s, tmp64, addr, get_mem_index(s));
151
- }
152
- tcg_temp_free_i64(tmp64);
153
- tcg_gen_addi_i32(addr, addr, stride);
154
- } else {
155
- for (pass = 0; pass < 2; pass++) {
156
- if (size == 2) {
157
- if (load) {
158
- tmp = tcg_temp_new_i32();
159
- gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
160
- neon_store_reg(rd, pass, tmp);
161
- } else {
162
- tmp = neon_load_reg(rd, pass);
163
- gen_aa32_st32(s, tmp, addr, get_mem_index(s));
164
- tcg_temp_free_i32(tmp);
165
- }
166
- tcg_gen_addi_i32(addr, addr, stride);
167
- } else if (size == 1) {
168
- if (load) {
169
- tmp = tcg_temp_new_i32();
170
- gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
171
- tcg_gen_addi_i32(addr, addr, stride);
172
- tmp2 = tcg_temp_new_i32();
173
- gen_aa32_ld16u(s, tmp2, addr, get_mem_index(s));
174
- tcg_gen_addi_i32(addr, addr, stride);
175
- tcg_gen_shli_i32(tmp2, tmp2, 16);
176
- tcg_gen_or_i32(tmp, tmp, tmp2);
177
- tcg_temp_free_i32(tmp2);
178
- neon_store_reg(rd, pass, tmp);
179
- } else {
180
- tmp = neon_load_reg(rd, pass);
181
- tmp2 = tcg_temp_new_i32();
182
- tcg_gen_shri_i32(tmp2, tmp, 16);
183
- gen_aa32_st16(s, tmp, addr, get_mem_index(s));
184
- tcg_temp_free_i32(tmp);
185
- tcg_gen_addi_i32(addr, addr, stride);
186
- gen_aa32_st16(s, tmp2, addr, get_mem_index(s));
187
- tcg_temp_free_i32(tmp2);
188
- tcg_gen_addi_i32(addr, addr, stride);
189
- }
190
- } else /* size == 0 */ {
191
- if (load) {
192
- tmp2 = NULL;
193
- for (n = 0; n < 4; n++) {
194
- tmp = tcg_temp_new_i32();
195
- gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
196
- tcg_gen_addi_i32(addr, addr, stride);
197
- if (n == 0) {
198
- tmp2 = tmp;
199
- } else {
200
- tcg_gen_shli_i32(tmp, tmp, n * 8);
201
- tcg_gen_or_i32(tmp2, tmp2, tmp);
202
- tcg_temp_free_i32(tmp);
203
- }
204
- }
205
- neon_store_reg(rd, pass, tmp2);
206
- } else {
207
- tmp2 = neon_load_reg(rd, pass);
208
- for (n = 0; n < 4; n++) {
209
- tmp = tcg_temp_new_i32();
210
- if (n == 0) {
211
- tcg_gen_mov_i32(tmp, tmp2);
212
- } else {
213
- tcg_gen_shri_i32(tmp, tmp2, n * 8);
214
- }
215
- gen_aa32_st8(s, tmp, addr, get_mem_index(s));
216
- tcg_temp_free_i32(tmp);
217
- tcg_gen_addi_i32(addr, addr, stride);
218
- }
219
- tcg_temp_free_i32(tmp2);
220
- }
221
+ for (n = 0; n < 8 >> size; n++) {
222
+ int xs;
223
+ for (xs = 0; xs < interleave; xs++) {
224
+ int tt = rd + reg + spacing * xs;
225
+
101
+
226
+ if (load) {
102
if (s->kernel_irqchip_allowed) {
227
+ gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
103
kvm_irqchip_create(s);
228
+ neon_store_element64(tt, n, size, tmp64);
104
}
229
+ } else {
105
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
230
+ neon_load_element64(tmp64, tt, n, size);
106
index XXXXXXX..XXXXXXX 100644
231
+ gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
107
--- a/target/i386/kvm.c
232
}
108
+++ b/target/i386/kvm.c
233
+ tcg_gen_add_i32(addr, addr, tmp2);
109
@@ -XXX,XX +XXX,XX @@
234
}
110
#include "sysemu/sysemu.h"
235
}
111
#include "sysemu/hw_accel.h"
236
- rd += spacing;
112
#include "sysemu/kvm_int.h"
237
}
113
-#include "sysemu/reset.h"
238
tcg_temp_free_i32(addr);
114
#include "sysemu/runstate.h"
239
- stride = nregs * 8;
115
#include "kvm_i386.h"
240
+ tcg_temp_free_i32(tmp2);
116
#include "hyperv.h"
241
+ tcg_temp_free_i64(tmp64);
117
@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
242
+ stride = nregs * interleave * 8;
118
}
243
} else {
119
}
244
size = (insn >> 10) & 3;
120
245
if (size == 3) {
121
-
122
-typedef struct HWPoisonPage {
123
- ram_addr_t ram_addr;
124
- QLIST_ENTRY(HWPoisonPage) list;
125
-} HWPoisonPage;
126
-
127
-static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
128
- QLIST_HEAD_INITIALIZER(hwpoison_page_list);
129
-
130
-static void kvm_unpoison_all(void *param)
131
-{
132
- HWPoisonPage *page, *next_page;
133
-
134
- QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
135
- QLIST_REMOVE(page, list);
136
- qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
137
- g_free(page);
138
- }
139
-}
140
-
141
-static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
142
-{
143
- HWPoisonPage *page;
144
-
145
- QLIST_FOREACH(page, &hwpoison_page_list, list) {
146
- if (page->ram_addr == ram_addr) {
147
- return;
148
- }
149
- }
150
- page = g_new(HWPoisonPage, 1);
151
- page->ram_addr = ram_addr;
152
- QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
153
-}
154
-
155
static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
156
int *max_banks)
157
{
158
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
159
fprintf(stderr, "e820_add_entry() table is full\n");
160
return ret;
161
}
162
- qemu_register_reset(kvm_unpoison_all, NULL);
163
164
shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
165
if (shadow_mem != -1) {
246
--
166
--
247
2.19.1
167
2.20.1
248
168
249
169
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
3
kvm_arch_on_sigbus_vcpu() error injection uses source_id as
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
index in etc/hardware_errors to find out Error Status Data
5
Message-id: 20181016223115.24100-8-richard.henderson@linaro.org
5
Block entry corresponding to error source. So supported source_id
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
values should be assigned here and not be changed afterwards to
7
make sure that guest will write error into expected Error Status
8
Data Block.
9
10
Before QEMU writes a new error to ACPI table, it will check whether
11
previous error has been acknowledged. If not acknowledged, the new
12
errors will be ignored and not be recorded. For the errors section
13
type, QEMU simulate it to memory section error.
14
15
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
16
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
17
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
18
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
19
Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
21
---
9
target/arm/cpu.h | 16 +++++++++++++++-
22
include/hw/acpi/ghes.h | 1 +
10
linux-user/aarch64/signal.c | 4 ++--
23
hw/acpi/ghes.c | 219 +++++++++++++++++++++++++++++++++++++++++
11
linux-user/elfload.c | 2 +-
24
2 files changed, 220 insertions(+)
12
linux-user/syscall.c | 10 ++++++----
25
13
target/arm/cpu64.c | 5 ++++-
26
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
14
target/arm/helper.c | 9 ++++++---
15
target/arm/machine.c | 3 +--
16
target/arm/translate-a64.c | 4 ++--
17
8 files changed, 37 insertions(+), 16 deletions(-)
18
19
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
20
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/cpu.h
28
--- a/include/hw/acpi/ghes.h
22
+++ b/target/arm/cpu.h
29
+++ b/include/hw/acpi/ghes.h
23
@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64ISAR1, FRINTTS, 32, 4)
30
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
24
FIELD(ID_AA64ISAR1, SB, 36, 4)
31
void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
25
FIELD(ID_AA64ISAR1, SPECRES, 40, 4)
32
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
26
33
GArray *hardware_errors);
27
+FIELD(ID_AA64PFR0, EL0, 0, 4)
34
+int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
28
+FIELD(ID_AA64PFR0, EL1, 4, 4)
35
#endif
29
+FIELD(ID_AA64PFR0, EL2, 8, 4)
36
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
30
+FIELD(ID_AA64PFR0, EL3, 12, 4)
37
index XXXXXXX..XXXXXXX 100644
31
+FIELD(ID_AA64PFR0, FP, 16, 4)
38
--- a/hw/acpi/ghes.c
32
+FIELD(ID_AA64PFR0, ADVSIMD, 20, 4)
39
+++ b/hw/acpi/ghes.c
33
+FIELD(ID_AA64PFR0, GIC, 24, 4)
40
@@ -XXX,XX +XXX,XX @@
34
+FIELD(ID_AA64PFR0, RAS, 28, 4)
41
#include "qemu/error-report.h"
35
+FIELD(ID_AA64PFR0, SVE, 32, 4)
42
#include "hw/acpi/generic_event_device.h"
36
+
43
#include "hw/nvram/fw_cfg.h"
37
QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
44
+#include "qemu/uuid.h"
38
45
39
/* If adding a feature bit which corresponds to a Linux ELF
46
#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
40
@@ -XXX,XX +XXX,XX @@ enum arm_features {
47
#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
41
ARM_FEATURE_PMU, /* has PMU support */
48
@@ -XXX,XX +XXX,XX @@
42
ARM_FEATURE_VBAR, /* has cp15 VBAR */
49
/* Address offset in Generic Address Structure(GAS) */
43
ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
50
#define GAS_ADDR_OFFSET 4
44
- ARM_FEATURE_SVE, /* has Scalable Vector Extension */
51
45
ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
52
+/*
46
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
53
+ * The total size of Generic Error Data Entry
47
};
54
+ * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
48
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
55
+ * Table 18-343 Generic Error Data Entry
49
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
56
+ */
57
+#define ACPI_GHES_DATA_LENGTH 72
58
+
59
+/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
60
+#define ACPI_GHES_MEM_CPER_LENGTH 80
61
+
62
+/* Masks for block_status flags */
63
+#define ACPI_GEBS_UNCORRECTABLE 1
64
+
65
+/*
66
+ * Total size for Generic Error Status Block except Generic Error Data Entries
67
+ * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
68
+ * Table 18-380 Generic Error Status Block
69
+ */
70
+#define ACPI_GHES_GESB_SIZE 20
71
+
72
+/*
73
+ * Values for error_severity field
74
+ */
75
+enum AcpiGenericErrorSeverity {
76
+ ACPI_CPER_SEV_RECOVERABLE = 0,
77
+ ACPI_CPER_SEV_FATAL = 1,
78
+ ACPI_CPER_SEV_CORRECTED = 2,
79
+ ACPI_CPER_SEV_NONE = 3,
80
+};
81
+
82
/*
83
* Hardware Error Notification
84
* ACPI 4.0: 17.3.2.7 Hardware Error Notification
85
@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
86
build_append_int_noprefix(table, 0, 4);
50
}
87
}
51
88
52
+static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
89
+/*
53
+{
90
+ * Generic Error Data Entry
54
+ return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
91
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
92
+ */
93
+static void acpi_ghes_generic_error_data(GArray *table,
94
+ const uint8_t *section_type, uint32_t error_severity,
95
+ uint8_t validation_bits, uint8_t flags,
96
+ uint32_t error_data_length, QemuUUID fru_id,
97
+ uint64_t time_stamp)
98
+{
99
+ const uint8_t fru_text[20] = {0};
100
+
101
+ /* Section Type */
102
+ g_array_append_vals(table, section_type, 16);
103
+
104
+ /* Error Severity */
105
+ build_append_int_noprefix(table, error_severity, 4);
106
+ /* Revision */
107
+ build_append_int_noprefix(table, 0x300, 2);
108
+ /* Validation Bits */
109
+ build_append_int_noprefix(table, validation_bits, 1);
110
+ /* Flags */
111
+ build_append_int_noprefix(table, flags, 1);
112
+ /* Error Data Length */
113
+ build_append_int_noprefix(table, error_data_length, 4);
114
+
115
+ /* FRU Id */
116
+ g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
117
+
118
+ /* FRU Text */
119
+ g_array_append_vals(table, fru_text, sizeof(fru_text));
120
+
121
+ /* Timestamp */
122
+ build_append_int_noprefix(table, time_stamp, 8);
123
+}
124
+
125
+/*
126
+ * Generic Error Status Block
127
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
128
+ */
129
+static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
130
+ uint32_t raw_data_offset, uint32_t raw_data_length,
131
+ uint32_t data_length, uint32_t error_severity)
132
+{
133
+ /* Block Status */
134
+ build_append_int_noprefix(table, block_status, 4);
135
+ /* Raw Data Offset */
136
+ build_append_int_noprefix(table, raw_data_offset, 4);
137
+ /* Raw Data Length */
138
+ build_append_int_noprefix(table, raw_data_length, 4);
139
+ /* Data Length */
140
+ build_append_int_noprefix(table, data_length, 4);
141
+ /* Error Severity */
142
+ build_append_int_noprefix(table, error_severity, 4);
143
+}
144
+
145
+/* UEFI 2.6: N.2.5 Memory Error Section */
146
+static void acpi_ghes_build_append_mem_cper(GArray *table,
147
+ uint64_t error_physical_addr)
148
+{
149
+ /*
150
+ * Memory Error Record
151
+ */
152
+
153
+ /* Validation Bits */
154
+ build_append_int_noprefix(table,
155
+ (1ULL << 14) | /* Type Valid */
156
+ (1ULL << 1) /* Physical Address Valid */,
157
+ 8);
158
+ /* Error Status */
159
+ build_append_int_noprefix(table, 0, 8);
160
+ /* Physical Address */
161
+ build_append_int_noprefix(table, error_physical_addr, 8);
162
+ /* Skip all the detailed information normally found in such a record */
163
+ build_append_int_noprefix(table, 0, 48);
164
+ /* Memory Error Type */
165
+ build_append_int_noprefix(table, 0 /* Unknown error */, 1);
166
+ /* Skip all the detailed information normally found in such a record */
167
+ build_append_int_noprefix(table, 0, 7);
168
+}
169
+
170
+static int acpi_ghes_record_mem_error(uint64_t error_block_address,
171
+ uint64_t error_physical_addr)
172
+{
173
+ GArray *block;
174
+
175
+ /* Memory Error Section Type */
176
+ const uint8_t uefi_cper_mem_sec[] =
177
+ UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
178
+ 0xED, 0x7C, 0x83, 0xB1);
179
+
180
+ /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
181
+ * Table 17-13 Generic Error Data Entry
182
+ */
183
+ QemuUUID fru_id = {};
184
+ uint32_t data_length;
185
+
186
+ block = g_array_new(false, true /* clear */, 1);
187
+
188
+ /* This is the length if adding a new generic error data entry*/
189
+ data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
190
+
191
+ /*
192
+ * Check whether it will run out of the preallocated memory if adding a new
193
+ * generic error data entry
194
+ */
195
+ if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
196
+ error_report("Not enough memory to record new CPER!!!");
197
+ g_array_free(block, true);
198
+ return -1;
199
+ }
200
+
201
+ /* Build the new generic error status block header */
202
+ acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
203
+ 0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
204
+
205
+ /* Build this new generic error data entry header */
206
+ acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
207
+ ACPI_CPER_SEV_RECOVERABLE, 0, 0,
208
+ ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
209
+
210
+ /* Build the memory section CPER for above new generic error data entry */
211
+ acpi_ghes_build_append_mem_cper(block, error_physical_addr);
212
+
213
+ /* Write the generic error data entry into guest memory */
214
+ cpu_physical_memory_write(error_block_address, block->data, block->len);
215
+
216
+ g_array_free(block, true);
217
+
218
+ return 0;
55
+}
219
+}
56
+
220
+
57
/*
221
/*
58
* Forward to the above feature tests given an ARMCPU pointer.
222
* Build table for the hardware error fw_cfg blob.
59
*/
223
* Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
60
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
224
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
61
index XXXXXXX..XXXXXXX 100644
225
fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
62
--- a/linux-user/aarch64/signal.c
226
NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
63
+++ b/linux-user/aarch64/signal.c
64
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
65
break;
66
67
case TARGET_SVE_MAGIC:
68
- if (arm_feature(env, ARM_FEATURE_SVE)) {
69
+ if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(env))) {
70
vq = (env->vfp.zcr_el[1] & 0xf) + 1;
71
sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
72
if (!sve && size == sve_size) {
73
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
74
&layout);
75
76
/* SVE state needs saving only if it exists. */
77
- if (arm_feature(env, ARM_FEATURE_SVE)) {
78
+ if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(env))) {
79
vq = (env->vfp.zcr_el[1] & 0xf) + 1;
80
sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
81
sve_ofs = alloc_sigframe_space(sve_size, &layout);
82
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
83
index XXXXXXX..XXXXXXX 100644
84
--- a/linux-user/elfload.c
85
+++ b/linux-user/elfload.c
86
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
87
GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
88
GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
89
GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
90
- GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
91
+ GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
92
93
#undef GET_FEATURE
94
#undef GET_FEATURE_ID
95
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
96
index XXXXXXX..XXXXXXX 100644
97
--- a/linux-user/syscall.c
98
+++ b/linux-user/syscall.c
99
@@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
100
* even though the current architectural maximum is VQ=16.
101
*/
102
ret = -TARGET_EINVAL;
103
- if (arm_feature(cpu_env, ARM_FEATURE_SVE)
104
+ if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(cpu_env))
105
&& arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
106
CPUARMState *env = cpu_env;
107
ARMCPU *cpu = arm_env_get_cpu(env);
108
@@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
109
return ret;
110
case TARGET_PR_SVE_GET_VL:
111
ret = -TARGET_EINVAL;
112
- if (arm_feature(cpu_env, ARM_FEATURE_SVE)) {
113
- CPUARMState *env = cpu_env;
114
- ret = ((env->vfp.zcr_el[1] & 0xf) + 1) * 16;
115
+ {
116
+ ARMCPU *cpu = arm_env_get_cpu(cpu_env);
117
+ if (cpu_isar_feature(aa64_sve, cpu)) {
118
+ ret = ((cpu->env.vfp.zcr_el[1] & 0xf) + 1) * 16;
119
+ }
120
}
121
return ret;
122
#endif /* AARCH64 */
123
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
124
index XXXXXXX..XXXXXXX 100644
125
--- a/target/arm/cpu64.c
126
+++ b/target/arm/cpu64.c
127
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
128
t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
129
cpu->isar.id_aa64isar1 = t;
130
131
+ t = cpu->isar.id_aa64pfr0;
132
+ t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
133
+ cpu->isar.id_aa64pfr0 = t;
134
+
135
/* Replicate the same data to the 32-bit id registers. */
136
u = cpu->isar.id_isar5;
137
u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
138
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
139
* present in either.
140
*/
141
set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
142
- set_feature(&cpu->env, ARM_FEATURE_SVE);
143
/* For usermode -cpu max we can use a larger and more efficient DCZ
144
* blocksize since we don't have to follow what the hardware does.
145
*/
146
diff --git a/target/arm/helper.c b/target/arm/helper.c
147
index XXXXXXX..XXXXXXX 100644
148
--- a/target/arm/helper.c
149
+++ b/target/arm/helper.c
150
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
151
define_one_arm_cp_reg(cpu, &sctlr);
152
}
153
154
- if (arm_feature(env, ARM_FEATURE_SVE)) {
155
+ if (cpu_isar_feature(aa64_sve, cpu)) {
156
define_one_arm_cp_reg(cpu, &zcr_el1_reginfo);
157
if (arm_feature(env, ARM_FEATURE_EL2)) {
158
define_one_arm_cp_reg(cpu, &zcr_el2_reginfo);
159
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
160
uint32_t flags;
161
162
if (is_a64(env)) {
163
+ ARMCPU *cpu = arm_env_get_cpu(env);
164
+
165
*pc = env->pc;
166
flags = ARM_TBFLAG_AARCH64_STATE_MASK;
167
/* Get control bits for tagged addresses */
168
flags |= (arm_regime_tbi0(env, mmu_idx) << ARM_TBFLAG_TBI0_SHIFT);
169
flags |= (arm_regime_tbi1(env, mmu_idx) << ARM_TBFLAG_TBI1_SHIFT);
170
171
- if (arm_feature(env, ARM_FEATURE_SVE)) {
172
+ if (cpu_isar_feature(aa64_sve, cpu)) {
173
int sve_el = sve_exception_el(env, current_el);
174
uint32_t zcr_len;
175
176
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
177
void aarch64_sve_change_el(CPUARMState *env, int old_el,
178
int new_el, bool el0_a64)
179
{
180
+ ARMCPU *cpu = arm_env_get_cpu(env);
181
int old_len, new_len;
182
bool old_a64, new_a64;
183
184
/* Nothing to do if no SVE. */
185
- if (!arm_feature(env, ARM_FEATURE_SVE)) {
186
+ if (!cpu_isar_feature(aa64_sve, cpu)) {
187
return;
188
}
189
190
diff --git a/target/arm/machine.c b/target/arm/machine.c
191
index XXXXXXX..XXXXXXX 100644
192
--- a/target/arm/machine.c
193
+++ b/target/arm/machine.c
194
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_iwmmxt = {
195
static bool sve_needed(void *opaque)
196
{
197
ARMCPU *cpu = opaque;
198
- CPUARMState *env = &cpu->env;
199
200
- return arm_feature(env, ARM_FEATURE_SVE);
201
+ return cpu_isar_feature(aa64_sve, cpu);
202
}
227
}
203
228
+
204
/* The first two words of each Zreg is stored in VFP state. */
229
+int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
205
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
230
+{
206
index XXXXXXX..XXXXXXX 100644
231
+ uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
207
--- a/target/arm/translate-a64.c
232
+ uint64_t start_addr;
208
+++ b/target/arm/translate-a64.c
233
+ bool ret = -1;
209
@@ -XXX,XX +XXX,XX @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
234
+ AcpiGedState *acpi_ged_state;
210
cpu_fprintf(f, " FPCR=%08x FPSR=%08x\n",
235
+ AcpiGhesState *ags;
211
vfp_get_fpcr(env), vfp_get_fpsr(env));
236
+
212
237
+ assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
213
- if (arm_feature(env, ARM_FEATURE_SVE) && sve_exception_el(env, el) == 0) {
238
+
214
+ if (cpu_isar_feature(aa64_sve, cpu) && sve_exception_el(env, el) == 0) {
239
+ acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
215
int j, zcr_len = sve_zcr_len_for_el(env, el);
240
+ NULL));
216
241
+ g_assert(acpi_ged_state);
217
for (i = 0; i <= FFR_PRED_NUM; i++) {
242
+ ags = &acpi_ged_state->ghes_state;
218
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
243
+
219
unallocated_encoding(s);
244
+ start_addr = le64_to_cpu(ags->ghes_addr_le);
220
break;
245
+
221
case 0x2:
246
+ if (physical_address) {
222
- if (!arm_dc_feature(s, ARM_FEATURE_SVE) || !disas_sve(s, insn)) {
247
+
223
+ if (!dc_isar_feature(aa64_sve, s) || !disas_sve(s, insn)) {
248
+ if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
224
unallocated_encoding(s);
249
+ start_addr += source_id * sizeof(uint64_t);
225
}
250
+ }
226
break;
251
+
252
+ cpu_physical_memory_read(start_addr, &error_block_addr,
253
+ sizeof(error_block_addr));
254
+
255
+ error_block_addr = le64_to_cpu(error_block_addr);
256
+
257
+ read_ack_register_addr = start_addr +
258
+ ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
259
+
260
+ cpu_physical_memory_read(read_ack_register_addr,
261
+ &read_ack_register, sizeof(read_ack_register));
262
+
263
+ /* zero means OSPM does not acknowledge the error */
264
+ if (!read_ack_register) {
265
+ error_report("OSPM does not acknowledge previous error,"
266
+ " so can not record CPER for current error anymore");
267
+ } else if (error_block_addr) {
268
+ read_ack_register = cpu_to_le64(0);
269
+ /*
270
+ * Clear the Read Ack Register, OSPM will write it to 1 when
271
+ * it acknowledges this error.
272
+ */
273
+ cpu_physical_memory_write(read_ack_register_addr,
274
+ &read_ack_register, sizeof(uint64_t));
275
+
276
+ ret = acpi_ghes_record_mem_error(error_block_addr,
277
+ physical_address);
278
+ } else
279
+ error_report("can not find Generic Error Status Block");
280
+ }
281
+
282
+ return ret;
283
+}
227
--
284
--
228
2.19.1
285
2.20.1
229
286
230
287
diff view generated by jsdifflib
1
Create and use a utility function to extract the EC field
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
from a syndrome, rather than open-coding the shift.
2
3
3
Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
4
translates the host VA delivered by host to guest PA, then fills this PA
5
to guest APEI GHES memory, then notifies guest according to the SIGBUS
6
type.
7
8
When guest accesses the poisoned memory, it will generate a Synchronous
9
External Abort(SEA). Then host kernel gets an APEI notification and calls
10
memory_failure() to unmapped the affected page in stage 2, finally
11
returns to guest.
12
13
Guest continues to access the PG_hwpoison page, it will trap to KVM as
14
stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
15
Qemu, Qemu records this error address into guest APEI GHES memory and
16
notifes guest using Synchronous-External-Abort(SEA).
17
18
In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
19
in which we can setup the type of exception and the syndrome information.
20
When switching to guest, the target vcpu will jump to the synchronous
21
external abort vector table entry.
22
23
The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
24
ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
25
not valid and hold an UNKNOWN value. These values will be set to KVM
26
register structures through KVM_SET_ONE_REG IOCTL.
27
28
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
29
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
30
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
31
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
32
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
33
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
34
Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
35
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181012144235.19646-9-peter.maydell@linaro.org
7
---
36
---
8
target/arm/internals.h | 5 +++++
37
include/sysemu/kvm.h | 3 +-
9
target/arm/helper.c | 4 ++--
38
target/arm/cpu.h | 4 +++
10
target/arm/kvm64.c | 2 +-
39
target/arm/internals.h | 5 +--
11
target/arm/op_helper.c | 2 +-
40
target/i386/cpu.h | 2 ++
12
4 files changed, 9 insertions(+), 4 deletions(-)
41
target/arm/helper.c | 2 +-
13
42
target/arm/kvm64.c | 77 +++++++++++++++++++++++++++++++++++++++++
43
target/arm/tlb_helper.c | 2 +-
44
7 files changed, 89 insertions(+), 6 deletions(-)
45
46
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/include/sysemu/kvm.h
49
+++ b/include/sysemu/kvm.h
50
@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
51
/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
52
unsigned long kvm_arch_vcpu_id(CPUState *cpu);
53
54
-#ifdef TARGET_I386
55
-#define KVM_HAVE_MCE_INJECTION 1
56
+#ifdef KVM_HAVE_MCE_INJECTION
57
void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
58
#endif
59
60
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/cpu.h
63
+++ b/target/arm/cpu.h
64
@@ -XXX,XX +XXX,XX @@
65
/* ARM processors have a weak memory model */
66
#define TCG_GUEST_DEFAULT_MO (0)
67
68
+#ifdef TARGET_AARCH64
69
+#define KVM_HAVE_MCE_INJECTION 1
70
+#endif
71
+
72
#define EXCP_UDEF 1 /* undefined instruction */
73
#define EXCP_SWI 2 /* software interrupt */
74
#define EXCP_PREFETCH_ABORT 3
14
diff --git a/target/arm/internals.h b/target/arm/internals.h
75
diff --git a/target/arm/internals.h b/target/arm/internals.h
15
index XXXXXXX..XXXXXXX 100644
76
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/internals.h
77
--- a/target/arm/internals.h
17
+++ b/target/arm/internals.h
78
+++ b/target/arm/internals.h
18
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
79
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
19
#define ARM_EL_IL (1 << ARM_EL_IL_SHIFT)
80
| ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
20
#define ARM_EL_ISV (1 << ARM_EL_ISV_SHIFT)
81
}
21
82
22
+static inline uint32_t syn_get_ec(uint32_t syn)
83
-static inline uint32_t syn_data_abort_no_iss(int same_el,
23
+{
84
+static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
24
+ return syn >> ARM_EL_EC_SHIFT;
85
int ea, int cm, int s1ptw,
25
+}
86
int wnr, int fsc)
26
+
87
{
27
/* Utility functions for constructing various kinds of syndrome value.
88
return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
28
* Note that in general we follow the AArch64 syndrome values; in a
89
| ARM_EL_IL
29
* few cases the value in HSR for exceptions taken to AArch32 Hyp
90
- | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
91
+ | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
92
+ | (wnr << 6) | fsc;
93
}
94
95
static inline uint32_t syn_data_abort_with_iss(int same_el,
96
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
97
index XXXXXXX..XXXXXXX 100644
98
--- a/target/i386/cpu.h
99
+++ b/target/i386/cpu.h
100
@@ -XXX,XX +XXX,XX @@
101
/* The x86 has a strong memory model with some store-after-load re-ordering */
102
#define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD)
103
104
+#define KVM_HAVE_MCE_INJECTION 1
105
+
106
/* Maximum instruction code size */
107
#define TARGET_MAX_INSN_SIZE 16
108
30
diff --git a/target/arm/helper.c b/target/arm/helper.c
109
diff --git a/target/arm/helper.c b/target/arm/helper.c
31
index XXXXXXX..XXXXXXX 100644
110
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/helper.c
111
--- a/target/arm/helper.c
33
+++ b/target/arm/helper.c
112
+++ b/target/arm/helper.c
34
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
113
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
35
uint32_t moe;
114
* Report exception with ESR indicating a fault due to a
36
115
* translation table walk for a cache maintenance instruction.
37
/* If this is a debug exception we must update the DBGDSCR.MOE bits */
116
*/
38
- switch (env->exception.syndrome >> ARM_EL_EC_SHIFT) {
117
- syn = syn_data_abort_no_iss(current_el == target_el,
39
+ switch (syn_get_ec(env->exception.syndrome)) {
118
+ syn = syn_data_abort_no_iss(current_el == target_el, 0,
40
case EC_BREAKPOINT:
119
fi.ea, 1, fi.s1ptw, 1, fsc);
41
case EC_BREAKPOINT_SAME_EL:
120
env->exception.vaddress = value;
42
moe = 1;
121
env->exception.fsr = fsr;
43
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_interrupt(CPUState *cs)
44
if (qemu_loglevel_mask(CPU_LOG_INT)
45
&& !excp_is_internal(cs->exception_index)) {
46
qemu_log_mask(CPU_LOG_INT, "...with ESR 0x%x/0x%" PRIx32 "\n",
47
- env->exception.syndrome >> ARM_EL_EC_SHIFT,
48
+ syn_get_ec(env->exception.syndrome),
49
env->exception.syndrome);
50
}
51
52
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
122
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
53
index XXXXXXX..XXXXXXX 100644
123
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/kvm64.c
124
--- a/target/arm/kvm64.c
55
+++ b/target/arm/kvm64.c
125
+++ b/target/arm/kvm64.c
56
@@ -XXX,XX +XXX,XX @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
126
@@ -XXX,XX +XXX,XX @@
57
127
#include "sysemu/kvm_int.h"
58
bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit)
128
#include "kvm_arm.h"
59
{
129
#include "internals.h"
60
- int hsr_ec = debug_exit->hsr >> ARM_EL_EC_SHIFT;
130
+#include "hw/acpi/acpi.h"
61
+ int hsr_ec = syn_get_ec(debug_exit->hsr);
131
+#include "hw/acpi/ghes.h"
62
ARMCPU *cpu = ARM_CPU(cs);
132
+#include "hw/arm/virt.h"
63
CPUClass *cc = CPU_GET_CLASS(cs);
133
64
CPUARMState *env = &cpu->env;
134
static bool have_guest_debug;
65
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
135
66
index XXXXXXX..XXXXXXX 100644
136
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
67
--- a/target/arm/op_helper.c
137
return KVM_PUT_RUNTIME_STATE;
68
+++ b/target/arm/op_helper.c
138
}
69
@@ -XXX,XX +XXX,XX @@ void raise_exception(CPUARMState *env, uint32_t excp,
139
70
* (see DDI0478C.a D1.10.4)
140
+/* Callers must hold the iothread mutex lock */
71
*/
141
+static void kvm_inject_arm_sea(CPUState *c)
72
target_el = 2;
142
+{
73
- if (syndrome >> ARM_EL_EC_SHIFT == EC_ADVSIMDFPACCESSTRAP) {
143
+ ARMCPU *cpu = ARM_CPU(c);
74
+ if (syn_get_ec(syndrome) == EC_ADVSIMDFPACCESSTRAP) {
144
+ CPUARMState *env = &cpu->env;
75
syndrome = syn_uncategorized();
145
+ CPUClass *cc = CPU_GET_CLASS(c);
76
}
146
+ uint32_t esr;
77
}
147
+ bool same_el;
148
+
149
+ c->exception_index = EXCP_DATA_ABORT;
150
+ env->exception.target_el = 1;
151
+
152
+ /*
153
+ * Set the DFSC to synchronous external abort and set FnV to not valid,
154
+ * this will tell guest the FAR_ELx is UNKNOWN for this abort.
155
+ */
156
+ same_el = arm_current_el(env) == env->exception.target_el;
157
+ esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
158
+
159
+ env->exception.syndrome = esr;
160
+
161
+ cc->do_interrupt(c);
162
+}
163
+
164
#define AARCH64_CORE_REG(x) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
165
KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
166
167
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
168
return ret;
169
}
170
171
+void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
172
+{
173
+ ram_addr_t ram_addr;
174
+ hwaddr paddr;
175
+ Object *obj = qdev_get_machine();
176
+ VirtMachineState *vms = VIRT_MACHINE(obj);
177
+ bool acpi_enabled = virt_is_acpi_enabled(vms);
178
+
179
+ assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
180
+
181
+ if (acpi_enabled && addr &&
182
+ object_property_get_bool(obj, "ras", NULL)) {
183
+ ram_addr = qemu_ram_addr_from_host(addr);
184
+ if (ram_addr != RAM_ADDR_INVALID &&
185
+ kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
186
+ kvm_hwpoison_page_add(ram_addr);
187
+ /*
188
+ * If this is a BUS_MCEERR_AR, we know we have been called
189
+ * synchronously from the vCPU thread, so we can easily
190
+ * synchronize the state and inject an error.
191
+ *
192
+ * TODO: we currently don't tell the guest at all about
193
+ * BUS_MCEERR_AO. In that case we might either be being
194
+ * called synchronously from the vCPU thread, or a bit
195
+ * later from the main thread, so doing the injection of
196
+ * the error would be more complicated.
197
+ */
198
+ if (code == BUS_MCEERR_AR) {
199
+ kvm_cpu_synchronize_state(c);
200
+ if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
201
+ kvm_inject_arm_sea(c);
202
+ } else {
203
+ error_report("failed to record the error");
204
+ abort();
205
+ }
206
+ }
207
+ return;
208
+ }
209
+ if (code == BUS_MCEERR_AO) {
210
+ error_report("Hardware memory error at addr %p for memory used by "
211
+ "QEMU itself instead of guest system!", addr);
212
+ }
213
+ }
214
+
215
+ if (code == BUS_MCEERR_AR) {
216
+ error_report("Hardware memory error!");
217
+ exit(1);
218
+ }
219
+}
220
+
221
/* C6.6.29 BRK instruction */
222
static const uint32_t brk_insn = 0xd4200000;
223
224
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
225
index XXXXXXX..XXXXXXX 100644
226
--- a/target/arm/tlb_helper.c
227
+++ b/target/arm/tlb_helper.c
228
@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
229
* ISV field.
230
*/
231
if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
232
- syn = syn_data_abort_no_iss(same_el,
233
+ syn = syn_data_abort_no_iss(same_el, 0,
234
ea, 0, s1ptw, is_write, fsc);
235
} else {
236
/*
78
--
237
--
79
2.19.1
238
2.20.1
80
239
81
240
diff view generated by jsdifflib
1
From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
2
2
3
Announce the availability of the various priority queues.
3
I and Xiang are willing to review the APEI-related patches and
4
This fixes an issue where guest kernels would miss to
4
volunteer as the reviewers for the HEST/GHES part.
5
configure secondary queues due to inproper feature bits.
6
5
7
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
6
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
8
Message-id: 20181017213932.19973-2-edgar.iglesias@gmail.com
7
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
Acked-by: Michael S. Tsirkin <mst@redhat.com>
10
Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
---
12
hw/net/cadence_gem.c | 8 +++++++-
13
MAINTAINERS | 9 +++++++++
13
1 file changed, 7 insertions(+), 1 deletion(-)
14
1 file changed, 9 insertions(+)
14
15
15
diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
16
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/net/cadence_gem.c
18
--- a/MAINTAINERS
18
+++ b/hw/net/cadence_gem.c
19
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
20
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
20
int i;
21
F: tests/qtest/acpi-utils.[hc]
21
CadenceGEMState *s = CADENCE_GEM(d);
22
F: tests/data/acpi/
22
const uint8_t *a;
23
23
+ uint32_t queues_mask = 0;
24
+ACPI/HEST/GHES
24
25
+R: Dongjiu Geng <gengdongjiu@huawei.com>
25
DB_PRINT("\n");
26
+R: Xiang Zheng <zhengxiang9@huawei.com>
26
27
+L: qemu-arm@nongnu.org
27
@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
28
+S: Maintained
28
s->regs[GEM_DESCONF] = 0x02500111;
29
+F: hw/acpi/ghes.c
29
s->regs[GEM_DESCONF2] = 0x2ab13fff;
30
+F: include/hw/acpi/ghes.h
30
s->regs[GEM_DESCONF5] = 0x002f2045;
31
+F: docs/specs/acpi_hest_ghes.rst
31
- s->regs[GEM_DESCONF6] = 0x00000200;
32
+ s->regs[GEM_DESCONF6] = 0x0;
33
+
32
+
34
+ if (s->num_priority_queues > 1) {
33
ppc4xx
35
+ queues_mask = MAKE_64BIT_MASK(1, s->num_priority_queues - 1);
34
M: David Gibson <david@gibson.dropbear.id.au>
36
+ s->regs[GEM_DESCONF6] |= queues_mask;
35
L: qemu-ppc@nongnu.org
37
+ }
38
39
/* Set MAC address */
40
a = &s->conf.macaddr.a[0];
41
--
36
--
42
2.19.1
37
2.20.1
43
38
44
39
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
2
to decodetree. These don't use do_3same() because they want to
3
operate on VFP double registers, whose offsets are different from the
4
neon_reg_offset() calculations do_3same does.
2
5
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-8-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
7
---
9
---
8
target/arm/translate.c | 67 ++++++++++++++++++++++++------------------
10
target/arm/neon-dp.decode | 3 +++
9
1 file changed, 39 insertions(+), 28 deletions(-)
11
target/arm/translate-neon.inc.c | 15 +++++++++++++++
12
target/arm/translate.c | 14 ++------------
13
3 files changed, 20 insertions(+), 12 deletions(-)
10
14
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/neon-dp.decode
18
+++ b/target/arm/neon-dp.decode
19
@@ -XXX,XX +XXX,XX @@ VMLS_3s 1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
20
21
VMUL_3s 1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
22
VMUL_p_3s 1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
23
+
24
+VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
25
+VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
26
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/translate-neon.inc.c
29
+++ b/target/arm/translate-neon.inc.c
30
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
31
}
32
return do_3same(s, a, gen_VMUL_p_3s);
33
}
34
+
35
+#define DO_VQRDMLAH(INSN, FUNC) \
36
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
37
+ { \
38
+ if (!dc_isar_feature(aa32_rdm, s)) { \
39
+ return false; \
40
+ } \
41
+ if (a->size != 1 && a->size != 2) { \
42
+ return false; \
43
+ } \
44
+ return do_3same(s, a, FUNC); \
45
+ }
46
+
47
+DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
48
+DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
11
diff --git a/target/arm/translate.c b/target/arm/translate.c
49
diff --git a/target/arm/translate.c b/target/arm/translate.c
12
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate.c
51
--- a/target/arm/translate.c
14
+++ b/target/arm/translate.c
52
+++ b/target/arm/translate.c
15
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
53
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
16
return 1;
54
if (!u) {
55
break; /* VPADD */
17
}
56
}
18
} else { /* (insn & 0x00380080) == 0 */
57
- /* VQRDMLAH */
19
- int invert;
58
- if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
20
+ int invert, reg_ofs, vec_size;
59
- gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
21
+
60
- vec_size, vec_size);
22
if (q && (rd & 1)) {
61
- return 0;
23
return 1;
62
- }
24
}
63
+ /* VQRDMLAH : handled by decodetree */
64
return 1;
65
66
case NEON_3R_VFM_VQRDMLSH:
25
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
67
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
26
break;
68
}
27
case 14:
28
imm |= (imm << 8) | (imm << 16) | (imm << 24);
29
- if (invert)
30
+ if (invert) {
31
imm = ~imm;
32
+ }
33
break;
34
case 15:
35
if (invert) {
36
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
37
| ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
38
break;
69
break;
39
}
70
}
40
- if (invert)
71
- /* VQRDMLSH */
41
+ if (invert) {
72
- if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
42
imm = ~imm;
73
- gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
43
+ }
74
- vec_size, vec_size);
44
75
- return 0;
45
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
76
- }
46
- if (op & 1 && op < 12) {
77
+ /* VQRDMLSH : handled by decodetree */
47
- tmp = neon_load_reg(rd, pass);
78
return 1;
48
- if (invert) {
79
49
- /* The immediate value has already been inverted, so
80
case NEON_3R_VABD:
50
- BIC becomes AND. */
51
- tcg_gen_andi_i32(tmp, tmp, imm);
52
- } else {
53
- tcg_gen_ori_i32(tmp, tmp, imm);
54
- }
55
+ reg_ofs = neon_reg_offset(rd, 0);
56
+ vec_size = q ? 16 : 8;
57
+
58
+ if (op & 1 && op < 12) {
59
+ if (invert) {
60
+ /* The immediate value has already been inverted,
61
+ * so BIC becomes AND.
62
+ */
63
+ tcg_gen_gvec_andi(MO_32, reg_ofs, reg_ofs, imm,
64
+ vec_size, vec_size);
65
} else {
66
- /* VMOV, VMVN. */
67
- tmp = tcg_temp_new_i32();
68
- if (op == 14 && invert) {
69
- int n;
70
- uint32_t val;
71
- val = 0;
72
- for (n = 0; n < 4; n++) {
73
- if (imm & (1 << (n + (pass & 1) * 4)))
74
- val |= 0xff << (n * 8);
75
- }
76
- tcg_gen_movi_i32(tmp, val);
77
- } else {
78
- tcg_gen_movi_i32(tmp, imm);
79
- }
80
+ tcg_gen_gvec_ori(MO_32, reg_ofs, reg_ofs, imm,
81
+ vec_size, vec_size);
82
+ }
83
+ } else {
84
+ /* VMOV, VMVN. */
85
+ if (op == 14 && invert) {
86
+ TCGv_i64 t64 = tcg_temp_new_i64();
87
+
88
+ for (pass = 0; pass <= q; ++pass) {
89
+ uint64_t val = 0;
90
+ int n;
91
+
92
+ for (n = 0; n < 8; n++) {
93
+ if (imm & (1 << (n + pass * 8))) {
94
+ val |= 0xffull << (n * 8);
95
+ }
96
+ }
97
+ tcg_gen_movi_i64(t64, val);
98
+ neon_store_reg64(t64, rd + pass);
99
+ }
100
+ tcg_temp_free_i64(t64);
101
+ } else {
102
+ tcg_gen_gvec_dup32i(reg_ofs, vec_size, vec_size, imm);
103
}
104
- neon_store_reg(rd, pass, tmp);
105
}
106
}
107
} else { /* (insn & 0x00800010 == 0x00800000) */
108
--
81
--
109
2.19.1
82
2.20.1
110
83
111
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon SHA instructions in the 3-reg-same group
2
2
to decodetree.
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
5
Message-id: 20181011205206.3552-6-richard.henderson@linaro.org
6
[PMM: drop change to now-deleted cpu_mode_names array]
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
9
---
7
---
10
target/arm/translate.c | 4 ++--
8
target/arm/neon-dp.decode | 10 +++
11
1 file changed, 2 insertions(+), 2 deletions(-)
9
target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
12
10
target/arm/translate.c | 46 +----------
11
3 files changed, 151 insertions(+), 44 deletions(-)
12
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/neon-dp.decode
16
+++ b/target/arm/neon-dp.decode
17
@@ -XXX,XX +XXX,XX @@ VMUL_3s 1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
18
VMUL_p_3s 1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
19
20
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
21
+
22
+SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
23
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
24
+SHA256H_3s 1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
25
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
26
+SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
27
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
28
+SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
29
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
30
+
31
VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
32
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate-neon.inc.c
35
+++ b/target/arm/translate-neon.inc.c
36
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
37
38
DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
39
DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
40
+
41
+static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
42
+{
43
+ TCGv_ptr ptr1, ptr2, ptr3;
44
+ TCGv_i32 tmp;
45
+
46
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
47
+ !dc_isar_feature(aa32_sha1, s)) {
48
+ return false;
49
+ }
50
+
51
+ /* UNDEF accesses to D16-D31 if they don't exist. */
52
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
53
+ ((a->vd | a->vn | a->vm) & 0x10)) {
54
+ return false;
55
+ }
56
+
57
+ if ((a->vn | a->vm | a->vd) & 1) {
58
+ return false;
59
+ }
60
+
61
+ if (!vfp_access_check(s)) {
62
+ return true;
63
+ }
64
+
65
+ ptr1 = vfp_reg_ptr(true, a->vd);
66
+ ptr2 = vfp_reg_ptr(true, a->vn);
67
+ ptr3 = vfp_reg_ptr(true, a->vm);
68
+ tmp = tcg_const_i32(a->optype);
69
+ gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
70
+ tcg_temp_free_i32(tmp);
71
+ tcg_temp_free_ptr(ptr1);
72
+ tcg_temp_free_ptr(ptr2);
73
+ tcg_temp_free_ptr(ptr3);
74
+
75
+ return true;
76
+}
77
+
78
+static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
79
+{
80
+ TCGv_ptr ptr1, ptr2, ptr3;
81
+
82
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
83
+ !dc_isar_feature(aa32_sha2, s)) {
84
+ return false;
85
+ }
86
+
87
+ /* UNDEF accesses to D16-D31 if they don't exist. */
88
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
89
+ ((a->vd | a->vn | a->vm) & 0x10)) {
90
+ return false;
91
+ }
92
+
93
+ if ((a->vn | a->vm | a->vd) & 1) {
94
+ return false;
95
+ }
96
+
97
+ if (!vfp_access_check(s)) {
98
+ return true;
99
+ }
100
+
101
+ ptr1 = vfp_reg_ptr(true, a->vd);
102
+ ptr2 = vfp_reg_ptr(true, a->vn);
103
+ ptr3 = vfp_reg_ptr(true, a->vm);
104
+ gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
105
+ tcg_temp_free_ptr(ptr1);
106
+ tcg_temp_free_ptr(ptr2);
107
+ tcg_temp_free_ptr(ptr3);
108
+
109
+ return true;
110
+}
111
+
112
+static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
113
+{
114
+ TCGv_ptr ptr1, ptr2, ptr3;
115
+
116
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
117
+ !dc_isar_feature(aa32_sha2, s)) {
118
+ return false;
119
+ }
120
+
121
+ /* UNDEF accesses to D16-D31 if they don't exist. */
122
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
123
+ ((a->vd | a->vn | a->vm) & 0x10)) {
124
+ return false;
125
+ }
126
+
127
+ if ((a->vn | a->vm | a->vd) & 1) {
128
+ return false;
129
+ }
130
+
131
+ if (!vfp_access_check(s)) {
132
+ return true;
133
+ }
134
+
135
+ ptr1 = vfp_reg_ptr(true, a->vd);
136
+ ptr2 = vfp_reg_ptr(true, a->vn);
137
+ ptr3 = vfp_reg_ptr(true, a->vm);
138
+ gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
139
+ tcg_temp_free_ptr(ptr1);
140
+ tcg_temp_free_ptr(ptr2);
141
+ tcg_temp_free_ptr(ptr3);
142
+
143
+ return true;
144
+}
145
+
146
+static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
147
+{
148
+ TCGv_ptr ptr1, ptr2, ptr3;
149
+
150
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
151
+ !dc_isar_feature(aa32_sha2, s)) {
152
+ return false;
153
+ }
154
+
155
+ /* UNDEF accesses to D16-D31 if they don't exist. */
156
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
157
+ ((a->vd | a->vn | a->vm) & 0x10)) {
158
+ return false;
159
+ }
160
+
161
+ if ((a->vn | a->vm | a->vd) & 1) {
162
+ return false;
163
+ }
164
+
165
+ if (!vfp_access_check(s)) {
166
+ return true;
167
+ }
168
+
169
+ ptr1 = vfp_reg_ptr(true, a->vd);
170
+ ptr2 = vfp_reg_ptr(true, a->vn);
171
+ ptr3 = vfp_reg_ptr(true, a->vm);
172
+ gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
173
+ tcg_temp_free_ptr(ptr1);
174
+ tcg_temp_free_ptr(ptr2);
175
+ tcg_temp_free_ptr(ptr3);
176
+
177
+ return true;
178
+}
13
diff --git a/target/arm/translate.c b/target/arm/translate.c
179
diff --git a/target/arm/translate.c b/target/arm/translate.c
14
index XXXXXXX..XXXXXXX 100644
180
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate.c
181
--- a/target/arm/translate.c
16
+++ b/target/arm/translate.c
182
+++ b/target/arm/translate.c
17
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 cpu_F0d, cpu_F1d;
183
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
18
184
int vec_size;
19
#include "exec/gen-icount.h"
185
uint32_t imm;
20
186
TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
21
-static const char *regnames[] =
187
- TCGv_ptr ptr1, ptr2, ptr3;
22
+static const char * const regnames[] =
188
+ TCGv_ptr ptr1, ptr2;
23
{ "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
189
TCGv_i64 tmp64;
24
"r8", "r9", "r10", "r11", "r12", "r13", "r14", "pc" };
190
25
191
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
26
@@ -XXX,XX +XXX,XX @@ static struct {
192
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
27
int nregs;
193
return 1;
28
int interleave;
194
}
29
int spacing;
195
switch (op) {
30
-} neon_ls_element_type[11] = {
196
- case NEON_3R_SHA:
31
+} const neon_ls_element_type[11] = {
197
- /* The SHA-1/SHA-256 3-register instructions require special
32
{4, 4, 1},
198
- * treatment here, as their size field is overloaded as an
33
{4, 4, 2},
199
- * op type selector, and they all consume their input in a
34
{4, 1, 1},
200
- * single pass.
201
- */
202
- if (!q) {
203
- return 1;
204
- }
205
- if (!u) { /* SHA-1 */
206
- if (!dc_isar_feature(aa32_sha1, s)) {
207
- return 1;
208
- }
209
- ptr1 = vfp_reg_ptr(true, rd);
210
- ptr2 = vfp_reg_ptr(true, rn);
211
- ptr3 = vfp_reg_ptr(true, rm);
212
- tmp4 = tcg_const_i32(size);
213
- gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
214
- tcg_temp_free_i32(tmp4);
215
- } else { /* SHA-256 */
216
- if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
217
- return 1;
218
- }
219
- ptr1 = vfp_reg_ptr(true, rd);
220
- ptr2 = vfp_reg_ptr(true, rn);
221
- ptr3 = vfp_reg_ptr(true, rm);
222
- switch (size) {
223
- case 0:
224
- gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
225
- break;
226
- case 1:
227
- gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
228
- break;
229
- case 2:
230
- gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
231
- break;
232
- }
233
- }
234
- tcg_temp_free_ptr(ptr1);
235
- tcg_temp_free_ptr(ptr2);
236
- tcg_temp_free_ptr(ptr3);
237
- return 0;
238
-
239
case NEON_3R_VPADD_VQRDMLAH:
240
if (!u) {
241
break; /* VPADD */
242
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
243
case NEON_3R_VMUL:
244
case NEON_3R_VML:
245
case NEON_3R_VSHL:
246
+ case NEON_3R_SHA:
247
/* Already handled by decodetree */
248
return 1;
249
}
35
--
250
--
36
2.19.1
251
2.20.1
37
252
38
253
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the 64-bit element insns in the 3-reg-same group
2
to decodetree. This covers VQSHL, VRSHL and VQRSHL where
3
size==0b11.
2
4
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-13-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
7
---
8
---
8
target/arm/translate.c | 70 +++++++++++++++++++++++++++++-------------
9
target/arm/neon-dp.decode | 13 +++++++++++
9
1 file changed, 48 insertions(+), 22 deletions(-)
10
target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
11
target/arm/translate.c | 38 ++-------------------------------
12
3 files changed, 39 insertions(+), 36 deletions(-)
10
13
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
19
VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
20
VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
21
22
+# Insns operating on 64-bit elements (size!=0b11 handled elsewhere)
23
+# The _rev suffix indicates that Vn and Vm are reversed (as explained
24
+# by the comment for the @3same_rev format).
25
+@3same_64_rev .... ... . . . 11 .... .... .... . q:1 . . .... \
26
+ &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
27
+
28
+VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
29
+VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
30
+VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
31
+VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
32
+VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
33
+VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
34
+
35
VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
36
VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
37
VMIN_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
38
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate-neon.inc.c
41
+++ b/target/arm/translate-neon.inc.c
42
@@ -XXX,XX +XXX,XX @@ static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
43
44
return true;
45
}
46
+
47
+#define DO_3SAME_64(INSN, FUNC) \
48
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
49
+ uint32_t rn_ofs, uint32_t rm_ofs, \
50
+ uint32_t oprsz, uint32_t maxsz) \
51
+ { \
52
+ static const GVecGen3 op = { .fni8 = FUNC }; \
53
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &op); \
54
+ } \
55
+ DO_3SAME(INSN, gen_##INSN##_3s)
56
+
57
+#define DO_3SAME_64_ENV(INSN, FUNC) \
58
+ static void gen_##INSN##_elt(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m) \
59
+ { \
60
+ FUNC(d, cpu_env, n, m); \
61
+ } \
62
+ DO_3SAME_64(INSN, gen_##INSN##_elt)
63
+
64
+DO_3SAME_64(VRSHL_S64, gen_helper_neon_rshl_s64)
65
+DO_3SAME_64(VRSHL_U64, gen_helper_neon_rshl_u64)
66
+DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
67
+DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
68
+DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
69
+DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
11
diff --git a/target/arm/translate.c b/target/arm/translate.c
70
diff --git a/target/arm/translate.c b/target/arm/translate.c
12
index XXXXXXX..XXXXXXX 100644
71
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate.c
72
--- a/target/arm/translate.c
14
+++ b/target/arm/translate.c
73
+++ b/target/arm/translate.c
15
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
74
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
16
size--;
75
}
17
}
76
18
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
77
if (size == 3) {
19
- /* To avoid excessive duplication of ops we implement shift
78
- /* 64-bit element instructions. */
20
- by immediate using the variable shift operations. */
79
- for (pass = 0; pass < (q ? 2 : 1); pass++) {
21
if (op < 8) {
80
- neon_load_reg64(cpu_V0, rn + pass);
22
/* Shift by immediate:
81
- neon_load_reg64(cpu_V1, rm + pass);
23
VSHR, VSRA, VRSHR, VRSRA, VSRI, VSHL, VQSHL, VQSHLU. */
82
- switch (op) {
24
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
83
- case NEON_3R_VQSHL:
25
}
84
- if (u) {
26
/* Right shifts are encoded as N - shift, where N is the
85
- gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
27
element size in bits. */
86
- cpu_V1, cpu_V0);
28
- if (op <= 4)
87
- } else {
29
+ if (op <= 4) {
88
- gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
30
shift = shift - (1 << (size + 3));
89
- cpu_V1, cpu_V0);
31
+ }
90
- }
32
+
33
+ switch (op) {
34
+ case 0: /* VSHR */
35
+ /* Right shift comes here negative. */
36
+ shift = -shift;
37
+ /* Shifts larger than the element size are architecturally
38
+ * valid. Unsigned results in all zeros; signed results
39
+ * in all sign bits.
40
+ */
41
+ if (!u) {
42
+ tcg_gen_gvec_sari(size, rd_ofs, rm_ofs,
43
+ MIN(shift, (8 << size) - 1),
44
+ vec_size, vec_size);
45
+ } else if (shift >= 8 << size) {
46
+ tcg_gen_gvec_dup8i(rd_ofs, vec_size, vec_size, 0);
47
+ } else {
48
+ tcg_gen_gvec_shri(size, rd_ofs, rm_ofs, shift,
49
+ vec_size, vec_size);
50
+ }
51
+ return 0;
52
+
53
+ case 5: /* VSHL, VSLI */
54
+ if (!u) { /* VSHL */
55
+ /* Shifts larger than the element size are
56
+ * architecturally valid and results in zero.
57
+ */
58
+ if (shift >= 8 << size) {
59
+ tcg_gen_gvec_dup8i(rd_ofs, vec_size, vec_size, 0);
60
+ } else {
61
+ tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
62
+ vec_size, vec_size);
63
+ }
64
+ return 0;
65
+ }
66
+ break;
67
+ }
68
+
69
if (size == 3) {
70
count = q + 1;
71
} else {
72
count = q ? 4: 2;
73
}
74
- switch (size) {
75
- case 0:
76
- imm = (uint8_t) shift;
77
- imm |= imm << 8;
78
- imm |= imm << 16;
79
- break;
91
- break;
80
- case 1:
92
- case NEON_3R_VRSHL:
81
- imm = (uint16_t) shift;
93
- if (u) {
82
- imm |= imm << 16;
94
- gen_helper_neon_rshl_u64(cpu_V0, cpu_V1, cpu_V0);
95
- } else {
96
- gen_helper_neon_rshl_s64(cpu_V0, cpu_V1, cpu_V0);
97
- }
83
- break;
98
- break;
84
- case 2:
99
- case NEON_3R_VQRSHL:
85
- case 3:
100
- if (u) {
86
- imm = shift;
101
- gen_helper_neon_qrshl_u64(cpu_V0, cpu_env,
102
- cpu_V1, cpu_V0);
103
- } else {
104
- gen_helper_neon_qrshl_s64(cpu_V0, cpu_env,
105
- cpu_V1, cpu_V0);
106
- }
87
- break;
107
- break;
88
- default:
108
- default:
89
- abort();
109
- abort();
90
- }
110
- }
91
+
111
- neon_store_reg64(cpu_V0, rd + pass);
92
+ /* To avoid excessive duplication of ops we implement shift
112
- }
93
+ * by immediate using the variable shift operations.
113
- return 0;
94
+ */
114
+ /* 64-bit element instructions: handled by decodetree */
95
+ imm = dup_const(size, shift);
115
+ return 1;
96
116
}
97
for (pass = 0; pass < count; pass++) {
117
pairwise = 0;
98
if (size == 3) {
118
switch (op) {
99
neon_load_reg64(cpu_V0, rm + pass);
100
tcg_gen_movi_i64(cpu_V1, imm);
101
switch (op) {
102
- case 0: /* VSHR */
103
case 1: /* VSRA */
104
if (u)
105
gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
106
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
107
cpu_V0, cpu_V1);
108
}
109
break;
110
+ default:
111
+ g_assert_not_reached();
112
}
113
if (op == 1 || op == 3) {
114
/* Accumulate. */
115
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
116
tmp2 = tcg_temp_new_i32();
117
tcg_gen_movi_i32(tmp2, imm);
118
switch (op) {
119
- case 0: /* VSHR */
120
case 1: /* VSRA */
121
GEN_NEON_INTEGER_OP(shl);
122
break;
123
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
124
case 7: /* VQSHL */
125
GEN_NEON_INTEGER_OP_ENV(qshl);
126
break;
127
+ default:
128
+ g_assert_not_reached();
129
}
130
tcg_temp_free_i32(tmp2);
131
132
--
119
--
133
2.19.1
120
2.20.1
134
121
135
122
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon VHADD insns in the 3-reg-same group to decodetree.
2
2
3
For a sequence of loads or stores from a single register,
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
little-endian operations can be promoted to an 8-byte op.
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
This can reduce the number of operations by a factor of 8.
5
Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
6
---
7
target/arm/neon-dp.decode | 2 ++
8
target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
9
target/arm/translate.c | 4 +---
10
3 files changed, 27 insertions(+), 3 deletions(-)
6
11
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
8
Message-id: 20181011205206.3552-20-richard.henderson@linaro.org
13
index XXXXXXX..XXXXXXX 100644
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
14
--- a/target/arm/neon-dp.decode
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
+++ b/target/arm/neon-dp.decode
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
@@ -XXX,XX +XXX,XX @@
12
---
17
@3same .... ... . . . size:2 .... .... .... . q:1 . . .... \
13
target/arm/translate.c | 10 ++++++++++
18
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
14
1 file changed, 10 insertions(+)
19
15
20
+VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
21
+VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
22
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
23
VQADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
24
25
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate-neon.inc.c
28
+++ b/target/arm/translate-neon.inc.c
29
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
30
DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
31
DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
32
DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
33
+
34
+#define DO_3SAME_32(INSN, FUNC) \
35
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
36
+ uint32_t rn_ofs, uint32_t rm_ofs, \
37
+ uint32_t oprsz, uint32_t maxsz) \
38
+ { \
39
+ static const GVecGen3 ops[4] = { \
40
+ { .fni4 = gen_helper_neon_##FUNC##8 }, \
41
+ { .fni4 = gen_helper_neon_##FUNC##16 }, \
42
+ { .fni4 = gen_helper_neon_##FUNC##32 }, \
43
+ { 0 }, \
44
+ }; \
45
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
46
+ } \
47
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
48
+ { \
49
+ if (a->size > 2) { \
50
+ return false; \
51
+ } \
52
+ return do_3same(s, a, gen_##INSN##_3s); \
53
+ }
54
+
55
+DO_3SAME_32(VHADD_S, hadd_s)
56
+DO_3SAME_32(VHADD_U, hadd_u)
16
diff --git a/target/arm/translate.c b/target/arm/translate.c
57
diff --git a/target/arm/translate.c b/target/arm/translate.c
17
index XXXXXXX..XXXXXXX 100644
58
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate.c
59
--- a/target/arm/translate.c
19
+++ b/target/arm/translate.c
60
+++ b/target/arm/translate.c
20
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
61
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
21
if (size == 3 && (interleave | spacing) != 1) {
62
case NEON_3R_VML:
63
case NEON_3R_VSHL:
64
case NEON_3R_SHA:
65
+ case NEON_3R_VHADD:
66
/* Already handled by decodetree */
22
return 1;
67
return 1;
23
}
68
}
24
+ /* For our purposes, bytes are always little-endian. */
69
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
25
+ if (size == 0) {
70
tmp2 = neon_load_reg(rm, pass);
26
+ endian = MO_LE;
71
}
27
+ }
72
switch (op) {
28
+ /* Consecutive little-endian elements from a single register
73
- case NEON_3R_VHADD:
29
+ * can be promoted to a larger little-endian operation.
74
- GEN_NEON_INTEGER_OP(hadd);
30
+ */
75
- break;
31
+ if (interleave == 1 && endian == MO_LE) {
76
case NEON_3R_VRHADD:
32
+ size = 3;
77
GEN_NEON_INTEGER_OP(rhadd);
33
+ }
78
break;
34
tmp64 = tcg_temp_new_i64();
35
addr = tcg_temp_new_i32();
36
tmp2 = tcg_const_i32(1 << size);
37
--
79
--
38
2.19.1
80
2.20.1
39
81
40
82
diff view generated by jsdifflib
1
If the HCR_EL2 PTW virtualizaiton configuration register bit
1
Convert the Neon VABA and VABD insns in the 3-reg-same group to
2
is set, then this means that a stage 2 Permission fault must
2
decodetree.
3
be generated if a stage 1 translation table access is made
4
to an address that is mapped as Device memory in stage 2.
5
Implement this.
6
3
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181012144235.19646-8-peter.maydell@linaro.org
6
Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
10
---
7
---
11
target/arm/helper.c | 21 ++++++++++++++++++++-
8
target/arm/neon-dp.decode | 6 ++++++
12
1 file changed, 20 insertions(+), 1 deletion(-)
9
target/arm/translate-neon.inc.c | 4 ++++
10
target/arm/translate.c | 22 ++--------------------
11
3 files changed, 12 insertions(+), 20 deletions(-)
13
12
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
15
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/helper.c
16
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
17
@@ -XXX,XX +XXX,XX @@ VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
19
hwaddr s2pa;
18
VMIN_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
20
int s2prot;
19
VMIN_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
21
int ret;
20
22
+ ARMCacheAttrs cacheattrs = {};
21
+VABD_S_3s 1111 001 0 0 . .. .... .... 0111 . . . 0 .... @3same
23
+ ARMCacheAttrs *pcacheattrs = NULL;
22
+VABD_U_3s 1111 001 1 0 . .. .... .... 0111 . . . 0 .... @3same
24
+
23
+
25
+ if (env->cp15.hcr_el2 & HCR_PTW) {
24
+VABA_S_3s 1111 001 0 0 . .. .... .... 0111 . . . 1 .... @3same
26
+ /*
25
+VABA_U_3s 1111 001 1 0 . .. .... .... 0111 . . . 1 .... @3same
27
+ * PTW means we must fault if this S1 walk touches S2 Device
26
+
28
+ * memory; otherwise we don't care about the attributes and can
27
VADD_3s 1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
29
+ * save the S2 translation the effort of computing them.
28
VSUB_3s 1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
30
+ */
29
31
+ pcacheattrs = &cacheattrs;
30
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
32
+ }
31
index XXXXXXX..XXXXXXX 100644
33
32
--- a/target/arm/translate-neon.inc.c
34
ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_S2NS, &s2pa,
33
+++ b/target/arm/translate-neon.inc.c
35
- &txattrs, &s2prot, &s2size, fi, NULL);
34
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
36
+ &txattrs, &s2prot, &s2size, fi, pcacheattrs);
35
DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
37
if (ret) {
36
DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
38
assert(fi->type != ARMFault_None);
37
DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
39
fi->s2addr = addr;
38
+DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
40
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
39
+DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
41
fi->s1ptw = true;
40
+DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
42
return ~0;
41
+DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
42
43
#define DO_3SAME_CMP(INSN, COND) \
44
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
45
diff --git a/target/arm/translate.c b/target/arm/translate.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/translate.c
48
+++ b/target/arm/translate.c
49
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
50
/* VQRDMLSH : handled by decodetree */
51
return 1;
52
53
- case NEON_3R_VABD:
54
- if (u) {
55
- gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
56
- vec_size, vec_size);
57
- } else {
58
- gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
59
- vec_size, vec_size);
60
- }
61
- return 0;
62
-
63
- case NEON_3R_VABA:
64
- if (u) {
65
- gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
66
- vec_size, vec_size);
67
- } else {
68
- gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
69
- vec_size, vec_size);
70
- }
71
- return 0;
72
-
73
case NEON_3R_VADD_VSUB:
74
case NEON_3R_LOGIC:
75
case NEON_3R_VMAX:
76
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
77
case NEON_3R_VSHL:
78
case NEON_3R_SHA:
79
case NEON_3R_VHADD:
80
+ case NEON_3R_VABD:
81
+ case NEON_3R_VABA:
82
/* Already handled by decodetree */
83
return 1;
43
}
84
}
44
+ if (pcacheattrs && (pcacheattrs->attrs & 0xf0) == 0) {
45
+ /* Access was to Device memory: generate Permission fault */
46
+ fi->type = ARMFault_Permission;
47
+ fi->s2addr = addr;
48
+ fi->stage2 = true;
49
+ fi->s1ptw = true;
50
+ return ~0;
51
+ }
52
addr = s2pa;
53
}
54
return addr;
55
--
85
--
56
2.19.1
86
2.20.1
57
87
58
88
diff view generated by jsdifflib
1
For traps of FP/SIMD instructions to AArch32 Hyp mode, the syndrome
1
Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
2
provided in HSR has more information than is reported to AArch64.
2
(These are all the other insns in 3-reg-same which were using
3
Specifically, there are extra fields TA and coproc which indicate
3
GEN_NEON_INTEGER_OP() and which are not pairwise or
4
whether the trapped instruction was FP or SIMD. Add this extra
4
reversed-operands.)
5
information to the syndromes we construct, and mask it out when
6
taking the exception to AArch64.
7
5
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20181012144235.19646-11-peter.maydell@linaro.org
8
Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
11
---
9
---
12
target/arm/internals.h | 14 +++++++++++++-
10
target/arm/neon-dp.decode | 6 ++++++
13
target/arm/helper.c | 9 +++++++++
11
target/arm/translate-neon.inc.c | 4 ++++
14
target/arm/translate.c | 8 ++++----
12
target/arm/translate.c | 8 ++------
15
3 files changed, 26 insertions(+), 5 deletions(-)
13
3 files changed, 12 insertions(+), 6 deletions(-)
16
14
17
diff --git a/target/arm/internals.h b/target/arm/internals.h
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/internals.h
17
--- a/target/arm/neon-dp.decode
20
+++ b/target/arm/internals.h
18
+++ b/target/arm/neon-dp.decode
21
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_get_ec(uint32_t syn)
19
@@ -XXX,XX +XXX,XX @@ VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
22
* few cases the value in HSR for exceptions taken to AArch32 Hyp
20
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
23
* mode differs slightly, and we fix this up when populating HSR in
21
VQADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
24
* arm_cpu_do_interrupt_aarch32_hyp().
22
25
+ * The exception is FP/SIMD access traps -- these report extra information
23
+VRHADD_S_3s 1111 001 0 0 . .. .... .... 0001 . . . 0 .... @3same
26
+ * when taking an exception to AArch32. For those we include the extra coproc
24
+VRHADD_U_3s 1111 001 1 0 . .. .... .... 0001 . . . 0 .... @3same
27
+ * and TA fields, and mask them out when taking the exception to AArch64.
28
*/
29
static inline uint32_t syn_uncategorized(void)
30
{
31
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_cp15_rrt_trap(int cv, int cond, int opc1, int crm,
32
33
static inline uint32_t syn_fp_access_trap(int cv, int cond, bool is_16bit)
34
{
35
+ /* AArch32 FP trap or any AArch64 FP/SIMD trap: TA == 0 coproc == 0xa */
36
return (EC_ADVSIMDFPACCESSTRAP << ARM_EL_EC_SHIFT)
37
| (is_16bit ? 0 : ARM_EL_IL)
38
- | (cv << 24) | (cond << 20);
39
+ | (cv << 24) | (cond << 20) | 0xa;
40
+}
41
+
25
+
42
+static inline uint32_t syn_simd_access_trap(int cv, int cond, bool is_16bit)
26
@3same_logic .... ... . . . .. .... .... .... . q:1 .. .... \
43
+{
27
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
44
+ /* AArch32 SIMD trap: TA == 1 coproc == 0 */
28
45
+ return (EC_ADVSIMDFPACCESSTRAP << ARM_EL_EC_SHIFT)
29
@@ -XXX,XX +XXX,XX @@ VBSL_3s 1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
46
+ | (is_16bit ? 0 : ARM_EL_IL)
30
VBIT_3s 1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
47
+ | (cv << 24) | (cond << 20) | (1 << 5);
31
VBIF_3s 1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
48
}
32
49
33
+VHSUB_S_3s 1111 001 0 0 . .. .... .... 0010 . . . 0 .... @3same
50
static inline uint32_t syn_sve_access_trap(void)
34
+VHSUB_U_3s 1111 001 1 0 . .. .... .... 0010 . . . 0 .... @3same
51
diff --git a/target/arm/helper.c b/target/arm/helper.c
35
+
36
VQSUB_S_3s 1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
37
VQSUB_U_3s 1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
38
39
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
52
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/helper.c
41
--- a/target/arm/translate-neon.inc.c
54
+++ b/target/arm/helper.c
42
+++ b/target/arm/translate-neon.inc.c
55
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
43
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
56
case EXCP_HVC:
44
57
case EXCP_HYP_TRAP:
45
DO_3SAME_32(VHADD_S, hadd_s)
58
case EXCP_SMC:
46
DO_3SAME_32(VHADD_U, hadd_u)
59
+ if (syn_get_ec(env->exception.syndrome) == EC_ADVSIMDFPACCESSTRAP) {
47
+DO_3SAME_32(VHSUB_S, hsub_s)
60
+ /*
48
+DO_3SAME_32(VHSUB_U, hsub_u)
61
+ * QEMU internal FP/SIMD syndromes from AArch32 include the
49
+DO_3SAME_32(VRHADD_S, rhadd_s)
62
+ * TA and coproc fields which are only exposed if the exception
50
+DO_3SAME_32(VRHADD_U, rhadd_u)
63
+ * is taken to AArch32 Hyp mode. Mask them out to get a valid
64
+ * AArch64 format syndrome.
65
+ */
66
+ env->exception.syndrome &= ~MAKE_64BIT_MASK(0, 20);
67
+ }
68
env->cp15.esr_el[new_el] = env->exception.syndrome;
69
break;
70
case EXCP_IRQ:
71
diff --git a/target/arm/translate.c b/target/arm/translate.c
51
diff --git a/target/arm/translate.c b/target/arm/translate.c
72
index XXXXXXX..XXXXXXX 100644
52
index XXXXXXX..XXXXXXX 100644
73
--- a/target/arm/translate.c
53
--- a/target/arm/translate.c
74
+++ b/target/arm/translate.c
54
+++ b/target/arm/translate.c
75
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
76
*/
77
if (s->fp_excp_el) {
78
gen_exception_insn(s, 4, EXCP_UDEF,
79
- syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
80
+ syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
81
return 0;
82
}
83
84
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
55
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
85
*/
56
case NEON_3R_VSHL:
86
if (s->fp_excp_el) {
57
case NEON_3R_SHA:
87
gen_exception_insn(s, 4, EXCP_UDEF,
58
case NEON_3R_VHADD:
88
- syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
59
+ case NEON_3R_VRHADD:
89
+ syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
60
+ case NEON_3R_VHSUB:
90
return 0;
61
case NEON_3R_VABD:
91
}
62
case NEON_3R_VABA:
92
63
/* Already handled by decodetree */
93
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
64
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
94
65
tmp2 = neon_load_reg(rm, pass);
95
if (s->fp_excp_el) {
66
}
96
gen_exception_insn(s, 4, EXCP_UDEF,
67
switch (op) {
97
- syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
68
- case NEON_3R_VRHADD:
98
+ syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
69
- GEN_NEON_INTEGER_OP(rhadd);
99
return 0;
70
- break;
100
}
71
- case NEON_3R_VHSUB:
101
if (!s->vfp_enabled) {
72
- GEN_NEON_INTEGER_OP(hsub);
102
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
73
- break;
103
74
case NEON_3R_VQSHL:
104
if (s->fp_excp_el) {
75
GEN_NEON_INTEGER_OP_ENV(qshl);
105
gen_exception_insn(s, 4, EXCP_UDEF,
76
break;
106
- syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
107
+ syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
108
return 0;
109
}
110
if (!s->vfp_enabled) {
111
--
77
--
112
2.19.1
78
2.20.1
113
79
114
80
diff view generated by jsdifflib
1
For the v7 version of the Arm architecture, the IL bit in
1
Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
2
syndrome register values where the field is not valid was
2
group to decodetree. We have already implemented the size==0b11
3
defined to be UNK/SBZP. In v8 this is RES1, which is what
3
case of these insns; this commit handles the remaining sizes.
4
QEMU currently implements. Handle the desired v7 behaviour
5
by squashing the IL bit for the affected cases:
6
* EC == EC_UNCATEGORIZED
7
* prefetch aborts
8
* data aborts where ISV is 0
9
10
(The fourth case listed in the v8 Arm ARM DDI 0487C.a in
11
section G7.2.70, "illegal state exception", can't happen
12
on a v7 CPU.)
13
14
This deals with a corner case noted in a comment.
15
4
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
18
Message-id: 20181012144235.19646-10-peter.maydell@linaro.org
7
Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
19
---
8
---
20
target/arm/internals.h | 7 ++-----
9
target/arm/neon-dp.decode | 30 ++++++++++++++++++-----
21
target/arm/helper.c | 13 +++++++++++++
10
target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
22
2 files changed, 15 insertions(+), 5 deletions(-)
11
target/arm/translate.c | 22 +++--------------
12
3 files changed, 70 insertions(+), 25 deletions(-)
23
13
24
diff --git a/target/arm/internals.h b/target/arm/internals.h
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
25
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/internals.h
16
--- a/target/arm/neon-dp.decode
27
+++ b/target/arm/internals.h
17
+++ b/target/arm/neon-dp.decode
28
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_get_ec(uint32_t syn)
18
@@ -XXX,XX +XXX,XX @@ VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
29
/* Utility functions for constructing various kinds of syndrome value.
19
@3same_64_rev .... ... . . . 11 .... .... .... . q:1 . . .... \
30
* Note that in general we follow the AArch64 syndrome values; in a
20
&3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
31
* few cases the value in HSR for exceptions taken to AArch32 Hyp
21
32
- * mode differs slightly, so if we ever implemented Hyp mode then the
22
-VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
33
- * syndrome value would need some massaging on exception entry.
23
-VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
34
- * (One example of this is that AArch64 defaults to IL bit set for
24
-VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
35
- * exceptions which don't specifically indicate information about the
25
-VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
36
- * trapping instruction, whereas AArch32 defaults to IL bit clear.)
26
-VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
37
+ * mode differs slightly, and we fix this up when populating HSR in
27
-VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
38
+ * arm_cpu_do_interrupt_aarch32_hyp().
28
+{
39
*/
29
+ VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
40
static inline uint32_t syn_uncategorized(void)
30
+ VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
41
{
31
+}
42
diff --git a/target/arm/helper.c b/target/arm/helper.c
32
+{
33
+ VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
34
+ VQSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
35
+}
36
+{
37
+ VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
38
+ VRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
39
+}
40
+{
41
+ VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
42
+ VRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
43
+}
44
+{
45
+ VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
46
+ VQRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
47
+}
48
+{
49
+ VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
50
+ VQRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
51
+}
52
53
VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
54
VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
55
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
43
index XXXXXXX..XXXXXXX 100644
56
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/helper.c
57
--- a/target/arm/translate-neon.inc.c
45
+++ b/target/arm/helper.c
58
+++ b/target/arm/translate-neon.inc.c
46
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32_hyp(CPUState *cs)
59
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
60
return do_3same(s, a, gen_##INSN##_3s); \
47
}
61
}
48
62
49
if (cs->exception_index != EXCP_IRQ && cs->exception_index != EXCP_FIQ) {
63
+/*
50
+ if (!arm_feature(env, ARM_FEATURE_V8)) {
64
+ * Some helper functions need to be passed the cpu_env. In order
51
+ /*
65
+ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
52
+ * QEMU syndrome values are v8-style. v7 has the IL bit
66
+ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
53
+ * UNK/SBZP for "field not valid" cases, where v8 uses RES1.
67
+ * and which call a NeonGenTwoOpEnvFn().
54
+ * If this is a v7 CPU, squash the IL bit in those cases.
68
+ */
55
+ */
69
+#define WRAP_ENV_FN(WRAPNAME, FUNC) \
56
+ if (cs->exception_index == EXCP_PREFETCH_ABORT ||
70
+ static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m) \
57
+ (cs->exception_index == EXCP_DATA_ABORT &&
71
+ { \
58
+ !(env->exception.syndrome & ARM_EL_ISV)) ||
72
+ FUNC(d, cpu_env, n, m); \
59
+ syn_get_ec(env->exception.syndrome) == EC_UNCATEGORIZED) {
73
+ }
60
+ env->exception.syndrome &= ~ARM_EL_IL;
74
+
61
+ }
75
+#define DO_3SAME_32_ENV(INSN, FUNC) \
62
+ }
76
+ WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8); \
63
env->cp15.esr_el[2] = env->exception.syndrome;
77
+ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16); \
64
}
78
+ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32); \
65
79
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
80
+ uint32_t rn_ofs, uint32_t rm_ofs, \
81
+ uint32_t oprsz, uint32_t maxsz) \
82
+ { \
83
+ static const GVecGen3 ops[4] = { \
84
+ { .fni4 = gen_##INSN##_tramp8 }, \
85
+ { .fni4 = gen_##INSN##_tramp16 }, \
86
+ { .fni4 = gen_##INSN##_tramp32 }, \
87
+ { 0 }, \
88
+ }; \
89
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
90
+ } \
91
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
92
+ { \
93
+ if (a->size > 2) { \
94
+ return false; \
95
+ } \
96
+ return do_3same(s, a, gen_##INSN##_3s); \
97
+ }
98
+
99
DO_3SAME_32(VHADD_S, hadd_s)
100
DO_3SAME_32(VHADD_U, hadd_u)
101
DO_3SAME_32(VHSUB_S, hsub_s)
102
DO_3SAME_32(VHSUB_U, hsub_u)
103
DO_3SAME_32(VRHADD_S, rhadd_s)
104
DO_3SAME_32(VRHADD_U, rhadd_u)
105
+DO_3SAME_32(VRSHL_S, rshl_s)
106
+DO_3SAME_32(VRSHL_U, rshl_u)
107
+
108
+DO_3SAME_32_ENV(VQSHL_S, qshl_s)
109
+DO_3SAME_32_ENV(VQSHL_U, qshl_u)
110
+DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
111
+DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
112
diff --git a/target/arm/translate.c b/target/arm/translate.c
113
index XXXXXXX..XXXXXXX 100644
114
--- a/target/arm/translate.c
115
+++ b/target/arm/translate.c
116
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
117
case NEON_3R_VHSUB:
118
case NEON_3R_VABD:
119
case NEON_3R_VABA:
120
+ case NEON_3R_VQSHL:
121
+ case NEON_3R_VRSHL:
122
+ case NEON_3R_VQRSHL:
123
/* Already handled by decodetree */
124
return 1;
125
}
126
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
127
}
128
pairwise = 0;
129
switch (op) {
130
- case NEON_3R_VQSHL:
131
- case NEON_3R_VRSHL:
132
- case NEON_3R_VQRSHL:
133
- {
134
- int rtmp;
135
- /* Shift instruction operands are reversed. */
136
- rtmp = rn;
137
- rn = rm;
138
- rm = rtmp;
139
- }
140
- break;
141
case NEON_3R_VPADD_VQRDMLAH:
142
case NEON_3R_VPMAX:
143
case NEON_3R_VPMIN:
144
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
145
tmp2 = neon_load_reg(rm, pass);
146
}
147
switch (op) {
148
- case NEON_3R_VQSHL:
149
- GEN_NEON_INTEGER_OP_ENV(qshl);
150
- break;
151
- case NEON_3R_VRSHL:
152
- GEN_NEON_INTEGER_OP(rshl);
153
- break;
154
- case NEON_3R_VQRSHL:
155
- GEN_NEON_INTEGER_OP_ENV(qrshl);
156
break;
157
case NEON_3R_VPMAX:
158
GEN_NEON_INTEGER_OP(pmax);
66
--
159
--
67
2.19.1
160
2.20.1
68
161
69
162
diff view generated by jsdifflib
1
The HCR.DC virtualization configuration register bit has the
1
Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
2
following effects:
2
decodetree. These are 'pairwise' operations.
3
* SCTLR.M behaves as if it is 0 for all purposes except
4
direct reads of the bit
5
* HCR.VM behaves as if it is 1 for all purposes except
6
direct reads of the bit
7
* the memory type produced by the first stage of the EL1&EL0
8
translation regime is Normal Non-Shareable,
9
Inner Write-Back Read-Allocate Write-Allocate,
10
Outer Write-Back Read-Allocate Write-Allocate.
11
12
Implement this behaviour.
13
3
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
16
Message-id: 20181012144235.19646-5-peter.maydell@linaro.org
6
Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
17
---
7
---
18
target/arm/helper.c | 23 +++++++++++++++++++++--
8
target/arm/neon-dp.decode | 9 +++++
19
1 file changed, 21 insertions(+), 2 deletions(-)
9
target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
10
target/arm/translate.c | 17 +-------
11
3 files changed, 82 insertions(+), 15 deletions(-)
20
12
21
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
22
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/helper.c
15
--- a/target/arm/neon-dp.decode
24
+++ b/target/arm/helper.c
16
+++ b/target/arm/neon-dp.decode
25
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
17
@@ -XXX,XX +XXX,XX @@
26
* * The Non-secure TTBCR.EAE bit is set to 1
18
@3same .... ... . . . size:2 .... .... .... . q:1 . . .... \
27
* * The implementation includes EL2, and the value of HCR.VM is 1
19
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
28
*
20
29
+ * (Note that HCR.DC makes HCR.VM behave as if it is 1.)
21
+@3same_q0 .... ... . . . size:2 .... .... .... . 0 . . .... \
30
+ *
22
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
31
* ATS1Hx always uses the 64bit format (not supported yet).
23
+
32
*/
24
VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
33
format64 = arm_s1_regime_using_lpae_format(env, mmu_idx);
25
VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
34
26
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
35
if (arm_feature(env, ARM_FEATURE_EL2)) {
27
@@ -XXX,XX +XXX,XX @@ VMLS_3s 1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
36
if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
28
VMUL_3s 1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
37
- format64 |= env->cp15.hcr_el2 & HCR_VM;
29
VMUL_p_3s 1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
38
+ format64 |= env->cp15.hcr_el2 & (HCR_VM | HCR_DC);
30
39
} else {
31
+VPMAX_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
40
format64 |= arm_current_el(env) == 2;
32
+VPMAX_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
41
}
33
+
42
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
34
+VPMIN_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
43
}
35
+VPMIN_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
44
36
+
45
if (mmu_idx == ARMMMUIdx_S2NS) {
37
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
46
- return (env->cp15.hcr_el2 & HCR_VM) == 0;
38
47
+ /* HCR.DC means HCR.VM behaves as 1 */
39
SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
48
+ return (env->cp15.hcr_el2 & (HCR_DC | HCR_VM)) == 0;
40
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
49
}
41
index XXXXXXX..XXXXXXX 100644
50
42
--- a/target/arm/translate-neon.inc.c
51
if (env->cp15.hcr_el2 & HCR_TGE) {
43
+++ b/target/arm/translate-neon.inc.c
52
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
44
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
53
}
45
DO_3SAME_32_ENV(VQSHL_U, qshl_u)
54
}
46
DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
55
47
DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
56
+ if ((env->cp15.hcr_el2 & HCR_DC) &&
48
+
57
+ (mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1)) {
49
+static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
58
+ /* HCR.DC means SCTLR_EL1.M behaves as 0 */
50
+{
51
+ /* Operations handled pairwise 32 bits at a time */
52
+ TCGv_i32 tmp, tmp2, tmp3;
53
+
54
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
55
+ return false;
56
+ }
57
+
58
+ /* UNDEF accesses to D16-D31 if they don't exist. */
59
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
60
+ ((a->vd | a->vn | a->vm) & 0x10)) {
61
+ return false;
62
+ }
63
+
64
+ if (a->size == 3) {
65
+ return false;
66
+ }
67
+
68
+ if (!vfp_access_check(s)) {
59
+ return true;
69
+ return true;
60
+ }
70
+ }
61
+
71
+
62
return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
72
+ assert(a->q == 0); /* enforced by decode patterns */
73
+
74
+ /*
75
+ * Note that we have to be careful not to clobber the source operands
76
+ * in the "vm == vd" case by storing the result of the first pass too
77
+ * early. Since Q is 0 there are always just two passes, so instead
78
+ * of a complicated loop over each pass we just unroll.
79
+ */
80
+ tmp = neon_load_reg(a->vn, 0);
81
+ tmp2 = neon_load_reg(a->vn, 1);
82
+ fn(tmp, tmp, tmp2);
83
+ tcg_temp_free_i32(tmp2);
84
+
85
+ tmp3 = neon_load_reg(a->vm, 0);
86
+ tmp2 = neon_load_reg(a->vm, 1);
87
+ fn(tmp3, tmp3, tmp2);
88
+ tcg_temp_free_i32(tmp2);
89
+
90
+ neon_store_reg(a->vd, 0, tmp);
91
+ neon_store_reg(a->vd, 1, tmp3);
92
+ return true;
93
+}
94
+
95
+#define DO_3SAME_PAIR(INSN, func) \
96
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
97
+ { \
98
+ static NeonGenTwoOpFn * const fns[] = { \
99
+ gen_helper_neon_##func##8, \
100
+ gen_helper_neon_##func##16, \
101
+ gen_helper_neon_##func##32, \
102
+ }; \
103
+ if (a->size > 2) { \
104
+ return false; \
105
+ } \
106
+ return do_3same_pair(s, a, fns[a->size]); \
107
+ }
108
+
109
+/* 32-bit pairwise ops end up the same as the elementwise versions. */
110
+#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
111
+#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
112
+#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
113
+#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
114
+
115
+DO_3SAME_PAIR(VPMAX_S, pmax_s)
116
+DO_3SAME_PAIR(VPMIN_S, pmin_s)
117
+DO_3SAME_PAIR(VPMAX_U, pmax_u)
118
+DO_3SAME_PAIR(VPMIN_U, pmin_u)
119
diff --git a/target/arm/translate.c b/target/arm/translate.c
120
index XXXXXXX..XXXXXXX 100644
121
--- a/target/arm/translate.c
122
+++ b/target/arm/translate.c
123
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
124
}
63
}
125
}
64
126
65
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr(CPUARMState *env, target_ulong address,
127
-/* 32-bit pairwise ops end up the same as the elementwise versions. */
66
128
-#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
67
/* Combine the S1 and S2 cache attributes, if needed */
129
-#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
68
if (!ret && cacheattrs != NULL) {
130
-#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
69
+ if (env->cp15.hcr_el2 & HCR_DC) {
131
-#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
70
+ /*
132
-
71
+ * HCR.DC forces the first stage attributes to
133
#define GEN_NEON_INTEGER_OP_ENV(name) do { \
72
+ * Normal Non-Shareable,
134
switch ((size << 1) | u) { \
73
+ * Inner Write-Back Read-Allocate Write-Allocate,
135
case 0: \
74
+ * Outer Write-Back Read-Allocate Write-Allocate.
136
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
75
+ */
137
case NEON_3R_VQSHL:
76
+ cacheattrs->attrs = 0xff;
138
case NEON_3R_VRSHL:
77
+ cacheattrs->shareability = 0;
139
case NEON_3R_VQRSHL:
78
+ }
140
+ case NEON_3R_VPMAX:
79
*cacheattrs = combine_cacheattrs(*cacheattrs, cacheattrs2);
141
+ case NEON_3R_VPMIN:
80
}
142
/* Already handled by decodetree */
81
143
return 1;
144
}
145
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
146
pairwise = 0;
147
switch (op) {
148
case NEON_3R_VPADD_VQRDMLAH:
149
- case NEON_3R_VPMAX:
150
- case NEON_3R_VPMIN:
151
pairwise = 1;
152
break;
153
case NEON_3R_FLOAT_ARITH:
154
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
155
tmp2 = neon_load_reg(rm, pass);
156
}
157
switch (op) {
158
- break;
159
- case NEON_3R_VPMAX:
160
- GEN_NEON_INTEGER_OP(pmax);
161
- break;
162
- case NEON_3R_VPMIN:
163
- GEN_NEON_INTEGER_OP(pmin);
164
- break;
165
case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high. */
166
if (!u) { /* VQDMULH */
167
switch (size) {
82
--
168
--
83
2.19.1
169
2.20.1
84
170
85
171
diff view generated by jsdifflib
1
The A/I/F bits in ISR_EL1 should track the virtual interrupt
1
Convert the Neon integer VPADD 3-reg-same insns to decodetree. These
2
status, not the physical interrupt status, if the associated
2
are 'pairwise' operations. (Note that VQRDMLAH, which shares the
3
HCR_EL2.AMO/IMO/FMO bit is set. Implement this, rather than
3
same primary opcode but has U=1, has already been converted.)
4
always showing the physical interrupt status.
5
6
We don't currently implement anything to do with external
7
aborts, so this applies only to the I and F bits (though it
8
ought to be possible for the outer guest to present a virtual
9
external abort to the inner guest, even if QEMU doesn't
10
emulate physical external aborts, so there is missing
11
functionality in this area).
12
4
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20181012144235.19646-6-peter.maydell@linaro.org
7
Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
16
---
8
---
17
target/arm/helper.c | 22 ++++++++++++++++++----
9
target/arm/neon-dp.decode | 2 ++
18
1 file changed, 18 insertions(+), 4 deletions(-)
10
target/arm/translate-neon.inc.c | 2 ++
11
target/arm/translate.c | 19 +------------------
12
3 files changed, 5 insertions(+), 18 deletions(-)
19
13
20
diff --git a/target/arm/helper.c b/target/arm/helper.c
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
21
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/helper.c
16
--- a/target/arm/neon-dp.decode
23
+++ b/target/arm/helper.c
17
+++ b/target/arm/neon-dp.decode
24
@@ -XXX,XX +XXX,XX @@ static uint64_t isr_read(CPUARMState *env, const ARMCPRegInfo *ri)
18
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
25
CPUState *cs = ENV_GET_CPU(env);
19
VPMIN_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
26
uint64_t ret = 0;
20
VPMIN_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
27
21
28
- if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
22
+VPADD_3s 1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
29
- ret |= CPSR_I;
30
+ if (arm_hcr_el2_imo(env)) {
31
+ if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
32
+ ret |= CPSR_I;
33
+ }
34
+ } else {
35
+ if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
36
+ ret |= CPSR_I;
37
+ }
38
}
39
- if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
40
- ret |= CPSR_F;
41
+
23
+
42
+ if (arm_hcr_el2_fmo(env)) {
24
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
43
+ if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
25
44
+ ret |= CPSR_F;
26
SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
45
+ }
27
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
46
+ } else {
28
index XXXXXXX..XXXXXXX 100644
47
+ if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
29
--- a/target/arm/translate-neon.inc.c
48
+ ret |= CPSR_F;
30
+++ b/target/arm/translate-neon.inc.c
49
+ }
31
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
50
}
32
#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
51
+
33
#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
52
/* External aborts are not possible in QEMU so A bit is always clear */
34
#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
53
return ret;
35
+#define gen_helper_neon_padd_u32 tcg_gen_add_i32
54
}
36
37
DO_3SAME_PAIR(VPMAX_S, pmax_s)
38
DO_3SAME_PAIR(VPMIN_S, pmin_s)
39
DO_3SAME_PAIR(VPMAX_U, pmax_u)
40
DO_3SAME_PAIR(VPMIN_U, pmin_u)
41
+DO_3SAME_PAIR(VPADD, padd_u)
42
diff --git a/target/arm/translate.c b/target/arm/translate.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/translate.c
45
+++ b/target/arm/translate.c
46
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
47
return 1;
48
}
49
switch (op) {
50
- case NEON_3R_VPADD_VQRDMLAH:
51
- if (!u) {
52
- break; /* VPADD */
53
- }
54
- /* VQRDMLAH : handled by decodetree */
55
- return 1;
56
-
57
case NEON_3R_VFM_VQRDMLSH:
58
if (!u) {
59
/* VFM, VFMS */
60
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
61
case NEON_3R_VQRSHL:
62
case NEON_3R_VPMAX:
63
case NEON_3R_VPMIN:
64
+ case NEON_3R_VPADD_VQRDMLAH:
65
/* Already handled by decodetree */
66
return 1;
67
}
68
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
69
}
70
pairwise = 0;
71
switch (op) {
72
- case NEON_3R_VPADD_VQRDMLAH:
73
- pairwise = 1;
74
- break;
75
case NEON_3R_FLOAT_ARITH:
76
pairwise = (u && size < 2); /* if VPADD (float) */
77
break;
78
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
79
}
80
}
81
break;
82
- case NEON_3R_VPADD_VQRDMLAH:
83
- switch (size) {
84
- case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
85
- case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
86
- case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
87
- default: abort();
88
- }
89
- break;
90
case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
91
{
92
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
55
--
93
--
56
2.19.1
94
2.20.1
57
95
58
96
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
2
decodetree. These are the last integer operations in the
3
3-reg-same group.
2
4
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-10-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
7
---
8
---
8
target/arm/translate.c | 29 ++++++++++-------------------
9
target/arm/neon-dp.decode | 3 +++
9
1 file changed, 10 insertions(+), 19 deletions(-)
10
target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
11
target/arm/translate.c | 24 +-----------------------
12
3 files changed, 28 insertions(+), 23 deletions(-)
10
13
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
19
VPMIN_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
20
VPMIN_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
21
22
+VQDMULH_3s 1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
23
+VQRDMULH_3s 1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
24
+
25
VPADD_3s 1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
26
27
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
28
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-neon.inc.c
31
+++ b/target/arm/translate-neon.inc.c
32
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
33
DO_3SAME_PAIR(VPMAX_U, pmax_u)
34
DO_3SAME_PAIR(VPMIN_U, pmin_u)
35
DO_3SAME_PAIR(VPADD, padd_u)
36
+
37
+#define DO_3SAME_VQDMULH(INSN, FUNC) \
38
+ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
39
+ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \
40
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
41
+ uint32_t rn_ofs, uint32_t rm_ofs, \
42
+ uint32_t oprsz, uint32_t maxsz) \
43
+ { \
44
+ static const GVecGen3 ops[2] = { \
45
+ { .fni4 = gen_##INSN##_tramp16 }, \
46
+ { .fni4 = gen_##INSN##_tramp32 }, \
47
+ }; \
48
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
49
+ } \
50
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
51
+ { \
52
+ if (a->size != 1 && a->size != 2) { \
53
+ return false; \
54
+ } \
55
+ return do_3same(s, a, gen_##INSN##_3s); \
56
+ }
57
+
58
+DO_3SAME_VQDMULH(VQDMULH, qdmulh)
59
+DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
11
diff --git a/target/arm/translate.c b/target/arm/translate.c
60
diff --git a/target/arm/translate.c b/target/arm/translate.c
12
index XXXXXXX..XXXXXXX 100644
61
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate.c
62
--- a/target/arm/translate.c
14
+++ b/target/arm/translate.c
63
+++ b/target/arm/translate.c
15
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
64
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
16
break;
65
case NEON_3R_VPMAX:
17
}
66
case NEON_3R_VPMIN:
18
return 0;
67
case NEON_3R_VPADD_VQRDMLAH:
19
+
68
+ case NEON_3R_VQDMULH_VQRDMULH:
20
+ case NEON_3R_VADD_VSUB:
69
/* Already handled by decodetree */
21
+ if (u) {
70
return 1;
22
+ tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
23
+ vec_size, vec_size);
24
+ } else {
25
+ tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
26
+ vec_size, vec_size);
27
+ }
28
+ return 0;
29
}
71
}
30
if (size == 3) {
31
/* 64-bit element instructions. */
32
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
72
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
33
cpu_V1, cpu_V0);
73
tmp2 = neon_load_reg(rm, pass);
34
}
74
}
35
break;
75
switch (op) {
36
- case NEON_3R_VADD_VSUB:
76
- case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high. */
37
- if (u) {
77
- if (!u) { /* VQDMULH */
38
- tcg_gen_sub_i64(CPU_V001);
78
- switch (size) {
39
- } else {
79
- case 1:
40
- tcg_gen_add_i64(CPU_V001);
80
- gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
41
- }
42
- break;
81
- break;
43
default:
82
- case 2:
44
abort();
83
- gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
45
}
84
- break;
46
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
85
- default: abort();
47
tmp2 = neon_load_reg(rd, pass);
86
- }
48
gen_neon_add(size, tmp, tmp2);
87
- } else { /* VQRDMULH */
49
break;
50
- case NEON_3R_VADD_VSUB:
51
- if (!u) { /* VADD */
52
- gen_neon_add(size, tmp, tmp2);
53
- } else { /* VSUB */
54
- switch (size) {
88
- switch (size) {
55
- case 0: gen_helper_neon_sub_u8(tmp, tmp, tmp2); break;
89
- case 1:
56
- case 1: gen_helper_neon_sub_u16(tmp, tmp, tmp2); break;
90
- gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
57
- case 2: tcg_gen_sub_i32(tmp, tmp, tmp2); break;
91
- break;
92
- case 2:
93
- gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
94
- break;
58
- default: abort();
95
- default: abort();
59
- }
96
- }
60
- }
97
- }
61
- break;
98
- break;
62
case NEON_3R_VTST_VCEQ:
99
case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
63
if (!u) { /* VTST */
100
{
64
switch (size) {
101
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
65
--
102
--
66
2.19.1
103
2.20.1
67
104
68
105
diff view generated by jsdifflib
1
The HCR_EL2 VI and VF bits are supposed to track whether there is
1
Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
2
a pending virtual IRQ or virtual FIQ. For QEMU we store the
2
We already have gvec helpers for addition and subtraction, but must
3
pending VIRQ/VFIQ status in cs->interrupt_request, so this means:
3
add one for fabd.
4
* if the register is read we must get these bit values from
5
cs->interrupt_request
6
* if the register is written then we must write the bit
7
values back into cs->interrupt_request
8
4
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181012144235.19646-7-peter.maydell@linaro.org
7
Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
12
---
8
---
13
target/arm/helper.c | 47 +++++++++++++++++++++++++++++++++++++++++----
9
target/arm/helper.h | 3 ++-
14
1 file changed, 43 insertions(+), 4 deletions(-)
10
target/arm/neon-dp.decode | 8 ++++++++
11
target/arm/neon_helper.c | 7 -------
12
target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
13
target/arm/translate.c | 10 +++-------
14
target/arm/vec_helper.c | 7 +++++++
15
6 files changed, 48 insertions(+), 15 deletions(-)
15
16
16
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.c
19
--- a/target/arm/helper.h
19
+++ b/target/arm/helper.c
20
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qneg_s16, TCG_CALL_NO_RWG, i32, env, i32)
21
static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
22
DEF_HELPER_FLAGS_2(neon_qneg_s32, TCG_CALL_NO_RWG, i32, env, i32)
22
{
23
DEF_HELPER_FLAGS_2(neon_qneg_s64, TCG_CALL_NO_RWG, i64, env, i64)
23
ARMCPU *cpu = arm_env_get_cpu(env);
24
24
+ CPUState *cs = ENV_GET_CPU(env);
25
-DEF_HELPER_3(neon_abd_f32, i32, i32, i32, ptr)
25
uint64_t valid_mask = HCR_MASK;
26
DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, ptr)
26
27
DEF_HELPER_3(neon_cge_f32, i32, i32, i32, ptr)
27
if (arm_feature(env, ARM_FEATURE_EL3)) {
28
DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr)
28
@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
/* Clear RES0 bits. */
30
DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
value &= valid_mask;
31
DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
32
32
+ /*
33
+DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
+ * VI and VF are kept in cs->interrupt_request. Modifying that
34
+
34
+ * requires that we have the iothread lock, which is done by
35
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
35
+ * marking the reginfo structs as ARM_CP_IO.
36
void, ptr, ptr, ptr, ptr, i32)
36
+ * Note that if a write to HCR pends a VIRQ or VFIQ it is never
37
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
37
+ * possible for it to be taken immediately, because VIRQ and
38
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
38
+ * VFIQ are masked unless running at EL0 or EL1, and HCR
39
index XXXXXXX..XXXXXXX 100644
39
+ * can only be written at EL2.
40
--- a/target/arm/neon-dp.decode
40
+ */
41
+++ b/target/arm/neon-dp.decode
41
+ g_assert(qemu_mutex_iothread_locked());
42
@@ -XXX,XX +XXX,XX @@
42
+ if (value & HCR_VI) {
43
@3same_q0 .... ... . . . size:2 .... .... .... . 0 . . .... \
43
+ cs->interrupt_request |= CPU_INTERRUPT_VIRQ;
44
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
44
+ } else {
45
45
+ cs->interrupt_request &= ~CPU_INTERRUPT_VIRQ;
46
+# For FP insns the high bit of 'size' is used as part of opcode decode
47
+@3same_fp .... ... . . . . size:1 .... .... .... . q:1 . . .... \
48
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
49
+
50
VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
51
VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
52
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
53
@@ -XXX,XX +XXX,XX @@ SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
54
vm=%vm_dp vn=%vn_dp vd=%vd_dp
55
56
VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
57
+
58
+VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
59
+VSUB_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
60
+VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
61
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/neon_helper.c
64
+++ b/target/arm/neon_helper.c
65
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qneg_s64)(CPUARMState *env, uint64_t x)
66
}
67
68
/* NEON Float helpers. */
69
-uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b, void *fpstp)
70
-{
71
- float_status *fpst = fpstp;
72
- float32 f0 = make_float32(a);
73
- float32 f1 = make_float32(b);
74
- return float32_val(float32_abs(float32_sub(f0, f1, fpst)));
75
-}
76
77
/* Floating point comparisons produce an integer result.
78
* Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
79
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/target/arm/translate-neon.inc.c
82
+++ b/target/arm/translate-neon.inc.c
83
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
84
85
DO_3SAME_VQDMULH(VQDMULH, qdmulh)
86
DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
87
+
88
+/*
89
+ * For all the functions using this macro, size == 1 means fp16,
90
+ * which is an architecture extension we don't implement yet.
91
+ */
92
+#define DO_3S_FP_GVEC(INSN,FUNC) \
93
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
94
+ uint32_t rn_ofs, uint32_t rm_ofs, \
95
+ uint32_t oprsz, uint32_t maxsz) \
96
+ { \
97
+ TCGv_ptr fpst = get_fpstatus_ptr(1); \
98
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpst, \
99
+ oprsz, maxsz, 0, FUNC); \
100
+ tcg_temp_free_ptr(fpst); \
101
+ } \
102
+ static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
103
+ { \
104
+ if (a->size != 0) { \
105
+ /* TODO fp16 support */ \
106
+ return false; \
107
+ } \
108
+ return do_3same(s, a, gen_##INSN##_3s); \
46
+ }
109
+ }
47
+ if (value & HCR_VF) {
48
+ cs->interrupt_request |= CPU_INTERRUPT_VFIQ;
49
+ } else {
50
+ cs->interrupt_request &= ~CPU_INTERRUPT_VFIQ;
51
+ }
52
+ value &= ~(HCR_VI | HCR_VF);
53
+
110
+
54
/* These bits change the MMU setup:
111
+
55
* HCR_VM enables stage 2 translation
112
+DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
56
* HCR_PTW forbids certain page-table setups
113
+DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
57
@@ -XXX,XX +XXX,XX @@ static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
114
+DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
58
hcr_write(env, NULL, value);
115
diff --git a/target/arm/translate.c b/target/arm/translate.c
116
index XXXXXXX..XXXXXXX 100644
117
--- a/target/arm/translate.c
118
+++ b/target/arm/translate.c
119
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
120
switch (op) {
121
case NEON_3R_FLOAT_ARITH:
122
pairwise = (u && size < 2); /* if VPADD (float) */
123
+ if (!pairwise) {
124
+ return 1; /* handled by decodetree */
125
+ }
126
break;
127
case NEON_3R_FLOAT_MINMAX:
128
pairwise = u; /* if VPMIN/VPMAX (float) */
129
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
130
{
131
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
132
switch ((u << 2) | size) {
133
- case 0: /* VADD */
134
case 4: /* VPADD */
135
gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
136
break;
137
- case 2: /* VSUB */
138
- gen_helper_vfp_subs(tmp, tmp, tmp2, fpstatus);
139
- break;
140
- case 6: /* VABD */
141
- gen_helper_neon_abd_f32(tmp, tmp, tmp2, fpstatus);
142
- break;
143
default:
144
abort();
145
}
146
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
147
index XXXXXXX..XXXXXXX 100644
148
--- a/target/arm/vec_helper.c
149
+++ b/target/arm/vec_helper.c
150
@@ -XXX,XX +XXX,XX @@ static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
151
return result;
59
}
152
}
60
153
61
+static uint64_t hcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
154
+static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
62
+{
155
+{
63
+ /* The VI and VF bits live in cs->interrupt_request */
156
+ return float32_abs(float32_sub(op1, op2, stat));
64
+ uint64_t ret = env->cp15.hcr_el2 & ~(HCR_VI | HCR_VF);
65
+ CPUState *cs = ENV_GET_CPU(env);
66
+
67
+ if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
68
+ ret |= HCR_VI;
69
+ }
70
+ if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
71
+ ret |= HCR_VF;
72
+ }
73
+ return ret;
74
+}
157
+}
75
+
158
+
76
static const ARMCPRegInfo el2_cp_reginfo[] = {
159
#define DO_3OP(NAME, FUNC, TYPE) \
77
{ .name = "HCR_EL2", .state = ARM_CP_STATE_AA64,
160
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
78
+ .type = ARM_CP_IO,
161
{ \
79
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
162
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
80
.access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
163
DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
81
- .writefn = hcr_write },
164
DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
82
+ .writefn = hcr_write, .readfn = hcr_read },
165
83
{ .name = "HCR", .state = ARM_CP_STATE_AA32,
166
+DO_3OP(gvec_fabd_s, float32_abd, float32)
84
- .type = ARM_CP_ALIAS,
167
+
85
+ .type = ARM_CP_ALIAS | ARM_CP_IO,
168
#ifdef TARGET_AARCH64
86
.cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
169
87
.access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
170
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
88
- .writefn = hcr_writelow },
89
+ .writefn = hcr_writelow, .readfn = hcr_read },
90
{ .name = "ELR_EL2", .state = ARM_CP_STATE_AA64,
91
.type = ARM_CP_ALIAS,
92
.opc0 = 3, .opc1 = 4, .crn = 4, .crm = 0, .opc2 = 1,
93
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
94
95
static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
96
{ .name = "HCR2", .state = ARM_CP_STATE_AA32,
97
- .type = ARM_CP_ALIAS,
98
+ .type = ARM_CP_ALIAS | ARM_CP_IO,
99
.cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
100
.access = PL2_RW,
101
.fieldoffset = offsetofhigh32(CPUARMState, cp15.hcr_el2),
102
--
171
--
103
2.19.1
172
2.20.1
104
173
105
174
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
2
2
decodetree. These are the only remaining 'pairwise' operations,
3
Having V6 alone imply jazelle was wrong for cortex-m0.
3
so we can delete the pairwise-specific bits of the old decoder's
4
Change to an assertion for V6 & !M.
4
for-each-element loop now.
5
5
6
This was harmless, because the only place we tested ARM_FEATURE_JAZELLE
7
was for 'bxj' in disas_arm(), which is unreachable for M-profile cores.
8
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181016223115.24100-6-richard.henderson@linaro.org
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
14
---
9
---
15
target/arm/cpu.h | 6 +++++-
10
target/arm/neon-dp.decode | 5 +++
16
target/arm/cpu.c | 17 ++++++++++++++---
11
target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
17
target/arm/translate.c | 2 +-
12
target/arm/translate.c | 63 +++++----------------------------
18
3 files changed, 20 insertions(+), 5 deletions(-)
13
3 files changed, 76 insertions(+), 55 deletions(-)
19
14
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
21
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
17
--- a/target/arm/neon-dp.decode
23
+++ b/target/arm/cpu.h
18
+++ b/target/arm/neon-dp.decode
24
@@ -XXX,XX +XXX,XX @@ enum arm_features {
19
@@ -XXX,XX +XXX,XX @@
25
ARM_FEATURE_PMU, /* has PMU support */
20
# For FP insns the high bit of 'size' is used as part of opcode decode
26
ARM_FEATURE_VBAR, /* has cp15 VBAR */
21
@3same_fp .... ... . . . . size:1 .... .... .... . q:1 . . .... \
27
ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
22
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
28
- ARM_FEATURE_JAZELLE, /* has (trivial) Jazelle implementation */
23
+@3same_fp_q0 .... ... . . . . size:1 .... .... .... . 0 . . .... \
29
ARM_FEATURE_SVE, /* has Scalable Vector Extension */
24
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
30
ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
25
31
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
26
VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
32
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_arm_div(const ARMISARegisters *id)
27
VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
33
return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
28
@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
34
}
29
35
30
VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
36
+static inline bool isar_feature_jazelle(const ARMISARegisters *id)
31
VSUB_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
32
+VPADD_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
33
VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
34
+VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
35
+VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
36
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/translate-neon.inc.c
39
+++ b/target/arm/translate-neon.inc.c
40
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
41
DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
42
DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
43
DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
44
+
45
+static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
37
+{
46
+{
38
+ return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
47
+ /* FP operations handled pairwise 32 bits at a time */
48
+ TCGv_i32 tmp, tmp2, tmp3;
49
+ TCGv_ptr fpstatus;
50
+
51
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
52
+ return false;
53
+ }
54
+
55
+ /* UNDEF accesses to D16-D31 if they don't exist. */
56
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
57
+ ((a->vd | a->vn | a->vm) & 0x10)) {
58
+ return false;
59
+ }
60
+
61
+ if (!vfp_access_check(s)) {
62
+ return true;
63
+ }
64
+
65
+ assert(a->q == 0); /* enforced by decode patterns */
66
+
67
+ /*
68
+ * Note that we have to be careful not to clobber the source operands
69
+ * in the "vm == vd" case by storing the result of the first pass too
70
+ * early. Since Q is 0 there are always just two passes, so instead
71
+ * of a complicated loop over each pass we just unroll.
72
+ */
73
+ fpstatus = get_fpstatus_ptr(1);
74
+ tmp = neon_load_reg(a->vn, 0);
75
+ tmp2 = neon_load_reg(a->vn, 1);
76
+ fn(tmp, tmp, tmp2, fpstatus);
77
+ tcg_temp_free_i32(tmp2);
78
+
79
+ tmp3 = neon_load_reg(a->vm, 0);
80
+ tmp2 = neon_load_reg(a->vm, 1);
81
+ fn(tmp3, tmp3, tmp2, fpstatus);
82
+ tcg_temp_free_i32(tmp2);
83
+ tcg_temp_free_ptr(fpstatus);
84
+
85
+ neon_store_reg(a->vd, 0, tmp);
86
+ neon_store_reg(a->vd, 1, tmp3);
87
+ return true;
39
+}
88
+}
40
+
89
+
41
static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
90
+/*
42
{
91
+ * For all the functions using this macro, size == 1 means fp16,
43
return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
92
+ * which is an architecture extension we don't implement yet.
44
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
93
+ */
45
index XXXXXXX..XXXXXXX 100644
94
+#define DO_3S_FP_PAIR(INSN,FUNC) \
46
--- a/target/arm/cpu.c
95
+ static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
47
+++ b/target/arm/cpu.c
96
+ { \
48
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
97
+ if (a->size != 0) { \
49
}
98
+ /* TODO fp16 support */ \
50
if (arm_feature(env, ARM_FEATURE_V6)) {
99
+ return false; \
51
set_feature(env, ARM_FEATURE_V5);
100
+ } \
52
- set_feature(env, ARM_FEATURE_JAZELLE);
101
+ return do_3same_fp_pair(s, a, FUNC); \
53
if (!arm_feature(env, ARM_FEATURE_M)) {
102
+ }
54
+ assert(cpu_isar_feature(jazelle, cpu));
103
+
55
set_feature(env, ARM_FEATURE_AUXCR);
104
+DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
56
}
105
+DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
57
}
106
+DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
58
@@ -XXX,XX +XXX,XX @@ static void arm926_initfn(Object *obj)
59
set_feature(&cpu->env, ARM_FEATURE_VFP);
60
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
61
set_feature(&cpu->env, ARM_FEATURE_CACHE_TEST_CLEAN);
62
- set_feature(&cpu->env, ARM_FEATURE_JAZELLE);
63
cpu->midr = 0x41069265;
64
cpu->reset_fpsid = 0x41011090;
65
cpu->ctr = 0x1dd20d2;
66
cpu->reset_sctlr = 0x00090078;
67
+
68
+ /*
69
+ * ARMv5 does not have the ID_ISAR registers, but we can still
70
+ * set the field to indicate Jazelle support within QEMU.
71
+ */
72
+ cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
73
}
74
75
static void arm946_initfn(Object *obj)
76
@@ -XXX,XX +XXX,XX @@ static void arm1026_initfn(Object *obj)
77
set_feature(&cpu->env, ARM_FEATURE_AUXCR);
78
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
79
set_feature(&cpu->env, ARM_FEATURE_CACHE_TEST_CLEAN);
80
- set_feature(&cpu->env, ARM_FEATURE_JAZELLE);
81
cpu->midr = 0x4106a262;
82
cpu->reset_fpsid = 0x410110a0;
83
cpu->ctr = 0x1dd20d2;
84
cpu->reset_sctlr = 0x00090078;
85
cpu->reset_auxcr = 1;
86
+
87
+ /*
88
+ * ARMv5 does not have the ID_ISAR registers, but we can still
89
+ * set the field to indicate Jazelle support within QEMU.
90
+ */
91
+ cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
92
+
93
{
94
/* The 1026 had an IFAR at c6,c0,0,1 rather than the ARMv6 c6,c0,0,2 */
95
ARMCPRegInfo ifar = {
96
diff --git a/target/arm/translate.c b/target/arm/translate.c
107
diff --git a/target/arm/translate.c b/target/arm/translate.c
97
index XXXXXXX..XXXXXXX 100644
108
index XXXXXXX..XXXXXXX 100644
98
--- a/target/arm/translate.c
109
--- a/target/arm/translate.c
99
+++ b/target/arm/translate.c
110
+++ b/target/arm/translate.c
100
@@ -XXX,XX +XXX,XX @@
111
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
101
#define ENABLE_ARCH_5 arm_dc_feature(s, ARM_FEATURE_V5)
112
int shift;
102
/* currently all emulated v5 cores are also v5TE, so don't bother */
113
int pass;
103
#define ENABLE_ARCH_5TE arm_dc_feature(s, ARM_FEATURE_V5)
114
int count;
104
-#define ENABLE_ARCH_5J arm_dc_feature(s, ARM_FEATURE_JAZELLE)
115
- int pairwise;
105
+#define ENABLE_ARCH_5J dc_isar_feature(jazelle, s)
116
int u;
106
#define ENABLE_ARCH_6 arm_dc_feature(s, ARM_FEATURE_V6)
117
int vec_size;
107
#define ENABLE_ARCH_6K arm_dc_feature(s, ARM_FEATURE_V6K)
118
uint32_t imm;
108
#define ENABLE_ARCH_6T2 arm_dc_feature(s, ARM_FEATURE_THUMB2)
119
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
120
case NEON_3R_VPMIN:
121
case NEON_3R_VPADD_VQRDMLAH:
122
case NEON_3R_VQDMULH_VQRDMULH:
123
+ case NEON_3R_FLOAT_ARITH:
124
/* Already handled by decodetree */
125
return 1;
126
}
127
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
128
/* 64-bit element instructions: handled by decodetree */
129
return 1;
130
}
131
- pairwise = 0;
132
switch (op) {
133
- case NEON_3R_FLOAT_ARITH:
134
- pairwise = (u && size < 2); /* if VPADD (float) */
135
- if (!pairwise) {
136
- return 1; /* handled by decodetree */
137
- }
138
- break;
139
case NEON_3R_FLOAT_MINMAX:
140
- pairwise = u; /* if VPMIN/VPMAX (float) */
141
+ if (u) {
142
+ return 1; /* VPMIN/VPMAX handled by decodetree */
143
+ }
144
break;
145
case NEON_3R_FLOAT_CMP:
146
if (!u && size) {
147
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
148
break;
149
}
150
151
- if (pairwise && q) {
152
- /* All the pairwise insns UNDEF if Q is set */
153
- return 1;
154
- }
155
-
156
for (pass = 0; pass < (q ? 4 : 2); pass++) {
157
158
- if (pairwise) {
159
- /* Pairwise. */
160
- if (pass < 1) {
161
- tmp = neon_load_reg(rn, 0);
162
- tmp2 = neon_load_reg(rn, 1);
163
- } else {
164
- tmp = neon_load_reg(rm, 0);
165
- tmp2 = neon_load_reg(rm, 1);
166
- }
167
- } else {
168
- /* Elementwise. */
169
- tmp = neon_load_reg(rn, pass);
170
- tmp2 = neon_load_reg(rm, pass);
171
- }
172
+ /* Elementwise. */
173
+ tmp = neon_load_reg(rn, pass);
174
+ tmp2 = neon_load_reg(rm, pass);
175
switch (op) {
176
- case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
177
- {
178
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
179
- switch ((u << 2) | size) {
180
- case 4: /* VPADD */
181
- gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
182
- break;
183
- default:
184
- abort();
185
- }
186
- tcg_temp_free_ptr(fpstatus);
187
- break;
188
- }
189
case NEON_3R_FLOAT_MULTIPLY:
190
{
191
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
192
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
193
}
194
tcg_temp_free_i32(tmp2);
195
196
- /* Save the result. For elementwise operations we can put it
197
- straight into the destination register. For pairwise operations
198
- we have to be careful to avoid clobbering the source operands. */
199
- if (pairwise && rd == rm) {
200
- neon_store_scratch(pass, tmp);
201
- } else {
202
- neon_store_reg(rd, pass, tmp);
203
- }
204
+ neon_store_reg(rd, pass, tmp);
205
206
} /* for pass */
207
- if (pairwise && rd == rm) {
208
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
209
- tmp = neon_load_scratch(pass);
210
- neon_store_reg(rd, pass, tmp);
211
- }
212
- }
213
/* End of 3 register same size operations. */
214
} else if (insn & (1 << 4)) {
215
if ((insn & 0x00380080) != 0) {
109
--
216
--
110
2.19.1
217
2.20.1
111
218
112
219
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
2
decodetree.
2
3
3
Both arm and thumb2 division are controlled by the same ISAR field,
4
We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
4
which takes care of the arm implies thumb case. Having M imply
5
need a loop function do_3same_fp(). This takes a reads_vd parameter
5
thumb2 division was wrong for cortex-m0, which is v6m and does not
6
to do_3same_fp() which tells it to load the old value into vd before
6
have thumb2 at all, much less thumb2 division.
7
calling the callback function, in the same way that the do_vfp_3op_sp()
8
and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
9
only uses in this patch pass reads_vd == true, but later commits
10
will use reads_vd == false.)
7
11
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
This conversion fixes in passing an underdecoding for VMUL
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
(originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
10
Message-id: 20181016223115.24100-5-richard.henderson@linaro.org
14
of the 'size' field must be 0. The old decoder didn't enforce this,
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
but the decodetree pattern does.
16
17
The gen_VMLA_fp_reg() function performs the addition operation
18
with the operands in the opposite order to the old decoder:
19
since Neon sets 'default NaN mode' float32_add operations are
20
commutative so there is no behaviour difference, but putting
21
them this way around matches the Arm ARM pseudocode and the
22
required operation order for the subtraction in gen_VMLS_fp_reg().
23
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
26
Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
13
---
27
---
14
target/arm/cpu.h | 12 ++++++++++--
28
target/arm/neon-dp.decode | 3 ++
15
linux-user/elfload.c | 4 ++--
29
target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
16
target/arm/cpu.c | 10 +---------
30
target/arm/translate.c | 17 +------
17
target/arm/translate.c | 4 ++--
31
3 files changed, 85 insertions(+), 16 deletions(-)
18
4 files changed, 15 insertions(+), 15 deletions(-)
19
32
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
33
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
21
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
35
--- a/target/arm/neon-dp.decode
23
+++ b/target/arm/cpu.h
36
+++ b/target/arm/neon-dp.decode
24
@@ -XXX,XX +XXX,XX @@ enum arm_features {
37
@@ -XXX,XX +XXX,XX @@ VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
25
ARM_FEATURE_VFP3,
38
VSUB_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
26
ARM_FEATURE_VFP_FP16,
39
VPADD_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
27
ARM_FEATURE_NEON,
40
VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
28
- ARM_FEATURE_THUMB_DIV, /* divide supported in Thumb encoding */
41
+VMLA_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
29
ARM_FEATURE_M, /* Microcontroller profile. */
42
+VMLS_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
30
ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling. */
43
+VMUL_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
31
ARM_FEATURE_THUMB2EE,
44
VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
32
@@ -XXX,XX +XXX,XX @@ enum arm_features {
45
VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
33
ARM_FEATURE_V5,
46
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
34
ARM_FEATURE_STRONGARM,
47
index XXXXXXX..XXXXXXX 100644
35
ARM_FEATURE_VAPA, /* cp15 VA to PA lookups */
48
--- a/target/arm/translate-neon.inc.c
36
- ARM_FEATURE_ARM_DIV, /* divide supported in ARM encoding */
49
+++ b/target/arm/translate-neon.inc.c
37
ARM_FEATURE_VFP4, /* VFPv4 (implies that NEON is v2) */
50
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
38
ARM_FEATURE_GENERIC_TIMER,
51
DO_3SAME_VQDMULH(VQDMULH, qdmulh)
39
ARM_FEATURE_MVFR, /* Media and VFP Feature Registers 0 and 1 */
52
DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
40
@@ -XXX,XX +XXX,XX @@ extern const uint64_t pred_esz_masks[4];
53
41
/*
54
+static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
42
* 32-bit feature tests via id registers.
55
+ bool reads_vd)
43
*/
44
+static inline bool isar_feature_thumb_div(const ARMISARegisters *id)
45
+{
56
+{
46
+ return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
57
+ /*
58
+ * FP operations handled elementwise 32 bits at a time.
59
+ * If reads_vd is true then the old value of Vd will be
60
+ * loaded before calling the callback function. This is
61
+ * used for multiply-accumulate type operations.
62
+ */
63
+ TCGv_i32 tmp, tmp2;
64
+ int pass;
65
+
66
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
67
+ return false;
68
+ }
69
+
70
+ /* UNDEF accesses to D16-D31 if they don't exist. */
71
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
72
+ ((a->vd | a->vn | a->vm) & 0x10)) {
73
+ return false;
74
+ }
75
+
76
+ if ((a->vn | a->vm | a->vd) & a->q) {
77
+ return false;
78
+ }
79
+
80
+ if (!vfp_access_check(s)) {
81
+ return true;
82
+ }
83
+
84
+ TCGv_ptr fpstatus = get_fpstatus_ptr(1);
85
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
86
+ tmp = neon_load_reg(a->vn, pass);
87
+ tmp2 = neon_load_reg(a->vm, pass);
88
+ if (reads_vd) {
89
+ TCGv_i32 tmp_rd = neon_load_reg(a->vd, pass);
90
+ fn(tmp_rd, tmp, tmp2, fpstatus);
91
+ neon_store_reg(a->vd, pass, tmp_rd);
92
+ tcg_temp_free_i32(tmp);
93
+ } else {
94
+ fn(tmp, tmp, tmp2, fpstatus);
95
+ neon_store_reg(a->vd, pass, tmp);
96
+ }
97
+ tcg_temp_free_i32(tmp2);
98
+ }
99
+ tcg_temp_free_ptr(fpstatus);
100
+ return true;
47
+}
101
+}
48
+
102
+
49
+static inline bool isar_feature_arm_div(const ARMISARegisters *id)
103
/*
104
* For all the functions using this macro, size == 1 means fp16,
105
* which is an architecture extension we don't implement yet.
106
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
107
DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
108
DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
109
DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
110
+DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
111
+
112
+/*
113
+ * For all the functions using this macro, size == 1 means fp16,
114
+ * which is an architecture extension we don't implement yet.
115
+ */
116
+#define DO_3S_FP(INSN,FUNC,READS_VD) \
117
+ static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
118
+ { \
119
+ if (a->size != 0) { \
120
+ /* TODO fp16 support */ \
121
+ return false; \
122
+ } \
123
+ return do_3same_fp(s, a, FUNC, READS_VD); \
124
+ }
125
+
126
+static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
127
+ TCGv_ptr fpstatus)
50
+{
128
+{
51
+ return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
129
+ gen_helper_vfp_muls(vn, vn, vm, fpstatus);
130
+ gen_helper_vfp_adds(vd, vd, vn, fpstatus);
52
+}
131
+}
53
+
132
+
54
static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
133
+static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
134
+ TCGv_ptr fpstatus)
135
+{
136
+ gen_helper_vfp_muls(vn, vn, vm, fpstatus);
137
+ gen_helper_vfp_subs(vd, vd, vn, fpstatus);
138
+}
139
+
140
+DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
141
+DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
142
143
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
55
{
144
{
56
return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
57
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/linux-user/elfload.c
60
+++ b/linux-user/elfload.c
61
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
62
GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
63
GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
64
GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
65
- GET_FEATURE(ARM_FEATURE_ARM_DIV, ARM_HWCAP_ARM_IDIVA);
66
- GET_FEATURE(ARM_FEATURE_THUMB_DIV, ARM_HWCAP_ARM_IDIVT);
67
+ GET_FEATURE_ID(arm_div, ARM_HWCAP_ARM_IDIVA);
68
+ GET_FEATURE_ID(thumb_div, ARM_HWCAP_ARM_IDIVT);
69
/* All QEMU's VFPv3 CPUs have 32 registers, see VFP_DREG in translate.c.
70
* Note that the ARM_HWCAP_ARM_VFPv3D16 bit is always the inverse of
71
* ARM_HWCAP_ARM_VFPD32 (and so always clear for QEMU); it is unrelated
72
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
73
index XXXXXXX..XXXXXXX 100644
74
--- a/target/arm/cpu.c
75
+++ b/target/arm/cpu.c
76
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
77
* Presence of EL2 itself is ARM_FEATURE_EL2, and of the
78
* Security Extensions is ARM_FEATURE_EL3.
79
*/
80
- set_feature(env, ARM_FEATURE_ARM_DIV);
81
+ assert(cpu_isar_feature(arm_div, cpu));
82
set_feature(env, ARM_FEATURE_LPAE);
83
set_feature(env, ARM_FEATURE_V7);
84
}
85
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
86
if (arm_feature(env, ARM_FEATURE_V5)) {
87
set_feature(env, ARM_FEATURE_V4T);
88
}
89
- if (arm_feature(env, ARM_FEATURE_M)) {
90
- set_feature(env, ARM_FEATURE_THUMB_DIV);
91
- }
92
- if (arm_feature(env, ARM_FEATURE_ARM_DIV)) {
93
- set_feature(env, ARM_FEATURE_THUMB_DIV);
94
- }
95
if (arm_feature(env, ARM_FEATURE_VFP4)) {
96
set_feature(env, ARM_FEATURE_VFP3);
97
set_feature(env, ARM_FEATURE_VFP_FP16);
98
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
99
ARMCPU *cpu = ARM_CPU(obj);
100
101
set_feature(&cpu->env, ARM_FEATURE_V7);
102
- set_feature(&cpu->env, ARM_FEATURE_THUMB_DIV);
103
- set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
104
set_feature(&cpu->env, ARM_FEATURE_V7MP);
105
set_feature(&cpu->env, ARM_FEATURE_PMSA);
106
cpu->midr = 0x411fc153; /* r1p3 */
107
diff --git a/target/arm/translate.c b/target/arm/translate.c
145
diff --git a/target/arm/translate.c b/target/arm/translate.c
108
index XXXXXXX..XXXXXXX 100644
146
index XXXXXXX..XXXXXXX 100644
109
--- a/target/arm/translate.c
147
--- a/target/arm/translate.c
110
+++ b/target/arm/translate.c
148
+++ b/target/arm/translate.c
111
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
149
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
112
case 1:
150
case NEON_3R_VPADD_VQRDMLAH:
113
case 3:
151
case NEON_3R_VQDMULH_VQRDMULH:
114
/* SDIV, UDIV */
152
case NEON_3R_FLOAT_ARITH:
115
- if (!arm_dc_feature(s, ARM_FEATURE_ARM_DIV)) {
153
+ case NEON_3R_FLOAT_MULTIPLY:
116
+ if (!dc_isar_feature(arm_div, s)) {
154
/* Already handled by decodetree */
117
goto illegal_op;
155
return 1;
118
}
156
}
119
if (((insn >> 5) & 7) || (rd != 15)) {
157
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
120
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
158
tmp = neon_load_reg(rn, pass);
121
tmp2 = load_reg(s, rm);
159
tmp2 = neon_load_reg(rm, pass);
122
if ((op & 0x50) == 0x10) {
160
switch (op) {
123
/* sdiv, udiv */
161
- case NEON_3R_FLOAT_MULTIPLY:
124
- if (!arm_dc_feature(s, ARM_FEATURE_THUMB_DIV)) {
162
- {
125
+ if (!dc_isar_feature(thumb_div, s)) {
163
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
126
goto illegal_op;
164
- gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
127
}
165
- if (!u) {
128
if (op & 0x20)
166
- tcg_temp_free_i32(tmp2);
167
- tmp2 = neon_load_reg(rd, pass);
168
- if (size == 0) {
169
- gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
170
- } else {
171
- gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
172
- }
173
- }
174
- tcg_temp_free_ptr(fpstatus);
175
- break;
176
- }
177
case NEON_3R_FLOAT_CMP:
178
{
179
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
129
--
180
--
130
2.19.1
181
2.20.1
131
182
132
183
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
2
VCEQ, VACGE and VACGT to decodetree.
2
3
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20181011205206.3552-12-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
7
---
7
---
8
target/arm/translate.c | 31 +++++++++++++++----------------
8
target/arm/neon-dp.decode | 5 +++++
9
1 file changed, 15 insertions(+), 16 deletions(-)
9
target/arm/translate-neon.inc.c | 6 +++++
10
target/arm/translate.c | 39 ++-------------------------------
11
3 files changed, 13 insertions(+), 37 deletions(-)
10
12
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/neon-dp.decode
16
+++ b/target/arm/neon-dp.decode
17
@@ -XXX,XX +XXX,XX @@ VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
18
VMLA_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
19
VMLS_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
20
VMUL_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
21
+VCEQ_fp_3s 1111 001 0 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
22
+VCGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
23
+VACGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
24
+VCGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
25
+VACGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
26
VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
27
VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
28
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-neon.inc.c
31
+++ b/target/arm/translate-neon.inc.c
32
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
33
return do_3same_fp(s, a, FUNC, READS_VD); \
34
}
35
36
+DO_3S_FP(VCEQ, gen_helper_neon_ceq_f32, false)
37
+DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
38
+DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
39
+DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
40
+DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
41
+
42
static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
43
TCGv_ptr fpstatus)
44
{
11
diff --git a/target/arm/translate.c b/target/arm/translate.c
45
diff --git a/target/arm/translate.c b/target/arm/translate.c
12
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate.c
47
--- a/target/arm/translate.c
14
+++ b/target/arm/translate.c
48
+++ b/target/arm/translate.c
15
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
49
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
16
vec_size, vec_size);
50
case NEON_3R_VQDMULH_VQRDMULH:
17
}
51
case NEON_3R_FLOAT_ARITH:
18
return 0;
52
case NEON_3R_FLOAT_MULTIPLY:
19
+
53
+ case NEON_3R_FLOAT_CMP:
20
+ case NEON_3R_VMUL: /* VMUL */
54
+ case NEON_3R_FLOAT_ACMP:
21
+ if (u) {
55
/* Already handled by decodetree */
22
+ /* Polynomial case allows only P8 and is handled below. */
56
return 1;
23
+ if (size != 0) {
24
+ return 1;
25
+ }
26
+ } else {
27
+ tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
28
+ vec_size, vec_size);
29
+ return 0;
30
+ }
31
+ break;
32
}
57
}
33
if (size == 3) {
34
/* 64-bit element instructions. */
35
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
58
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
36
return 1;
59
return 1; /* VPMIN/VPMAX handled by decodetree */
37
}
60
}
38
break;
61
break;
39
- case NEON_3R_VMUL:
62
- case NEON_3R_FLOAT_CMP:
40
- if (u && (size != 0)) {
63
- if (!u && size) {
41
- /* UNDEF on invalid size for polynomial subcase */
64
- /* no encoding for U=0 C=1x */
42
- return 1;
65
- return 1;
43
- }
66
- }
44
- break;
67
- break;
45
case NEON_3R_VFM_VQRDMLSH:
68
- case NEON_3R_FLOAT_ACMP:
46
if (!arm_dc_feature(s, ARM_FEATURE_VFP4)) {
69
- if (!u) {
47
return 1;
70
- return 1;
71
- }
72
- break;
73
case NEON_3R_FLOAT_MISC:
74
/* VMAXNM/VMINNM in ARMv8 */
75
if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
48
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
76
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
49
}
77
tmp = neon_load_reg(rn, pass);
50
break;
78
tmp2 = neon_load_reg(rm, pass);
51
case NEON_3R_VMUL:
79
switch (op) {
52
- if (u) { /* polynomial */
80
- case NEON_3R_FLOAT_CMP:
53
- gen_helper_neon_mul_p8(tmp, tmp, tmp2);
81
- {
54
- } else { /* Integer */
82
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
55
- switch (size) {
83
- if (!u) {
56
- case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
84
- gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
57
- case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
85
- } else {
58
- case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
86
- if (size == 0) {
59
- default: abort();
87
- gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
88
- } else {
89
- gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
60
- }
90
- }
61
- }
91
- }
62
+ /* VMUL.P8; other cases already eliminated. */
92
- tcg_temp_free_ptr(fpstatus);
63
+ gen_helper_neon_mul_p8(tmp, tmp, tmp2);
93
- break;
64
break;
94
- }
65
case NEON_3R_VPMAX:
95
- case NEON_3R_FLOAT_ACMP:
66
GEN_NEON_INTEGER_OP(pmax);
96
- {
97
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
98
- if (size == 0) {
99
- gen_helper_neon_acge_f32(tmp, tmp, tmp2, fpstatus);
100
- } else {
101
- gen_helper_neon_acgt_f32(tmp, tmp, tmp2, fpstatus);
102
- }
103
- tcg_temp_free_ptr(fpstatus);
104
- break;
105
- }
106
case NEON_3R_FLOAT_MINMAX:
107
{
108
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
67
--
109
--
68
2.19.1
110
2.20.1
69
111
70
112
diff view generated by jsdifflib
1
The switch_mode() function is defined in target/arm/helper.c and used
1
The usual location for the env argument in the argument list of a TCG helper
2
only in that file and nowhere else, so we can make it file-local
2
is immediately after the return-value argument. recps_f32 and rsqrts_f32
3
rather than global.
3
differ in that they put it at the end.
4
5
Move the env argument to its usual place; this will allow us to
6
more easily use these helper functions with the gvec APIs.
4
7
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20181012144235.19646-3-peter.maydell@linaro.org
10
Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
8
---
11
---
9
target/arm/internals.h | 1 -
12
target/arm/helper.h | 4 ++--
10
target/arm/helper.c | 6 ++++--
13
target/arm/translate.c | 4 ++--
11
2 files changed, 4 insertions(+), 3 deletions(-)
14
target/arm/vfp_helper.c | 4 ++--
15
3 files changed, 6 insertions(+), 6 deletions(-)
12
16
13
diff --git a/target/arm/internals.h b/target/arm/internals.h
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/internals.h
19
--- a/target/arm/helper.h
16
+++ b/target/arm/internals.h
20
+++ b/target/arm/helper.h
17
@@ -XXX,XX +XXX,XX @@ static inline int bank_number(int mode)
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
18
g_assert_not_reached();
22
DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
23
DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
24
25
-DEF_HELPER_3(recps_f32, f32, f32, f32, env)
26
-DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
27
+DEF_HELPER_3(recps_f32, f32, env, f32, f32)
28
+DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
29
DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
30
DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
31
DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
32
diff --git a/target/arm/translate.c b/target/arm/translate.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate.c
35
+++ b/target/arm/translate.c
36
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
37
tcg_temp_free_ptr(fpstatus);
38
} else {
39
if (size == 0) {
40
- gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
41
+ gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
42
} else {
43
- gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
44
+ gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
45
}
46
}
47
break;
48
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/vfp_helper.c
51
+++ b/target/arm/vfp_helper.c
52
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
53
#define float32_three make_float32(0x40400000)
54
#define float32_one_point_five make_float32(0x3fc00000)
55
56
-float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
57
+float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
58
{
59
float_status *s = &env->vfp.standard_fp_status;
60
if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
61
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
62
return float32_sub(float32_two, float32_mul(a, b, s), s);
19
}
63
}
20
64
21
-void switch_mode(CPUARMState *, int);
65
-float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
22
void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu);
66
+float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
23
void arm_translate_init(void);
24
25
diff --git a/target/arm/helper.c b/target/arm/helper.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/helper.c
28
+++ b/target/arm/helper.c
29
@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
30
V8M_SAttributes *sattrs);
31
#endif
32
33
+static void switch_mode(CPUARMState *env, int mode);
34
+
35
static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
36
{
67
{
37
int nregs;
68
float_status *s = &env->vfp.standard_fp_status;
38
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
69
float32 product;
39
return 0;
40
}
41
42
-void switch_mode(CPUARMState *env, int mode)
43
+static void switch_mode(CPUARMState *env, int mode)
44
{
45
ARMCPU *cpu = arm_env_get_cpu(env);
46
47
@@ -XXX,XX +XXX,XX @@ void aarch64_sync_64_to_32(CPUARMState *env)
48
49
#else
50
51
-void switch_mode(CPUARMState *env, int mode)
52
+static void switch_mode(CPUARMState *env, int mode)
53
{
54
int old_mode;
55
int i;
56
--
70
--
57
2.19.1
71
2.20.1
58
72
59
73
diff view generated by jsdifflib
1
The HCR.FB virtualization configuration register bit requests that
1
Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
2
TLB maintenance, branch predictor invalidate-all and icache
2
insns to decodetree. (These are all the remaining non-accumulation
3
invalidate-all operations performed in NS EL1 should be upgraded
3
instructions in this group.)
4
from "local CPU only to "broadcast within Inner Shareable domain".
5
For QEMU we NOP the branch predictor and icache operations, so
6
we only need to upgrade the TLB invalidates:
7
AArch32 TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID,
8
ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, TLBIMVAL, TLBIMVAAL
9
AArch64 TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1,
10
TLBI VALE1, TLBI VAALE1
11
4
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20181012144235.19646-4-peter.maydell@linaro.org
7
Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
15
---
8
---
16
target/arm/helper.c | 191 +++++++++++++++++++++++++++-----------------
9
target/arm/neon-dp.decode | 6 +++
17
1 file changed, 116 insertions(+), 75 deletions(-)
10
target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
11
target/arm/translate.c | 42 +-------------------
12
3 files changed, 78 insertions(+), 40 deletions(-)
18
13
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
20
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.c
16
--- a/target/arm/neon-dp.decode
22
+++ b/target/arm/helper.c
17
+++ b/target/arm/neon-dp.decode
23
@@ -XXX,XX +XXX,XX @@ static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
18
@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
24
raw_write(env, ri, value);
19
VACGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
25
}
20
VCGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
26
21
VACGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
27
-static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
22
+VMAX_fp_3s 1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
28
- uint64_t value)
23
+VMIN_fp_3s 1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
29
-{
24
VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
30
- /* Invalidate all (TLBIALL) */
25
VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
31
- ARMCPU *cpu = arm_env_get_cpu(env);
26
+VRECPS_fp_3s 1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
32
-
27
+VRSQRTS_fp_3s 1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
33
- tlb_flush(CPU(cpu));
28
+VMAXNM_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
34
-}
29
+VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
35
-
30
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
36
-static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
31
index XXXXXXX..XXXXXXX 100644
37
- uint64_t value)
32
--- a/target/arm/translate-neon.inc.c
38
-{
33
+++ b/target/arm/translate-neon.inc.c
39
- /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
34
@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
40
- ARMCPU *cpu = arm_env_get_cpu(env);
35
DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
41
-
36
DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
42
- tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
37
DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
43
-}
38
+DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
44
-
39
+DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
45
-static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
40
46
- uint64_t value)
41
static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
47
-{
42
TCGv_ptr fpstatus)
48
- /* Invalidate by ASID (TLBIASID) */
43
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
49
- ARMCPU *cpu = arm_env_get_cpu(env);
44
DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
50
-
45
DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
51
- tlb_flush(CPU(cpu));
46
52
-}
47
+static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
53
-
54
-static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
55
- uint64_t value)
56
-{
57
- /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
58
- ARMCPU *cpu = arm_env_get_cpu(env);
59
-
60
- tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
61
-}
62
-
63
/* IS variants of TLB operations must affect all cores */
64
static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
65
uint64_t value)
66
@@ -XXX,XX +XXX,XX @@ static void tlbimvaa_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
67
tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
68
}
69
70
+/*
71
+ * Non-IS variants of TLB operations are upgraded to
72
+ * IS versions if we are at NS EL1 and HCR_EL2.FB is set to
73
+ * force broadcast of these operations.
74
+ */
75
+static bool tlb_force_broadcast(CPUARMState *env)
76
+{
48
+{
77
+ return (env->cp15.hcr_el2 & HCR_FB) &&
49
+ if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
78
+ arm_current_el(env) == 1 && arm_is_secure_below_el3(env);
50
+ return false;
51
+ }
52
+
53
+ if (a->size != 0) {
54
+ /* TODO fp16 support */
55
+ return false;
56
+ }
57
+
58
+ return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
79
+}
59
+}
80
+
60
+
81
+static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
61
+static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
82
+ uint64_t value)
83
+{
62
+{
84
+ /* Invalidate all (TLBIALL) */
63
+ if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
85
+ ARMCPU *cpu = arm_env_get_cpu(env);
64
+ return false;
86
+
87
+ if (tlb_force_broadcast(env)) {
88
+ tlbiall_is_write(env, NULL, value);
89
+ return;
90
+ }
65
+ }
91
+
66
+
92
+ tlb_flush(CPU(cpu));
67
+ if (a->size != 0) {
68
+ /* TODO fp16 support */
69
+ return false;
70
+ }
71
+
72
+ return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
93
+}
73
+}
94
+
74
+
95
+static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
75
+WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
96
+ uint64_t value)
76
+
77
+static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
78
+ uint32_t rn_ofs, uint32_t rm_ofs,
79
+ uint32_t oprsz, uint32_t maxsz)
97
+{
80
+{
98
+ /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
81
+ static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
99
+ ARMCPU *cpu = arm_env_get_cpu(env);
82
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
83
+}
100
+
84
+
101
+ if (tlb_force_broadcast(env)) {
85
+static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
102
+ tlbimva_is_write(env, NULL, value);
86
+{
103
+ return;
87
+ if (a->size != 0) {
88
+ /* TODO fp16 support */
89
+ return false;
104
+ }
90
+ }
105
+
91
+
106
+ tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
92
+ return do_3same(s, a, gen_VRECPS_fp_3s);
107
+}
93
+}
108
+
94
+
109
+static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
95
+WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
110
+ uint64_t value)
96
+
97
+static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
98
+ uint32_t rn_ofs, uint32_t rm_ofs,
99
+ uint32_t oprsz, uint32_t maxsz)
111
+{
100
+{
112
+ /* Invalidate by ASID (TLBIASID) */
101
+ static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
113
+ ARMCPU *cpu = arm_env_get_cpu(env);
102
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
103
+}
114
+
104
+
115
+ if (tlb_force_broadcast(env)) {
105
+static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
116
+ tlbiasid_is_write(env, NULL, value);
106
+{
117
+ return;
107
+ if (a->size != 0) {
108
+ /* TODO fp16 support */
109
+ return false;
118
+ }
110
+ }
119
+
111
+
120
+ tlb_flush(CPU(cpu));
112
+ return do_3same(s, a, gen_VRSQRTS_fp_3s);
121
+}
113
+}
122
+
114
+
123
+static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
115
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
124
+ uint64_t value)
125
+{
126
+ /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
127
+ ARMCPU *cpu = arm_env_get_cpu(env);
128
+
129
+ if (tlb_force_broadcast(env)) {
130
+ tlbimvaa_is_write(env, NULL, value);
131
+ return;
132
+ }
133
+
134
+ tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
135
+}
136
+
137
static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
138
uint64_t value)
139
{
116
{
140
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
117
/* FP operations handled pairwise 32 bits at a time */
141
* Page D4-1736 (DDI0487A.b)
118
diff --git a/target/arm/translate.c b/target/arm/translate.c
142
*/
119
index XXXXXXX..XXXXXXX 100644
143
120
--- a/target/arm/translate.c
144
-static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
121
+++ b/target/arm/translate.c
145
- uint64_t value)
122
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
146
-{
123
case NEON_3R_FLOAT_MULTIPLY:
147
- CPUState *cs = ENV_GET_CPU(env);
124
case NEON_3R_FLOAT_CMP:
148
-
125
case NEON_3R_FLOAT_ACMP:
149
- if (arm_is_secure_below_el3(env)) {
126
+ case NEON_3R_FLOAT_MINMAX:
150
- tlb_flush_by_mmuidx(cs,
127
+ case NEON_3R_FLOAT_MISC:
151
- ARMMMUIdxBit_S1SE1 |
128
/* Already handled by decodetree */
152
- ARMMMUIdxBit_S1SE0);
129
return 1;
153
- } else {
130
}
154
- tlb_flush_by_mmuidx(cs,
131
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
155
- ARMMMUIdxBit_S12NSE1 |
132
return 1;
156
- ARMMMUIdxBit_S12NSE0);
133
}
157
- }
134
switch (op) {
158
-}
135
- case NEON_3R_FLOAT_MINMAX:
159
-
136
- if (u) {
160
static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
137
- return 1; /* VPMIN/VPMAX handled by decodetree */
161
uint64_t value)
138
- }
162
{
139
- break;
163
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
140
- case NEON_3R_FLOAT_MISC:
164
}
141
- /* VMAXNM/VMINNM in ARMv8 */
165
}
142
- if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
166
143
- return 1;
167
+static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
144
- }
168
+ uint64_t value)
145
- break;
169
+{
146
case NEON_3R_VFM_VQRDMLSH:
170
+ CPUState *cs = ENV_GET_CPU(env);
147
if (!dc_isar_feature(aa32_simdfmac, s)) {
171
+
148
return 1;
172
+ if (tlb_force_broadcast(env)) {
149
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
173
+ tlbi_aa64_vmalle1_write(env, NULL, value);
150
tmp = neon_load_reg(rn, pass);
174
+ return;
151
tmp2 = neon_load_reg(rm, pass);
175
+ }
152
switch (op) {
176
+
153
- case NEON_3R_FLOAT_MINMAX:
177
+ if (arm_is_secure_below_el3(env)) {
154
- {
178
+ tlb_flush_by_mmuidx(cs,
155
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
179
+ ARMMMUIdxBit_S1SE1 |
156
- if (size == 0) {
180
+ ARMMMUIdxBit_S1SE0);
157
- gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
181
+ } else {
158
- } else {
182
+ tlb_flush_by_mmuidx(cs,
159
- gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
183
+ ARMMMUIdxBit_S12NSE1 |
160
- }
184
+ ARMMMUIdxBit_S12NSE0);
161
- tcg_temp_free_ptr(fpstatus);
185
+ }
162
- break;
186
+}
163
- }
187
+
164
- case NEON_3R_FLOAT_MISC:
188
static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
165
- if (u) {
189
uint64_t value)
166
- /* VMAXNM/VMINNM */
190
{
167
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
191
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
168
- if (size == 0) {
192
tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E3);
169
- gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
193
}
170
- } else {
194
171
- gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
195
-static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
172
- }
196
- uint64_t value)
173
- tcg_temp_free_ptr(fpstatus);
197
-{
174
- } else {
198
- /* Invalidate by VA, EL1&0 (AArch64 version).
175
- if (size == 0) {
199
- * Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
176
- gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
200
- * since we don't support flush-for-specific-ASID-only or
177
- } else {
201
- * flush-last-level-only.
178
- gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
202
- */
179
- }
203
- ARMCPU *cpu = arm_env_get_cpu(env);
180
- }
204
- CPUState *cs = CPU(cpu);
181
- break;
205
- uint64_t pageaddr = sextract64(value << 12, 0, 56);
182
case NEON_3R_VFM_VQRDMLSH:
206
-
183
{
207
- if (arm_is_secure_below_el3(env)) {
184
/* VFMA, VFMS: fused multiply-add */
208
- tlb_flush_page_by_mmuidx(cs, pageaddr,
209
- ARMMMUIdxBit_S1SE1 |
210
- ARMMMUIdxBit_S1SE0);
211
- } else {
212
- tlb_flush_page_by_mmuidx(cs, pageaddr,
213
- ARMMMUIdxBit_S12NSE1 |
214
- ARMMMUIdxBit_S12NSE0);
215
- }
216
-}
217
-
218
static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
219
uint64_t value)
220
{
221
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
222
}
223
}
224
225
+static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
226
+ uint64_t value)
227
+{
228
+ /* Invalidate by VA, EL1&0 (AArch64 version).
229
+ * Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
230
+ * since we don't support flush-for-specific-ASID-only or
231
+ * flush-last-level-only.
232
+ */
233
+ ARMCPU *cpu = arm_env_get_cpu(env);
234
+ CPUState *cs = CPU(cpu);
235
+ uint64_t pageaddr = sextract64(value << 12, 0, 56);
236
+
237
+ if (tlb_force_broadcast(env)) {
238
+ tlbi_aa64_vae1is_write(env, NULL, value);
239
+ return;
240
+ }
241
+
242
+ if (arm_is_secure_below_el3(env)) {
243
+ tlb_flush_page_by_mmuidx(cs, pageaddr,
244
+ ARMMMUIdxBit_S1SE1 |
245
+ ARMMMUIdxBit_S1SE0);
246
+ } else {
247
+ tlb_flush_page_by_mmuidx(cs, pageaddr,
248
+ ARMMMUIdxBit_S12NSE1 |
249
+ ARMMMUIdxBit_S12NSE0);
250
+ }
251
+}
252
+
253
static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
254
uint64_t value)
255
{
256
--
185
--
257
2.19.1
186
2.20.1
258
187
259
188
diff view generated by jsdifflib
1
For AArch32, exception return happens through certain kinds
1
Convert the Neon floating point VFMA and VFMS insn to decodetree.
2
of CPSR write. We don't currently have any CPU_LOG_INT logging
2
These are the last insns in the 3-reg-same group so we can
3
of these events (unlike AArch64, where we log in the ERET
3
remove all the support/loop code from the old decoder.
4
instruction). Add some suitable logging.
5
6
This will log exception returns like this:
7
Exception return from AArch32 hyp to usr PC 0x80100374
8
9
paralleling the existing logging in the exception_return
10
helper for AArch64 exception returns:
11
Exception return from AArch64 EL2 to AArch64 EL0 PC 0x8003045c
12
Exception return from AArch64 EL2 to AArch32 EL0 PC 0x8003045c
13
14
(Note that an AArch32 exception return can only be
15
AArch32->AArch32, never to AArch64.)
16
4
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Message-id: 20181012144235.19646-2-peter.maydell@linaro.org
7
Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
20
---
8
---
21
target/arm/internals.h | 18 ++++++++++++++++++
9
target/arm/neon-dp.decode | 3 +
22
target/arm/helper.c | 10 ++++++++++
10
target/arm/translate-neon.inc.c | 41 ++++++++
23
target/arm/translate.c | 7 +------
11
target/arm/translate.c | 176 +-------------------------------
24
3 files changed, 29 insertions(+), 6 deletions(-)
12
3 files changed, 46 insertions(+), 174 deletions(-)
25
13
26
diff --git a/target/arm/internals.h b/target/arm/internals.h
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
27
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/internals.h
16
--- a/target/arm/neon-dp.decode
29
+++ b/target/arm/internals.h
17
+++ b/target/arm/neon-dp.decode
30
@@ -XXX,XX +XXX,XX @@ static inline uint32_t v7m_sp_limit(CPUARMState *env)
18
@@ -XXX,XX +XXX,XX @@ SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
31
}
19
SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
20
vm=%vm_dp vn=%vn_dp vd=%vd_dp
21
22
+VFMA_fp_3s 1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
23
+VFMS_fp_3s 1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
24
+
25
VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
26
27
VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
28
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-neon.inc.c
31
+++ b/target/arm/translate-neon.inc.c
32
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
33
return do_3same(s, a, gen_VRSQRTS_fp_3s);
32
}
34
}
33
35
34
+/**
36
+static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
35
+ * aarch32_mode_name(): Return name of the AArch32 CPU mode
37
+ TCGv_ptr fpstatus)
36
+ * @psr: Program Status Register indicating CPU mode
38
+{
37
+ *
39
+ gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
38
+ * Returns, for debug logging purposes, a printable representation
40
+}
39
+ * of the AArch32 CPU mode ("svc", "usr", etc) as indicated by
41
+
40
+ * the low bits of the specified PSR.
42
+static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
41
+ */
43
+{
42
+static inline const char *aarch32_mode_name(uint32_t psr)
44
+ if (!dc_isar_feature(aa32_simdfmac, s)) {
43
+{
45
+ return false;
44
+ static const char cpu_mode_names[16][4] = {
46
+ }
45
+ "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
47
+
46
+ "???", "???", "hyp", "und", "???", "???", "???", "sys"
48
+ if (a->size != 0) {
47
+ };
49
+ /* TODO fp16 support */
48
+
50
+ return false;
49
+ return cpu_mode_names[psr & 0xf];
51
+ }
50
+}
52
+
51
+
53
+ return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
52
#endif
54
+}
53
diff --git a/target/arm/helper.c b/target/arm/helper.c
55
+
54
index XXXXXXX..XXXXXXX 100644
56
+static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
55
--- a/target/arm/helper.c
57
+ TCGv_ptr fpstatus)
56
+++ b/target/arm/helper.c
58
+{
57
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
59
+ gen_helper_vfp_negs(vn, vn);
58
mask |= CPSR_IL;
60
+ gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
59
val |= CPSR_IL;
61
+}
60
}
62
+
61
+ qemu_log_mask(LOG_GUEST_ERROR,
63
+static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
62
+ "Illegal AArch32 mode switch attempt from %s to %s\n",
64
+{
63
+ aarch32_mode_name(env->uncached_cpsr),
65
+ if (!dc_isar_feature(aa32_simdfmac, s)) {
64
+ aarch32_mode_name(val));
66
+ return false;
65
} else {
67
+ }
66
+ qemu_log_mask(CPU_LOG_INT, "%s %s to %s PC 0x%" PRIx32 "\n",
68
+
67
+ write_type == CPSRWriteExceptionReturn ?
69
+ if (a->size != 0) {
68
+ "Exception return from AArch32" :
70
+ /* TODO fp16 support */
69
+ "AArch32 mode switch from",
71
+ return false;
70
+ aarch32_mode_name(env->uncached_cpsr),
72
+ }
71
+ aarch32_mode_name(val), env->regs[15]);
73
+
72
switch_mode(env, val & CPSR_M);
74
+ return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
73
}
75
+}
74
}
76
+
77
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
78
{
79
/* FP operations handled pairwise 32 bits at a time */
75
diff --git a/target/arm/translate.c b/target/arm/translate.c
80
diff --git a/target/arm/translate.c b/target/arm/translate.c
76
index XXXXXXX..XXXXXXX 100644
81
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/translate.c
82
--- a/target/arm/translate.c
78
+++ b/target/arm/translate.c
83
+++ b/target/arm/translate.c
79
@@ -XXX,XX +XXX,XX @@ void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb)
84
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
80
translator_loop(ops, &dc.base, cpu, tb);
85
}
81
}
86
}
82
87
83
-static const char *cpu_mode_names[16] = {
88
-/* Symbolic constants for op fields for Neon 3-register same-length.
84
- "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
89
- * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
85
- "???", "???", "hyp", "und", "???", "???", "???", "sys"
90
- * table A7-9.
91
- */
92
-#define NEON_3R_VHADD 0
93
-#define NEON_3R_VQADD 1
94
-#define NEON_3R_VRHADD 2
95
-#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
96
-#define NEON_3R_VHSUB 4
97
-#define NEON_3R_VQSUB 5
98
-#define NEON_3R_VCGT 6
99
-#define NEON_3R_VCGE 7
100
-#define NEON_3R_VSHL 8
101
-#define NEON_3R_VQSHL 9
102
-#define NEON_3R_VRSHL 10
103
-#define NEON_3R_VQRSHL 11
104
-#define NEON_3R_VMAX 12
105
-#define NEON_3R_VMIN 13
106
-#define NEON_3R_VABD 14
107
-#define NEON_3R_VABA 15
108
-#define NEON_3R_VADD_VSUB 16
109
-#define NEON_3R_VTST_VCEQ 17
110
-#define NEON_3R_VML 18 /* VMLA, VMLS */
111
-#define NEON_3R_VMUL 19
112
-#define NEON_3R_VPMAX 20
113
-#define NEON_3R_VPMIN 21
114
-#define NEON_3R_VQDMULH_VQRDMULH 22
115
-#define NEON_3R_VPADD_VQRDMLAH 23
116
-#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
117
-#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
118
-#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
119
-#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
120
-#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
121
-#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
122
-#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
123
-#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
124
-
125
-static const uint8_t neon_3r_sizes[] = {
126
- [NEON_3R_VHADD] = 0x7,
127
- [NEON_3R_VQADD] = 0xf,
128
- [NEON_3R_VRHADD] = 0x7,
129
- [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
130
- [NEON_3R_VHSUB] = 0x7,
131
- [NEON_3R_VQSUB] = 0xf,
132
- [NEON_3R_VCGT] = 0x7,
133
- [NEON_3R_VCGE] = 0x7,
134
- [NEON_3R_VSHL] = 0xf,
135
- [NEON_3R_VQSHL] = 0xf,
136
- [NEON_3R_VRSHL] = 0xf,
137
- [NEON_3R_VQRSHL] = 0xf,
138
- [NEON_3R_VMAX] = 0x7,
139
- [NEON_3R_VMIN] = 0x7,
140
- [NEON_3R_VABD] = 0x7,
141
- [NEON_3R_VABA] = 0x7,
142
- [NEON_3R_VADD_VSUB] = 0xf,
143
- [NEON_3R_VTST_VCEQ] = 0x7,
144
- [NEON_3R_VML] = 0x7,
145
- [NEON_3R_VMUL] = 0x7,
146
- [NEON_3R_VPMAX] = 0x7,
147
- [NEON_3R_VPMIN] = 0x7,
148
- [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
149
- [NEON_3R_VPADD_VQRDMLAH] = 0x7,
150
- [NEON_3R_SHA] = 0xf, /* size field encodes op type */
151
- [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
152
- [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
153
- [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
154
- [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
155
- [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
156
- [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
157
- [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
86
-};
158
-};
87
-
159
-
88
void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
160
/* Symbolic constants for op fields for Neon 2-register miscellaneous.
89
int flags)
161
* The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
90
{
162
* table A7-13.
91
@@ -XXX,XX +XXX,XX @@ void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
163
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
92
psr & CPSR_V ? 'V' : '-',
164
rm_ofs = neon_reg_offset(rm, 0);
93
psr & CPSR_T ? 'T' : 'A',
165
94
ns_status,
166
if ((insn & (1 << 23)) == 0) {
95
- cpu_mode_names[psr & 0xf], (psr & 0x10) ? 32 : 26);
167
- /* Three register same length. */
96
+ aarch32_mode_name(psr), (psr & 0x10) ? 32 : 26);
168
- op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
97
}
169
- /* Catch invalid op and bad size combinations: UNDEF */
98
170
- if ((neon_3r_sizes[op] & (1 << size)) == 0) {
99
if (flags & CPU_DUMP_FPU) {
171
- return 1;
172
- }
173
- /* All insns of this form UNDEF for either this condition or the
174
- * superset of cases "Q==1"; we catch the latter later.
175
- */
176
- if (q && ((rd | rn | rm) & 1)) {
177
- return 1;
178
- }
179
- switch (op) {
180
- case NEON_3R_VFM_VQRDMLSH:
181
- if (!u) {
182
- /* VFM, VFMS */
183
- if (size == 1) {
184
- return 1;
185
- }
186
- break;
187
- }
188
- /* VQRDMLSH : handled by decodetree */
189
- return 1;
190
-
191
- case NEON_3R_VADD_VSUB:
192
- case NEON_3R_LOGIC:
193
- case NEON_3R_VMAX:
194
- case NEON_3R_VMIN:
195
- case NEON_3R_VTST_VCEQ:
196
- case NEON_3R_VCGT:
197
- case NEON_3R_VCGE:
198
- case NEON_3R_VQADD:
199
- case NEON_3R_VQSUB:
200
- case NEON_3R_VMUL:
201
- case NEON_3R_VML:
202
- case NEON_3R_VSHL:
203
- case NEON_3R_SHA:
204
- case NEON_3R_VHADD:
205
- case NEON_3R_VRHADD:
206
- case NEON_3R_VHSUB:
207
- case NEON_3R_VABD:
208
- case NEON_3R_VABA:
209
- case NEON_3R_VQSHL:
210
- case NEON_3R_VRSHL:
211
- case NEON_3R_VQRSHL:
212
- case NEON_3R_VPMAX:
213
- case NEON_3R_VPMIN:
214
- case NEON_3R_VPADD_VQRDMLAH:
215
- case NEON_3R_VQDMULH_VQRDMULH:
216
- case NEON_3R_FLOAT_ARITH:
217
- case NEON_3R_FLOAT_MULTIPLY:
218
- case NEON_3R_FLOAT_CMP:
219
- case NEON_3R_FLOAT_ACMP:
220
- case NEON_3R_FLOAT_MINMAX:
221
- case NEON_3R_FLOAT_MISC:
222
- /* Already handled by decodetree */
223
- return 1;
224
- }
225
-
226
- if (size == 3) {
227
- /* 64-bit element instructions: handled by decodetree */
228
- return 1;
229
- }
230
- switch (op) {
231
- case NEON_3R_VFM_VQRDMLSH:
232
- if (!dc_isar_feature(aa32_simdfmac, s)) {
233
- return 1;
234
- }
235
- break;
236
- default:
237
- break;
238
- }
239
-
240
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
241
-
242
- /* Elementwise. */
243
- tmp = neon_load_reg(rn, pass);
244
- tmp2 = neon_load_reg(rm, pass);
245
- switch (op) {
246
- case NEON_3R_VFM_VQRDMLSH:
247
- {
248
- /* VFMA, VFMS: fused multiply-add */
249
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
250
- TCGv_i32 tmp3 = neon_load_reg(rd, pass);
251
- if (size) {
252
- /* VFMS */
253
- gen_helper_vfp_negs(tmp, tmp);
254
- }
255
- gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
256
- tcg_temp_free_i32(tmp3);
257
- tcg_temp_free_ptr(fpstatus);
258
- break;
259
- }
260
- default:
261
- abort();
262
- }
263
- tcg_temp_free_i32(tmp2);
264
-
265
- neon_store_reg(rd, pass, tmp);
266
-
267
- } /* for pass */
268
- /* End of 3 register same size operations. */
269
+ /* Three register same length: handled by decodetree */
270
+ return 1;
271
} else if (insn & (1 << 4)) {
272
if ((insn & 0x00380080) != 0) {
273
/* Two registers and shift. */
100
--
274
--
101
2.19.1
275
2.20.1
102
276
103
277
diff view generated by jsdifflib