1
target-arm queue: the big things in here are SVE in system
1
Nothing earth-shaking in here, just a lot of refactoring and cleanup
2
emulation mode, and v8M stack limit checking; there are
2
and a few bugfixes. I suspect I'll have another pullreq to come in
3
also a handful of smaller fixes.
3
the early part of next week...
4
4
5
thanks
5
The following changes since commit 19591e9e0938ea5066984553c256a043bd5d822f:
6
-- PMM
7
6
8
The following changes since commit 079911cb6e26898e16f5bb56ef4f9d33cf92d32d:
7
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2020-08-27 16:59:02 +0100)
9
10
Merge remote-tracking branch 'remotes/rth/tags/pull-fpu-20181005' into staging (2018-10-08 12:44:35 +0100)
11
8
12
are available in the Git repository at:
9
are available in the Git repository at:
13
10
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20181008
11
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200828
15
12
16
for you to fetch changes up to 74e2e59b8d0a68be0956310fc349179c89fd7be0:
13
for you to fetch changes up to ed78849d9711805bda37ee026018d6ee7a606d0e:
17
14
18
hw/display/bcm2835_fb: Silence Coverity warning about multiply overflow (2018-10-08 14:55:05 +0100)
15
target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd (2020-08-28 10:02:50 +0100)
19
16
20
----------------------------------------------------------------
17
----------------------------------------------------------------
21
target-arm queue:
18
target-arm queue:
22
* target/arm: fix error in a code comment
19
* target/arm: Cleanup and refactoring preparatory to SVE2
23
* virt: Suppress external aborts on virt-2.10 and earlier
20
* armsse: Define ARMSSEClass correctly
24
* target/arm: Correct condition for v8M callee stack push
21
* hw/misc/unimp: Improve information provided in log messages
25
* target/arm: Don't read r4 from v8M exception stackframe twice
22
* hw/qdev-clock: Avoid calling qdev_connect_clock_in after DeviceRealize
26
* target/arm: Support SVE in system emulation mode
23
* hw/arm/xilinx_zynq: Call qdev_connect_clock_in() before DeviceRealize
27
* target/arm: Implement v8M hardware stack limit checking
24
* hw/net/allwinner-sun8i-emac: Use AddressSpace for DMA transfers
28
* hw/display/bcm2835_fb: Silence Coverity warning about multiply overflow
25
* hw/sd/allwinner-sdhost: Use AddressSpace for DMA transfers
26
* target/arm: Fill in the WnR syndrome bit in mte_check_fail
27
* target/arm: Clarify HCR_EL2 ARMCPRegInfo type
28
* hw/arm/musicpal: Use AddressSpace for DMA transfers
29
* hw/clock: Minor cleanups
30
* hw/arm/sbsa-ref: fix typo breaking PCIe IRQs
29
31
30
----------------------------------------------------------------
32
----------------------------------------------------------------
31
Dongjiu Geng (1):
33
Eduardo Habkost (1):
32
target/arm: fix code comments error
34
armsse: Define ARMSSEClass correctly
33
35
34
Peter Maydell (17):
36
Graeme Gregory (1):
35
virt: Suppress external aborts on virt-2.10 and earlier
37
hw/arm/sbsa-ref: fix typo breaking PCIe IRQs
36
target/arm: Correct condition for v8M callee stack push
37
target/arm: Don't read r4 from v8M exception stackframe twice
38
target/arm: Define new TBFLAG for v8M stack checking
39
target/arm: Define new EXCP type for v8M stack overflows
40
target/arm: Move v7m_using_psp() to internals.h
41
target/arm: Add v8M stack checks on ADD/SUB/MOV of SP
42
target/arm: Add some comments in Thumb decode
43
target/arm: Add v8M stack checks on exception entry
44
target/arm: Add v8M stack limit checks on NS function calls
45
target/arm: Add v8M stack checks for LDRD/STRD (imm)
46
target/arm: Add v8M stack checks for Thumb2 LDM/STM
47
target/arm: Add v8M stack checks for T32 load/store single
48
target/arm: Add v8M stack checks for Thumb push/pop
49
target/arm: Add v8M stack checks for VLDM/VSTM
50
target/arm: Add v8M stack checks for MSR to SP_NS
51
hw/display/bcm2835_fb: Silence Coverity warning about multiply overflow
52
38
53
Richard Henderson (15):
39
Philippe Mathieu-Daudé (14):
54
target/arm: Define ID_AA64ZFR0_EL1
40
hw/clock: Remove unused clock_init*() functions
55
target/arm: Adjust sve_exception_el
41
hw/clock: Let clock_set() return boolean value
56
target/arm: Pass in current_el to fp and sve_exception_el
42
hw/clock: Only propagate clock changes if the clock is changed
57
target/arm: Handle SVE vector length changes in system mode
43
hw/arm/musicpal: Use AddressSpace for DMA transfers
58
target/arm: Adjust aarch64_cpu_dump_state for system mode SVE
44
target/arm: Clarify HCR_EL2 ARMCPRegInfo type
59
target/arm: Clear unused predicate bits for LD1RQ
45
hw/sd/allwinner-sdhost: Use AddressSpace for DMA transfers
60
target/arm: Rewrite helper_sve_ld1*_r using pages
46
hw/net/allwinner-sun8i-emac: Use AddressSpace for DMA transfers
61
target/arm: Rewrite helper_sve_ld[234]*_r
47
hw/arm/xilinx_zynq: Uninline cadence_uart_create()
62
target/arm: Rewrite helper_sve_st[1234]*_r
48
hw/arm/xilinx_zynq: Call qdev_connect_clock_in() before DeviceRealize
63
target/arm: Split contiguous loads for endianness
49
hw/qdev-clock: Uninline qdev_connect_clock_in()
64
target/arm: Split contiguous stores for endianness
50
hw/qdev-clock: Avoid calling qdev_connect_clock_in after DeviceRealize
65
target/arm: Rewrite vector gather loads
51
hw/misc/unimp: Display value after offset
66
target/arm: Rewrite vector gather stores
52
hw/misc/unimp: Display the value with width of the access size
67
target/arm: Rewrite vector gather first-fault loads
53
hw/misc/unimp: Display the offset with width of the region size
68
target/arm: Pass TCGMemOpIdx to sve memory helpers
69
54
70
target/arm/cpu.h | 17 +
55
Richard Henderson (19):
71
target/arm/helper-sve.h | 385 ++++++---
56
target/arm: Pass the entire mte descriptor to mte_check_fail
72
target/arm/helper.h | 2 +
57
target/arm: Fill in the WnR syndrome bit in mte_check_fail
73
target/arm/internals.h | 44 +
58
qemu/int128: Add int128_lshift
74
target/arm/kvm_arm.h | 4 +-
59
target/arm: Split out gen_gvec_fn_zz
75
target/arm/translate.h | 1 +
60
target/arm: Split out gen_gvec_fn_zzz, do_zzz_fn
76
hw/arm/virt.c | 2 +
61
target/arm: Rearrange {sve,fp}_check_access assert
77
hw/display/bcm2835_fb.c | 2 +-
62
target/arm: Merge do_vector2_p into do_mov_p
78
target/arm/cpu64.c | 42 -
63
target/arm: Clean up 4-operand predicate expansion
79
target/arm/helper.c | 345 +++++---
64
target/arm: Use tcg_gen_gvec_bitsel for trans_SEL_pppp
80
target/arm/kvm.c | 2 +-
65
target/arm: Split out gen_gvec_ool_zzzp
81
target/arm/op_helper.c | 24 +-
66
target/arm: Merge helper_sve_clr_* and helper_sve_movz_*
82
target/arm/sve_helper.c | 1961 ++++++++++++++++++++++++++++++--------------
67
target/arm: Split out gen_gvec_ool_zzp
83
target/arm/translate-a64.c | 8 +-
68
target/arm: Split out gen_gvec_ool_zzz
84
target/arm/translate-sve.c | 670 ++++++++++-----
69
target/arm: Split out gen_gvec_ool_zz
85
target/arm/translate.c | 198 ++++-
70
target/arm: Tidy SVE tszimm shift formats
86
16 files changed, 2611 insertions(+), 1096 deletions(-)
71
target/arm: Generalize inl_qrdmlah_* helper functions
72
target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd
73
target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd
74
target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd
87
75
76
include/hw/arm/armsse.h | 2 +-
77
include/hw/char/cadence_uart.h | 17 --
78
include/hw/clock.h | 30 +--
79
include/hw/misc/unimp.h | 1 +
80
include/hw/net/allwinner-sun8i-emac.h | 6 +
81
include/hw/qdev-clock.h | 8 +-
82
include/hw/sd/allwinner-sdhost.h | 6 +
83
include/qemu/int128.h | 16 ++
84
target/arm/helper-sve.h | 5 -
85
target/arm/helper.h | 28 +++
86
target/arm/translate.h | 1 +
87
target/arm/sve.decode | 35 ++-
88
hw/arm/allwinner-a10.c | 2 +
89
hw/arm/allwinner-h3.c | 4 +
90
hw/arm/armsse.c | 1 +
91
hw/arm/musicpal.c | 45 ++--
92
hw/arm/sbsa-ref.c | 2 +-
93
hw/arm/xilinx_zynq.c | 24 +-
94
hw/core/clock.c | 7 +-
95
hw/core/qdev-clock.c | 6 +
96
hw/misc/unimp.c | 14 +-
97
hw/net/allwinner-sun8i-emac.c | 46 ++--
98
hw/sd/allwinner-sdhost.c | 37 +++-
99
target/arm/helper.c | 1 -
100
target/arm/mte_helper.c | 19 +-
101
target/arm/sve_helper.c | 70 ++----
102
target/arm/translate-a64.c | 110 ++++++++--
103
target/arm/translate-sve.c | 399 ++++++++++++++--------------------
104
target/arm/vec_helper.c | 182 +++++++++++-----
105
29 files changed, 629 insertions(+), 495 deletions(-)
106
diff view generated by jsdifflib
1
In v7m_exception_taken() we were incorrectly using a
1
From: Graeme Gregory <graeme@nuviainc.com>
2
"LR bit EXCRET.ES is 1" check when it should be 0
2
3
(compare the pseudocode ExceptionTaken() function).
3
Fixing a typo in a previous patch that translated an "i" to a 1
4
This meant we didn't stack the callee-saved registers
4
and therefore breaking the allocation of PCIe interrupts. This was
5
when tailchaining from a NonSecure to a Secure exception.
5
discovered when virtio-net-pci devices ceased to function correctly.
6
6
7
Cc: qemu-stable@nongnu.org
7
Cc: qemu-stable@nongnu.org
8
Fixes: 48ba18e6d3f3 ("hw/arm/sbsa-ref: Simplify by moving the gic in the machine state")
9
Signed-off-by: Graeme Gregory <graeme@nuviainc.com>
10
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Message-id: 20200821083853.356490-1-graeme@nuviainc.com
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20181002145940.30931-1-peter.maydell@linaro.org
11
---
13
---
12
target/arm/helper.c | 2 +-
14
hw/arm/sbsa-ref.c | 2 +-
13
1 file changed, 1 insertion(+), 1 deletion(-)
15
1 file changed, 1 insertion(+), 1 deletion(-)
14
16
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
19
--- a/hw/arm/sbsa-ref.c
18
+++ b/target/arm/helper.c
20
+++ b/hw/arm/sbsa-ref.c
19
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
21
@@ -XXX,XX +XXX,XX @@ static void create_pcie(SBSAMachineState *sms)
20
* not already saved.
22
21
*/
23
for (i = 0; i < GPEX_NUM_IRQS; i++) {
22
if (lr & R_V7M_EXCRET_DCRS_MASK &&
24
sysbus_connect_irq(SYS_BUS_DEVICE(dev), i,
23
- !(dotailchain && (lr & R_V7M_EXCRET_ES_MASK))) {
25
- qdev_get_gpio_in(sms->gic, irq + 1));
24
+ !(dotailchain && !(lr & R_V7M_EXCRET_ES_MASK))) {
26
+ qdev_get_gpio_in(sms->gic, irq + i));
25
push_failed = v7m_push_callee_stack(cpu, lr, dotailchain,
27
gpex_set_irq_num(GPEX_HOST(dev), i, irq + i);
26
ignore_stackfaults);
28
}
27
}
29
28
--
30
--
29
2.19.0
31
2.20.1
30
32
31
33
diff view generated by jsdifflib
1
We're going to want v7m_using_psp() in op_helper.c in the
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
next patch, so move it from helper.c to internals.h.
3
2
3
clock_init*() inlined funtions are simple wrappers around
4
clock_set*() and are not used. Remove them in favor of clock_set*().
5
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200806123858.30058-2-f4bug@amsat.org
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20181002163556.10279-4-peter.maydell@linaro.org
8
---
10
---
9
target/arm/internals.h | 16 ++++++++++++++++
11
include/hw/clock.h | 13 -------------
10
target/arm/helper.c | 12 ------------
12
1 file changed, 13 deletions(-)
11
2 files changed, 16 insertions(+), 12 deletions(-)
12
13
13
diff --git a/target/arm/internals.h b/target/arm/internals.h
14
diff --git a/include/hw/clock.h b/include/hw/clock.h
14
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/internals.h
16
--- a/include/hw/clock.h
16
+++ b/target/arm/internals.h
17
+++ b/include/hw/clock.h
17
@@ -XXX,XX +XXX,XX @@ static inline uint32_t arm_debug_exception_fsr(CPUARMState *env)
18
@@ -XXX,XX +XXX,XX @@ static inline bool clock_is_enabled(const Clock *clk)
18
*/
19
return clock_get(clk) != 0;
19
#define MEMOPIDX_SHIFT 8
20
21
+/**
22
+ * v7m_using_psp: Return true if using process stack pointer
23
+ * Return true if the CPU is currently using the process stack
24
+ * pointer, or false if it is using the main stack pointer.
25
+ */
26
+static inline bool v7m_using_psp(CPUARMState *env)
27
+{
28
+ /* Handler mode always uses the main stack; for thread mode
29
+ * the CONTROL.SPSEL bit determines the answer.
30
+ * Note that in v7M it is not possible to be in Handler mode with
31
+ * CONTROL.SPSEL non-zero, but in v8M it is, so we must check both.
32
+ */
33
+ return !arm_v7m_is_handler_mode(env) &&
34
+ env->v7m.control[env->v7m.secure] & R_V7M_CONTROL_SPSEL_MASK;
35
+}
36
+
37
#endif
38
diff --git a/target/arm/helper.c b/target/arm/helper.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/helper.c
41
+++ b/target/arm/helper.c
42
@@ -XXX,XX +XXX,XX @@ pend_fault:
43
return false;
44
}
20
}
45
21
46
-/* Return true if we're using the process stack pointer (not the MSP) */
22
-static inline void clock_init(Clock *clk, uint64_t value)
47
-static bool v7m_using_psp(CPUARMState *env)
48
-{
23
-{
49
- /* Handler mode always uses the main stack; for thread mode
24
- clock_set(clk, value);
50
- * the CONTROL.SPSEL bit determines the answer.
25
-}
51
- * Note that in v7M it is not possible to be in Handler mode with
26
-static inline void clock_init_hz(Clock *clk, uint64_t value)
52
- * CONTROL.SPSEL non-zero, but in v8M it is, so we must check both.
27
-{
53
- */
28
- clock_set_hz(clk, value);
54
- return !arm_v7m_is_handler_mode(env) &&
29
-}
55
- env->v7m.control[env->v7m.secure] & R_V7M_CONTROL_SPSEL_MASK;
30
-static inline void clock_init_ns(Clock *clk, uint64_t value)
31
-{
32
- clock_set_ns(clk, value);
56
-}
33
-}
57
-
34
-
58
/* Write to v7M CONTROL.SPSEL bit for the specified security bank.
35
#endif /* QEMU_HW_CLOCK_H */
59
* This may change the current stack pointer between Main and Process
60
* stack pointers if it is done for the CONTROL register for the current
61
--
36
--
62
2.19.0
37
2.20.1
63
38
64
39
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
2
3
The parameter of kvm_arm_init_cpreg_list() is ARMCPU instead of
3
Let clock_set() return a boolean value whether the clock
4
CPUState, so correct the note to make it match the code.
4
has been updated or not.
5
5
6
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Message-id: 1538069046-5757-1-git-send-email-gengdongjiu@huawei.com
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20200806123858.30058-3-f4bug@amsat.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
target/arm/kvm_arm.h | 4 ++--
11
include/hw/clock.h | 12 +++++++-----
12
target/arm/kvm.c | 2 +-
12
hw/core/clock.c | 7 ++++++-
13
2 files changed, 3 insertions(+), 3 deletions(-)
13
2 files changed, 13 insertions(+), 6 deletions(-)
14
14
15
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
15
diff --git a/include/hw/clock.h b/include/hw/clock.h
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/kvm_arm.h
17
--- a/include/hw/clock.h
18
+++ b/target/arm/kvm_arm.h
18
+++ b/include/hw/clock.h
19
@@ -XXX,XX +XXX,XX @@ void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid, uint64_t group,
19
@@ -XXX,XX +XXX,XX @@ void clock_set_source(Clock *clk, Clock *src);
20
* @value: the clock's value, 0 means unclocked
21
*
22
* Set the local cached period value of @clk to @value.
23
+ *
24
+ * @return: true if the clock is changed.
25
*/
26
-void clock_set(Clock *clk, uint64_t value);
27
+bool clock_set(Clock *clk, uint64_t value);
28
29
-static inline void clock_set_hz(Clock *clk, unsigned hz)
30
+static inline bool clock_set_hz(Clock *clk, unsigned hz)
31
{
32
- clock_set(clk, CLOCK_PERIOD_FROM_HZ(hz));
33
+ return clock_set(clk, CLOCK_PERIOD_FROM_HZ(hz));
34
}
35
36
-static inline void clock_set_ns(Clock *clk, unsigned ns)
37
+static inline bool clock_set_ns(Clock *clk, unsigned ns)
38
{
39
- clock_set(clk, CLOCK_PERIOD_FROM_NS(ns));
40
+ return clock_set(clk, CLOCK_PERIOD_FROM_NS(ns));
41
}
20
42
21
/**
43
/**
22
* kvm_arm_init_cpreg_list:
44
diff --git a/hw/core/clock.c b/hw/core/clock.c
23
- * @cs: CPUState
24
+ * @cpu: ARMCPU
25
*
26
- * Initialize the CPUState's cpreg list according to the kernel's
27
+ * Initialize the ARMCPU cpreg list according to the kernel's
28
* definition of what CPU registers it knows about (and throw away
29
* the previous TCG-created cpreg list).
30
*
31
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
32
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/kvm.c
46
--- a/hw/core/clock.c
34
+++ b/target/arm/kvm.c
47
+++ b/hw/core/clock.c
35
@@ -XXX,XX +XXX,XX @@ static int compare_u64(const void *a, const void *b)
48
@@ -XXX,XX +XXX,XX @@ void clock_clear_callback(Clock *clk)
36
return 0;
49
clock_set_callback(clk, NULL, NULL);
37
}
50
}
38
51
39
-/* Initialize the CPUState's cpreg list according to the kernel's
52
-void clock_set(Clock *clk, uint64_t period)
40
+/* Initialize the ARMCPU cpreg list according to the kernel's
53
+bool clock_set(Clock *clk, uint64_t period)
41
* definition of what CPU registers it knows about (and throw away
54
{
42
* the previous TCG-created cpreg list).
55
+ if (clk->period == period) {
43
*/
56
+ return false;
57
+ }
58
trace_clock_set(CLOCK_PATH(clk), CLOCK_PERIOD_TO_NS(clk->period),
59
CLOCK_PERIOD_TO_NS(period));
60
clk->period = period;
61
+
62
+ return true;
63
}
64
65
static void clock_propagate_period(Clock *clk, bool call_callbacks)
44
--
66
--
45
2.19.0
67
2.20.1
46
68
47
69
diff view generated by jsdifflib
1
Updating the NS stack pointer via MSR to SP_NS should include
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
a check whether the new SP value is below the stack limit.
3
No other kinds of update to the various stack pointer and
4
limit registers via MSR should perform a check.
5
2
3
Avoid propagating the clock change when the clock does not change.
4
5
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200806123858.30058-4-f4bug@amsat.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181002163556.10279-14-peter.maydell@linaro.org
10
---
9
---
11
target/arm/helper.c | 14 +++++++++++++-
10
include/hw/clock.h | 5 +++--
12
1 file changed, 13 insertions(+), 1 deletion(-)
11
1 file changed, 3 insertions(+), 2 deletions(-)
13
12
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
diff --git a/include/hw/clock.h b/include/hw/clock.h
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
15
--- a/include/hw/clock.h
17
+++ b/target/arm/helper.c
16
+++ b/include/hw/clock.h
18
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
17
@@ -XXX,XX +XXX,XX @@ void clock_propagate(Clock *clk);
19
* currently in handler mode or not, using the NS CONTROL.SPSEL.
18
*/
20
*/
19
static inline void clock_update(Clock *clk, uint64_t value)
21
bool spsel = env->v7m.control[M_REG_NS] & R_V7M_CONTROL_SPSEL_MASK;
20
{
22
+ bool is_psp = !arm_v7m_is_handler_mode(env) && spsel;
21
- clock_set(clk, value);
23
+ uint32_t limit;
22
- clock_propagate(clk);
24
23
+ if (clock_set(clk, value)) {
25
if (!env->v7m.secure) {
24
+ clock_propagate(clk);
26
return;
25
+ }
27
}
26
}
28
- if (!arm_v7m_is_handler_mode(env) && spsel) {
27
29
+
28
static inline void clock_update_hz(Clock *clk, unsigned hz)
30
+ limit = is_psp ? env->v7m.psplim[false] : env->v7m.msplim[false];
31
+
32
+ if (val < limit) {
33
+ CPUState *cs = CPU(arm_env_get_cpu(env));
34
+
35
+ cpu_restore_state(cs, GETPC(), true);
36
+ raise_exception(env, EXCP_STKOF, 0, 1);
37
+ }
38
+
39
+ if (is_psp) {
40
env->v7m.other_ss_psp = val;
41
} else {
42
env->v7m.other_ss_msp = val;
43
--
29
--
44
2.19.0
30
2.20.1
45
31
46
32
diff view generated by jsdifflib
1
Add checks for breaches of the v8M stack limit when the
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
stack pointer is decremented to push the exception frame
3
for exception entry.
4
2
5
Note that the exception-entry case is unique in that the
3
Allow the device to execute the DMA transfers in a different
6
stack pointer is updated to be the limit value if the limit
4
AddressSpace.
7
is hit (per rule R_ZLZG).
8
5
6
We keep using the system_memory address space, but via the
7
proper dma_memory_access() API.
8
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200814125533.4047-1-f4bug@amsat.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20181002163556.10279-7-peter.maydell@linaro.org
13
---
13
---
14
target/arm/helper.c | 54 ++++++++++++++++++++++++++++++++++++++-------
14
hw/arm/musicpal.c | 45 +++++++++++++++++++++++++++++++--------------
15
1 file changed, 46 insertions(+), 8 deletions(-)
15
1 file changed, 31 insertions(+), 14 deletions(-)
16
16
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
diff --git a/hw/arm/musicpal.c b/hw/arm/musicpal.c
18
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.c
19
--- a/hw/arm/musicpal.c
20
+++ b/target/arm/helper.c
20
+++ b/hw/arm/musicpal.c
21
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
21
@@ -XXX,XX +XXX,XX @@
22
uint32_t frameptr;
22
#include "hw/audio/wm8750.h"
23
ARMMMUIdx mmu_idx;
23
#include "sysemu/block-backend.h"
24
bool stacked_ok;
24
#include "sysemu/runstate.h"
25
+ uint32_t limit;
25
+#include "sysemu/dma.h"
26
+ bool want_psp;
26
#include "exec/address-spaces.h"
27
27
#include "ui/pixel_ops.h"
28
if (dotailchain) {
28
#include "qemu/cutils.h"
29
bool mode = lr & R_V7M_EXCRET_MODE_MASK;
29
@@ -XXX,XX +XXX,XX @@ typedef struct mv88w8618_eth_state {
30
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
30
31
mmu_idx = arm_v7m_mmu_idx_for_secstate_and_priv(env, M_REG_S, priv);
31
MemoryRegion iomem;
32
frame_sp_p = get_v7m_sp_ptr(env, M_REG_S, mode,
32
qemu_irq irq;
33
lr & R_V7M_EXCRET_SPSEL_MASK);
33
+ MemoryRegion *dma_mr;
34
+ want_psp = mode && (lr & R_V7M_EXCRET_SPSEL_MASK);
34
+ AddressSpace dma_as;
35
+ if (want_psp) {
35
uint32_t smir;
36
+ limit = env->v7m.psplim[M_REG_S];
36
uint32_t icr;
37
+ } else {
37
uint32_t imr;
38
+ limit = env->v7m.msplim[M_REG_S];
38
@@ -XXX,XX +XXX,XX @@ typedef struct mv88w8618_eth_state {
39
+ }
39
NICConf conf;
40
} else {
40
} mv88w8618_eth_state;
41
mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
41
42
frame_sp_p = &env->regs[13];
42
-static void eth_rx_desc_put(uint32_t addr, mv88w8618_rx_desc *desc)
43
+ limit = v7m_sp_limit(env);
43
+static void eth_rx_desc_put(AddressSpace *dma_as, uint32_t addr,
44
}
44
+ mv88w8618_rx_desc *desc)
45
45
{
46
frameptr = *frame_sp_p - 0x28;
46
cpu_to_le32s(&desc->cmdstat);
47
+ if (frameptr < limit) {
47
cpu_to_le16s(&desc->bytes);
48
+ /*
48
cpu_to_le16s(&desc->buffer_size);
49
+ * Stack limit failure: set SP to the limit value, and generate
49
cpu_to_le32s(&desc->buffer);
50
+ * STKOF UsageFault. Stack pushes below the limit must not be
50
cpu_to_le32s(&desc->next);
51
+ * performed. It is IMPDEF whether pushes above the limit are
51
- cpu_physical_memory_write(addr, desc, sizeof(*desc));
52
+ * performed; we choose not to.
52
+ dma_memory_write(dma_as, addr, desc, sizeof(*desc));
53
+ */
53
}
54
+ qemu_log_mask(CPU_LOG_INT,
54
55
+ "...STKOF during callee-saves register stacking\n");
55
-static void eth_rx_desc_get(uint32_t addr, mv88w8618_rx_desc *desc)
56
+ env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
56
+static void eth_rx_desc_get(AddressSpace *dma_as, uint32_t addr,
57
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
57
+ mv88w8618_rx_desc *desc)
58
+ env->v7m.secure);
58
{
59
+ *frame_sp_p = limit;
59
- cpu_physical_memory_read(addr, desc, sizeof(*desc));
60
+ return true;
60
+ dma_memory_read(dma_as, addr, desc, sizeof(*desc));
61
+ }
61
le32_to_cpus(&desc->cmdstat);
62
62
le16_to_cpus(&desc->bytes);
63
/* Write as much of the stack frame as we can. A write failure may
63
le16_to_cpus(&desc->buffer_size);
64
* cause us to pend a derived exception.
64
@@ -XXX,XX +XXX,XX @@ static ssize_t eth_receive(NetClientState *nc, const uint8_t *buf, size_t size)
65
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
65
continue;
66
v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx,
66
}
67
ignore_faults);
67
do {
68
68
- eth_rx_desc_get(desc_addr, &desc);
69
- /* Update SP regardless of whether any of the stack accesses failed.
69
+ eth_rx_desc_get(&s->dma_as, desc_addr, &desc);
70
- * When we implement v8M stack limit checking then this attempt to
70
if ((desc.cmdstat & MP_ETH_RX_OWN) && desc.buffer_size >= size) {
71
- * update SP might also fail and result in a derived exception.
71
- cpu_physical_memory_write(desc.buffer + s->vlan_header,
72
- */
72
+ dma_memory_write(&s->dma_as, desc.buffer + s->vlan_header,
73
+ /* Update SP regardless of whether any of the stack accesses failed. */
73
buf, size);
74
*frame_sp_p = frameptr;
74
desc.bytes = size + s->vlan_header;
75
75
desc.cmdstat &= ~MP_ETH_RX_OWN;
76
return !stacked_ok;
76
@@ -XXX,XX +XXX,XX @@ static ssize_t eth_receive(NetClientState *nc, const uint8_t *buf, size_t size)
77
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
77
if (s->icr & s->imr) {
78
78
qemu_irq_raise(s->irq);
79
frameptr -= 0x20;
79
}
80
80
- eth_rx_desc_put(desc_addr, &desc);
81
+ if (arm_feature(env, ARM_FEATURE_V8)) {
81
+ eth_rx_desc_put(&s->dma_as, desc_addr, &desc);
82
+ uint32_t limit = v7m_sp_limit(env);
82
return size;
83
+
83
}
84
+ if (frameptr < limit) {
84
desc_addr = desc.next;
85
+ /*
85
@@ -XXX,XX +XXX,XX @@ static ssize_t eth_receive(NetClientState *nc, const uint8_t *buf, size_t size)
86
+ * Stack limit failure: set SP to the limit value, and generate
86
return size;
87
+ * STKOF UsageFault. Stack pushes below the limit must not be
87
}
88
+ * performed. It is IMPDEF whether pushes above the limit are
88
89
+ * performed; we choose not to.
89
-static void eth_tx_desc_put(uint32_t addr, mv88w8618_tx_desc *desc)
90
+ */
90
+static void eth_tx_desc_put(AddressSpace *dma_as, uint32_t addr,
91
+ qemu_log_mask(CPU_LOG_INT,
91
+ mv88w8618_tx_desc *desc)
92
+ "...STKOF during stacking\n");
92
{
93
+ env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
93
cpu_to_le32s(&desc->cmdstat);
94
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
94
cpu_to_le16s(&desc->res);
95
+ env->v7m.secure);
95
cpu_to_le16s(&desc->bytes);
96
+ env->regs[13] = limit;
96
cpu_to_le32s(&desc->buffer);
97
+ return true;
97
cpu_to_le32s(&desc->next);
98
+ }
98
- cpu_physical_memory_write(addr, desc, sizeof(*desc));
99
+ dma_memory_write(dma_as, addr, desc, sizeof(*desc));
100
}
101
102
-static void eth_tx_desc_get(uint32_t addr, mv88w8618_tx_desc *desc)
103
+static void eth_tx_desc_get(AddressSpace *dma_as, uint32_t addr,
104
+ mv88w8618_tx_desc *desc)
105
{
106
- cpu_physical_memory_read(addr, desc, sizeof(*desc));
107
+ dma_memory_read(dma_as, addr, desc, sizeof(*desc));
108
le32_to_cpus(&desc->cmdstat);
109
le16_to_cpus(&desc->res);
110
le16_to_cpus(&desc->bytes);
111
@@ -XXX,XX +XXX,XX @@ static void eth_send(mv88w8618_eth_state *s, int queue_index)
112
int len;
113
114
do {
115
- eth_tx_desc_get(desc_addr, &desc);
116
+ eth_tx_desc_get(&s->dma_as, desc_addr, &desc);
117
next_desc = desc.next;
118
if (desc.cmdstat & MP_ETH_TX_OWN) {
119
len = desc.bytes;
120
if (len < 2048) {
121
- cpu_physical_memory_read(desc.buffer, buf, len);
122
+ dma_memory_read(&s->dma_as, desc.buffer, buf, len);
123
qemu_send_packet(qemu_get_queue(s->nic), buf, len);
124
}
125
desc.cmdstat &= ~MP_ETH_TX_OWN;
126
s->icr |= 1 << (MP_ETH_IRQ_TXLO_BIT - queue_index);
127
- eth_tx_desc_put(desc_addr, &desc);
128
+ eth_tx_desc_put(&s->dma_as, desc_addr, &desc);
129
}
130
desc_addr = next_desc;
131
} while (desc_addr != s->tx_queue[queue_index]);
132
@@ -XXX,XX +XXX,XX @@ static void mv88w8618_eth_realize(DeviceState *dev, Error **errp)
133
{
134
mv88w8618_eth_state *s = MV88W8618_ETH(dev);
135
136
+ if (!s->dma_mr) {
137
+ error_setg(errp, TYPE_MV88W8618_ETH " 'dma-memory' link not set");
138
+ return;
99
+ }
139
+ }
100
+
140
+
101
/* Write as much of the stack frame as we can. If we fail a stack
141
+ address_space_init(&s->dma_as, s->dma_mr, "emac-dma");
102
* write this will result in a derived exception being pended
142
s->nic = qemu_new_nic(&net_mv88w8618_info, &s->conf,
103
* (which may be taken in preference to the one we started with
143
object_get_typename(OBJECT(dev)), dev->id, s);
104
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
144
}
105
v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
145
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription mv88w8618_eth_vmsd = {
106
v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
146
107
147
static Property mv88w8618_eth_properties[] = {
108
- /* Update SP regardless of whether any of the stack accesses failed.
148
DEFINE_NIC_PROPERTIES(mv88w8618_eth_state, conf),
109
- * When we implement v8M stack limit checking then this attempt to
149
+ DEFINE_PROP_LINK("dma-memory", mv88w8618_eth_state, dma_mr,
110
- * update SP might also fail and result in a derived exception.
150
+ TYPE_MEMORY_REGION, MemoryRegion *),
111
- */
151
DEFINE_PROP_END_OF_LIST(),
112
+ /* Update SP regardless of whether any of the stack accesses failed. */
152
};
113
env->regs[13] = frameptr;
153
114
154
@@ -XXX,XX +XXX,XX @@ static void musicpal_init(MachineState *machine)
115
return !stacked_ok;
155
qemu_check_nic_model(&nd_table[0], "mv88w8618");
156
dev = qdev_new(TYPE_MV88W8618_ETH);
157
qdev_set_nic_properties(dev, &nd_table[0]);
158
+ object_property_set_link(OBJECT(dev), "dma-memory",
159
+ OBJECT(get_system_memory()), &error_fatal);
160
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
161
sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, MP_ETH_BASE);
162
sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[MP_ETH_IRQ]);
116
--
163
--
117
2.19.0
164
2.20.1
118
165
119
166
diff view generated by jsdifflib
1
A cut-and-paste error meant we were reading r4 from the v8M
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
callee-saves exception stack frame twice. This is harmless
3
since it just meant we did two memory accesses to the same
4
location, but it's unnecessary. Delete it.
5
2
3
In commit ce4afed839 ("target/arm: Implement AArch32 HCR and HCR2")
4
the HCR_EL2 register has been changed from type NO_RAW (no underlying
5
state and does not support raw access for state saving/loading) to
6
type CONST (TCG can assume the value to be constant), removing the
7
read/write accessors.
8
We forgot to remove the previous type ARM_CP_NO_RAW. This is not
9
really a problem since the field is overwritten. However it makes
10
code review confuse, so remove it.
11
12
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
13
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20200812111223.7787-1-f4bug@amsat.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181002150304.2287-1-peter.maydell@linaro.org
10
---
17
---
11
target/arm/helper.c | 1 -
18
target/arm/helper.c | 1 -
12
1 file changed, 1 deletion(-)
19
1 file changed, 1 deletion(-)
13
20
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
21
diff --git a/target/arm/helper.c b/target/arm/helper.c
15
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
23
--- a/target/arm/helper.c
17
+++ b/target/arm/helper.c
24
+++ b/target/arm/helper.c
18
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
25
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
19
}
26
.access = PL2_RW,
20
27
.readfn = arm_cp_read_zero, .writefn = arm_cp_write_ignore },
21
pop_ok = pop_ok &&
28
{ .name = "HCR_EL2", .state = ARM_CP_STATE_BOTH,
22
- v7m_stack_read(cpu, &env->regs[4], frameptr + 0x8, mmu_idx) &&
29
- .type = ARM_CP_NO_RAW,
23
v7m_stack_read(cpu, &env->regs[4], frameptr + 0x8, mmu_idx) &&
30
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
24
v7m_stack_read(cpu, &env->regs[5], frameptr + 0xc, mmu_idx) &&
31
.access = PL2_RW,
25
v7m_stack_read(cpu, &env->regs[6], frameptr + 0x10, mmu_idx) &&
32
.type = ARM_CP_CONST, .resetvalue = 0 },
26
--
33
--
27
2.19.0
34
2.20.1
28
35
29
36
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Check for EL3 before testing CPTR_EL3.EZ. Return 0 when the exception
3
We need more information than just the mmu_idx in order
4
should be routed via AdvSIMDFPAccessTrap. Mirror the structure of
4
to create the proper exception syndrome. Only change the
5
CheckSVEEnabled more closely.
5
function signature so far.
6
6
7
Fixes: 5be5e8eda78
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200813200816.3037186-2-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181005175350.30752-3-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
11
---
14
target/arm/helper.c | 96 ++++++++++++++++++++++-----------------------
12
target/arm/mte_helper.c | 10 +++++-----
15
1 file changed, 46 insertions(+), 50 deletions(-)
13
1 file changed, 5 insertions(+), 5 deletions(-)
16
14
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
15
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.c
17
--- a/target/arm/mte_helper.c
20
+++ b/target/arm/helper.c
18
+++ b/target/arm/mte_helper.c
21
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
19
@@ -XXX,XX +XXX,XX @@ void HELPER(stzgm_tags)(CPUARMState *env, uint64_t ptr, uint64_t val)
22
REGINFO_SENTINEL
20
}
23
};
21
24
22
/* Record a tag check failure. */
25
-/* Return the exception level to which SVE-disabled exceptions should
23
-static void mte_check_fail(CPUARMState *env, int mmu_idx,
26
- * be taken, or 0 if SVE is enabled.
24
+static void mte_check_fail(CPUARMState *env, uint32_t desc,
27
+/* Return the exception level to which exceptions should be taken
25
uint64_t dirty_ptr, uintptr_t ra)
28
+ * via SVEAccessTrap. If an exception should be routed through
29
+ * AArch64.AdvSIMDFPAccessTrap, return 0; fp_exception_el should
30
+ * take care of raising that exception.
31
+ * C.f. the ARM pseudocode function CheckSVEEnabled.
32
*/
33
static int sve_exception_el(CPUARMState *env)
34
{
26
{
35
#ifndef CONFIG_USER_ONLY
27
+ int mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
36
unsigned current_el = arm_current_el(env);
28
ARMMMUIdx arm_mmu_idx = core_to_aa64_mmu_idx(mmu_idx);
37
29
int el, reg_el, tcf, select;
38
- /* The CPACR.ZEN controls traps to EL1:
30
uint64_t sctlr;
39
- * 0, 2 : trap EL0 and EL1 accesses
31
@@ -XXX,XX +XXX,XX @@ uint64_t mte_check1(CPUARMState *env, uint32_t desc,
40
- * 1 : trap only EL0 accesses
41
- * 3 : trap no accesses
42
+ if (current_el <= 1) {
43
+ bool disabled = false;
44
+
45
+ /* The CPACR.ZEN controls traps to EL1:
46
+ * 0, 2 : trap EL0 and EL1 accesses
47
+ * 1 : trap only EL0 accesses
48
+ * 3 : trap no accesses
49
+ */
50
+ if (!extract32(env->cp15.cpacr_el1, 16, 1)) {
51
+ disabled = true;
52
+ } else if (!extract32(env->cp15.cpacr_el1, 17, 1)) {
53
+ disabled = current_el == 0;
54
+ }
55
+ if (disabled) {
56
+ /* route_to_el2 */
57
+ return (arm_feature(env, ARM_FEATURE_EL2)
58
+ && !arm_is_secure(env)
59
+ && (env->cp15.hcr_el2 & HCR_TGE) ? 2 : 1);
60
+ }
61
+
62
+ /* Check CPACR.FPEN. */
63
+ if (!extract32(env->cp15.cpacr_el1, 20, 1)) {
64
+ disabled = true;
65
+ } else if (!extract32(env->cp15.cpacr_el1, 21, 1)) {
66
+ disabled = current_el == 0;
67
+ }
68
+ if (disabled) {
69
+ return 0;
70
+ }
71
+ }
72
+
73
+ /* CPTR_EL2. Since TZ and TFP are positive,
74
+ * they will be zero when EL2 is not present.
75
*/
76
- switch (extract32(env->cp15.cpacr_el1, 16, 2)) {
77
- default:
78
- if (current_el <= 1) {
79
- /* Trap to PL1, which might be EL1 or EL3 */
80
- if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
81
- return 3;
82
- }
83
- return 1;
84
+ if (current_el <= 2 && !arm_is_secure_below_el3(env)) {
85
+ if (env->cp15.cptr_el[2] & CPTR_TZ) {
86
+ return 2;
87
}
88
- break;
89
- case 1:
90
- if (current_el == 0) {
91
- return 1;
92
+ if (env->cp15.cptr_el[2] & CPTR_TFP) {
93
+ return 0;
94
}
95
- break;
96
- case 3:
97
- break;
98
}
32
}
99
33
100
- /* Similarly for CPACR.FPEN, after having checked ZEN. */
34
if (unlikely(!mte_probe1_int(env, desc, ptr, ra, bit55))) {
101
- switch (extract32(env->cp15.cpacr_el1, 20, 2)) {
35
- int mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
102
- default:
36
- mte_check_fail(env, mmu_idx, ptr, ra);
103
- if (current_el <= 1) {
37
+ mte_check_fail(env, desc, ptr, ra);
104
- if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
105
- return 3;
106
- }
107
- return 1;
108
- }
109
- break;
110
- case 1:
111
- if (current_el == 0) {
112
- return 1;
113
- }
114
- break;
115
- case 3:
116
- break;
117
- }
118
-
119
- /* CPTR_EL2. Check both TZ and TFP. */
120
- if (current_el <= 2
121
- && (env->cp15.cptr_el[2] & (CPTR_TFP | CPTR_TZ))
122
- && !arm_is_secure_below_el3(env)) {
123
- return 2;
124
- }
125
-
126
- /* CPTR_EL3. Check both EZ and TFP. */
127
- if (!(env->cp15.cptr_el[3] & CPTR_EZ)
128
- || (env->cp15.cptr_el[3] & CPTR_TFP)) {
129
+ /* CPTR_EL3. Since EZ is negative we must check for EL3. */
130
+ if (arm_feature(env, ARM_FEATURE_EL3)
131
+ && !(env->cp15.cptr_el[3] & CPTR_EZ)) {
132
return 3;
133
}
38
}
134
#endif
39
40
return useronly_clean_ptr(ptr);
41
@@ -XXX,XX +XXX,XX @@ uint64_t mte_checkN(CPUARMState *env, uint32_t desc,
42
43
fail_ofs = tag_first + n * TAG_GRANULE - ptr;
44
fail_ofs = ROUND_UP(fail_ofs, esize);
45
- mte_check_fail(env, mmu_idx, ptr + fail_ofs, ra);
46
+ mte_check_fail(env, desc, ptr + fail_ofs, ra);
47
}
48
49
done:
50
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(mte_check_zva)(CPUARMState *env, uint32_t desc, uint64_t ptr)
51
fail:
52
/* Locate the first nibble that differs. */
53
i = ctz64(mem_tag ^ ptr_tag) >> 4;
54
- mte_check_fail(env, mmu_idx, align_ptr + i * TAG_GRANULE, ra);
55
+ mte_check_fail(env, desc, align_ptr + i * TAG_GRANULE, ra);
56
57
done:
58
return useronly_clean_ptr(ptr);
135
--
59
--
136
2.19.0
60
2.20.1
137
61
138
62
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We are going to want to determine whether sve is enabled
3
According to AArch64.TagCheckFault, none of the other ISS values are
4
for EL other than current.
4
provided, so we do not need to go so far as merge_syn_data_abort.
5
But we were missing the WnR bit.
5
6
6
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
7
Tested-by: Andrey Konovalov <andreyknvl@google.com>
8
Reported-by: Andrey Konovalov <andreyknvl@google.com>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200813200816.3037186-3-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-4-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/helper.c | 21 +++++++++------------
14
target/arm/mte_helper.c | 9 +++++----
13
1 file changed, 9 insertions(+), 12 deletions(-)
15
1 file changed, 5 insertions(+), 4 deletions(-)
14
16
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
19
--- a/target/arm/mte_helper.c
18
+++ b/target/arm/helper.c
20
+++ b/target/arm/mte_helper.c
19
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
21
@@ -XXX,XX +XXX,XX @@ static void mte_check_fail(CPUARMState *env, uint32_t desc,
20
* take care of raising that exception.
21
* C.f. the ARM pseudocode function CheckSVEEnabled.
22
*/
23
-static int sve_exception_el(CPUARMState *env)
24
+static int sve_exception_el(CPUARMState *env, int el)
25
{
22
{
26
#ifndef CONFIG_USER_ONLY
23
int mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
27
- unsigned current_el = arm_current_el(env);
24
ARMMMUIdx arm_mmu_idx = core_to_aa64_mmu_idx(mmu_idx);
28
-
25
- int el, reg_el, tcf, select;
29
- if (current_el <= 1) {
26
+ int el, reg_el, tcf, select, is_write, syn;
30
+ if (el <= 1) {
27
uint64_t sctlr;
31
bool disabled = false;
28
32
29
reg_el = regime_el(env, arm_mmu_idx);
33
/* The CPACR.ZEN controls traps to EL1:
30
@@ -XXX,XX +XXX,XX @@ static void mte_check_fail(CPUARMState *env, uint32_t desc,
34
@@ -XXX,XX +XXX,XX @@ static int sve_exception_el(CPUARMState *env)
31
*/
35
if (!extract32(env->cp15.cpacr_el1, 16, 1)) {
32
cpu_restore_state(env_cpu(env), ra, true);
36
disabled = true;
33
env->exception.vaddress = dirty_ptr;
37
} else if (!extract32(env->cp15.cpacr_el1, 17, 1)) {
34
- raise_exception(env, EXCP_DATA_ABORT,
38
- disabled = current_el == 0;
35
- syn_data_abort_no_iss(el != 0, 0, 0, 0, 0, 0, 0x11),
39
+ disabled = el == 0;
36
- exception_target_el(env));
40
}
37
+
41
if (disabled) {
38
+ is_write = FIELD_EX32(desc, MTEDESC, WRITE);
42
/* route_to_el2 */
39
+ syn = syn_data_abort_no_iss(el != 0, 0, 0, 0, 0, is_write, 0x11);
43
@@ -XXX,XX +XXX,XX @@ static int sve_exception_el(CPUARMState *env)
40
+ raise_exception(env, EXCP_DATA_ABORT, syn, exception_target_el(env));
44
if (!extract32(env->cp15.cpacr_el1, 20, 1)) {
41
/* noreturn, but fall through to the assert anyway */
45
disabled = true;
42
46
} else if (!extract32(env->cp15.cpacr_el1, 21, 1)) {
43
case 0:
47
- disabled = current_el == 0;
48
+ disabled = el == 0;
49
}
50
if (disabled) {
51
return 0;
52
@@ -XXX,XX +XXX,XX @@ static int sve_exception_el(CPUARMState *env)
53
/* CPTR_EL2. Since TZ and TFP are positive,
54
* they will be zero when EL2 is not present.
55
*/
56
- if (current_el <= 2 && !arm_is_secure_below_el3(env)) {
57
+ if (el <= 2 && !arm_is_secure_below_el3(env)) {
58
if (env->cp15.cptr_el[2] & CPTR_TZ) {
59
return 2;
60
}
61
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
62
/* Return the exception level to which FP-disabled exceptions should
63
* be taken, or 0 if FP is enabled.
64
*/
65
-static inline int fp_exception_el(CPUARMState *env)
66
+static int fp_exception_el(CPUARMState *env, int cur_el)
67
{
68
#ifndef CONFIG_USER_ONLY
69
int fpen;
70
- int cur_el = arm_current_el(env);
71
72
/* CPACR and the CPTR registers don't exist before v6, so FP is
73
* always accessible
74
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
75
target_ulong *cs_base, uint32_t *pflags)
76
{
77
ARMMMUIdx mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
78
- int fp_el = fp_exception_el(env);
79
+ int current_el = arm_current_el(env);
80
+ int fp_el = fp_exception_el(env, current_el);
81
uint32_t flags;
82
83
if (is_a64(env)) {
84
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
85
flags |= (arm_regime_tbi1(env, mmu_idx) << ARM_TBFLAG_TBI1_SHIFT);
86
87
if (arm_feature(env, ARM_FEATURE_SVE)) {
88
- int sve_el = sve_exception_el(env);
89
+ int sve_el = sve_exception_el(env, current_el);
90
uint32_t zcr_len;
91
92
/* If SVE is disabled, but FP is enabled,
93
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
94
if (sve_el != 0 && fp_el == 0) {
95
zcr_len = 0;
96
} else {
97
- int current_el = arm_current_el(env);
98
ARMCPU *cpu = arm_env_get_cpu(env);
99
100
zcr_len = cpu->sve_max_vq - 1;
101
--
44
--
102
2.19.0
45
2.20.1
103
46
104
47
diff view generated by jsdifflib
1
Check the v8M stack limits when pushing the frame for a
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
non-secure function call via BLXNS.
3
2
4
In order to be able to generate the exception we need to
3
Allow the device to execute the DMA transfers in a different
5
promote raise_exception() from being local to op_helper.c
4
AddressSpace.
6
so we can call it from helper.c.
7
5
6
The A10 and H3 SoC keep using the system_memory address space,
7
but via the proper dma_memory_access() API.
8
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
11
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
12
Message-id: 20200814110057.307-1-f4bug@amsat.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181002163556.10279-8-peter.maydell@linaro.org
12
---
14
---
13
target/arm/internals.h | 9 +++++++++
15
include/hw/sd/allwinner-sdhost.h | 6 ++++++
14
target/arm/helper.c | 4 ++++
16
hw/arm/allwinner-a10.c | 2 ++
15
target/arm/op_helper.c | 4 ++--
17
hw/arm/allwinner-h3.c | 2 ++
16
3 files changed, 15 insertions(+), 2 deletions(-)
18
hw/sd/allwinner-sdhost.c | 37 ++++++++++++++++++++++++++------
19
4 files changed, 41 insertions(+), 6 deletions(-)
17
20
18
diff --git a/target/arm/internals.h b/target/arm/internals.h
21
diff --git a/include/hw/sd/allwinner-sdhost.h b/include/hw/sd/allwinner-sdhost.h
19
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/internals.h
23
--- a/include/hw/sd/allwinner-sdhost.h
21
+++ b/target/arm/internals.h
24
+++ b/include/hw/sd/allwinner-sdhost.h
22
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_EXCRET, RES1, 7, 25) /* including the must-be-1 prefix */
25
@@ -XXX,XX +XXX,XX @@ typedef struct AwSdHostState {
23
#define M_FAKE_FSR_NSC_EXEC 0xf /* NS executing in S&NSC memory */
26
/** Interrupt output signal to notify CPU */
24
#define M_FAKE_FSR_SFAULT 0xe /* SecureFault INVTRAN, INVEP or AUVIOL */
27
qemu_irq irq;
25
28
26
+/**
29
+ /** Memory region where DMA transfers are done */
27
+ * raise_exception: Raise the specified exception.
30
+ MemoryRegion *dma_mr;
28
+ * Raise a guest exception with the specified value, syndrome register
29
+ * and target exception level. This should be called from helper functions,
30
+ * and never returns because we will longjump back up to the CPU main loop.
31
+ */
32
+void QEMU_NORETURN raise_exception(CPUARMState *env, uint32_t excp,
33
+ uint32_t syndrome, uint32_t target_el);
34
+
31
+
35
/*
32
+ /** Address space used internally for DMA transfers */
36
* For AArch64, map a given EL to an index in the banked_spsr array.
33
+ AddressSpace dma_as;
37
* Note that this mapping and the AArch32 mapping defined in bank_number()
34
+
38
diff --git a/target/arm/helper.c b/target/arm/helper.c
35
/** Number of bytes left in current DMA transfer */
36
uint32_t transfer_cnt;
37
38
diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
39
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/helper.c
40
--- a/hw/arm/allwinner-a10.c
41
+++ b/target/arm/helper.c
41
+++ b/hw/arm/allwinner-a10.c
42
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
42
@@ -XXX,XX +XXX,XX @@ static void aw_a10_realize(DeviceState *dev, Error **errp)
43
"BLXNS with misaligned SP is UNPREDICTABLE\n");
44
}
43
}
45
44
46
+ if (sp < v7m_sp_limit(env)) {
45
/* SD/MMC */
47
+ raise_exception(env, EXCP_STKOF, 0, 1);
46
+ object_property_set_link(OBJECT(&s->mmc0), "dma-memory",
47
+ OBJECT(get_system_memory()), &error_fatal);
48
sysbus_realize(SYS_BUS_DEVICE(&s->mmc0), &error_fatal);
49
sysbus_mmio_map(SYS_BUS_DEVICE(&s->mmc0), 0, AW_A10_MMC0_BASE);
50
sysbus_connect_irq(SYS_BUS_DEVICE(&s->mmc0), 0, qdev_get_gpio_in(dev, 32));
51
diff --git a/hw/arm/allwinner-h3.c b/hw/arm/allwinner-h3.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/hw/arm/allwinner-h3.c
54
+++ b/hw/arm/allwinner-h3.c
55
@@ -XXX,XX +XXX,XX @@ static void allwinner_h3_realize(DeviceState *dev, Error **errp)
56
sysbus_mmio_map(SYS_BUS_DEVICE(&s->sid), 0, s->memmap[AW_H3_SID]);
57
58
/* SD/MMC */
59
+ object_property_set_link(OBJECT(&s->mmc0), "dma-memory",
60
+ OBJECT(get_system_memory()), &error_fatal);
61
sysbus_realize(SYS_BUS_DEVICE(&s->mmc0), &error_fatal);
62
sysbus_mmio_map(SYS_BUS_DEVICE(&s->mmc0), 0, s->memmap[AW_H3_MMC0]);
63
sysbus_connect_irq(SYS_BUS_DEVICE(&s->mmc0), 0,
64
diff --git a/hw/sd/allwinner-sdhost.c b/hw/sd/allwinner-sdhost.c
65
index XXXXXXX..XXXXXXX 100644
66
--- a/hw/sd/allwinner-sdhost.c
67
+++ b/hw/sd/allwinner-sdhost.c
68
@@ -XXX,XX +XXX,XX @@
69
#include "qemu/log.h"
70
#include "qemu/module.h"
71
#include "qemu/units.h"
72
+#include "qapi/error.h"
73
#include "sysemu/blockdev.h"
74
+#include "sysemu/dma.h"
75
+#include "hw/qdev-properties.h"
76
#include "hw/irq.h"
77
#include "hw/sd/allwinner-sdhost.h"
78
#include "migration/vmstate.h"
79
@@ -XXX,XX +XXX,XX @@ static uint32_t allwinner_sdhost_process_desc(AwSdHostState *s,
80
uint8_t buf[1024];
81
82
/* Read descriptor */
83
- cpu_physical_memory_read(desc_addr, desc, sizeof(*desc));
84
+ dma_memory_read(&s->dma_as, desc_addr, desc, sizeof(*desc));
85
if (desc->size == 0) {
86
desc->size = klass->max_desc_size;
87
} else if (desc->size > klass->max_desc_size) {
88
@@ -XXX,XX +XXX,XX @@ static uint32_t allwinner_sdhost_process_desc(AwSdHostState *s,
89
90
/* Write to SD bus */
91
if (is_write) {
92
- cpu_physical_memory_read((desc->addr & DESC_SIZE_MASK) + num_done,
93
- buf, buf_bytes);
94
+ dma_memory_read(&s->dma_as,
95
+ (desc->addr & DESC_SIZE_MASK) + num_done,
96
+ buf, buf_bytes);
97
sdbus_write_data(&s->sdbus, buf, buf_bytes);
98
99
/* Read from SD bus */
100
} else {
101
sdbus_read_data(&s->sdbus, buf, buf_bytes);
102
- cpu_physical_memory_write((desc->addr & DESC_SIZE_MASK) + num_done,
103
- buf, buf_bytes);
104
+ dma_memory_write(&s->dma_as,
105
+ (desc->addr & DESC_SIZE_MASK) + num_done,
106
+ buf, buf_bytes);
107
}
108
num_done += buf_bytes;
109
}
110
111
/* Clear hold flag and flush descriptor */
112
desc->status &= ~DESC_STATUS_HOLD;
113
- cpu_physical_memory_write(desc_addr, desc, sizeof(*desc));
114
+ dma_memory_write(&s->dma_as, desc_addr, desc, sizeof(*desc));
115
116
return num_done;
117
}
118
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_allwinner_sdhost = {
119
}
120
};
121
122
+static Property allwinner_sdhost_properties[] = {
123
+ DEFINE_PROP_LINK("dma-memory", AwSdHostState, dma_mr,
124
+ TYPE_MEMORY_REGION, MemoryRegion *),
125
+ DEFINE_PROP_END_OF_LIST(),
126
+};
127
+
128
static void allwinner_sdhost_init(Object *obj)
129
{
130
AwSdHostState *s = AW_SDHOST(obj);
131
@@ -XXX,XX +XXX,XX @@ static void allwinner_sdhost_init(Object *obj)
132
sysbus_init_irq(SYS_BUS_DEVICE(s), &s->irq);
133
}
134
135
+static void allwinner_sdhost_realize(DeviceState *dev, Error **errp)
136
+{
137
+ AwSdHostState *s = AW_SDHOST(dev);
138
+
139
+ if (!s->dma_mr) {
140
+ error_setg(errp, TYPE_AW_SDHOST " 'dma-memory' link not set");
141
+ return;
48
+ }
142
+ }
49
+
143
+
50
saved_psr = env->v7m.exception;
144
+ address_space_init(&s->dma_as, s->dma_mr, "sdhost-dma");
51
if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK) {
145
+}
52
saved_psr |= XPSR_SFPA;
146
+
53
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
147
static void allwinner_sdhost_reset(DeviceState *dev)
54
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/op_helper.c
56
+++ b/target/arm/op_helper.c
57
@@ -XXX,XX +XXX,XX @@
58
#define SIGNBIT (uint32_t)0x80000000
59
#define SIGNBIT64 ((uint64_t)1 << 63)
60
61
-static void raise_exception(CPUARMState *env, uint32_t excp,
62
- uint32_t syndrome, uint32_t target_el)
63
+void raise_exception(CPUARMState *env, uint32_t excp,
64
+ uint32_t syndrome, uint32_t target_el)
65
{
148
{
66
CPUState *cs = CPU(arm_env_get_cpu(env));
149
AwSdHostState *s = AW_SDHOST(dev);
67
150
@@ -XXX,XX +XXX,XX @@ static void allwinner_sdhost_class_init(ObjectClass *klass, void *data)
151
152
dc->reset = allwinner_sdhost_reset;
153
dc->vmsd = &vmstate_allwinner_sdhost;
154
+ dc->realize = allwinner_sdhost_realize;
155
+ device_class_set_props(dc, allwinner_sdhost_properties);
156
}
157
158
static void allwinner_sdhost_sun4i_class_init(ObjectClass *klass, void *data)
68
--
159
--
69
2.19.0
160
2.20.1
70
161
71
162
diff view generated by jsdifflib
New patch
1
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
3
Allow the device to execute the DMA transfers in a different
4
AddressSpace.
5
6
The H3 SoC keeps using the system_memory address space,
7
but via the proper dma_memory_access() API.
8
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
12
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
13
Message-id: 20200814122907.27732-1-f4bug@amsat.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
16
include/hw/net/allwinner-sun8i-emac.h | 6 ++++
17
hw/arm/allwinner-h3.c | 2 ++
18
hw/net/allwinner-sun8i-emac.c | 46 +++++++++++++++++----------
19
3 files changed, 38 insertions(+), 16 deletions(-)
20
21
diff --git a/include/hw/net/allwinner-sun8i-emac.h b/include/hw/net/allwinner-sun8i-emac.h
22
index XXXXXXX..XXXXXXX 100644
23
--- a/include/hw/net/allwinner-sun8i-emac.h
24
+++ b/include/hw/net/allwinner-sun8i-emac.h
25
@@ -XXX,XX +XXX,XX @@ typedef struct AwSun8iEmacState {
26
/** Interrupt output signal to notify CPU */
27
qemu_irq irq;
28
29
+ /** Memory region where DMA transfers are done */
30
+ MemoryRegion *dma_mr;
31
+
32
+ /** Address space used internally for DMA transfers */
33
+ AddressSpace dma_as;
34
+
35
/** Generic Network Interface Controller (NIC) for networking API */
36
NICState *nic;
37
38
diff --git a/hw/arm/allwinner-h3.c b/hw/arm/allwinner-h3.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/hw/arm/allwinner-h3.c
41
+++ b/hw/arm/allwinner-h3.c
42
@@ -XXX,XX +XXX,XX @@ static void allwinner_h3_realize(DeviceState *dev, Error **errp)
43
qemu_check_nic_model(&nd_table[0], TYPE_AW_SUN8I_EMAC);
44
qdev_set_nic_properties(DEVICE(&s->emac), &nd_table[0]);
45
}
46
+ object_property_set_link(OBJECT(&s->emac), "dma-memory",
47
+ OBJECT(get_system_memory()), &error_fatal);
48
sysbus_realize(SYS_BUS_DEVICE(&s->emac), &error_fatal);
49
sysbus_mmio_map(SYS_BUS_DEVICE(&s->emac), 0, s->memmap[AW_H3_EMAC]);
50
sysbus_connect_irq(SYS_BUS_DEVICE(&s->emac), 0,
51
diff --git a/hw/net/allwinner-sun8i-emac.c b/hw/net/allwinner-sun8i-emac.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/hw/net/allwinner-sun8i-emac.c
54
+++ b/hw/net/allwinner-sun8i-emac.c
55
@@ -XXX,XX +XXX,XX @@
56
57
#include "qemu/osdep.h"
58
#include "qemu/units.h"
59
+#include "qapi/error.h"
60
#include "hw/sysbus.h"
61
#include "migration/vmstate.h"
62
#include "net/net.h"
63
@@ -XXX,XX +XXX,XX @@
64
#include "net/checksum.h"
65
#include "qemu/module.h"
66
#include "exec/cpu-common.h"
67
+#include "sysemu/dma.h"
68
#include "hw/net/allwinner-sun8i-emac.h"
69
70
/* EMAC register offsets */
71
@@ -XXX,XX +XXX,XX @@ static void allwinner_sun8i_emac_update_irq(AwSun8iEmacState *s)
72
qemu_set_irq(s->irq, (s->int_sta & s->int_en) != 0);
73
}
74
75
-static uint32_t allwinner_sun8i_emac_next_desc(FrameDescriptor *desc,
76
+static uint32_t allwinner_sun8i_emac_next_desc(AwSun8iEmacState *s,
77
+ FrameDescriptor *desc,
78
size_t min_size)
79
{
80
uint32_t paddr = desc->next;
81
82
- cpu_physical_memory_read(paddr, desc, sizeof(*desc));
83
+ dma_memory_read(&s->dma_as, paddr, desc, sizeof(*desc));
84
85
if ((desc->status & DESC_STATUS_CTL) &&
86
(desc->status2 & DESC_STATUS2_BUF_SIZE_MASK) >= min_size) {
87
@@ -XXX,XX +XXX,XX @@ static uint32_t allwinner_sun8i_emac_next_desc(FrameDescriptor *desc,
88
}
89
}
90
91
-static uint32_t allwinner_sun8i_emac_get_desc(FrameDescriptor *desc,
92
+static uint32_t allwinner_sun8i_emac_get_desc(AwSun8iEmacState *s,
93
+ FrameDescriptor *desc,
94
uint32_t start_addr,
95
size_t min_size)
96
{
97
@@ -XXX,XX +XXX,XX @@ static uint32_t allwinner_sun8i_emac_get_desc(FrameDescriptor *desc,
98
99
/* Note that the list is a cycle. Last entry points back to the head. */
100
while (desc_addr != 0) {
101
- cpu_physical_memory_read(desc_addr, desc, sizeof(*desc));
102
+ dma_memory_read(&s->dma_as, desc_addr, desc, sizeof(*desc));
103
104
if ((desc->status & DESC_STATUS_CTL) &&
105
(desc->status2 & DESC_STATUS2_BUF_SIZE_MASK) >= min_size) {
106
@@ -XXX,XX +XXX,XX @@ static uint32_t allwinner_sun8i_emac_rx_desc(AwSun8iEmacState *s,
107
FrameDescriptor *desc,
108
size_t min_size)
109
{
110
- return allwinner_sun8i_emac_get_desc(desc, s->rx_desc_curr, min_size);
111
+ return allwinner_sun8i_emac_get_desc(s, desc, s->rx_desc_curr, min_size);
112
}
113
114
static uint32_t allwinner_sun8i_emac_tx_desc(AwSun8iEmacState *s,
115
FrameDescriptor *desc,
116
size_t min_size)
117
{
118
- return allwinner_sun8i_emac_get_desc(desc, s->tx_desc_head, min_size);
119
+ return allwinner_sun8i_emac_get_desc(s, desc, s->tx_desc_head, min_size);
120
}
121
122
-static void allwinner_sun8i_emac_flush_desc(FrameDescriptor *desc,
123
+static void allwinner_sun8i_emac_flush_desc(AwSun8iEmacState *s,
124
+ FrameDescriptor *desc,
125
uint32_t phys_addr)
126
{
127
- cpu_physical_memory_write(phys_addr, desc, sizeof(*desc));
128
+ dma_memory_write(&s->dma_as, phys_addr, desc, sizeof(*desc));
129
}
130
131
static bool allwinner_sun8i_emac_can_receive(NetClientState *nc)
132
@@ -XXX,XX +XXX,XX @@ static ssize_t allwinner_sun8i_emac_receive(NetClientState *nc,
133
<< RX_DESC_STATUS_FRM_LEN_SHIFT;
134
}
135
136
- cpu_physical_memory_write(desc.addr, buf, desc_bytes);
137
- allwinner_sun8i_emac_flush_desc(&desc, s->rx_desc_curr);
138
+ dma_memory_write(&s->dma_as, desc.addr, buf, desc_bytes);
139
+ allwinner_sun8i_emac_flush_desc(s, &desc, s->rx_desc_curr);
140
trace_allwinner_sun8i_emac_receive(s->rx_desc_curr, desc.addr,
141
desc_bytes);
142
143
@@ -XXX,XX +XXX,XX @@ static ssize_t allwinner_sun8i_emac_receive(NetClientState *nc,
144
bytes_left -= desc_bytes;
145
146
/* Move to the next descriptor */
147
- s->rx_desc_curr = allwinner_sun8i_emac_next_desc(&desc, 64);
148
+ s->rx_desc_curr = allwinner_sun8i_emac_next_desc(s, &desc, 64);
149
if (!s->rx_desc_curr) {
150
/* Not enough buffer space available */
151
s->int_sta |= INT_STA_RX_BUF_UA;
152
@@ -XXX,XX +XXX,XX @@ static void allwinner_sun8i_emac_transmit(AwSun8iEmacState *s)
153
desc.status |= TX_DESC_STATUS_LENGTH_ERR;
154
break;
155
}
156
- cpu_physical_memory_read(desc.addr, packet_buf + packet_bytes, bytes);
157
+ dma_memory_read(&s->dma_as, desc.addr, packet_buf + packet_bytes, bytes);
158
packet_bytes += bytes;
159
desc.status &= ~DESC_STATUS_CTL;
160
- allwinner_sun8i_emac_flush_desc(&desc, s->tx_desc_curr);
161
+ allwinner_sun8i_emac_flush_desc(s, &desc, s->tx_desc_curr);
162
163
/* After the last descriptor, send the packet */
164
if (desc.status2 & TX_DESC_STATUS2_LAST_DESC) {
165
@@ -XXX,XX +XXX,XX @@ static void allwinner_sun8i_emac_transmit(AwSun8iEmacState *s)
166
packet_bytes = 0;
167
transmitted++;
168
}
169
- s->tx_desc_curr = allwinner_sun8i_emac_next_desc(&desc, 0);
170
+ s->tx_desc_curr = allwinner_sun8i_emac_next_desc(s, &desc, 0);
171
}
172
173
/* Raise transmit completed interrupt */
174
@@ -XXX,XX +XXX,XX @@ static uint64_t allwinner_sun8i_emac_read(void *opaque, hwaddr offset,
175
break;
176
case REG_TX_CUR_BUF: /* Transmit Current Buffer */
177
if (s->tx_desc_curr != 0) {
178
- cpu_physical_memory_read(s->tx_desc_curr, &desc, sizeof(desc));
179
+ dma_memory_read(&s->dma_as, s->tx_desc_curr, &desc, sizeof(desc));
180
value = desc.addr;
181
} else {
182
value = 0;
183
@@ -XXX,XX +XXX,XX @@ static uint64_t allwinner_sun8i_emac_read(void *opaque, hwaddr offset,
184
break;
185
case REG_RX_CUR_BUF: /* Receive Current Buffer */
186
if (s->rx_desc_curr != 0) {
187
- cpu_physical_memory_read(s->rx_desc_curr, &desc, sizeof(desc));
188
+ dma_memory_read(&s->dma_as, s->rx_desc_curr, &desc, sizeof(desc));
189
value = desc.addr;
190
} else {
191
value = 0;
192
@@ -XXX,XX +XXX,XX @@ static void allwinner_sun8i_emac_realize(DeviceState *dev, Error **errp)
193
{
194
AwSun8iEmacState *s = AW_SUN8I_EMAC(dev);
195
196
+ if (!s->dma_mr) {
197
+ error_setg(errp, TYPE_AW_SUN8I_EMAC " 'dma-memory' link not set");
198
+ return;
199
+ }
200
+
201
+ address_space_init(&s->dma_as, s->dma_mr, "emac-dma");
202
+
203
qemu_macaddr_default_if_unset(&s->conf.macaddr);
204
s->nic = qemu_new_nic(&net_allwinner_sun8i_emac_info, &s->conf,
205
object_get_typename(OBJECT(dev)), dev->id, s);
206
@@ -XXX,XX +XXX,XX @@ static void allwinner_sun8i_emac_realize(DeviceState *dev, Error **errp)
207
static Property allwinner_sun8i_emac_properties[] = {
208
DEFINE_NIC_PROPERTIES(AwSun8iEmacState, conf),
209
DEFINE_PROP_UINT8("phy-addr", AwSun8iEmacState, mii_phy_addr, 0),
210
+ DEFINE_PROP_LINK("dma-memory", AwSun8iEmacState, dma_mr,
211
+ TYPE_MEMORY_REGION, MemoryRegion *),
212
DEFINE_PROP_END_OF_LIST(),
213
};
214
215
--
216
2.20.1
217
218
diff view generated by jsdifflib
1
In commit c79c0a314c43b78 we enabled emulation of external aborts
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
when the guest attempts to access a physical address with no
3
mapped device. In commit 4672cbd7bed88dc6 we suppress this for
4
most legacy boards to prevent breakage of previously working
5
guests, but we didn't suppress it in the 'virt' board, with
6
the rationale "we know that guests won't try to prod devices
7
that we don't describe in the device tree or ACPI tables". This
8
is mostly true, but we've had a report of a Linux guest image
9
that this did break. The problem seems to be that the guest
10
is (incorrectly) configured with a DEBUG_UART_PHYS value that
11
tells it there is a uart at 0x10009000 (which is true for
12
vexpress but not for virt), so in early bootup the kernel
13
probes this bogus address.
14
2
15
This is a misconfigured guest, so we don't need to worry
3
As we want to call qdev_connect_clock_in() before the device
16
about it too much, but we can arrange that guests that ran
4
is realized, we need to uninline cadence_uart_create() first.
17
on QEMU v2.10 (before c79c0a314c43b78) will still run on
18
the "virt-2.10" board model, by suppressing external aborts
19
only for that version and earlier. This seems a reasonable
20
compromise: "virt-2.10" is supposed to behave the same way
21
that "virt" did in the 2.10 release, and making it do that
22
provides a usable workaround for guests with bugs like this.
23
5
24
Cc: qemu-stable@nongnu.org
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
8
Message-id: 20200803105647.22223-2-f4bug@amsat.org
25
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
26
Message-id: 20180925144127.31965-1-peter.maydell@linaro.org
27
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
28
---
10
---
29
hw/arm/virt.c | 2 ++
11
include/hw/char/cadence_uart.h | 17 -----------------
30
1 file changed, 2 insertions(+)
12
hw/arm/xilinx_zynq.c | 14 ++++++++++++--
13
2 files changed, 12 insertions(+), 19 deletions(-)
31
14
32
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
15
diff --git a/include/hw/char/cadence_uart.h b/include/hw/char/cadence_uart.h
33
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
34
--- a/hw/arm/virt.c
17
--- a/include/hw/char/cadence_uart.h
35
+++ b/hw/arm/virt.c
18
+++ b/include/hw/char/cadence_uart.h
36
@@ -XXX,XX +XXX,XX @@ static void virt_machine_2_10_options(MachineClass *mc)
19
@@ -XXX,XX +XXX,XX @@ typedef struct {
37
{
20
Clock *refclk;
38
virt_machine_2_11_options(mc);
21
} CadenceUARTState;
39
SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_10);
22
40
+ /* before 2.11 we never faulted accesses to bad addresses */
23
-static inline DeviceState *cadence_uart_create(hwaddr addr,
41
+ mc->ignore_memory_transaction_failures = true;
24
- qemu_irq irq,
42
}
25
- Chardev *chr)
43
DEFINE_VIRT_MACHINE(2, 10)
26
-{
27
- DeviceState *dev;
28
- SysBusDevice *s;
29
-
30
- dev = qdev_new(TYPE_CADENCE_UART);
31
- s = SYS_BUS_DEVICE(dev);
32
- qdev_prop_set_chr(dev, "chardev", chr);
33
- sysbus_realize_and_unref(s, &error_fatal);
34
- sysbus_mmio_map(s, 0, addr);
35
- sysbus_connect_irq(s, 0, irq);
36
-
37
- return dev;
38
-}
39
-
40
#endif
41
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
42
index XXXXXXX..XXXXXXX 100644
43
--- a/hw/arm/xilinx_zynq.c
44
+++ b/hw/arm/xilinx_zynq.c
45
@@ -XXX,XX +XXX,XX @@ static void zynq_init(MachineState *machine)
46
sysbus_create_simple(TYPE_CHIPIDEA, 0xE0002000, pic[53 - IRQ_OFFSET]);
47
sysbus_create_simple(TYPE_CHIPIDEA, 0xE0003000, pic[76 - IRQ_OFFSET]);
48
49
- dev = cadence_uart_create(0xE0000000, pic[59 - IRQ_OFFSET], serial_hd(0));
50
+ dev = qdev_new(TYPE_CADENCE_UART);
51
+ busdev = SYS_BUS_DEVICE(dev);
52
+ qdev_prop_set_chr(dev, "chardev", serial_hd(0));
53
+ sysbus_realize_and_unref(busdev, &error_fatal);
54
+ sysbus_mmio_map(busdev, 0, 0xE0000000);
55
+ sysbus_connect_irq(busdev, 0, pic[59 - IRQ_OFFSET]);
56
qdev_connect_clock_in(dev, "refclk",
57
qdev_get_clock_out(slcr, "uart0_ref_clk"));
58
- dev = cadence_uart_create(0xE0001000, pic[82 - IRQ_OFFSET], serial_hd(1));
59
+ dev = qdev_new(TYPE_CADENCE_UART);
60
+ busdev = SYS_BUS_DEVICE(dev);
61
+ qdev_prop_set_chr(dev, "chardev", serial_hd(1));
62
+ sysbus_realize_and_unref(busdev, &error_fatal);
63
+ sysbus_mmio_map(busdev, 0, 0xE0001000);
64
+ sysbus_connect_irq(busdev, 0, pic[82 - IRQ_OFFSET]);
65
qdev_connect_clock_in(dev, "refclk",
66
qdev_get_clock_out(slcr, "uart1_ref_clk"));
44
67
45
--
68
--
46
2.19.0
69
2.20.1
47
70
48
71
diff view generated by jsdifflib
1
Define EXCP_STKOF, and arrange for it to cause us to take
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
a UsageFault with CFSR.STKOF set.
3
2
3
Clock canonical name is set in device_set_realized (see the block
4
added to hw/core/qdev.c in commit 0e6934f264).
5
If we connect a clock after the device is realized, this code is
6
not executed. This is currently not a problem as this name is only
7
used for trace events, however this disrupt tracing.
8
9
Fix by calling qdev_connect_clock_in() before realizing.
10
11
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
12
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
13
Message-id: 20200803105647.22223-3-f4bug@amsat.org
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20181002163556.10279-3-peter.maydell@linaro.org
8
---
15
---
9
target/arm/cpu.h | 2 ++
16
hw/arm/xilinx_zynq.c | 18 +++++++++---------
10
target/arm/helper.c | 5 +++++
17
1 file changed, 9 insertions(+), 9 deletions(-)
11
2 files changed, 7 insertions(+)
12
18
13
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
14
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.h
21
--- a/hw/arm/xilinx_zynq.c
16
+++ b/target/arm/cpu.h
22
+++ b/hw/arm/xilinx_zynq.c
17
@@ -XXX,XX +XXX,XX @@
23
@@ -XXX,XX +XXX,XX @@ static void zynq_init(MachineState *machine)
18
#define EXCP_SEMIHOST 16 /* semihosting call */
24
1, 0x0066, 0x0022, 0x0000, 0x0000, 0x0555, 0x2aa,
19
#define EXCP_NOCP 17 /* v7M NOCP UsageFault */
25
0);
20
#define EXCP_INVSTATE 18 /* v7M INVSTATE UsageFault */
26
21
+#define EXCP_STKOF 19 /* v8M STKOF UsageFault */
27
- /* Create slcr, keep a pointer to connect clocks */
22
/* NB: add new EXCP_ defines to the array in arm_log_exception() too */
28
- slcr = qdev_new("xilinx,zynq_slcr");
23
29
- sysbus_realize_and_unref(SYS_BUS_DEVICE(slcr), &error_fatal);
24
#define ARMV7M_EXCP_RESET 1
30
- sysbus_mmio_map(SYS_BUS_DEVICE(slcr), 0, 0xF8000000);
25
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CFSR, UNDEFINSTR, 16 + 0, 1)
31
-
26
FIELD(V7M_CFSR, INVSTATE, 16 + 1, 1)
32
/* Create the main clock source, and feed slcr with it */
27
FIELD(V7M_CFSR, INVPC, 16 + 2, 1)
33
zynq_machine->ps_clk = CLOCK(object_new(TYPE_CLOCK));
28
FIELD(V7M_CFSR, NOCP, 16 + 3, 1)
34
object_property_add_child(OBJECT(zynq_machine), "ps_clk",
29
+FIELD(V7M_CFSR, STKOF, 16 + 4, 1)
35
OBJECT(zynq_machine->ps_clk));
30
FIELD(V7M_CFSR, UNALIGNED, 16 + 8, 1)
36
object_unref(OBJECT(zynq_machine->ps_clk));
31
FIELD(V7M_CFSR, DIVBYZERO, 16 + 9, 1)
37
clock_set_hz(zynq_machine->ps_clk, PS_CLK_FREQUENCY);
32
38
+
33
diff --git a/target/arm/helper.c b/target/arm/helper.c
39
+ /* Create slcr, keep a pointer to connect clocks */
34
index XXXXXXX..XXXXXXX 100644
40
+ slcr = qdev_new("xilinx,zynq_slcr");
35
--- a/target/arm/helper.c
41
qdev_connect_clock_in(slcr, "ps_clk", zynq_machine->ps_clk);
36
+++ b/target/arm/helper.c
42
+ sysbus_realize_and_unref(SYS_BUS_DEVICE(slcr), &error_fatal);
37
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
43
+ sysbus_mmio_map(SYS_BUS_DEVICE(slcr), 0, 0xF8000000);
38
[EXCP_SEMIHOST] = "Semihosting call",
44
39
[EXCP_NOCP] = "v7M NOCP UsageFault",
45
dev = qdev_new(TYPE_A9MPCORE_PRIV);
40
[EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
46
qdev_prop_set_uint32(dev, "num-cpu", 1);
41
+ [EXCP_STKOF] = "v8M STKOF UsageFault",
47
@@ -XXX,XX +XXX,XX @@ static void zynq_init(MachineState *machine)
42
};
48
dev = qdev_new(TYPE_CADENCE_UART);
43
49
busdev = SYS_BUS_DEVICE(dev);
44
if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
50
qdev_prop_set_chr(dev, "chardev", serial_hd(0));
45
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
51
+ qdev_connect_clock_in(dev, "refclk",
46
armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
52
+ qdev_get_clock_out(slcr, "uart0_ref_clk"));
47
env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_INVSTATE_MASK;
53
sysbus_realize_and_unref(busdev, &error_fatal);
48
break;
54
sysbus_mmio_map(busdev, 0, 0xE0000000);
49
+ case EXCP_STKOF:
55
sysbus_connect_irq(busdev, 0, pic[59 - IRQ_OFFSET]);
50
+ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
56
- qdev_connect_clock_in(dev, "refclk",
51
+ env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
57
- qdev_get_clock_out(slcr, "uart0_ref_clk"));
52
+ break;
58
dev = qdev_new(TYPE_CADENCE_UART);
53
case EXCP_SWI:
59
busdev = SYS_BUS_DEVICE(dev);
54
/* The PC already points to the next instruction. */
60
qdev_prop_set_chr(dev, "chardev", serial_hd(1));
55
armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC, env->v7m.secure);
61
+ qdev_connect_clock_in(dev, "refclk",
62
+ qdev_get_clock_out(slcr, "uart1_ref_clk"));
63
sysbus_realize_and_unref(busdev, &error_fatal);
64
sysbus_mmio_map(busdev, 0, 0xE0001000);
65
sysbus_connect_irq(busdev, 0, pic[82 - IRQ_OFFSET]);
66
- qdev_connect_clock_in(dev, "refclk",
67
- qdev_get_clock_out(slcr, "uart1_ref_clk"));
68
69
sysbus_create_varargs("cadence_ttc", 0xF8001000,
70
pic[42-IRQ_OFFSET], pic[43-IRQ_OFFSET], pic[44-IRQ_OFFSET], NULL);
56
--
71
--
57
2.19.0
72
2.20.1
58
73
59
74
diff view generated by jsdifflib
1
Add the v8M stack checks for the VLDM/VSTM
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
(aka VPUSH/VPOP) instructions. This code is currently
3
unreachable because we haven't yet implemented M profile
4
floating point support, but since the change is simple,
5
we add it now because otherwise we're likely to forget to
6
do it later.
7
2
3
We want to assert the device is not realized. To avoid overloading
4
this header including "hw/qdev-core.h", uninline the function first.
5
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
8
Message-id: 20200803105647.22223-4-f4bug@amsat.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181002163556.10279-13-peter.maydell@linaro.org
12
---
10
---
13
target/arm/translate.c | 12 ++++++++++++
11
include/hw/qdev-clock.h | 6 +-----
14
1 file changed, 12 insertions(+)
12
hw/core/qdev-clock.c | 5 +++++
13
2 files changed, 6 insertions(+), 5 deletions(-)
15
14
16
diff --git a/target/arm/translate.c b/target/arm/translate.c
15
diff --git a/include/hw/qdev-clock.h b/include/hw/qdev-clock.h
17
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate.c
17
--- a/include/hw/qdev-clock.h
19
+++ b/target/arm/translate.c
18
+++ b/include/hw/qdev-clock.h
20
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
19
@@ -XXX,XX +XXX,XX @@ Clock *qdev_get_clock_out(DeviceState *dev, const char *name);
21
if (insn & (1 << 24)) /* pre-decrement */
20
* Set the source clock of input clock @name of device @dev to @source.
22
tcg_gen_addi_i32(addr, addr, -((insn & 0xff) << 2));
21
* @source period update will be propagated to @name clock.
23
22
*/
24
+ if (s->v8m_stackcheck && rn == 13 && w) {
23
-static inline void qdev_connect_clock_in(DeviceState *dev, const char *name,
25
+ /*
24
- Clock *source)
26
+ * Here 'addr' is the lowest address we will store to,
25
-{
27
+ * and is either the old SP (if post-increment) or
26
- clock_set_source(qdev_get_clock_in(dev, name), source);
28
+ * the new SP (if pre-decrement). For post-increment
27
-}
29
+ * where the old value is below the limit and the new
28
+void qdev_connect_clock_in(DeviceState *dev, const char *name, Clock *source);
30
+ * value is above, it is UNKNOWN whether the limit check
29
31
+ * triggers; we choose to trigger.
30
/**
32
+ */
31
* qdev_alias_clock:
33
+ gen_helper_v8m_stackcheck(cpu_env, addr);
32
diff --git a/hw/core/qdev-clock.c b/hw/core/qdev-clock.c
34
+ }
33
index XXXXXXX..XXXXXXX 100644
34
--- a/hw/core/qdev-clock.c
35
+++ b/hw/core/qdev-clock.c
36
@@ -XXX,XX +XXX,XX @@ Clock *qdev_alias_clock(DeviceState *dev, const char *name,
37
38
return ncl->clock;
39
}
35
+
40
+
36
if (dp)
41
+void qdev_connect_clock_in(DeviceState *dev, const char *name, Clock *source)
37
offset = 8;
42
+{
38
else
43
+ clock_set_source(qdev_get_clock_in(dev, name), source);
44
+}
39
--
45
--
40
2.19.0
46
2.20.1
41
47
42
48
diff view generated by jsdifflib
1
Add v8M stack checks for the 16-bit Thumb push/pop
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
encodings: STMDB, STMFD, LDM, LDMIA, LDMFD.
3
2
3
Clock canonical name is set in device_set_realized (see the block
4
added to hw/core/qdev.c in commit 0e6934f264).
5
If we connect a clock after the device is realized, this code is
6
not executed. This is currently not a problem as this name is only
7
used for trace events, however this disrupt tracing.
8
9
Add a comment to document qdev_connect_clock_in() must be called
10
before the device is realized, and assert this condition.
11
12
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
13
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
14
Message-id: 20200803105647.22223-5-f4bug@amsat.org
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20181002163556.10279-12-peter.maydell@linaro.org
8
---
16
---
9
target/arm/translate.c | 16 +++++++++++++++-
17
include/hw/qdev-clock.h | 2 ++
10
1 file changed, 15 insertions(+), 1 deletion(-)
18
hw/core/qdev-clock.c | 1 +
19
2 files changed, 3 insertions(+)
11
20
12
diff --git a/target/arm/translate.c b/target/arm/translate.c
21
diff --git a/include/hw/qdev-clock.h b/include/hw/qdev-clock.h
13
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate.c
23
--- a/include/hw/qdev-clock.h
15
+++ b/target/arm/translate.c
24
+++ b/include/hw/qdev-clock.h
16
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
25
@@ -XXX,XX +XXX,XX @@ Clock *qdev_get_clock_out(DeviceState *dev, const char *name);
17
store_reg(s, rd, tmp);
26
*
18
break;
27
* Set the source clock of input clock @name of device @dev to @source.
19
case 4: case 5: case 0xc: case 0xd:
28
* @source period update will be propagated to @name clock.
20
- /* push/pop */
29
+ *
21
+ /*
30
+ * Must be called before @dev is realized.
22
+ * 0b1011_x10x_xxxx_xxxx
31
*/
23
+ * - push/pop
32
void qdev_connect_clock_in(DeviceState *dev, const char *name, Clock *source);
24
+ */
33
25
addr = load_reg(s, 13);
34
diff --git a/hw/core/qdev-clock.c b/hw/core/qdev-clock.c
26
if (insn & (1 << 8))
35
index XXXXXXX..XXXXXXX 100644
27
offset = 4;
36
--- a/hw/core/qdev-clock.c
28
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
37
+++ b/hw/core/qdev-clock.c
29
if ((insn & (1 << 11)) == 0) {
38
@@ -XXX,XX +XXX,XX @@ Clock *qdev_alias_clock(DeviceState *dev, const char *name,
30
tcg_gen_addi_i32(addr, addr, -offset);
39
31
}
40
void qdev_connect_clock_in(DeviceState *dev, const char *name, Clock *source)
32
+
41
{
33
+ if (s->v8m_stackcheck) {
42
+ assert(!dev->realized);
34
+ /*
43
clock_set_source(qdev_get_clock_in(dev, name), source);
35
+ * Here 'addr' is the lower of "old SP" and "new SP";
44
}
36
+ * if this is a pop that starts below the limit and ends
37
+ * above it, it is UNKNOWN whether the limit check triggers;
38
+ * we choose to trigger.
39
+ */
40
+ gen_helper_v8m_stackcheck(cpu_env, addr);
41
+ }
42
+
43
for (i = 0; i < 8; i++) {
44
if (insn & (1 << i)) {
45
if (insn & (1 << 11)) {
46
--
45
--
47
2.19.0
46
2.20.1
48
47
49
48
diff view generated by jsdifflib
1
Add v8M stack checks for the instructions in the T32
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
"load/store single" encoding class: these are the
3
"immediate pre-indexed" and "immediate, post-indexed"
4
LDR and STR instructions.
5
2
3
To better align the read/write accesses, display the value after
4
the offset (read accesses only display the offset).
5
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Message-id: 20200812190206.31595-2-f4bug@amsat.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181002163556.10279-11-peter.maydell@linaro.org
10
---
10
---
11
target/arm/translate.c | 23 ++++++++++++++++++++++-
11
hw/misc/unimp.c | 8 ++++----
12
1 file changed, 22 insertions(+), 1 deletion(-)
12
1 file changed, 4 insertions(+), 4 deletions(-)
13
13
14
diff --git a/target/arm/translate.c b/target/arm/translate.c
14
diff --git a/hw/misc/unimp.c b/hw/misc/unimp.c
15
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/translate.c
16
--- a/hw/misc/unimp.c
17
+++ b/target/arm/translate.c
17
+++ b/hw/misc/unimp.c
18
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
18
@@ -XXX,XX +XXX,XX @@ static uint64_t unimp_read(void *opaque, hwaddr offset, unsigned size)
19
imm = -imm;
19
{
20
/* Fall through. */
20
UnimplementedDeviceState *s = UNIMPLEMENTED_DEVICE(opaque);
21
case 0xf: /* Pre-increment. */
21
22
- tcg_gen_addi_i32(addr, addr, imm);
22
- qemu_log_mask(LOG_UNIMP, "%s: unimplemented device read "
23
writeback = 1;
23
+ qemu_log_mask(LOG_UNIMP, "%s: unimplemented device read "
24
break;
24
"(size %d, offset 0x%" HWADDR_PRIx ")\n",
25
default:
25
s->name, size, offset);
26
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
26
return 0;
27
27
@@ -XXX,XX +XXX,XX @@ static void unimp_write(void *opaque, hwaddr offset,
28
issinfo = writeback ? ISSInvalid : rs;
28
UnimplementedDeviceState *s = UNIMPLEMENTED_DEVICE(opaque);
29
29
30
+ if (s->v8m_stackcheck && rn == 13 && writeback) {
30
qemu_log_mask(LOG_UNIMP, "%s: unimplemented device write "
31
+ /*
31
- "(size %d, value 0x%" PRIx64
32
+ * Stackcheck. Here we know 'addr' is the current SP;
32
- ", offset 0x%" HWADDR_PRIx ")\n",
33
+ * if imm is +ve we're moving SP up, else down. It is
33
- s->name, size, value, offset);
34
+ * UNKNOWN whether the limit check triggers when SP starts
34
+ "(size %d, offset 0x%" HWADDR_PRIx
35
+ * below the limit and ends up above it; we chose to do so.
35
+ ", value 0x%" PRIx64 ")\n",
36
+ */
36
+ s->name, size, offset, value);
37
+ if ((int32_t)imm < 0) {
37
}
38
+ TCGv_i32 newsp = tcg_temp_new_i32();
38
39
+
39
static const MemoryRegionOps unimp_ops = {
40
+ tcg_gen_addi_i32(newsp, addr, imm);
41
+ gen_helper_v8m_stackcheck(cpu_env, newsp);
42
+ tcg_temp_free_i32(newsp);
43
+ } else {
44
+ gen_helper_v8m_stackcheck(cpu_env, addr);
45
+ }
46
+ }
47
+
48
+ if (writeback && !postinc) {
49
+ tcg_gen_addi_i32(addr, addr, imm);
50
+ }
51
+
52
if (insn & (1 << 20)) {
53
/* Load. */
54
tmp = tcg_temp_new_i32();
55
--
40
--
56
2.19.0
41
2.20.1
57
42
58
43
diff view generated by jsdifflib
1
Add the v8M stack checks for:
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
* LDM (T2 encoding)
3
* STM (T2 encoding)
4
2
5
This includes the 32-bit encodings of the instructions listed
3
To quickly notice the access size, display the value with the
6
in v8M ARM ARM rule R_YVWT as
4
width of the access (i.e. 16-bit access is displayed 0x0000,
7
* LDM, LDMIA, LDMFD
5
while 8-bit access 0x00).
8
* LDMDB, LDMEA
9
* POP (multiple registers)
10
* PUSH (muliple registers)
11
* STM, STMIA, STMEA
12
* STMDB, STMFD
13
6
14
We perform the stack limit before doing any other part
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
of the load or store.
8
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
9
Message-id: 20200812190206.31595-3-f4bug@amsat.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
hw/misc/unimp.c | 4 ++--
13
1 file changed, 2 insertions(+), 2 deletions(-)
16
14
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
diff --git a/hw/misc/unimp.c b/hw/misc/unimp.c
18
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20181002163556.10279-10-peter.maydell@linaro.org
21
---
22
target/arm/translate.c | 19 ++++++++++++++++++-
23
1 file changed, 18 insertions(+), 1 deletion(-)
24
25
diff --git a/target/arm/translate.c b/target/arm/translate.c
26
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate.c
17
--- a/hw/misc/unimp.c
28
+++ b/target/arm/translate.c
18
+++ b/hw/misc/unimp.c
29
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
19
@@ -XXX,XX +XXX,XX @@ static void unimp_write(void *opaque, hwaddr offset,
30
} else {
20
31
int i, loaded_base = 0;
21
qemu_log_mask(LOG_UNIMP, "%s: unimplemented device write "
32
TCGv_i32 loaded_var;
22
"(size %d, offset 0x%" HWADDR_PRIx
33
+ bool wback = extract32(insn, 21, 1);
23
- ", value 0x%" PRIx64 ")\n",
34
/* Load/store multiple. */
24
- s->name, size, offset, value);
35
addr = load_reg(s, rn);
25
+ ", value 0x%0*" PRIx64 ")\n",
36
offset = 0;
26
+ s->name, size, offset, size << 1, value);
37
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
27
}
38
if (insn & (1 << i))
28
39
offset += 4;
29
static const MemoryRegionOps unimp_ops = {
40
}
41
+
42
if (insn & (1 << 24)) {
43
tcg_gen_addi_i32(addr, addr, -offset);
44
}
45
46
+ if (s->v8m_stackcheck && rn == 13 && wback) {
47
+ /*
48
+ * If the writeback is incrementing SP rather than
49
+ * decrementing it, and the initial SP is below the
50
+ * stack limit but the final written-back SP would
51
+ * be above, then then we must not perform any memory
52
+ * accesses, but it is IMPDEF whether we generate
53
+ * an exception. We choose to do so in this case.
54
+ * At this point 'addr' is the lowest address, so
55
+ * either the original SP (if incrementing) or our
56
+ * final SP (if decrementing), so that's what we check.
57
+ */
58
+ gen_helper_v8m_stackcheck(cpu_env, addr);
59
+ }
60
+
61
loaded_var = NULL;
62
for (i = 0; i < 16; i++) {
63
if ((insn & (1 << i)) == 0)
64
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
65
if (loaded_base) {
66
store_reg(s, rn, loaded_var);
67
}
68
- if (insn & (1 << 21)) {
69
+ if (wback) {
70
/* Base register writeback. */
71
if (insn & (1 << 24)) {
72
tcg_gen_addi_i32(addr, addr, -offset);
73
--
30
--
74
2.19.0
31
2.20.1
75
32
76
33
diff view generated by jsdifflib
1
Coverity complains (CID 1395628) that the multiply in the calculation
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
of the framebuffer base is performed as 32x32 but then used in a
3
context that takes a 64-bit hwaddr. This can't actually ever
4
overflow the 32-bit result, because of the constraints placed on
5
the s->config values in bcm2835_fb_validate_config(). But we
6
can placate Coverity anyway, by explicitly casting one of the
7
inputs to a hwaddr, so the whole expression is calculated with
8
64-bit arithmetic.
9
2
3
To have a better idea of how big is the region where the offset
4
belongs, display the value with the width of the region size
5
(i.e. a region of 0x1000 bytes uses 0x000 format).
6
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
9
Message-id: 20200812190206.31595-4-f4bug@amsat.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
12
Message-id: 20181005133012.26490-1-peter.maydell@linaro.org
13
---
11
---
14
hw/display/bcm2835_fb.c | 2 +-
12
include/hw/misc/unimp.h | 1 +
15
1 file changed, 1 insertion(+), 1 deletion(-)
13
hw/misc/unimp.c | 10 ++++++----
14
2 files changed, 7 insertions(+), 4 deletions(-)
16
15
17
diff --git a/hw/display/bcm2835_fb.c b/hw/display/bcm2835_fb.c
16
diff --git a/include/hw/misc/unimp.h b/include/hw/misc/unimp.h
18
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
19
--- a/hw/display/bcm2835_fb.c
18
--- a/include/hw/misc/unimp.h
20
+++ b/hw/display/bcm2835_fb.c
19
+++ b/include/hw/misc/unimp.h
21
@@ -XXX,XX +XXX,XX @@ static void fb_update_display(void *opaque)
20
@@ -XXX,XX +XXX,XX @@
21
typedef struct {
22
SysBusDevice parent_obj;
23
MemoryRegion iomem;
24
+ unsigned offset_fmt_width;
25
char *name;
26
uint64_t size;
27
} UnimplementedDeviceState;
28
diff --git a/hw/misc/unimp.c b/hw/misc/unimp.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/hw/misc/unimp.c
31
+++ b/hw/misc/unimp.c
32
@@ -XXX,XX +XXX,XX @@ static uint64_t unimp_read(void *opaque, hwaddr offset, unsigned size)
33
UnimplementedDeviceState *s = UNIMPLEMENTED_DEVICE(opaque);
34
35
qemu_log_mask(LOG_UNIMP, "%s: unimplemented device read "
36
- "(size %d, offset 0x%" HWADDR_PRIx ")\n",
37
- s->name, size, offset);
38
+ "(size %d, offset 0x%0*" HWADDR_PRIx ")\n",
39
+ s->name, size, s->offset_fmt_width, offset);
40
return 0;
41
}
42
43
@@ -XXX,XX +XXX,XX @@ static void unimp_write(void *opaque, hwaddr offset,
44
UnimplementedDeviceState *s = UNIMPLEMENTED_DEVICE(opaque);
45
46
qemu_log_mask(LOG_UNIMP, "%s: unimplemented device write "
47
- "(size %d, offset 0x%" HWADDR_PRIx
48
+ "(size %d, offset 0x%0*" HWADDR_PRIx
49
", value 0x%0*" PRIx64 ")\n",
50
- s->name, size, offset, size << 1, value);
51
+ s->name, size, s->offset_fmt_width, offset, size << 1, value);
52
}
53
54
static const MemoryRegionOps unimp_ops = {
55
@@ -XXX,XX +XXX,XX @@ static void unimp_realize(DeviceState *dev, Error **errp)
56
return;
22
}
57
}
23
58
24
if (s->invalidate) {
59
+ s->offset_fmt_width = DIV_ROUND_UP(64 - clz64(s->size - 1), 4);
25
- hwaddr base = s->config.base + xoff + yoff * src_width;
60
+
26
+ hwaddr base = s->config.base + xoff + (hwaddr)yoff * src_width;
61
memory_region_init_io(&s->iomem, OBJECT(s), &unimp_ops, s,
27
framebuffer_update_memory_section(&s->fbsection, s->dma_mr,
62
s->name, s->size);
28
base,
63
sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
29
s->config.yres, src_width);
30
--
64
--
31
2.19.0
65
2.20.1
32
66
33
67
diff view generated by jsdifflib
1
Add the v8M stack checks for:
1
From: Eduardo Habkost <ehabkost@redhat.com>
2
* LDRD (immediate)
3
* STRD (immediate)
4
2
5
Loads and stores are more complicated than ADD/SUB/MOV, because we
3
TYPE_ARM_SSE is a TYPE_SYS_BUS_DEVICE subclass, but
6
must ensure that memory accesses below the stack limit are not
4
ARMSSEClass::parent_class is declared as DeviceClass.
7
performed, so we can't simply do the check when we actually update
8
SP.
9
5
10
For these instructions, if the stack limit check triggers
6
It never caused any problems by pure luck:
11
we must not:
12
* perform any memory access below the SP limit
13
* update PC, SP or the load/store base register
14
but it is IMPDEF whether we:
15
* perform any accesses above or equal to the SP limit
16
* update destination registers for loads
17
7
18
For QEMU we choose to always check the limit before doing any other
8
We were not setting class_size for TYPE_ARM_SSE, so class_size of
19
part of the load or store, so we won't update any registers or
9
TYPE_SYS_BUS_DEVICE was being used (sizeof(SysBusDeviceClass)).
20
perform any memory accesses.
10
This made the system allocate enough memory for TYPE_ARM_SSE
11
devices even though ARMSSEClass was too small for a sysbus
12
device.
21
13
22
It is UNKNOWN whether the limit check triggers for a load or store
14
Additionally, the ARMSSEClass::info field ended up at the same
23
where the initial SP value is below the limit and one of the stores
15
offset as SysBusDeviceClass::explicit_ofw_unit_address. This
24
would be below the limit, but the writeback moves SP to above the
16
would make sysbus_get_fw_dev_path() crash for the device.
25
limit. For QEMU we choose to trigger the check in this situation.
17
Luckily, sysbus_get_fw_dev_path() never gets called for
18
TYPE_ARM_SSE devices, because qdev_get_fw_dev_path() is only used
19
by the boot device code, and TYPE_ARM_SSE devices don't appear at
20
the fw_boot_order list.
26
21
27
Note that limit checks happen only for loads and stores which update
22
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
28
SP via writeback; they do not happen for loads and stores which
23
Message-id: 20200826181006.4097163-1-ehabkost@redhat.com
29
simply use SP as a base register.
24
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
25
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
26
---
27
include/hw/arm/armsse.h | 2 +-
28
hw/arm/armsse.c | 1 +
29
2 files changed, 2 insertions(+), 1 deletion(-)
30
30
31
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
31
diff --git a/include/hw/arm/armsse.h b/include/hw/arm/armsse.h
32
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
33
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
34
Message-id: 20181002163556.10279-9-peter.maydell@linaro.org
35
---
36
target/arm/translate.c | 27 +++++++++++++++++++++++++--
37
1 file changed, 25 insertions(+), 2 deletions(-)
38
39
diff --git a/target/arm/translate.c b/target/arm/translate.c
40
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate.c
33
--- a/include/hw/arm/armsse.h
42
+++ b/target/arm/translate.c
34
+++ b/include/hw/arm/armsse.h
43
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
35
@@ -XXX,XX +XXX,XX @@ typedef struct ARMSSE {
44
* 0b1111_1001_x11x_xxxx_xxxx_xxxx_xxxx_xxxx
36
typedef struct ARMSSEInfo ARMSSEInfo;
45
* - load/store dual (pre-indexed)
37
46
*/
38
typedef struct ARMSSEClass {
47
+ bool wback = extract32(insn, 21, 1);
39
- DeviceClass parent_class;
48
+
40
+ SysBusDeviceClass parent_class;
49
if (rn == 15) {
41
const ARMSSEInfo *info;
50
if (insn & (1 << 21)) {
42
} ARMSSEClass;
51
/* UNPREDICTABLE */
43
52
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
44
diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
53
addr = load_reg(s, rn);
45
index XXXXXXX..XXXXXXX 100644
54
}
46
--- a/hw/arm/armsse.c
55
offset = (insn & 0xff) * 4;
47
+++ b/hw/arm/armsse.c
56
- if ((insn & (1 << 23)) == 0)
48
@@ -XXX,XX +XXX,XX @@ static const TypeInfo armsse_info = {
57
+ if ((insn & (1 << 23)) == 0) {
49
.name = TYPE_ARMSSE,
58
offset = -offset;
50
.parent = TYPE_SYS_BUS_DEVICE,
59
+ }
51
.instance_size = sizeof(ARMSSE),
60
+
52
+ .class_size = sizeof(ARMSSEClass),
61
+ if (s->v8m_stackcheck && rn == 13 && wback) {
53
.instance_init = armsse_init,
62
+ /*
54
.abstract = true,
63
+ * Here 'addr' is the current SP; if offset is +ve we're
55
.interfaces = (InterfaceInfo[]) {
64
+ * moving SP up, else down. It is UNKNOWN whether the limit
65
+ * check triggers when SP starts below the limit and ends
66
+ * up above it; check whichever of the current and final
67
+ * SP is lower, so QEMU will trigger in that situation.
68
+ */
69
+ if ((int32_t)offset < 0) {
70
+ TCGv_i32 newsp = tcg_temp_new_i32();
71
+
72
+ tcg_gen_addi_i32(newsp, addr, offset);
73
+ gen_helper_v8m_stackcheck(cpu_env, newsp);
74
+ tcg_temp_free_i32(newsp);
75
+ } else {
76
+ gen_helper_v8m_stackcheck(cpu_env, addr);
77
+ }
78
+ }
79
+
80
if (insn & (1 << 24)) {
81
tcg_gen_addi_i32(addr, addr, offset);
82
offset = 0;
83
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
84
gen_aa32_st32(s, tmp, addr, get_mem_index(s));
85
tcg_temp_free_i32(tmp);
86
}
87
- if (insn & (1 << 21)) {
88
+ if (wback) {
89
/* Base writeback. */
90
tcg_gen_addi_i32(addr, addr, offset - 4);
91
store_reg(s, rn, addr);
92
--
56
--
93
2.19.0
57
2.20.1
94
58
95
59
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This fixes the endianness problem for softmmu, and moves the
3
Add left-shift to match the existing right-shift.
4
main loop out of a macro and into an inlined function.
5
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
7
Message-id: 20200815013145.539409-2-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-10-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
9
---
12
target/arm/sve_helper.c | 351 ++++++++++++++++++++--------------------
10
include/qemu/int128.h | 16 ++++++++++++++++
13
1 file changed, 172 insertions(+), 179 deletions(-)
11
1 file changed, 16 insertions(+)
14
12
15
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
13
diff --git a/include/qemu/int128.h b/include/qemu/int128.h
16
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sve_helper.c
15
--- a/include/qemu/int128.h
18
+++ b/target/arm/sve_helper.c
16
+++ b/include/qemu/int128.h
19
@@ -XXX,XX +XXX,XX @@ typedef intptr_t sve_ld1_host_fn(void *vd, void *vg, void *host,
17
@@ -XXX,XX +XXX,XX @@ static inline Int128 int128_rshift(Int128 a, int n)
20
*/
18
return a >> n;
21
typedef void sve_ld1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
22
target_ulong vaddr, int mmu_idx, uintptr_t ra);
23
+typedef sve_ld1_tlb_fn sve_st1_tlb_fn;
24
25
/*
26
* Generate the above primitives.
27
@@ -XXX,XX +XXX,XX @@ DO_LDFF1_LDNF1_2(dd, 3, 3)
28
/*
29
* Store contiguous data, protected by a governing predicate.
30
*/
31
-#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \
32
-void HELPER(NAME)(CPUARMState *env, void *vg, \
33
- target_ulong addr, uint32_t desc) \
34
-{ \
35
- intptr_t i, oprsz = simd_oprsz(desc); \
36
- intptr_t ra = GETPC(); \
37
- unsigned rd = simd_data(desc); \
38
- void *vd = &env->vfp.zregs[rd]; \
39
- for (i = 0; i < oprsz; ) { \
40
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
41
- do { \
42
- if (pg & 1) { \
43
- TYPEM m = *(TYPEE *)(vd + H(i)); \
44
- FN(env, addr, m, ra); \
45
- } \
46
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
47
- addr += sizeof(TYPEM); \
48
- } while (i & 15); \
49
- } \
50
+
51
+#ifdef CONFIG_SOFTMMU
52
+#define DO_ST_TLB(NAME, H, TYPEM, HOST, MOEND, TLB) \
53
+static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
54
+ target_ulong addr, int mmu_idx, uintptr_t ra) \
55
+{ \
56
+ TCGMemOpIdx oi = make_memop_idx(ctz32(sizeof(TYPEM)) | MOEND, mmu_idx); \
57
+ TLB(env, addr, *(TYPEM *)(vd + H(reg_off)), oi, ra); \
58
}
19
}
59
-
20
60
-#define DO_ST1_D(NAME, FN, TYPEM) \
21
+static inline Int128 int128_lshift(Int128 a, int n)
61
-void HELPER(NAME)(CPUARMState *env, void *vg, \
62
- target_ulong addr, uint32_t desc) \
63
-{ \
64
- intptr_t i, oprsz = simd_oprsz(desc) / 8; \
65
- intptr_t ra = GETPC(); \
66
- unsigned rd = simd_data(desc); \
67
- uint64_t *d = &env->vfp.zregs[rd].d[0]; \
68
- uint8_t *pg = vg; \
69
- for (i = 0; i < oprsz; i += 1) { \
70
- if (pg[H1(i)] & 1) { \
71
- FN(env, addr, d[i], ra); \
72
- } \
73
- addr += sizeof(TYPEM); \
74
- } \
75
+#else
76
+#define DO_ST_TLB(NAME, H, TYPEM, HOST, MOEND, TLB) \
77
+static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
78
+ target_ulong addr, int mmu_idx, uintptr_t ra) \
79
+{ \
80
+ HOST(g2h(addr), *(TYPEM *)(vd + H(reg_off))); \
81
}
82
+#endif
83
84
-#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \
85
-void HELPER(NAME)(CPUARMState *env, void *vg, \
86
- target_ulong addr, uint32_t desc) \
87
-{ \
88
- intptr_t i, oprsz = simd_oprsz(desc); \
89
- intptr_t ra = GETPC(); \
90
- unsigned rd = simd_data(desc); \
91
- void *d1 = &env->vfp.zregs[rd]; \
92
- void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
93
- for (i = 0; i < oprsz; ) { \
94
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
95
- do { \
96
- if (pg & 1) { \
97
- TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
98
- TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
99
- FN(env, addr, m1, ra); \
100
- FN(env, addr + sizeof(TYPEM), m2, ra); \
101
- } \
102
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
103
- addr += 2 * sizeof(TYPEM); \
104
- } while (i & 15); \
105
- } \
106
-}
107
+DO_ST_TLB(st1bb, H1, uint8_t, stb_p, 0, helper_ret_stb_mmu)
108
+DO_ST_TLB(st1bh, H1_2, uint16_t, stb_p, 0, helper_ret_stb_mmu)
109
+DO_ST_TLB(st1bs, H1_4, uint32_t, stb_p, 0, helper_ret_stb_mmu)
110
+DO_ST_TLB(st1bd, , uint64_t, stb_p, 0, helper_ret_stb_mmu)
111
112
-#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \
113
-void HELPER(NAME)(CPUARMState *env, void *vg, \
114
- target_ulong addr, uint32_t desc) \
115
-{ \
116
- intptr_t i, oprsz = simd_oprsz(desc); \
117
- intptr_t ra = GETPC(); \
118
- unsigned rd = simd_data(desc); \
119
- void *d1 = &env->vfp.zregs[rd]; \
120
- void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
121
- void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
122
- for (i = 0; i < oprsz; ) { \
123
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
124
- do { \
125
- if (pg & 1) { \
126
- TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
127
- TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
128
- TYPEM m3 = *(TYPEE *)(d3 + H(i)); \
129
- FN(env, addr, m1, ra); \
130
- FN(env, addr + sizeof(TYPEM), m2, ra); \
131
- FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \
132
- } \
133
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
134
- addr += 3 * sizeof(TYPEM); \
135
- } while (i & 15); \
136
- } \
137
-}
138
+DO_ST_TLB(st1hh_le, H1_2, uint16_t, stw_le_p, MO_LE, helper_le_stw_mmu)
139
+DO_ST_TLB(st1hs_le, H1_4, uint32_t, stw_le_p, MO_LE, helper_le_stw_mmu)
140
+DO_ST_TLB(st1hd_le, , uint64_t, stw_le_p, MO_LE, helper_le_stw_mmu)
141
142
-#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \
143
-void HELPER(NAME)(CPUARMState *env, void *vg, \
144
- target_ulong addr, uint32_t desc) \
145
-{ \
146
- intptr_t i, oprsz = simd_oprsz(desc); \
147
- intptr_t ra = GETPC(); \
148
- unsigned rd = simd_data(desc); \
149
- void *d1 = &env->vfp.zregs[rd]; \
150
- void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
151
- void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
152
- void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \
153
- for (i = 0; i < oprsz; ) { \
154
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
155
- do { \
156
- if (pg & 1) { \
157
- TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
158
- TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
159
- TYPEM m3 = *(TYPEE *)(d3 + H(i)); \
160
- TYPEM m4 = *(TYPEE *)(d4 + H(i)); \
161
- FN(env, addr, m1, ra); \
162
- FN(env, addr + sizeof(TYPEM), m2, ra); \
163
- FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \
164
- FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \
165
- } \
166
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
167
- addr += 4 * sizeof(TYPEM); \
168
- } while (i & 15); \
169
- } \
170
-}
171
+DO_ST_TLB(st1ss_le, H1_4, uint32_t, stl_le_p, MO_LE, helper_le_stl_mmu)
172
+DO_ST_TLB(st1sd_le, , uint64_t, stl_le_p, MO_LE, helper_le_stl_mmu)
173
174
-DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2)
175
-DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4)
176
-DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t)
177
+DO_ST_TLB(st1dd_le, , uint64_t, stq_le_p, MO_LE, helper_le_stq_mmu)
178
179
-DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4)
180
-DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t)
181
+DO_ST_TLB(st1hh_be, H1_2, uint16_t, stw_be_p, MO_BE, helper_be_stw_mmu)
182
+DO_ST_TLB(st1hs_be, H1_4, uint32_t, stw_be_p, MO_BE, helper_be_stw_mmu)
183
+DO_ST_TLB(st1hd_be, , uint64_t, stw_be_p, MO_BE, helper_be_stw_mmu)
184
185
-DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t)
186
+DO_ST_TLB(st1ss_be, H1_4, uint32_t, stl_be_p, MO_BE, helper_be_stl_mmu)
187
+DO_ST_TLB(st1sd_be, , uint64_t, stl_be_p, MO_BE, helper_be_stl_mmu)
188
189
-DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
190
-DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
191
-DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
192
-DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
193
+DO_ST_TLB(st1dd_be, , uint64_t, stq_be_p, MO_BE, helper_be_stq_mmu)
194
195
-DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
196
-DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
197
-DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
198
-DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
199
+#undef DO_ST_TLB
200
201
-DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
202
-DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
203
-DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
204
-DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
205
-
206
-DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t)
207
-
208
-void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg,
209
- target_ulong addr, uint32_t desc)
210
+/*
211
+ * Common helpers for all contiguous 1,2,3,4-register predicated stores.
212
+ */
213
+static void sve_st1_r(CPUARMState *env, void *vg, target_ulong addr,
214
+ uint32_t desc, const uintptr_t ra,
215
+ const int esize, const int msize,
216
+ sve_st1_tlb_fn *tlb_fn)
217
{
218
- intptr_t i, oprsz = simd_oprsz(desc) / 8;
219
- intptr_t ra = GETPC();
220
+ const int mmu_idx = cpu_mmu_index(env, false);
221
+ intptr_t i, oprsz = simd_oprsz(desc);
222
unsigned rd = simd_data(desc);
223
- uint64_t *d1 = &env->vfp.zregs[rd].d[0];
224
- uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
225
- uint8_t *pg = vg;
226
+ void *vd = &env->vfp.zregs[rd];
227
228
- for (i = 0; i < oprsz; i += 1) {
229
- if (pg[H1(i)] & 1) {
230
- cpu_stq_data_ra(env, addr, d1[i], ra);
231
- cpu_stq_data_ra(env, addr + 8, d2[i], ra);
232
- }
233
- addr += 2 * 8;
234
+ set_helper_retaddr(ra);
235
+ for (i = 0; i < oprsz; ) {
236
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
237
+ do {
238
+ if (pg & 1) {
239
+ tlb_fn(env, vd, i, addr, mmu_idx, ra);
240
+ }
241
+ i += esize, pg >>= esize;
242
+ addr += msize;
243
+ } while (i & 15);
244
}
245
+ set_helper_retaddr(0);
246
}
247
248
-void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg,
249
- target_ulong addr, uint32_t desc)
250
+static void sve_st2_r(CPUARMState *env, void *vg, target_ulong addr,
251
+ uint32_t desc, const uintptr_t ra,
252
+ const int esize, const int msize,
253
+ sve_st1_tlb_fn *tlb_fn)
254
{
255
- intptr_t i, oprsz = simd_oprsz(desc) / 8;
256
- intptr_t ra = GETPC();
257
+ const int mmu_idx = cpu_mmu_index(env, false);
258
+ intptr_t i, oprsz = simd_oprsz(desc);
259
unsigned rd = simd_data(desc);
260
- uint64_t *d1 = &env->vfp.zregs[rd].d[0];
261
- uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
262
- uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0];
263
- uint8_t *pg = vg;
264
+ void *d1 = &env->vfp.zregs[rd];
265
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31];
266
267
- for (i = 0; i < oprsz; i += 1) {
268
- if (pg[H1(i)] & 1) {
269
- cpu_stq_data_ra(env, addr, d1[i], ra);
270
- cpu_stq_data_ra(env, addr + 8, d2[i], ra);
271
- cpu_stq_data_ra(env, addr + 16, d3[i], ra);
272
- }
273
- addr += 3 * 8;
274
+ set_helper_retaddr(ra);
275
+ for (i = 0; i < oprsz; ) {
276
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
277
+ do {
278
+ if (pg & 1) {
279
+ tlb_fn(env, d1, i, addr, mmu_idx, ra);
280
+ tlb_fn(env, d2, i, addr + msize, mmu_idx, ra);
281
+ }
282
+ i += esize, pg >>= esize;
283
+ addr += 2 * msize;
284
+ } while (i & 15);
285
}
286
+ set_helper_retaddr(0);
287
}
288
289
-void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
290
- target_ulong addr, uint32_t desc)
291
+static void sve_st3_r(CPUARMState *env, void *vg, target_ulong addr,
292
+ uint32_t desc, const uintptr_t ra,
293
+ const int esize, const int msize,
294
+ sve_st1_tlb_fn *tlb_fn)
295
{
296
- intptr_t i, oprsz = simd_oprsz(desc) / 8;
297
- intptr_t ra = GETPC();
298
+ const int mmu_idx = cpu_mmu_index(env, false);
299
+ intptr_t i, oprsz = simd_oprsz(desc);
300
unsigned rd = simd_data(desc);
301
- uint64_t *d1 = &env->vfp.zregs[rd].d[0];
302
- uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
303
- uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0];
304
- uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0];
305
- uint8_t *pg = vg;
306
+ void *d1 = &env->vfp.zregs[rd];
307
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31];
308
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31];
309
310
- for (i = 0; i < oprsz; i += 1) {
311
- if (pg[H1(i)] & 1) {
312
- cpu_stq_data_ra(env, addr, d1[i], ra);
313
- cpu_stq_data_ra(env, addr + 8, d2[i], ra);
314
- cpu_stq_data_ra(env, addr + 16, d3[i], ra);
315
- cpu_stq_data_ra(env, addr + 24, d4[i], ra);
316
- }
317
- addr += 4 * 8;
318
+ set_helper_retaddr(ra);
319
+ for (i = 0; i < oprsz; ) {
320
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
321
+ do {
322
+ if (pg & 1) {
323
+ tlb_fn(env, d1, i, addr, mmu_idx, ra);
324
+ tlb_fn(env, d2, i, addr + msize, mmu_idx, ra);
325
+ tlb_fn(env, d3, i, addr + 2 * msize, mmu_idx, ra);
326
+ }
327
+ i += esize, pg >>= esize;
328
+ addr += 3 * msize;
329
+ } while (i & 15);
330
}
331
+ set_helper_retaddr(0);
332
}
333
334
+static void sve_st4_r(CPUARMState *env, void *vg, target_ulong addr,
335
+ uint32_t desc, const uintptr_t ra,
336
+ const int esize, const int msize,
337
+ sve_st1_tlb_fn *tlb_fn)
338
+{
22
+{
339
+ const int mmu_idx = cpu_mmu_index(env, false);
23
+ return a << n;
340
+ intptr_t i, oprsz = simd_oprsz(desc);
341
+ unsigned rd = simd_data(desc);
342
+ void *d1 = &env->vfp.zregs[rd];
343
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31];
344
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31];
345
+ void *d4 = &env->vfp.zregs[(rd + 3) & 31];
346
+
347
+ set_helper_retaddr(ra);
348
+ for (i = 0; i < oprsz; ) {
349
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
350
+ do {
351
+ if (pg & 1) {
352
+ tlb_fn(env, d1, i, addr, mmu_idx, ra);
353
+ tlb_fn(env, d2, i, addr + msize, mmu_idx, ra);
354
+ tlb_fn(env, d3, i, addr + 2 * msize, mmu_idx, ra);
355
+ tlb_fn(env, d4, i, addr + 3 * msize, mmu_idx, ra);
356
+ }
357
+ i += esize, pg >>= esize;
358
+ addr += 4 * msize;
359
+ } while (i & 15);
360
+ }
361
+ set_helper_retaddr(0);
362
+}
24
+}
363
+
25
+
364
+#define DO_STN_1(N, NAME, ESIZE) \
26
static inline Int128 int128_add(Int128 a, Int128 b)
365
+void __attribute__((flatten)) HELPER(sve_st##N##NAME##_r) \
27
{
366
+ (CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
28
return a + b;
367
+{ \
29
@@ -XXX,XX +XXX,XX @@ static inline Int128 int128_rshift(Int128 a, int n)
368
+ sve_st##N##_r(env, vg, addr, desc, GETPC(), ESIZE, 1, \
30
}
369
+ sve_st1##NAME##_tlb); \
31
}
32
33
+static inline Int128 int128_lshift(Int128 a, int n)
34
+{
35
+ uint64_t l = a.lo << (n & 63);
36
+ if (n >= 64) {
37
+ return int128_make128(0, l);
38
+ } else if (n > 0) {
39
+ return int128_make128(l, (a.hi << n) | (a.lo >> (64 - n)));
40
+ }
41
+ return a;
370
+}
42
+}
371
+
43
+
372
+#define DO_STN_2(N, NAME, ESIZE, MSIZE) \
44
static inline Int128 int128_add(Int128 a, Int128 b)
373
+void __attribute__((flatten)) HELPER(sve_st##N##NAME##_r) \
45
{
374
+ (CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
46
uint64_t lo = a.lo + b.lo;
375
+{ \
376
+ sve_st##N##_r(env, vg, addr, desc, GETPC(), ESIZE, MSIZE, \
377
+ arm_cpu_data_is_big_endian(env) \
378
+ ? sve_st1##NAME##_be_tlb : sve_st1##NAME##_le_tlb); \
379
+}
380
+
381
+DO_STN_1(1, bb, 1)
382
+DO_STN_1(1, bh, 2)
383
+DO_STN_1(1, bs, 4)
384
+DO_STN_1(1, bd, 8)
385
+DO_STN_1(2, bb, 1)
386
+DO_STN_1(3, bb, 1)
387
+DO_STN_1(4, bb, 1)
388
+
389
+DO_STN_2(1, hh, 2, 2)
390
+DO_STN_2(1, hs, 4, 2)
391
+DO_STN_2(1, hd, 8, 2)
392
+DO_STN_2(2, hh, 2, 2)
393
+DO_STN_2(3, hh, 2, 2)
394
+DO_STN_2(4, hh, 2, 2)
395
+
396
+DO_STN_2(1, ss, 4, 4)
397
+DO_STN_2(1, sd, 8, 4)
398
+DO_STN_2(2, ss, 4, 4)
399
+DO_STN_2(3, ss, 4, 4)
400
+DO_STN_2(4, ss, 4, 4)
401
+
402
+DO_STN_2(1, dd, 8, 8)
403
+DO_STN_2(2, dd, 8, 8)
404
+DO_STN_2(3, dd, 8, 8)
405
+DO_STN_2(4, dd, 8, 8)
406
+
407
+#undef DO_STN_1
408
+#undef DO_STN_2
409
+
410
/* Loads with a vector index. */
411
412
#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
413
--
47
--
414
2.19.0
48
2.20.1
415
49
416
50
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
SVE vector length can change when changing EL, or when writing
3
Model the new function on gen_gvec_fn2 in translate-a64.c, but
4
to one of the ZCR_ELn registers.
4
indicating which kind of register and in which order. Since there
5
is only one user of do_vector2_z, fold it into do_mov_z.
5
6
6
For correctness, our implementation requires that predicate bits
7
that are inaccessible are never set. Which means noticing length
8
changes and zeroing the appropriate register bits.
9
10
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20181005175350.30752-5-richard.henderson@linaro.org
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20200815013145.539409-3-richard.henderson@linaro.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
11
---
16
target/arm/cpu.h | 4 ++
12
target/arm/translate-sve.c | 19 ++++++++++---------
17
target/arm/cpu64.c | 42 -------------
13
1 file changed, 10 insertions(+), 9 deletions(-)
18
target/arm/helper.c | 133 +++++++++++++++++++++++++++++++++++++----
19
target/arm/op_helper.c | 1 +
20
4 files changed, 125 insertions(+), 55 deletions(-)
21
14
22
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
23
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/cpu.h
17
--- a/target/arm/translate-sve.c
25
+++ b/target/arm/cpu.h
18
+++ b/target/arm/translate-sve.c
26
@@ -XXX,XX +XXX,XX @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
19
@@ -XXX,XX +XXX,XX @@ static int pred_gvec_reg_size(DisasContext *s)
27
int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
28
int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
29
void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq);
30
+void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el);
31
+#else
32
+static inline void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) { }
33
+static inline void aarch64_sve_change_el(CPUARMState *env, int o, int n) { }
34
#endif
35
36
target_ulong do_arm_semihosting(CPUARMState *env);
37
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/cpu64.c
40
+++ b/target/arm/cpu64.c
41
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_register_types(void)
42
}
20
}
43
21
44
type_init(aarch64_cpu_register_types)
22
/* Invoke a vector expander on two Zregs. */
45
-
23
-static bool do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn,
46
-/* The manual says that when SVE is enabled and VQ is widened the
24
- int esz, int rd, int rn)
47
- * implementation is allowed to zero the previously inaccessible
25
+
48
- * portion of the registers. The corollary to that is that when
26
+static void gen_gvec_fn_zz(DisasContext *s, GVecGen2Fn *gvec_fn,
49
- * SVE is enabled and VQ is narrowed we are also allowed to zero
27
+ int esz, int rd, int rn)
50
- * the now inaccessible portion of the registers.
28
{
51
- *
29
- if (sve_access_check(s)) {
52
- * The intent of this is that no predicate bit beyond VQ is ever set.
30
- unsigned vsz = vec_full_reg_size(s);
53
- * Which means that some operations on predicate registers themselves
31
- gvec_fn(esz, vec_full_reg_offset(s, rd),
54
- * may operate on full uint64_t or even unrolled across the maximum
32
- vec_full_reg_offset(s, rn), vsz, vsz);
55
- * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally
56
- * may well be cheaper than conditionals to restrict the operation
57
- * to the relevant portion of a uint16_t[16].
58
- *
59
- * TODO: Need to call this for changes to the real system registers
60
- * and EL state changes.
61
- */
62
-void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
63
-{
64
- int i, j;
65
- uint64_t pmask;
66
-
67
- assert(vq >= 1 && vq <= ARM_MAX_VQ);
68
- assert(vq <= arm_env_get_cpu(env)->sve_max_vq);
69
-
70
- /* Zap the high bits of the zregs. */
71
- for (i = 0; i < 32; i++) {
72
- memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq));
73
- }
33
- }
74
-
34
- return true;
75
- /* Zap the high bits of the pregs and ffr. */
35
+ unsigned vsz = vec_full_reg_size(s);
76
- pmask = 0;
36
+ gvec_fn(esz, vec_full_reg_offset(s, rd),
77
- if (vq & 3) {
37
+ vec_full_reg_offset(s, rn), vsz, vsz);
78
- pmask = ~(-1ULL << (16 * (vq & 3)));
79
- }
80
- for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) {
81
- for (i = 0; i < 17; ++i) {
82
- env->vfp.pregs[i].p[j] &= pmask;
83
- }
84
- pmask = 0;
85
- }
86
-}
87
diff --git a/target/arm/helper.c b/target/arm/helper.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/target/arm/helper.c
90
+++ b/target/arm/helper.c
91
@@ -XXX,XX +XXX,XX @@ static int sve_exception_el(CPUARMState *env, int el)
92
return 0;
93
}
38
}
94
39
95
+/*
40
/* Invoke a vector expander on three Zregs. */
96
+ * Given that SVE is enabled, return the vector length for EL.
41
@@ -XXX,XX +XXX,XX @@ static bool do_vector3_z(DisasContext *s, GVecGen3Fn *gvec_fn,
97
+ */
42
/* Invoke a vector move on two Zregs. */
98
+static uint32_t sve_zcr_len_for_el(CPUARMState *env, int el)
43
static bool do_mov_z(DisasContext *s, int rd, int rn)
99
+{
44
{
100
+ ARMCPU *cpu = arm_env_get_cpu(env);
45
- return do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn);
101
+ uint32_t zcr_len = cpu->sve_max_vq - 1;
46
+ if (sve_access_check(s)) {
102
+
47
+ gen_gvec_fn_zz(s, tcg_gen_gvec_mov, MO_8, rd, rn);
103
+ if (el <= 1) {
104
+ zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]);
105
+ }
48
+ }
106
+ if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
49
+ return true;
107
+ zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]);
108
+ }
109
+ if (el < 3 && arm_feature(env, ARM_FEATURE_EL3)) {
110
+ zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
111
+ }
112
+ return zcr_len;
113
+}
114
+
115
static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
116
uint64_t value)
117
{
118
+ int cur_el = arm_current_el(env);
119
+ int old_len = sve_zcr_len_for_el(env, cur_el);
120
+ int new_len;
121
+
122
/* Bits other than [3:0] are RAZ/WI. */
123
raw_write(env, ri, value & 0xf);
124
+
125
+ /*
126
+ * Because we arrived here, we know both FP and SVE are enabled;
127
+ * otherwise we would have trapped access to the ZCR_ELn register.
128
+ */
129
+ new_len = sve_zcr_len_for_el(env, cur_el);
130
+ if (new_len < old_len) {
131
+ aarch64_sve_narrow_vq(env, new_len + 1);
132
+ }
133
}
50
}
134
51
135
static const ARMCPRegInfo zcr_el1_reginfo = {
52
/* Initialize a Zreg with replications of a 64-bit immediate. */
136
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
137
unsigned int new_el = env->exception.target_el;
138
target_ulong addr = env->cp15.vbar_el[new_el];
139
unsigned int new_mode = aarch64_pstate_mode(new_el, true);
140
+ unsigned int cur_el = arm_current_el(env);
141
142
- if (arm_current_el(env) < new_el) {
143
+ aarch64_sve_change_el(env, cur_el, new_el);
144
+
145
+ if (cur_el < new_el) {
146
/* Entry vector offset depends on whether the implemented EL
147
* immediately lower than the target level is using AArch32 or AArch64
148
*/
149
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
150
if (sve_el != 0 && fp_el == 0) {
151
zcr_len = 0;
152
} else {
153
- ARMCPU *cpu = arm_env_get_cpu(env);
154
-
155
- zcr_len = cpu->sve_max_vq - 1;
156
- if (current_el <= 1) {
157
- zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]);
158
- }
159
- if (current_el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
160
- zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]);
161
- }
162
- if (current_el < 3 && arm_feature(env, ARM_FEATURE_EL3)) {
163
- zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
164
- }
165
+ zcr_len = sve_zcr_len_for_el(env, current_el);
166
}
167
flags |= sve_el << ARM_TBFLAG_SVEEXC_EL_SHIFT;
168
flags |= zcr_len << ARM_TBFLAG_ZCR_LEN_SHIFT;
169
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
170
*pflags = flags;
171
*cs_base = 0;
172
}
173
+
174
+#ifdef TARGET_AARCH64
175
+/*
176
+ * The manual says that when SVE is enabled and VQ is widened the
177
+ * implementation is allowed to zero the previously inaccessible
178
+ * portion of the registers. The corollary to that is that when
179
+ * SVE is enabled and VQ is narrowed we are also allowed to zero
180
+ * the now inaccessible portion of the registers.
181
+ *
182
+ * The intent of this is that no predicate bit beyond VQ is ever set.
183
+ * Which means that some operations on predicate registers themselves
184
+ * may operate on full uint64_t or even unrolled across the maximum
185
+ * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally
186
+ * may well be cheaper than conditionals to restrict the operation
187
+ * to the relevant portion of a uint16_t[16].
188
+ */
189
+void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
190
+{
191
+ int i, j;
192
+ uint64_t pmask;
193
+
194
+ assert(vq >= 1 && vq <= ARM_MAX_VQ);
195
+ assert(vq <= arm_env_get_cpu(env)->sve_max_vq);
196
+
197
+ /* Zap the high bits of the zregs. */
198
+ for (i = 0; i < 32; i++) {
199
+ memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq));
200
+ }
201
+
202
+ /* Zap the high bits of the pregs and ffr. */
203
+ pmask = 0;
204
+ if (vq & 3) {
205
+ pmask = ~(-1ULL << (16 * (vq & 3)));
206
+ }
207
+ for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) {
208
+ for (i = 0; i < 17; ++i) {
209
+ env->vfp.pregs[i].p[j] &= pmask;
210
+ }
211
+ pmask = 0;
212
+ }
213
+}
214
+
215
+/*
216
+ * Notice a change in SVE vector size when changing EL.
217
+ */
218
+void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el)
219
+{
220
+ int old_len, new_len;
221
+
222
+ /* Nothing to do if no SVE. */
223
+ if (!arm_feature(env, ARM_FEATURE_SVE)) {
224
+ return;
225
+ }
226
+
227
+ /* Nothing to do if FP is disabled in either EL. */
228
+ if (fp_exception_el(env, old_el) || fp_exception_el(env, new_el)) {
229
+ return;
230
+ }
231
+
232
+ /*
233
+ * DDI0584A.d sec 3.2: "If SVE instructions are disabled or trapped
234
+ * at ELx, or not available because the EL is in AArch32 state, then
235
+ * for all purposes other than a direct read, the ZCR_ELx.LEN field
236
+ * has an effective value of 0".
237
+ *
238
+ * Consider EL2 (aa64, vq=4) -> EL0 (aa32) -> EL1 (aa64, vq=0).
239
+ * If we ignore aa32 state, we would fail to see the vq4->vq0 transition
240
+ * from EL2->EL1. Thus we go ahead and narrow when entering aa32 so that
241
+ * we already have the correct register contents when encountering the
242
+ * vq0->vq0 transition between EL0->EL1.
243
+ */
244
+ old_len = (arm_el_is_aa64(env, old_el) && !sve_exception_el(env, old_el)
245
+ ? sve_zcr_len_for_el(env, old_el) : 0);
246
+ new_len = (arm_el_is_aa64(env, new_el) && !sve_exception_el(env, new_el)
247
+ ? sve_zcr_len_for_el(env, new_el) : 0);
248
+
249
+ /* When changing vector length, clear inaccessible state. */
250
+ if (new_len < old_len) {
251
+ aarch64_sve_narrow_vq(env, new_len + 1);
252
+ }
253
+}
254
+#endif
255
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
256
index XXXXXXX..XXXXXXX 100644
257
--- a/target/arm/op_helper.c
258
+++ b/target/arm/op_helper.c
259
@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env)
260
"AArch64 EL%d PC 0x%" PRIx64 "\n",
261
cur_el, new_el, env->pc);
262
}
263
+ aarch64_sve_change_el(env, cur_el, new_el);
264
265
qemu_mutex_lock_iothread();
266
arm_call_el_change_hook(arm_env_get_cpu(env));
267
--
53
--
268
2.19.0
54
2.20.1
269
55
270
56
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Uses tlb_vaddr_to_host for correct operation with softmmu.
3
Model gen_gvec_fn_zzz on gen_gvec_fn3 in translate-a64.c, but
4
Optimize for accesses within a single page or pair of pages.
4
indicating which kind of register and in which order.
5
5
6
Model do_zzz_fn on the other do_foo functions that take an
7
argument set and verify sve enabled.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200815013145.539409-4-richard.henderson@linaro.org
8
Message-id: 20181005175350.30752-8-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
13
---
11
target/arm/sve_helper.c | 731 +++++++++++++++++++++++++++++++---------
14
target/arm/translate-sve.c | 43 +++++++++++++++++++++-----------------
12
1 file changed, 569 insertions(+), 162 deletions(-)
15
1 file changed, 24 insertions(+), 19 deletions(-)
13
16
14
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
17
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/sve_helper.c
19
--- a/target/arm/translate-sve.c
17
+++ b/target/arm/sve_helper.c
20
+++ b/target/arm/translate-sve.c
18
@@ -XXX,XX +XXX,XX @@ static void swap_memmove(void *vd, void *vs, size_t n)
21
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn_zz(DisasContext *s, GVecGen2Fn *gvec_fn,
19
}
20
}
22
}
21
23
22
+/* Similarly for memset of 0. */
24
/* Invoke a vector expander on three Zregs. */
23
+static void swap_memzero(void *vd, size_t n)
25
-static bool do_vector3_z(DisasContext *s, GVecGen3Fn *gvec_fn,
26
- int esz, int rd, int rn, int rm)
27
+static void gen_gvec_fn_zzz(DisasContext *s, GVecGen3Fn *gvec_fn,
28
+ int esz, int rd, int rn, int rm)
29
{
30
- if (sve_access_check(s)) {
31
- unsigned vsz = vec_full_reg_size(s);
32
- gvec_fn(esz, vec_full_reg_offset(s, rd),
33
- vec_full_reg_offset(s, rn),
34
- vec_full_reg_offset(s, rm), vsz, vsz);
35
- }
36
- return true;
37
+ unsigned vsz = vec_full_reg_size(s);
38
+ gvec_fn(esz, vec_full_reg_offset(s, rd),
39
+ vec_full_reg_offset(s, rn),
40
+ vec_full_reg_offset(s, rm), vsz, vsz);
41
}
42
43
/* Invoke a vector move on two Zregs. */
44
@@ -XXX,XX +XXX,XX @@ const uint64_t pred_esz_masks[4] = {
45
*** SVE Logical - Unpredicated Group
46
*/
47
48
+static bool do_zzz_fn(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *gvec_fn)
24
+{
49
+{
25
+ uintptr_t d = (uintptr_t)vd;
50
+ if (sve_access_check(s)) {
26
+ uintptr_t o = (d | n) & 7;
51
+ gen_gvec_fn_zzz(s, gvec_fn, a->esz, a->rd, a->rn, a->rm);
27
+ size_t i;
28
+
29
+ /* Usually, the first bit of a predicate is set, so N is 0. */
30
+ if (likely(n == 0)) {
31
+ return;
32
+ }
52
+ }
33
+
53
+ return true;
34
+#ifndef HOST_WORDS_BIGENDIAN
35
+ o = 0;
36
+#endif
37
+ switch (o) {
38
+ case 0:
39
+ memset(vd, 0, n);
40
+ break;
41
+
42
+ case 4:
43
+ for (i = 0; i < n; i += 4) {
44
+ *(uint32_t *)H1_4(d + i) = 0;
45
+ }
46
+ break;
47
+
48
+ case 2:
49
+ case 6:
50
+ for (i = 0; i < n; i += 2) {
51
+ *(uint16_t *)H1_2(d + i) = 0;
52
+ }
53
+ break;
54
+
55
+ default:
56
+ for (i = 0; i < n; i++) {
57
+ *(uint8_t *)H1(d + i) = 0;
58
+ }
59
+ break;
60
+ }
61
+}
54
+}
62
+
55
+
63
void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc)
56
static bool trans_AND_zzz(DisasContext *s, arg_rrr_esz *a)
64
{
57
{
65
intptr_t opr_sz = simd_oprsz(desc);
58
- return do_vector3_z(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm);
66
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
59
+ return do_zzz_fn(s, a, tcg_gen_gvec_and);
60
}
61
62
static bool trans_ORR_zzz(DisasContext *s, arg_rrr_esz *a)
63
{
64
- return do_vector3_z(s, tcg_gen_gvec_or, 0, a->rd, a->rn, a->rm);
65
+ return do_zzz_fn(s, a, tcg_gen_gvec_or);
66
}
67
68
static bool trans_EOR_zzz(DisasContext *s, arg_rrr_esz *a)
69
{
70
- return do_vector3_z(s, tcg_gen_gvec_xor, 0, a->rd, a->rn, a->rm);
71
+ return do_zzz_fn(s, a, tcg_gen_gvec_xor);
72
}
73
74
static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a)
75
{
76
- return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
77
+ return do_zzz_fn(s, a, tcg_gen_gvec_andc);
78
}
79
67
/*
80
/*
68
* Load contiguous data, protected by a governing predicate.
81
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a)
69
*/
82
70
-#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \
83
static bool trans_ADD_zzz(DisasContext *s, arg_rrr_esz *a)
71
-static void do_##NAME(CPUARMState *env, void *vd, void *vg, \
84
{
72
- target_ulong addr, intptr_t oprsz, \
85
- return do_vector3_z(s, tcg_gen_gvec_add, a->esz, a->rd, a->rn, a->rm);
73
- uintptr_t ra) \
86
+ return do_zzz_fn(s, a, tcg_gen_gvec_add);
74
-{ \
75
- intptr_t i = 0; \
76
- do { \
77
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
78
- do { \
79
- TYPEM m = 0; \
80
- if (pg & 1) { \
81
- m = FN(env, addr, ra); \
82
- } \
83
- *(TYPEE *)(vd + H(i)) = m; \
84
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
85
- addr += sizeof(TYPEM); \
86
- } while (i & 15); \
87
- } while (i < oprsz); \
88
-} \
89
-void HELPER(NAME)(CPUARMState *env, void *vg, \
90
- target_ulong addr, uint32_t desc) \
91
-{ \
92
- do_##NAME(env, &env->vfp.zregs[simd_data(desc)], vg, \
93
- addr, simd_oprsz(desc), GETPC()); \
94
+
95
+/*
96
+ * Load elements into @vd, controlled by @vg, from @host + @mem_ofs.
97
+ * Memory is valid through @host + @mem_max. The register element
98
+ * indicies are inferred from @mem_ofs, as modified by the types for
99
+ * which the helper is built. Return the @mem_ofs of the first element
100
+ * not loaded (which is @mem_max if they are all loaded).
101
+ *
102
+ * For softmmu, we have fully validated the guest page. For user-only,
103
+ * we cannot fully validate without taking the mmap lock, but since we
104
+ * know the access is within one host page, if any access is valid they
105
+ * all must be valid. However, when @vg is all false, it may be that
106
+ * no access is valid.
107
+ */
108
+typedef intptr_t sve_ld1_host_fn(void *vd, void *vg, void *host,
109
+ intptr_t mem_ofs, intptr_t mem_max);
110
+
111
+/*
112
+ * Load one element into @vd + @reg_off from (@env, @vaddr, @ra).
113
+ * The controlling predicate is known to be true.
114
+ */
115
+typedef void sve_ld1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
116
+ target_ulong vaddr, int mmu_idx, uintptr_t ra);
117
+
118
+/*
119
+ * Generate the above primitives.
120
+ */
121
+
122
+#define DO_LD_HOST(NAME, H, TYPEE, TYPEM, HOST) \
123
+static intptr_t sve_##NAME##_host(void *vd, void *vg, void *host, \
124
+ intptr_t mem_off, const intptr_t mem_max) \
125
+{ \
126
+ intptr_t reg_off = mem_off * (sizeof(TYPEE) / sizeof(TYPEM)); \
127
+ uint64_t *pg = vg; \
128
+ while (mem_off + sizeof(TYPEM) <= mem_max) { \
129
+ TYPEM val = 0; \
130
+ if (likely((pg[reg_off >> 6] >> (reg_off & 63)) & 1)) { \
131
+ val = HOST(host + mem_off); \
132
+ } \
133
+ *(TYPEE *)(vd + H(reg_off)) = val; \
134
+ mem_off += sizeof(TYPEM), reg_off += sizeof(TYPEE); \
135
+ } \
136
+ return mem_off; \
137
}
87
}
138
88
139
+#ifdef CONFIG_SOFTMMU
89
static bool trans_SUB_zzz(DisasContext *s, arg_rrr_esz *a)
140
+#define DO_LD_TLB(NAME, H, TYPEE, TYPEM, HOST, MOEND, TLB) \
90
{
141
+static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
91
- return do_vector3_z(s, tcg_gen_gvec_sub, a->esz, a->rd, a->rn, a->rm);
142
+ target_ulong addr, int mmu_idx, uintptr_t ra) \
92
+ return do_zzz_fn(s, a, tcg_gen_gvec_sub);
143
+{ \
144
+ TCGMemOpIdx oi = make_memop_idx(ctz32(sizeof(TYPEM)) | MOEND, mmu_idx); \
145
+ TYPEM val = TLB(env, addr, oi, ra); \
146
+ *(TYPEE *)(vd + H(reg_off)) = val; \
147
+}
148
+#else
149
+#define DO_LD_TLB(NAME, H, TYPEE, TYPEM, HOST, MOEND, TLB) \
150
+static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
151
+ target_ulong addr, int mmu_idx, uintptr_t ra) \
152
+{ \
153
+ TYPEM val = HOST(g2h(addr)); \
154
+ *(TYPEE *)(vd + H(reg_off)) = val; \
155
+}
156
+#endif
157
+
158
+#define DO_LD_PRIM_1(NAME, H, TE, TM) \
159
+ DO_LD_HOST(NAME, H, TE, TM, ldub_p) \
160
+ DO_LD_TLB(NAME, H, TE, TM, ldub_p, 0, helper_ret_ldub_mmu)
161
+
162
+DO_LD_PRIM_1(ld1bb, H1, uint8_t, uint8_t)
163
+DO_LD_PRIM_1(ld1bhu, H1_2, uint16_t, uint8_t)
164
+DO_LD_PRIM_1(ld1bhs, H1_2, uint16_t, int8_t)
165
+DO_LD_PRIM_1(ld1bsu, H1_4, uint32_t, uint8_t)
166
+DO_LD_PRIM_1(ld1bss, H1_4, uint32_t, int8_t)
167
+DO_LD_PRIM_1(ld1bdu, , uint64_t, uint8_t)
168
+DO_LD_PRIM_1(ld1bds, , uint64_t, int8_t)
169
+
170
+#define DO_LD_PRIM_2(NAME, end, MOEND, H, TE, TM, PH, PT) \
171
+ DO_LD_HOST(NAME##_##end, H, TE, TM, PH##_##end##_p) \
172
+ DO_LD_TLB(NAME##_##end, H, TE, TM, PH##_##end##_p, \
173
+ MOEND, helper_##end##_##PT##_mmu)
174
+
175
+DO_LD_PRIM_2(ld1hh, le, MO_LE, H1_2, uint16_t, uint16_t, lduw, lduw)
176
+DO_LD_PRIM_2(ld1hsu, le, MO_LE, H1_4, uint32_t, uint16_t, lduw, lduw)
177
+DO_LD_PRIM_2(ld1hss, le, MO_LE, H1_4, uint32_t, int16_t, lduw, lduw)
178
+DO_LD_PRIM_2(ld1hdu, le, MO_LE, , uint64_t, uint16_t, lduw, lduw)
179
+DO_LD_PRIM_2(ld1hds, le, MO_LE, , uint64_t, int16_t, lduw, lduw)
180
+
181
+DO_LD_PRIM_2(ld1ss, le, MO_LE, H1_4, uint32_t, uint32_t, ldl, ldul)
182
+DO_LD_PRIM_2(ld1sdu, le, MO_LE, , uint64_t, uint32_t, ldl, ldul)
183
+DO_LD_PRIM_2(ld1sds, le, MO_LE, , uint64_t, int32_t, ldl, ldul)
184
+
185
+DO_LD_PRIM_2(ld1dd, le, MO_LE, , uint64_t, uint64_t, ldq, ldq)
186
+
187
+DO_LD_PRIM_2(ld1hh, be, MO_BE, H1_2, uint16_t, uint16_t, lduw, lduw)
188
+DO_LD_PRIM_2(ld1hsu, be, MO_BE, H1_4, uint32_t, uint16_t, lduw, lduw)
189
+DO_LD_PRIM_2(ld1hss, be, MO_BE, H1_4, uint32_t, int16_t, lduw, lduw)
190
+DO_LD_PRIM_2(ld1hdu, be, MO_BE, , uint64_t, uint16_t, lduw, lduw)
191
+DO_LD_PRIM_2(ld1hds, be, MO_BE, , uint64_t, int16_t, lduw, lduw)
192
+
193
+DO_LD_PRIM_2(ld1ss, be, MO_BE, H1_4, uint32_t, uint32_t, ldl, ldul)
194
+DO_LD_PRIM_2(ld1sdu, be, MO_BE, , uint64_t, uint32_t, ldl, ldul)
195
+DO_LD_PRIM_2(ld1sds, be, MO_BE, , uint64_t, int32_t, ldl, ldul)
196
+
197
+DO_LD_PRIM_2(ld1dd, be, MO_BE, , uint64_t, uint64_t, ldq, ldq)
198
+
199
+#undef DO_LD_TLB
200
+#undef DO_LD_HOST
201
+#undef DO_LD_PRIM_1
202
+#undef DO_LD_PRIM_2
203
+
204
+/*
205
+ * Skip through a sequence of inactive elements in the guarding predicate @vg,
206
+ * beginning at @reg_off bounded by @reg_max. Return the offset of the active
207
+ * element >= @reg_off, or @reg_max if there were no active elements at all.
208
+ */
209
+static intptr_t find_next_active(uint64_t *vg, intptr_t reg_off,
210
+ intptr_t reg_max, int esz)
211
+{
212
+ uint64_t pg_mask = pred_esz_masks[esz];
213
+ uint64_t pg = (vg[reg_off >> 6] & pg_mask) >> (reg_off & 63);
214
+
215
+ /* In normal usage, the first element is active. */
216
+ if (likely(pg & 1)) {
217
+ return reg_off;
218
+ }
219
+
220
+ if (pg == 0) {
221
+ reg_off &= -64;
222
+ do {
223
+ reg_off += 64;
224
+ if (unlikely(reg_off >= reg_max)) {
225
+ /* The entire predicate was false. */
226
+ return reg_max;
227
+ }
228
+ pg = vg[reg_off >> 6] & pg_mask;
229
+ } while (pg == 0);
230
+ }
231
+ reg_off += ctz64(pg);
232
+
233
+ /* We should never see an out of range predicate bit set. */
234
+ tcg_debug_assert(reg_off < reg_max);
235
+ return reg_off;
236
+}
237
+
238
+/*
239
+ * Return the maximum offset <= @mem_max which is still within the page
240
+ * referenced by @base + @mem_off.
241
+ */
242
+static intptr_t max_for_page(target_ulong base, intptr_t mem_off,
243
+ intptr_t mem_max)
244
+{
245
+ target_ulong addr = base + mem_off;
246
+ intptr_t split = -(intptr_t)(addr | TARGET_PAGE_MASK);
247
+ return MIN(split, mem_max - mem_off) + mem_off;
248
+}
249
+
250
+static inline void set_helper_retaddr(uintptr_t ra)
251
+{
252
+#ifdef CONFIG_USER_ONLY
253
+ helper_retaddr = ra;
254
+#endif
255
+}
256
+
257
+/*
258
+ * The result of tlb_vaddr_to_host for user-only is just g2h(x),
259
+ * which is always non-null. Elide the useless test.
260
+ */
261
+static inline bool test_host_page(void *host)
262
+{
263
+#ifdef CONFIG_USER_ONLY
264
+ return true;
265
+#else
266
+ return likely(host != NULL);
267
+#endif
268
+}
269
+
270
+/*
271
+ * Common helper for all contiguous one-register predicated loads.
272
+ */
273
+static void sve_ld1_r(CPUARMState *env, void *vg, const target_ulong addr,
274
+ uint32_t desc, const uintptr_t retaddr,
275
+ const int esz, const int msz,
276
+ sve_ld1_host_fn *host_fn,
277
+ sve_ld1_tlb_fn *tlb_fn)
278
+{
279
+ void *vd = &env->vfp.zregs[simd_data(desc)];
280
+ const int diffsz = esz - msz;
281
+ const intptr_t reg_max = simd_oprsz(desc);
282
+ const intptr_t mem_max = reg_max >> diffsz;
283
+ const int mmu_idx = cpu_mmu_index(env, false);
284
+ ARMVectorReg scratch;
285
+ void *host;
286
+ intptr_t split, reg_off, mem_off;
287
+
288
+ /* Find the first active element. */
289
+ reg_off = find_next_active(vg, 0, reg_max, esz);
290
+ if (unlikely(reg_off == reg_max)) {
291
+ /* The entire predicate was false; no load occurs. */
292
+ memset(vd, 0, reg_max);
293
+ return;
294
+ }
295
+ mem_off = reg_off >> diffsz;
296
+ set_helper_retaddr(retaddr);
297
+
298
+ /*
299
+ * If the (remaining) load is entirely within a single page, then:
300
+ * For softmmu, and the tlb hits, then no faults will occur;
301
+ * For user-only, either the first load will fault or none will.
302
+ * We can thus perform the load directly to the destination and
303
+ * Vd will be unmodified on any exception path.
304
+ */
305
+ split = max_for_page(addr, mem_off, mem_max);
306
+ if (likely(split == mem_max)) {
307
+ host = tlb_vaddr_to_host(env, addr + mem_off, MMU_DATA_LOAD, mmu_idx);
308
+ if (test_host_page(host)) {
309
+ mem_off = host_fn(vd, vg, host - mem_off, mem_off, mem_max);
310
+ tcg_debug_assert(mem_off == mem_max);
311
+ set_helper_retaddr(0);
312
+ /* After having taken any fault, zero leading inactive elements. */
313
+ swap_memzero(vd, reg_off);
314
+ return;
315
+ }
316
+ }
317
+
318
+ /*
319
+ * Perform the predicated read into a temporary, thus ensuring
320
+ * if the load of the last element faults, Vd is not modified.
321
+ */
322
+#ifdef CONFIG_USER_ONLY
323
+ swap_memzero(&scratch, reg_off);
324
+ host_fn(&scratch, vg, g2h(addr), mem_off, mem_max);
325
+#else
326
+ memset(&scratch, 0, reg_max);
327
+ goto start;
328
+ while (1) {
329
+ reg_off = find_next_active(vg, reg_off, reg_max, esz);
330
+ if (reg_off >= reg_max) {
331
+ break;
332
+ }
333
+ mem_off = reg_off >> diffsz;
334
+ split = max_for_page(addr, mem_off, mem_max);
335
+
336
+ start:
337
+ if (split - mem_off >= (1 << msz)) {
338
+ /* At least one whole element on this page. */
339
+ host = tlb_vaddr_to_host(env, addr + mem_off,
340
+ MMU_DATA_LOAD, mmu_idx);
341
+ if (host) {
342
+ mem_off = host_fn(&scratch, vg, host - mem_off,
343
+ mem_off, split);
344
+ reg_off = mem_off << diffsz;
345
+ continue;
346
+ }
347
+ }
348
+
349
+ /*
350
+ * Perform one normal read. This may fault, longjmping out to the
351
+ * main loop in order to raise an exception. It may succeed, and
352
+ * as a side-effect load the TLB entry for the next round. Finally,
353
+ * in the extremely unlikely case we're performing this operation
354
+ * on I/O memory, it may succeed but not bring in the TLB entry.
355
+ * But even then we have still made forward progress.
356
+ */
357
+ tlb_fn(env, &scratch, reg_off, addr + mem_off, mmu_idx, retaddr);
358
+ reg_off += 1 << esz;
359
+ }
360
+#endif
361
+
362
+ set_helper_retaddr(0);
363
+ memcpy(vd, &scratch, reg_max);
364
+}
365
+
366
+#define DO_LD1_1(NAME, ESZ) \
367
+void HELPER(sve_##NAME##_r)(CPUARMState *env, void *vg, \
368
+ target_ulong addr, uint32_t desc) \
369
+{ \
370
+ sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, 0, \
371
+ sve_##NAME##_host, sve_##NAME##_tlb); \
372
+}
373
+
374
+/* TODO: Propagate the endian check back to the translator. */
375
+#define DO_LD1_2(NAME, ESZ, MSZ) \
376
+void HELPER(sve_##NAME##_r)(CPUARMState *env, void *vg, \
377
+ target_ulong addr, uint32_t desc) \
378
+{ \
379
+ if (arm_cpu_data_is_big_endian(env)) { \
380
+ sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
381
+ sve_##NAME##_be_host, sve_##NAME##_be_tlb); \
382
+ } else { \
383
+ sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
384
+ sve_##NAME##_le_host, sve_##NAME##_le_tlb); \
385
+ } \
386
+}
387
+
388
+DO_LD1_1(ld1bb, 0)
389
+DO_LD1_1(ld1bhu, 1)
390
+DO_LD1_1(ld1bhs, 1)
391
+DO_LD1_1(ld1bsu, 2)
392
+DO_LD1_1(ld1bss, 2)
393
+DO_LD1_1(ld1bdu, 3)
394
+DO_LD1_1(ld1bds, 3)
395
+
396
+DO_LD1_2(ld1hh, 1, 1)
397
+DO_LD1_2(ld1hsu, 2, 1)
398
+DO_LD1_2(ld1hss, 2, 1)
399
+DO_LD1_2(ld1hdu, 3, 1)
400
+DO_LD1_2(ld1hds, 3, 1)
401
+
402
+DO_LD1_2(ld1ss, 2, 2)
403
+DO_LD1_2(ld1sdu, 3, 2)
404
+DO_LD1_2(ld1sds, 3, 2)
405
+
406
+DO_LD1_2(ld1dd, 3, 3)
407
+
408
+#undef DO_LD1_1
409
+#undef DO_LD1_2
410
+
411
#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \
412
void HELPER(NAME)(CPUARMState *env, void *vg, \
413
target_ulong addr, uint32_t desc) \
414
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(CPUARMState *env, void *vg, \
415
} \
416
}
93
}
417
94
418
-DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2)
95
static bool trans_SQADD_zzz(DisasContext *s, arg_rrr_esz *a)
419
-DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2)
96
{
420
-DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4)
97
- return do_vector3_z(s, tcg_gen_gvec_ssadd, a->esz, a->rd, a->rn, a->rm);
421
-DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4)
98
+ return do_zzz_fn(s, a, tcg_gen_gvec_ssadd);
422
-DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, )
99
}
423
-DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
100
424
-
101
static bool trans_SQSUB_zzz(DisasContext *s, arg_rrr_esz *a)
425
-DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
102
{
426
-DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int16_t, H1_4)
103
- return do_vector3_z(s, tcg_gen_gvec_sssub, a->esz, a->rd, a->rn, a->rm);
427
-DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
104
+ return do_zzz_fn(s, a, tcg_gen_gvec_sssub);
428
-DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
105
}
429
-
106
430
-DO_LD1(sve_ld1sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, )
107
static bool trans_UQADD_zzz(DisasContext *s, arg_rrr_esz *a)
431
-DO_LD1(sve_ld1sds_r, cpu_ldl_data_ra, uint64_t, int32_t, )
108
{
432
-
109
- return do_vector3_z(s, tcg_gen_gvec_usadd, a->esz, a->rd, a->rn, a->rm);
433
-DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
110
+ return do_zzz_fn(s, a, tcg_gen_gvec_usadd);
434
DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
111
}
435
DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
112
436
DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
113
static bool trans_UQSUB_zzz(DisasContext *s, arg_rrr_esz *a)
437
114
{
438
-DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
115
- return do_vector3_z(s, tcg_gen_gvec_ussub, a->esz, a->rd, a->rn, a->rm);
439
DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
116
+ return do_zzz_fn(s, a, tcg_gen_gvec_ussub);
440
DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
117
}
441
DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
442
443
-DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
444
DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
445
DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
446
DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
447
448
-DO_LD1(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
449
DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
450
DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
451
DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
452
453
-#undef DO_LD1
454
#undef DO_LD2
455
#undef DO_LD3
456
#undef DO_LD4
457
118
458
/*
119
/*
459
* Load contiguous data, first-fault and no-fault.
460
+ *
461
+ * For user-only, one could argue that we should hold the mmap_lock during
462
+ * the operation so that there is no race between page_check_range and the
463
+ * load operation. However, unmapping pages out from under a running thread
464
+ * is extraordinarily unlikely. This theoretical race condition also affects
465
+ * linux-user/ in its get_user/put_user macros.
466
+ *
467
+ * TODO: Construct some helpers, written in assembly, that interact with
468
+ * handle_cpu_signal to produce memory ops which can properly report errors
469
+ * without racing.
470
*/
471
472
-#ifdef CONFIG_USER_ONLY
473
-
474
/* Fault on byte I. All bits in FFR from I are cleared. The vector
475
* result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE
476
* option, which leaves subsequent data unchanged.
477
@@ -XXX,XX +XXX,XX @@ static void record_fault(CPUARMState *env, uintptr_t i, uintptr_t oprsz)
478
}
479
}
480
481
-/* Hold the mmap lock during the operation so that there is no race
482
- * between page_check_range and the load operation. We expect the
483
- * usual case to have no faults at all, so we check the whole range
484
- * first and if successful defer to the normal load operation.
485
- *
486
- * TODO: Change mmap_lock to a rwlock so that multiple readers
487
- * can run simultaneously. This will probably help other uses
488
- * within QEMU as well.
489
+/*
490
+ * Common helper for all contiguous first-fault loads.
491
*/
492
-#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
493
-static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg, \
494
- target_ulong addr, intptr_t oprsz, \
495
- bool first, uintptr_t ra) \
496
-{ \
497
- intptr_t i = 0; \
498
- do { \
499
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
500
- do { \
501
- TYPEM m = 0; \
502
- if (pg & 1) { \
503
- if (!first && \
504
- unlikely(page_check_range(addr, sizeof(TYPEM), \
505
- PAGE_READ))) { \
506
- record_fault(env, i, oprsz); \
507
- return; \
508
- } \
509
- m = FN(env, addr, ra); \
510
- first = false; \
511
- } \
512
- *(TYPEE *)(vd + H(i)) = m; \
513
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
514
- addr += sizeof(TYPEM); \
515
- } while (i & 15); \
516
- } while (i < oprsz); \
517
-} \
518
-void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \
519
- target_ulong addr, uint32_t desc) \
520
-{ \
521
- intptr_t oprsz = simd_oprsz(desc); \
522
- unsigned rd = simd_data(desc); \
523
- void *vd = &env->vfp.zregs[rd]; \
524
- mmap_lock(); \
525
- if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \
526
- do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \
527
- } else { \
528
- do_sve_ldff1##PART(env, vd, vg, addr, oprsz, true, GETPC()); \
529
- } \
530
- mmap_unlock(); \
531
-}
532
+static void sve_ldff1_r(CPUARMState *env, void *vg, const target_ulong addr,
533
+ uint32_t desc, const uintptr_t retaddr,
534
+ const int esz, const int msz,
535
+ sve_ld1_host_fn *host_fn,
536
+ sve_ld1_tlb_fn *tlb_fn)
537
+{
538
+ void *vd = &env->vfp.zregs[simd_data(desc)];
539
+ const int diffsz = esz - msz;
540
+ const intptr_t reg_max = simd_oprsz(desc);
541
+ const intptr_t mem_max = reg_max >> diffsz;
542
+ const int mmu_idx = cpu_mmu_index(env, false);
543
+ intptr_t split, reg_off, mem_off;
544
+ void *host;
545
546
-/* No-fault loads are like first-fault loads without the
547
- * first faulting special case.
548
- */
549
-#define DO_LDNF1(PART) \
550
-void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \
551
- target_ulong addr, uint32_t desc) \
552
-{ \
553
- intptr_t oprsz = simd_oprsz(desc); \
554
- unsigned rd = simd_data(desc); \
555
- void *vd = &env->vfp.zregs[rd]; \
556
- mmap_lock(); \
557
- if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \
558
- do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \
559
- } else { \
560
- do_sve_ldff1##PART(env, vd, vg, addr, oprsz, false, GETPC()); \
561
- } \
562
- mmap_unlock(); \
563
-}
564
+ /* Skip to the first active element. */
565
+ reg_off = find_next_active(vg, 0, reg_max, esz);
566
+ if (unlikely(reg_off == reg_max)) {
567
+ /* The entire predicate was false; no load occurs. */
568
+ memset(vd, 0, reg_max);
569
+ return;
570
+ }
571
+ mem_off = reg_off >> diffsz;
572
+ set_helper_retaddr(retaddr);
573
574
+ /*
575
+ * If the (remaining) load is entirely within a single page, then:
576
+ * For softmmu, and the tlb hits, then no faults will occur;
577
+ * For user-only, either the first load will fault or none will.
578
+ * We can thus perform the load directly to the destination and
579
+ * Vd will be unmodified on any exception path.
580
+ */
581
+ split = max_for_page(addr, mem_off, mem_max);
582
+ if (likely(split == mem_max)) {
583
+ host = tlb_vaddr_to_host(env, addr + mem_off, MMU_DATA_LOAD, mmu_idx);
584
+ if (test_host_page(host)) {
585
+ mem_off = host_fn(vd, vg, host - mem_off, mem_off, mem_max);
586
+ tcg_debug_assert(mem_off == mem_max);
587
+ set_helper_retaddr(0);
588
+ /* After any fault, zero any leading inactive elements. */
589
+ swap_memzero(vd, reg_off);
590
+ return;
591
+ }
592
+ }
593
+
594
+#ifdef CONFIG_USER_ONLY
595
+ /*
596
+ * The page(s) containing this first element at ADDR+MEM_OFF must
597
+ * be valid. Considering that this first element may be misaligned
598
+ * and cross a page boundary itself, take the rest of the page from
599
+ * the last byte of the element.
600
+ */
601
+ split = max_for_page(addr, mem_off + (1 << msz) - 1, mem_max);
602
+ mem_off = host_fn(vd, vg, g2h(addr), mem_off, split);
603
+
604
+ /* After any fault, zero any leading inactive elements. */
605
+ swap_memzero(vd, reg_off);
606
+ reg_off = mem_off << diffsz;
607
#else
608
+ /*
609
+ * Perform one normal read, which will fault or not.
610
+ * But it is likely to bring the page into the tlb.
611
+ */
612
+ tlb_fn(env, vd, reg_off, addr + mem_off, mmu_idx, retaddr);
613
614
-/* TODO: System mode is not yet supported.
615
- * This would probably use tlb_vaddr_to_host.
616
- */
617
-#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
618
-void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \
619
- target_ulong addr, uint32_t desc) \
620
-{ \
621
- g_assert_not_reached(); \
622
-}
623
-
624
-#define DO_LDNF1(PART) \
625
-void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \
626
- target_ulong addr, uint32_t desc) \
627
-{ \
628
- g_assert_not_reached(); \
629
-}
630
+ /* After any fault, zero any leading predicated false elts. */
631
+ swap_memzero(vd, reg_off);
632
+ mem_off += 1 << msz;
633
+ reg_off += 1 << esz;
634
635
+ /* Try again to read the balance of the page. */
636
+ split = max_for_page(addr, mem_off - 1, mem_max);
637
+ if (split >= (1 << msz)) {
638
+ host = tlb_vaddr_to_host(env, addr + mem_off, MMU_DATA_LOAD, mmu_idx);
639
+ if (host) {
640
+ mem_off = host_fn(vd, vg, host - mem_off, mem_off, split);
641
+ reg_off = mem_off << diffsz;
642
+ }
643
+ }
644
#endif
645
646
-DO_LDFF1(bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
647
-DO_LDFF1(bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2)
648
-DO_LDFF1(bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2)
649
-DO_LDFF1(bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4)
650
-DO_LDFF1(bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4)
651
-DO_LDFF1(bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, )
652
-DO_LDFF1(bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
653
+ set_helper_retaddr(0);
654
+ record_fault(env, reg_off, reg_max);
655
+}
656
657
-DO_LDFF1(hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
658
-DO_LDFF1(hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
659
-DO_LDFF1(hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4)
660
-DO_LDFF1(hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
661
-DO_LDFF1(hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
662
+/*
663
+ * Common helper for all contiguous no-fault loads.
664
+ */
665
+static void sve_ldnf1_r(CPUARMState *env, void *vg, const target_ulong addr,
666
+ uint32_t desc, const int esz, const int msz,
667
+ sve_ld1_host_fn *host_fn)
668
+{
669
+ void *vd = &env->vfp.zregs[simd_data(desc)];
670
+ const int diffsz = esz - msz;
671
+ const intptr_t reg_max = simd_oprsz(desc);
672
+ const intptr_t mem_max = reg_max >> diffsz;
673
+ const int mmu_idx = cpu_mmu_index(env, false);
674
+ intptr_t split, reg_off, mem_off;
675
+ void *host;
676
677
-DO_LDFF1(ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
678
-DO_LDFF1(sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, )
679
-DO_LDFF1(sds_r, cpu_ldl_data_ra, uint64_t, int32_t, )
680
+#ifdef CONFIG_USER_ONLY
681
+ host = tlb_vaddr_to_host(env, addr, MMU_DATA_LOAD, mmu_idx);
682
+ if (likely(page_check_range(addr, mem_max, PAGE_READ) == 0)) {
683
+ /* The entire operation is valid and will not fault. */
684
+ host_fn(vd, vg, host, 0, mem_max);
685
+ return;
686
+ }
687
+#endif
688
689
-DO_LDFF1(dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
690
+ /* There will be no fault, so we may modify in advance. */
691
+ memset(vd, 0, reg_max);
692
693
-#undef DO_LDFF1
694
+ /* Skip to the first active element. */
695
+ reg_off = find_next_active(vg, 0, reg_max, esz);
696
+ if (unlikely(reg_off == reg_max)) {
697
+ /* The entire predicate was false; no load occurs. */
698
+ return;
699
+ }
700
+ mem_off = reg_off >> diffsz;
701
702
-DO_LDNF1(bb_r)
703
-DO_LDNF1(bhu_r)
704
-DO_LDNF1(bhs_r)
705
-DO_LDNF1(bsu_r)
706
-DO_LDNF1(bss_r)
707
-DO_LDNF1(bdu_r)
708
-DO_LDNF1(bds_r)
709
+#ifdef CONFIG_USER_ONLY
710
+ if (page_check_range(addr + mem_off, 1 << msz, PAGE_READ) == 0) {
711
+ /* At least one load is valid; take the rest of the page. */
712
+ split = max_for_page(addr, mem_off + (1 << msz) - 1, mem_max);
713
+ mem_off = host_fn(vd, vg, host, mem_off, split);
714
+ reg_off = mem_off << diffsz;
715
+ }
716
+#else
717
+ /*
718
+ * If the address is not in the TLB, we have no way to bring the
719
+ * entry into the TLB without also risking a fault. Note that
720
+ * the corollary is that we never load from an address not in RAM.
721
+ *
722
+ * This last is out of spec, in a weird corner case.
723
+ * Per the MemNF/MemSingleNF pseudocode, a NF load from Device memory
724
+ * must not actually hit the bus -- it returns UNKNOWN data instead.
725
+ * But if you map non-RAM with Normal memory attributes and do a NF
726
+ * load then it should access the bus. (Nobody ought actually do this
727
+ * in the real world, obviously.)
728
+ *
729
+ * Then there are the annoying special cases with watchpoints...
730
+ *
731
+ * TODO: Add a form of tlb_fill that does not raise an exception,
732
+ * with a form of tlb_vaddr_to_host and a set of loads to match.
733
+ * The non_fault_vaddr_to_host would handle everything, usually,
734
+ * and the loads would handle the iomem path for watchpoints.
735
+ */
736
+ host = tlb_vaddr_to_host(env, addr + mem_off, MMU_DATA_LOAD, mmu_idx);
737
+ split = max_for_page(addr, mem_off, mem_max);
738
+ if (host && split >= (1 << msz)) {
739
+ mem_off = host_fn(vd, vg, host - mem_off, mem_off, split);
740
+ reg_off = mem_off << diffsz;
741
+ }
742
+#endif
743
744
-DO_LDNF1(hh_r)
745
-DO_LDNF1(hsu_r)
746
-DO_LDNF1(hss_r)
747
-DO_LDNF1(hdu_r)
748
-DO_LDNF1(hds_r)
749
+ record_fault(env, reg_off, reg_max);
750
+}
751
752
-DO_LDNF1(ss_r)
753
-DO_LDNF1(sdu_r)
754
-DO_LDNF1(sds_r)
755
+#define DO_LDFF1_LDNF1_1(PART, ESZ) \
756
+void HELPER(sve_ldff1##PART##_r)(CPUARMState *env, void *vg, \
757
+ target_ulong addr, uint32_t desc) \
758
+{ \
759
+ sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, 0, \
760
+ sve_ld1##PART##_host, sve_ld1##PART##_tlb); \
761
+} \
762
+void HELPER(sve_ldnf1##PART##_r)(CPUARMState *env, void *vg, \
763
+ target_ulong addr, uint32_t desc) \
764
+{ \
765
+ sve_ldnf1_r(env, vg, addr, desc, ESZ, 0, sve_ld1##PART##_host); \
766
+}
767
768
-DO_LDNF1(dd_r)
769
+/* TODO: Propagate the endian check back to the translator. */
770
+#define DO_LDFF1_LDNF1_2(PART, ESZ, MSZ) \
771
+void HELPER(sve_ldff1##PART##_r)(CPUARMState *env, void *vg, \
772
+ target_ulong addr, uint32_t desc) \
773
+{ \
774
+ if (arm_cpu_data_is_big_endian(env)) { \
775
+ sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
776
+ sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
777
+ } else { \
778
+ sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
779
+ sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
780
+ } \
781
+} \
782
+void HELPER(sve_ldnf1##PART##_r)(CPUARMState *env, void *vg, \
783
+ target_ulong addr, uint32_t desc) \
784
+{ \
785
+ if (arm_cpu_data_is_big_endian(env)) { \
786
+ sve_ldnf1_r(env, vg, addr, desc, ESZ, MSZ, \
787
+ sve_ld1##PART##_be_host); \
788
+ } else { \
789
+ sve_ldnf1_r(env, vg, addr, desc, ESZ, MSZ, \
790
+ sve_ld1##PART##_le_host); \
791
+ } \
792
+}
793
794
-#undef DO_LDNF1
795
+DO_LDFF1_LDNF1_1(bb, 0)
796
+DO_LDFF1_LDNF1_1(bhu, 1)
797
+DO_LDFF1_LDNF1_1(bhs, 1)
798
+DO_LDFF1_LDNF1_1(bsu, 2)
799
+DO_LDFF1_LDNF1_1(bss, 2)
800
+DO_LDFF1_LDNF1_1(bdu, 3)
801
+DO_LDFF1_LDNF1_1(bds, 3)
802
+
803
+DO_LDFF1_LDNF1_2(hh, 1, 1)
804
+DO_LDFF1_LDNF1_2(hsu, 2, 1)
805
+DO_LDFF1_LDNF1_2(hss, 2, 1)
806
+DO_LDFF1_LDNF1_2(hdu, 3, 1)
807
+DO_LDFF1_LDNF1_2(hds, 3, 1)
808
+
809
+DO_LDFF1_LDNF1_2(ss, 2, 2)
810
+DO_LDFF1_LDNF1_2(sdu, 3, 2)
811
+DO_LDFF1_LDNF1_2(sds, 3, 2)
812
+
813
+DO_LDFF1_LDNF1_2(dd, 3, 3)
814
+
815
+#undef DO_LDFF1_LDNF1_1
816
+#undef DO_LDFF1_LDNF1_2
817
818
/*
819
* Store contiguous data, protected by a governing predicate.
820
--
120
--
821
2.19.0
121
2.20.1
822
122
823
123
diff view generated by jsdifflib
1
The Arm v8M architecture includes hardware stack limit checking.
1
From: Richard Henderson <richard.henderson@linaro.org>
2
When certain instructions update the stack pointer, if the new
3
value of SP is below the limit set in the associated limit register
4
then an exception is taken. Add a TB flag that tracks whether
5
the limit-checking code needs to be emitted.
6
2
3
We want to ensure that access is checked by the time we ask
4
for a specific fp/vector register. We want to ensure that
5
we do not emit two lots of code to raise an exception.
6
7
But sometimes it's difficult to cleanly organize the code
8
such that we never pass through sve_check_access exactly once.
9
Allow multiple calls so long as the result is true, that is,
10
no exception to be raised.
11
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20200815013145.539409-5-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20181002163556.10279-2-peter.maydell@linaro.org
11
---
16
---
12
target/arm/cpu.h | 7 +++++++
17
target/arm/translate.h | 1 +
13
target/arm/translate.h | 1 +
18
target/arm/translate-a64.c | 27 ++++++++++++++++-----------
14
target/arm/helper.c | 10 ++++++++++
19
2 files changed, 17 insertions(+), 11 deletions(-)
15
target/arm/translate.c | 1 +
16
4 files changed, 19 insertions(+)
17
20
18
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/cpu.h
21
+++ b/target/arm/cpu.h
22
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CCR, UNALIGN_TRP, 3, 1)
23
FIELD(V7M_CCR, DIV_0_TRP, 4, 1)
24
FIELD(V7M_CCR, BFHFNMIGN, 8, 1)
25
FIELD(V7M_CCR, STKALIGN, 9, 1)
26
+FIELD(V7M_CCR, STKOFHFNMIGN, 10, 1)
27
FIELD(V7M_CCR, DC, 16, 1)
28
FIELD(V7M_CCR, IC, 17, 1)
29
+FIELD(V7M_CCR, BP, 18, 1)
30
31
/* V7M SCR bits */
32
FIELD(V7M_SCR, SLEEPONEXIT, 1, 1)
33
@@ -XXX,XX +XXX,XX @@ static inline bool arm_cpu_data_is_big_endian(CPUARMState *env)
34
/* For M profile only, Handler (ie not Thread) mode */
35
#define ARM_TBFLAG_HANDLER_SHIFT 21
36
#define ARM_TBFLAG_HANDLER_MASK (1 << ARM_TBFLAG_HANDLER_SHIFT)
37
+/* For M profile only, whether we should generate stack-limit checks */
38
+#define ARM_TBFLAG_STACKCHECK_SHIFT 22
39
+#define ARM_TBFLAG_STACKCHECK_MASK (1 << ARM_TBFLAG_STACKCHECK_SHIFT)
40
41
/* Bit usage when in AArch64 state */
42
#define ARM_TBFLAG_TBI0_SHIFT 0 /* TBI0 for EL0/1 or TBI for EL2/3 */
43
@@ -XXX,XX +XXX,XX @@ static inline bool arm_cpu_data_is_big_endian(CPUARMState *env)
44
(((F) & ARM_TBFLAG_BE_DATA_MASK) >> ARM_TBFLAG_BE_DATA_SHIFT)
45
#define ARM_TBFLAG_HANDLER(F) \
46
(((F) & ARM_TBFLAG_HANDLER_MASK) >> ARM_TBFLAG_HANDLER_SHIFT)
47
+#define ARM_TBFLAG_STACKCHECK(F) \
48
+ (((F) & ARM_TBFLAG_STACKCHECK_MASK) >> ARM_TBFLAG_STACKCHECK_SHIFT)
49
#define ARM_TBFLAG_TBI0(F) \
50
(((F) & ARM_TBFLAG_TBI0_MASK) >> ARM_TBFLAG_TBI0_SHIFT)
51
#define ARM_TBFLAG_TBI1(F) \
52
diff --git a/target/arm/translate.h b/target/arm/translate.h
21
diff --git a/target/arm/translate.h b/target/arm/translate.h
53
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/translate.h
23
--- a/target/arm/translate.h
55
+++ b/target/arm/translate.h
24
+++ b/target/arm/translate.h
56
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
25
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
57
int vec_stride;
26
* that it is set at the point where we actually touch the FP regs.
58
bool v7m_handler_mode;
59
bool v8m_secure; /* true if v8M and we're in Secure mode */
60
+ bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
61
/* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
62
* so that top level loop can generate correct syndrome information.
63
*/
27
*/
64
diff --git a/target/arm/helper.c b/target/arm/helper.c
28
bool fp_access_checked;
29
+ bool sve_access_checked;
30
/* ARMv8 single-step state (this is distinct from the QEMU gdbstub
31
* single-step support).
32
*/
33
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
65
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
66
--- a/target/arm/helper.c
35
--- a/target/arm/translate-a64.c
67
+++ b/target/arm/helper.c
36
+++ b/target/arm/translate-a64.c
68
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
37
@@ -XXX,XX +XXX,XX @@ static void do_vec_ld(DisasContext *s, int destidx, int element,
69
flags |= ARM_TBFLAG_HANDLER_MASK;
38
* unallocated-encoding checks (otherwise the syndrome information
39
* for the resulting exception will be incorrect).
40
*/
41
-static inline bool fp_access_check(DisasContext *s)
42
+static bool fp_access_check(DisasContext *s)
43
{
44
- assert(!s->fp_access_checked);
45
- s->fp_access_checked = true;
46
+ if (s->fp_excp_el) {
47
+ assert(!s->fp_access_checked);
48
+ s->fp_access_checked = true;
49
50
- if (!s->fp_excp_el) {
51
- return true;
52
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
53
+ syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
54
+ return false;
70
}
55
}
71
56
-
72
+ /* v8M always applies stack limit checks unless CCR.STKOFHFNMIGN is
57
- gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
73
+ * suppressing them because the requested execution priority is less than 0.
58
- syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
74
+ */
59
- return false;
75
+ if (arm_feature(env, ARM_FEATURE_V8) &&
60
+ s->fp_access_checked = true;
76
+ arm_feature(env, ARM_FEATURE_M) &&
61
+ return true;
77
+ !((mmu_idx & ARM_MMU_IDX_M_NEGPRI) &&
62
}
78
+ (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
63
79
+ flags |= ARM_TBFLAG_STACKCHECK_MASK;
64
/* Check that SVE access is enabled. If it is, return true.
80
+ }
65
@@ -XXX,XX +XXX,XX @@ static inline bool fp_access_check(DisasContext *s)
66
bool sve_access_check(DisasContext *s)
67
{
68
if (s->sve_excp_el) {
69
- gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_sve_access_trap(),
70
- s->sve_excp_el);
71
+ assert(!s->sve_access_checked);
72
+ s->sve_access_checked = true;
81
+
73
+
82
*pflags = flags;
74
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
83
*cs_base = 0;
75
+ syn_sve_access_trap(), s->sve_excp_el);
76
return false;
77
}
78
+ s->sve_access_checked = true;
79
return fp_access_check(s);
84
}
80
}
85
diff --git a/target/arm/translate.c b/target/arm/translate.c
81
86
index XXXXXXX..XXXXXXX 100644
82
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
87
--- a/target/arm/translate.c
83
s->base.pc_next += 4;
88
+++ b/target/arm/translate.c
84
89
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
85
s->fp_access_checked = false;
90
dc->v7m_handler_mode = ARM_TBFLAG_HANDLER(dc->base.tb->flags);
86
+ s->sve_access_checked = false;
91
dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
87
92
regime_is_secure(env, dc->mmu_idx);
88
if (dc_isar_feature(aa64_bti, s)) {
93
+ dc->v8m_stackcheck = ARM_TBFLAG_STACKCHECK(dc->base.tb->flags);
89
if (s->base.num_insns == 1) {
94
dc->cp_regs = cpu->cp_regs;
95
dc->features = env->features;
96
97
--
90
--
98
2.19.0
91
2.20.1
99
92
100
93
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This fixes the endianness problem for softmmu, and moves
3
This is the only user of the function.
4
the main loop out of a macro and into an inlined function.
5
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
7
Message-id: 20200815013145.539409-6-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-14-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
9
---
12
target/arm/helper-sve.h | 52 ++++++++++----
10
target/arm/translate-sve.c | 19 ++++++-------------
13
target/arm/sve_helper.c | 139 ++++++++++++++++++++++++-------------
11
1 file changed, 6 insertions(+), 13 deletions(-)
14
target/arm/translate-sve.c | 74 +++++++++++++-------
15
3 files changed, 177 insertions(+), 88 deletions(-)
16
12
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
20
+++ b/target/arm/helper-sve.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG,
22
23
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
24
void, env, ptr, ptr, ptr, tl, i32)
25
-DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
26
+DEF_HELPER_FLAGS_6(sve_sths_le_zsu, TCG_CALL_NO_WG,
27
void, env, ptr, ptr, ptr, tl, i32)
28
-DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG,
29
+DEF_HELPER_FLAGS_6(sve_sths_be_zsu, TCG_CALL_NO_WG,
30
+ void, env, ptr, ptr, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_6(sve_stss_le_zsu, TCG_CALL_NO_WG,
32
+ void, env, ptr, ptr, ptr, tl, i32)
33
+DEF_HELPER_FLAGS_6(sve_stss_be_zsu, TCG_CALL_NO_WG,
34
void, env, ptr, ptr, ptr, tl, i32)
35
36
DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG,
37
void, env, ptr, ptr, ptr, tl, i32)
38
-DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG,
39
+DEF_HELPER_FLAGS_6(sve_sths_le_zss, TCG_CALL_NO_WG,
40
void, env, ptr, ptr, ptr, tl, i32)
41
-DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG,
42
+DEF_HELPER_FLAGS_6(sve_sths_be_zss, TCG_CALL_NO_WG,
43
+ void, env, ptr, ptr, ptr, tl, i32)
44
+DEF_HELPER_FLAGS_6(sve_stss_le_zss, TCG_CALL_NO_WG,
45
+ void, env, ptr, ptr, ptr, tl, i32)
46
+DEF_HELPER_FLAGS_6(sve_stss_be_zss, TCG_CALL_NO_WG,
47
void, env, ptr, ptr, ptr, tl, i32)
48
49
DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG,
50
void, env, ptr, ptr, ptr, tl, i32)
51
-DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG,
52
+DEF_HELPER_FLAGS_6(sve_sthd_le_zsu, TCG_CALL_NO_WG,
53
void, env, ptr, ptr, ptr, tl, i32)
54
-DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG,
55
+DEF_HELPER_FLAGS_6(sve_sthd_be_zsu, TCG_CALL_NO_WG,
56
void, env, ptr, ptr, ptr, tl, i32)
57
-DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG,
58
+DEF_HELPER_FLAGS_6(sve_stsd_le_zsu, TCG_CALL_NO_WG,
59
+ void, env, ptr, ptr, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_6(sve_stsd_be_zsu, TCG_CALL_NO_WG,
61
+ void, env, ptr, ptr, ptr, tl, i32)
62
+DEF_HELPER_FLAGS_6(sve_stdd_le_zsu, TCG_CALL_NO_WG,
63
+ void, env, ptr, ptr, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_6(sve_stdd_be_zsu, TCG_CALL_NO_WG,
65
void, env, ptr, ptr, ptr, tl, i32)
66
67
DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG,
68
void, env, ptr, ptr, ptr, tl, i32)
69
-DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG,
70
+DEF_HELPER_FLAGS_6(sve_sthd_le_zss, TCG_CALL_NO_WG,
71
void, env, ptr, ptr, ptr, tl, i32)
72
-DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG,
73
+DEF_HELPER_FLAGS_6(sve_sthd_be_zss, TCG_CALL_NO_WG,
74
void, env, ptr, ptr, ptr, tl, i32)
75
-DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG,
76
+DEF_HELPER_FLAGS_6(sve_stsd_le_zss, TCG_CALL_NO_WG,
77
+ void, env, ptr, ptr, ptr, tl, i32)
78
+DEF_HELPER_FLAGS_6(sve_stsd_be_zss, TCG_CALL_NO_WG,
79
+ void, env, ptr, ptr, ptr, tl, i32)
80
+DEF_HELPER_FLAGS_6(sve_stdd_le_zss, TCG_CALL_NO_WG,
81
+ void, env, ptr, ptr, ptr, tl, i32)
82
+DEF_HELPER_FLAGS_6(sve_stdd_be_zss, TCG_CALL_NO_WG,
83
void, env, ptr, ptr, ptr, tl, i32)
84
85
DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG,
86
void, env, ptr, ptr, ptr, tl, i32)
87
-DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG,
88
+DEF_HELPER_FLAGS_6(sve_sthd_le_zd, TCG_CALL_NO_WG,
89
void, env, ptr, ptr, ptr, tl, i32)
90
-DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG,
91
+DEF_HELPER_FLAGS_6(sve_sthd_be_zd, TCG_CALL_NO_WG,
92
void, env, ptr, ptr, ptr, tl, i32)
93
-DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG,
94
+DEF_HELPER_FLAGS_6(sve_stsd_le_zd, TCG_CALL_NO_WG,
95
+ void, env, ptr, ptr, ptr, tl, i32)
96
+DEF_HELPER_FLAGS_6(sve_stsd_be_zd, TCG_CALL_NO_WG,
97
+ void, env, ptr, ptr, ptr, tl, i32)
98
+DEF_HELPER_FLAGS_6(sve_stdd_le_zd, TCG_CALL_NO_WG,
99
+ void, env, ptr, ptr, ptr, tl, i32)
100
+DEF_HELPER_FLAGS_6(sve_stdd_be_zd, TCG_CALL_NO_WG,
101
void, env, ptr, ptr, ptr, tl, i32)
102
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/target/arm/sve_helper.c
105
+++ b/target/arm/sve_helper.c
106
@@ -XXX,XX +XXX,XX @@ DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
107
108
/* Stores with a vector index. */
109
110
-#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
111
-void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
112
- target_ulong base, uint32_t desc) \
113
-{ \
114
- intptr_t i, oprsz = simd_oprsz(desc); \
115
- unsigned scale = simd_data(desc); \
116
- uintptr_t ra = GETPC(); \
117
- for (i = 0; i < oprsz; ) { \
118
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
119
- do { \
120
- if (likely(pg & 1)) { \
121
- target_ulong off = *(TYPEI *)(vm + H1_4(i)); \
122
- uint32_t d = *(uint32_t *)(vd + H1_4(i)); \
123
- FN(env, base + (off << scale), d, ra); \
124
- } \
125
- i += sizeof(uint32_t), pg >>= sizeof(uint32_t); \
126
- } while (i & 15); \
127
- } \
128
+static void sve_st1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
129
+ target_ulong base, uint32_t desc, uintptr_t ra,
130
+ zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
131
+{
132
+ const int mmu_idx = cpu_mmu_index(env, false);
133
+ intptr_t i, oprsz = simd_oprsz(desc);
134
+ unsigned scale = simd_data(desc);
135
+
136
+ set_helper_retaddr(ra);
137
+ for (i = 0; i < oprsz; ) {
138
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
139
+ do {
140
+ if (likely(pg & 1)) {
141
+ target_ulong off = off_fn(vm, i);
142
+ tlb_fn(env, vd, i, base + (off << scale), mmu_idx, ra);
143
+ }
144
+ i += 4, pg >>= 4;
145
+ } while (i & 15);
146
+ }
147
+ set_helper_retaddr(0);
148
}
149
150
-#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \
151
-void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
152
- target_ulong base, uint32_t desc) \
153
-{ \
154
- intptr_t i, oprsz = simd_oprsz(desc) / 8; \
155
- unsigned scale = simd_data(desc); \
156
- uintptr_t ra = GETPC(); \
157
- uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \
158
- for (i = 0; i < oprsz; i++) { \
159
- if (likely(pg[H1(i)] & 1)) { \
160
- target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \
161
- FN(env, base + off, d[i], ra); \
162
- } \
163
- } \
164
+static void sve_st1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
165
+ target_ulong base, uint32_t desc, uintptr_t ra,
166
+ zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
167
+{
168
+ const int mmu_idx = cpu_mmu_index(env, false);
169
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
170
+ unsigned scale = simd_data(desc);
171
+
172
+ set_helper_retaddr(ra);
173
+ for (i = 0; i < oprsz; i++) {
174
+ uint8_t pg = *(uint8_t *)(vg + H1(i));
175
+ if (likely(pg & 1)) {
176
+ target_ulong off = off_fn(vm, i * 8);
177
+ tlb_fn(env, vd, i * 8, base + (off << scale), mmu_idx, ra);
178
+ }
179
+ }
180
+ set_helper_retaddr(0);
181
}
182
183
-DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra)
184
-DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra)
185
-DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra)
186
+#define DO_ST1_ZPZ_S(MEM, OFS) \
187
+void __attribute__((flatten)) HELPER(sve_st##MEM##_##OFS) \
188
+ (CPUARMState *env, void *vd, void *vg, void *vm, \
189
+ target_ulong base, uint32_t desc) \
190
+{ \
191
+ sve_st1_zs(env, vd, vg, vm, base, desc, GETPC(), \
192
+ off_##OFS##_s, sve_st1##MEM##_tlb); \
193
+}
194
195
-DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra)
196
-DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra)
197
-DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra)
198
+#define DO_ST1_ZPZ_D(MEM, OFS) \
199
+void __attribute__((flatten)) HELPER(sve_st##MEM##_##OFS) \
200
+ (CPUARMState *env, void *vd, void *vg, void *vm, \
201
+ target_ulong base, uint32_t desc) \
202
+{ \
203
+ sve_st1_zd(env, vd, vg, vm, base, desc, GETPC(), \
204
+ off_##OFS##_d, sve_st1##MEM##_tlb); \
205
+}
206
207
-DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra)
208
-DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra)
209
-DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra)
210
-DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra)
211
+DO_ST1_ZPZ_S(bs, zsu)
212
+DO_ST1_ZPZ_S(hs_le, zsu)
213
+DO_ST1_ZPZ_S(hs_be, zsu)
214
+DO_ST1_ZPZ_S(ss_le, zsu)
215
+DO_ST1_ZPZ_S(ss_be, zsu)
216
217
-DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra)
218
-DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra)
219
-DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra)
220
-DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra)
221
+DO_ST1_ZPZ_S(bs, zss)
222
+DO_ST1_ZPZ_S(hs_le, zss)
223
+DO_ST1_ZPZ_S(hs_be, zss)
224
+DO_ST1_ZPZ_S(ss_le, zss)
225
+DO_ST1_ZPZ_S(ss_be, zss)
226
227
-DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra)
228
-DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra)
229
-DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra)
230
-DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra)
231
+DO_ST1_ZPZ_D(bd, zsu)
232
+DO_ST1_ZPZ_D(hd_le, zsu)
233
+DO_ST1_ZPZ_D(hd_be, zsu)
234
+DO_ST1_ZPZ_D(sd_le, zsu)
235
+DO_ST1_ZPZ_D(sd_be, zsu)
236
+DO_ST1_ZPZ_D(dd_le, zsu)
237
+DO_ST1_ZPZ_D(dd_be, zsu)
238
+
239
+DO_ST1_ZPZ_D(bd, zss)
240
+DO_ST1_ZPZ_D(hd_le, zss)
241
+DO_ST1_ZPZ_D(hd_be, zss)
242
+DO_ST1_ZPZ_D(sd_le, zss)
243
+DO_ST1_ZPZ_D(sd_be, zss)
244
+DO_ST1_ZPZ_D(dd_le, zss)
245
+DO_ST1_ZPZ_D(dd_be, zss)
246
+
247
+DO_ST1_ZPZ_D(bd, zd)
248
+DO_ST1_ZPZ_D(hd_le, zd)
249
+DO_ST1_ZPZ_D(hd_be, zd)
250
+DO_ST1_ZPZ_D(sd_le, zd)
251
+DO_ST1_ZPZ_D(sd_be, zd)
252
+DO_ST1_ZPZ_D(dd_le, zd)
253
+DO_ST1_ZPZ_D(dd_be, zd)
254
+
255
+#undef DO_ST1_ZPZ_S
256
+#undef DO_ST1_ZPZ_D
257
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
258
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
259
--- a/target/arm/translate-sve.c
15
--- a/target/arm/translate-sve.c
260
+++ b/target/arm/translate-sve.c
16
+++ b/target/arm/translate-sve.c
261
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
17
@@ -XXX,XX +XXX,XX @@ static void do_dupi_z(DisasContext *s, int rd, uint64_t word)
262
return true;
18
tcg_gen_gvec_dup_imm(MO_64, vec_full_reg_offset(s, rd), vsz, vsz, word);
263
}
19
}
264
20
265
-/* Indexed by [xs][msz]. */
21
-/* Invoke a vector expander on two Pregs. */
266
-static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = {
22
-static bool do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn,
267
- { gen_helper_sve_stbs_zsu,
23
- int esz, int rd, int rn)
268
- gen_helper_sve_sths_zsu,
24
-{
269
- gen_helper_sve_stss_zsu, },
25
- if (sve_access_check(s)) {
270
- { gen_helper_sve_stbs_zss,
26
- unsigned psz = pred_gvec_reg_size(s);
271
- gen_helper_sve_sths_zss,
27
- gvec_fn(esz, pred_full_reg_offset(s, rd),
272
- gen_helper_sve_stss_zss, },
28
- pred_full_reg_offset(s, rn), psz, psz);
273
+/* Indexed by [be][xs][msz]. */
29
- }
274
+static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][2][3] = {
30
- return true;
275
+ /* Little-endian */
31
-}
276
+ { { gen_helper_sve_stbs_zsu,
32
-
277
+ gen_helper_sve_sths_le_zsu,
33
/* Invoke a vector expander on three Pregs. */
278
+ gen_helper_sve_stss_le_zsu, },
34
static bool do_vector3_p(DisasContext *s, GVecGen3Fn *gvec_fn,
279
+ { gen_helper_sve_stbs_zss,
35
int esz, int rd, int rn, int rm)
280
+ gen_helper_sve_sths_le_zss,
36
@@ -XXX,XX +XXX,XX @@ static bool do_vecop4_p(DisasContext *s, const GVecGen4 *gvec_op,
281
+ gen_helper_sve_stss_le_zss, } },
37
/* Invoke a vector move on two Pregs. */
282
+ /* Big-endian */
38
static bool do_mov_p(DisasContext *s, int rd, int rn)
283
+ { { gen_helper_sve_stbs_zsu,
284
+ gen_helper_sve_sths_be_zsu,
285
+ gen_helper_sve_stss_be_zsu, },
286
+ { gen_helper_sve_stbs_zss,
287
+ gen_helper_sve_sths_be_zss,
288
+ gen_helper_sve_stss_be_zss, } },
289
};
290
291
/* Note that we overload xs=2 to indicate 64-bit offset. */
292
-static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = {
293
- { gen_helper_sve_stbd_zsu,
294
- gen_helper_sve_sthd_zsu,
295
- gen_helper_sve_stsd_zsu,
296
- gen_helper_sve_stdd_zsu, },
297
- { gen_helper_sve_stbd_zss,
298
- gen_helper_sve_sthd_zss,
299
- gen_helper_sve_stsd_zss,
300
- gen_helper_sve_stdd_zss, },
301
- { gen_helper_sve_stbd_zd,
302
- gen_helper_sve_sthd_zd,
303
- gen_helper_sve_stsd_zd,
304
- gen_helper_sve_stdd_zd, },
305
+static gen_helper_gvec_mem_scatter * const scatter_store_fn64[2][3][4] = {
306
+ /* Little-endian */
307
+ { { gen_helper_sve_stbd_zsu,
308
+ gen_helper_sve_sthd_le_zsu,
309
+ gen_helper_sve_stsd_le_zsu,
310
+ gen_helper_sve_stdd_le_zsu, },
311
+ { gen_helper_sve_stbd_zss,
312
+ gen_helper_sve_sthd_le_zss,
313
+ gen_helper_sve_stsd_le_zss,
314
+ gen_helper_sve_stdd_le_zss, },
315
+ { gen_helper_sve_stbd_zd,
316
+ gen_helper_sve_sthd_le_zd,
317
+ gen_helper_sve_stsd_le_zd,
318
+ gen_helper_sve_stdd_le_zd, } },
319
+ /* Big-endian */
320
+ { { gen_helper_sve_stbd_zsu,
321
+ gen_helper_sve_sthd_be_zsu,
322
+ gen_helper_sve_stsd_be_zsu,
323
+ gen_helper_sve_stdd_be_zsu, },
324
+ { gen_helper_sve_stbd_zss,
325
+ gen_helper_sve_sthd_be_zss,
326
+ gen_helper_sve_stsd_be_zss,
327
+ gen_helper_sve_stdd_be_zss, },
328
+ { gen_helper_sve_stbd_zd,
329
+ gen_helper_sve_sthd_be_zd,
330
+ gen_helper_sve_stsd_be_zd,
331
+ gen_helper_sve_stdd_be_zd, } },
332
};
333
334
static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
335
{
39
{
336
gen_helper_gvec_mem_scatter *fn;
40
- return do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn);
337
+ int be = s->be_data == MO_BE;
41
+ if (sve_access_check(s)) {
338
42
+ unsigned psz = pred_gvec_reg_size(s);
339
if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
43
+ tcg_gen_gvec_mov(MO_8, pred_full_reg_offset(s, rd),
340
return false;
44
+ pred_full_reg_offset(s, rn), psz, psz);
341
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
45
+ }
342
}
46
+ return true;
343
switch (a->esz) {
47
}
344
case MO_32:
48
345
- fn = scatter_store_fn32[a->xs][a->msz];
49
/* Set the cpu flags as per a return from an SVE helper. */
346
+ fn = scatter_store_fn32[be][a->xs][a->msz];
347
break;
348
case MO_64:
349
- fn = scatter_store_fn64[a->xs][a->msz];
350
+ fn = scatter_store_fn64[be][a->xs][a->msz];
351
break;
352
default:
353
g_assert_not_reached();
354
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
355
static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn)
356
{
357
gen_helper_gvec_mem_scatter *fn = NULL;
358
+ int be = s->be_data == MO_BE;
359
TCGv_i64 imm;
360
361
if (a->esz < a->msz) {
362
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn)
363
364
switch (a->esz) {
365
case MO_32:
366
- fn = scatter_store_fn32[0][a->msz];
367
+ fn = scatter_store_fn32[be][0][a->msz];
368
break;
369
case MO_64:
370
- fn = scatter_store_fn64[2][a->msz];
371
+ fn = scatter_store_fn64[be][2][a->msz];
372
break;
373
}
374
assert(fn != NULL);
375
--
50
--
376
2.19.0
51
2.20.1
377
52
378
53
diff view generated by jsdifflib
New patch
1
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
Move the check for !S into do_pppp_flags, which allows to merge in
4
do_vecop4_p. Split out gen_gvec_fn_ppp without sve_access_check,
5
to mirror gen_gvec_fn_zzz.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20200815013145.539409-7-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/translate-sve.c | 111 ++++++++++++++-----------------------
13
1 file changed, 43 insertions(+), 68 deletions(-)
14
15
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-sve.c
18
+++ b/target/arm/translate-sve.c
19
@@ -XXX,XX +XXX,XX @@ static void do_dupi_z(DisasContext *s, int rd, uint64_t word)
20
}
21
22
/* Invoke a vector expander on three Pregs. */
23
-static bool do_vector3_p(DisasContext *s, GVecGen3Fn *gvec_fn,
24
- int esz, int rd, int rn, int rm)
25
+static void gen_gvec_fn_ppp(DisasContext *s, GVecGen3Fn *gvec_fn,
26
+ int rd, int rn, int rm)
27
{
28
- if (sve_access_check(s)) {
29
- unsigned psz = pred_gvec_reg_size(s);
30
- gvec_fn(esz, pred_full_reg_offset(s, rd),
31
- pred_full_reg_offset(s, rn),
32
- pred_full_reg_offset(s, rm), psz, psz);
33
- }
34
- return true;
35
-}
36
-
37
-/* Invoke a vector operation on four Pregs. */
38
-static bool do_vecop4_p(DisasContext *s, const GVecGen4 *gvec_op,
39
- int rd, int rn, int rm, int rg)
40
-{
41
- if (sve_access_check(s)) {
42
- unsigned psz = pred_gvec_reg_size(s);
43
- tcg_gen_gvec_4(pred_full_reg_offset(s, rd),
44
- pred_full_reg_offset(s, rn),
45
- pred_full_reg_offset(s, rm),
46
- pred_full_reg_offset(s, rg),
47
- psz, psz, gvec_op);
48
- }
49
- return true;
50
+ unsigned psz = pred_gvec_reg_size(s);
51
+ gvec_fn(MO_64, pred_full_reg_offset(s, rd),
52
+ pred_full_reg_offset(s, rn),
53
+ pred_full_reg_offset(s, rm), psz, psz);
54
}
55
56
/* Invoke a vector move on two Pregs. */
57
@@ -XXX,XX +XXX,XX @@ static bool do_pppp_flags(DisasContext *s, arg_rprr_s *a,
58
int mofs = pred_full_reg_offset(s, a->rm);
59
int gofs = pred_full_reg_offset(s, a->pg);
60
61
+ if (!a->s) {
62
+ tcg_gen_gvec_4(dofs, nofs, mofs, gofs, psz, psz, gvec_op);
63
+ return true;
64
+ }
65
+
66
if (psz == 8) {
67
/* Do the operation and the flags generation in temps. */
68
TCGv_i64 pd = tcg_temp_new_i64();
69
@@ -XXX,XX +XXX,XX @@ static bool trans_AND_pppp(DisasContext *s, arg_rprr_s *a)
70
.fno = gen_helper_sve_and_pppp,
71
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
72
};
73
- if (a->s) {
74
- return do_pppp_flags(s, a, &op);
75
- } else if (a->rn == a->rm) {
76
- if (a->pg == a->rn) {
77
- return do_mov_p(s, a->rd, a->rn);
78
- } else {
79
- return do_vector3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->pg);
80
+
81
+ if (!a->s) {
82
+ if (!sve_access_check(s)) {
83
+ return true;
84
+ }
85
+ if (a->rn == a->rm) {
86
+ if (a->pg == a->rn) {
87
+ do_mov_p(s, a->rd, a->rn);
88
+ } else {
89
+ gen_gvec_fn_ppp(s, tcg_gen_gvec_and, a->rd, a->rn, a->pg);
90
+ }
91
+ return true;
92
+ } else if (a->pg == a->rn || a->pg == a->rm) {
93
+ gen_gvec_fn_ppp(s, tcg_gen_gvec_and, a->rd, a->rn, a->rm);
94
+ return true;
95
}
96
- } else if (a->pg == a->rn || a->pg == a->rm) {
97
- return do_vector3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm);
98
- } else {
99
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
100
}
101
+ return do_pppp_flags(s, a, &op);
102
}
103
104
static void gen_bic_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
105
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_pppp(DisasContext *s, arg_rprr_s *a)
106
.fno = gen_helper_sve_bic_pppp,
107
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
108
};
109
- if (a->s) {
110
- return do_pppp_flags(s, a, &op);
111
- } else if (a->pg == a->rn) {
112
- return do_vector3_p(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
113
- } else {
114
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
115
+
116
+ if (!a->s && a->pg == a->rn) {
117
+ if (sve_access_check(s)) {
118
+ gen_gvec_fn_ppp(s, tcg_gen_gvec_andc, a->rd, a->rn, a->rm);
119
+ }
120
+ return true;
121
}
122
+ return do_pppp_flags(s, a, &op);
123
}
124
125
static void gen_eor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
126
@@ -XXX,XX +XXX,XX @@ static bool trans_EOR_pppp(DisasContext *s, arg_rprr_s *a)
127
.fno = gen_helper_sve_eor_pppp,
128
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
129
};
130
- if (a->s) {
131
- return do_pppp_flags(s, a, &op);
132
- } else {
133
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
134
- }
135
+ return do_pppp_flags(s, a, &op);
136
}
137
138
static void gen_sel_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
139
@@ -XXX,XX +XXX,XX @@ static bool trans_SEL_pppp(DisasContext *s, arg_rprr_s *a)
140
.fno = gen_helper_sve_sel_pppp,
141
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
142
};
143
+
144
if (a->s) {
145
return false;
146
- } else {
147
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
148
}
149
+ return do_pppp_flags(s, a, &op);
150
}
151
152
static void gen_orr_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
153
@@ -XXX,XX +XXX,XX @@ static bool trans_ORR_pppp(DisasContext *s, arg_rprr_s *a)
154
.fno = gen_helper_sve_orr_pppp,
155
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
156
};
157
- if (a->s) {
158
- return do_pppp_flags(s, a, &op);
159
- } else if (a->pg == a->rn && a->rn == a->rm) {
160
+
161
+ if (!a->s && a->pg == a->rn && a->rn == a->rm) {
162
return do_mov_p(s, a->rd, a->rn);
163
- } else {
164
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
165
}
166
+ return do_pppp_flags(s, a, &op);
167
}
168
169
static void gen_orn_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
170
@@ -XXX,XX +XXX,XX @@ static bool trans_ORN_pppp(DisasContext *s, arg_rprr_s *a)
171
.fno = gen_helper_sve_orn_pppp,
172
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
173
};
174
- if (a->s) {
175
- return do_pppp_flags(s, a, &op);
176
- } else {
177
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
178
- }
179
+ return do_pppp_flags(s, a, &op);
180
}
181
182
static void gen_nor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
183
@@ -XXX,XX +XXX,XX @@ static bool trans_NOR_pppp(DisasContext *s, arg_rprr_s *a)
184
.fno = gen_helper_sve_nor_pppp,
185
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
186
};
187
- if (a->s) {
188
- return do_pppp_flags(s, a, &op);
189
- } else {
190
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
191
- }
192
+ return do_pppp_flags(s, a, &op);
193
}
194
195
static void gen_nand_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
196
@@ -XXX,XX +XXX,XX @@ static bool trans_NAND_pppp(DisasContext *s, arg_rprr_s *a)
197
.fno = gen_helper_sve_nand_pppp,
198
.prefer_i64 = TCG_TARGET_REG_BITS == 64,
199
};
200
- if (a->s) {
201
- return do_pppp_flags(s, a, &op);
202
- } else {
203
- return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
204
- }
205
+ return do_pppp_flags(s, a, &op);
206
}
207
208
/*
209
--
210
2.20.1
211
212
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This fixes the endianness problem for softmmu, and moves
3
The gvec operation was added after the initial implementation
4
the main loop out of a macro and into an inlined function.
4
of the SEL instruction and was missed in the conversion.
5
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
8
Message-id: 20200815013145.539409-8-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-13-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/helper-sve.h | 84 +++++++++----
11
target/arm/translate-sve.c | 31 ++++++++-----------------------
13
target/arm/sve_helper.c | 225 ++++++++++++++++++++++++----------
12
1 file changed, 8 insertions(+), 23 deletions(-)
14
target/arm/translate-sve.c | 244 +++++++++++++++++++++++++------------
15
3 files changed, 386 insertions(+), 167 deletions(-)
16
13
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
20
+++ b/target/arm/helper-sve.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_st1sd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
23
DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
24
void, env, ptr, ptr, ptr, tl, i32)
25
-DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG,
26
+DEF_HELPER_FLAGS_6(sve_ldhsu_le_zsu, TCG_CALL_NO_WG,
27
void, env, ptr, ptr, ptr, tl, i32)
28
-DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG,
29
+DEF_HELPER_FLAGS_6(sve_ldhsu_be_zsu, TCG_CALL_NO_WG,
30
+ void, env, ptr, ptr, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_6(sve_ldss_le_zsu, TCG_CALL_NO_WG,
32
+ void, env, ptr, ptr, ptr, tl, i32)
33
+DEF_HELPER_FLAGS_6(sve_ldss_be_zsu, TCG_CALL_NO_WG,
34
void, env, ptr, ptr, ptr, tl, i32)
35
DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG,
36
void, env, ptr, ptr, ptr, tl, i32)
37
-DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG,
38
+DEF_HELPER_FLAGS_6(sve_ldhss_le_zsu, TCG_CALL_NO_WG,
39
+ void, env, ptr, ptr, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_6(sve_ldhss_be_zsu, TCG_CALL_NO_WG,
41
void, env, ptr, ptr, ptr, tl, i32)
42
43
DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG,
44
void, env, ptr, ptr, ptr, tl, i32)
45
-DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG,
46
+DEF_HELPER_FLAGS_6(sve_ldhsu_le_zss, TCG_CALL_NO_WG,
47
void, env, ptr, ptr, ptr, tl, i32)
48
-DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG,
49
+DEF_HELPER_FLAGS_6(sve_ldhsu_be_zss, TCG_CALL_NO_WG,
50
+ void, env, ptr, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_6(sve_ldss_le_zss, TCG_CALL_NO_WG,
52
+ void, env, ptr, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_6(sve_ldss_be_zss, TCG_CALL_NO_WG,
54
void, env, ptr, ptr, ptr, tl, i32)
55
DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG,
56
void, env, ptr, ptr, ptr, tl, i32)
57
-DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG,
58
+DEF_HELPER_FLAGS_6(sve_ldhss_le_zss, TCG_CALL_NO_WG,
59
+ void, env, ptr, ptr, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_6(sve_ldhss_be_zss, TCG_CALL_NO_WG,
61
void, env, ptr, ptr, ptr, tl, i32)
62
63
DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG,
64
void, env, ptr, ptr, ptr, tl, i32)
65
-DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG,
66
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zsu, TCG_CALL_NO_WG,
67
void, env, ptr, ptr, ptr, tl, i32)
68
-DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG,
69
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zsu, TCG_CALL_NO_WG,
70
void, env, ptr, ptr, ptr, tl, i32)
71
-DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG,
72
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zsu, TCG_CALL_NO_WG,
73
+ void, env, ptr, ptr, ptr, tl, i32)
74
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zsu, TCG_CALL_NO_WG,
75
+ void, env, ptr, ptr, ptr, tl, i32)
76
+DEF_HELPER_FLAGS_6(sve_lddd_le_zsu, TCG_CALL_NO_WG,
77
+ void, env, ptr, ptr, ptr, tl, i32)
78
+DEF_HELPER_FLAGS_6(sve_lddd_be_zsu, TCG_CALL_NO_WG,
79
void, env, ptr, ptr, ptr, tl, i32)
80
DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG,
81
void, env, ptr, ptr, ptr, tl, i32)
82
-DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG,
83
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zsu, TCG_CALL_NO_WG,
84
void, env, ptr, ptr, ptr, tl, i32)
85
-DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG,
86
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zsu, TCG_CALL_NO_WG,
87
+ void, env, ptr, ptr, ptr, tl, i32)
88
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zsu, TCG_CALL_NO_WG,
89
+ void, env, ptr, ptr, ptr, tl, i32)
90
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zsu, TCG_CALL_NO_WG,
91
void, env, ptr, ptr, ptr, tl, i32)
92
93
DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG,
94
void, env, ptr, ptr, ptr, tl, i32)
95
-DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG,
96
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zss, TCG_CALL_NO_WG,
97
void, env, ptr, ptr, ptr, tl, i32)
98
-DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG,
99
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zss, TCG_CALL_NO_WG,
100
void, env, ptr, ptr, ptr, tl, i32)
101
-DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG,
102
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zss, TCG_CALL_NO_WG,
103
+ void, env, ptr, ptr, ptr, tl, i32)
104
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zss, TCG_CALL_NO_WG,
105
+ void, env, ptr, ptr, ptr, tl, i32)
106
+DEF_HELPER_FLAGS_6(sve_lddd_le_zss, TCG_CALL_NO_WG,
107
+ void, env, ptr, ptr, ptr, tl, i32)
108
+DEF_HELPER_FLAGS_6(sve_lddd_be_zss, TCG_CALL_NO_WG,
109
void, env, ptr, ptr, ptr, tl, i32)
110
DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG,
111
void, env, ptr, ptr, ptr, tl, i32)
112
-DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG,
113
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zss, TCG_CALL_NO_WG,
114
void, env, ptr, ptr, ptr, tl, i32)
115
-DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG,
116
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zss, TCG_CALL_NO_WG,
117
+ void, env, ptr, ptr, ptr, tl, i32)
118
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zss, TCG_CALL_NO_WG,
119
+ void, env, ptr, ptr, ptr, tl, i32)
120
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zss, TCG_CALL_NO_WG,
121
void, env, ptr, ptr, ptr, tl, i32)
122
123
DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG,
124
void, env, ptr, ptr, ptr, tl, i32)
125
-DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG,
126
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zd, TCG_CALL_NO_WG,
127
void, env, ptr, ptr, ptr, tl, i32)
128
-DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG,
129
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zd, TCG_CALL_NO_WG,
130
void, env, ptr, ptr, ptr, tl, i32)
131
-DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG,
132
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zd, TCG_CALL_NO_WG,
133
+ void, env, ptr, ptr, ptr, tl, i32)
134
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zd, TCG_CALL_NO_WG,
135
+ void, env, ptr, ptr, ptr, tl, i32)
136
+DEF_HELPER_FLAGS_6(sve_lddd_le_zd, TCG_CALL_NO_WG,
137
+ void, env, ptr, ptr, ptr, tl, i32)
138
+DEF_HELPER_FLAGS_6(sve_lddd_be_zd, TCG_CALL_NO_WG,
139
void, env, ptr, ptr, ptr, tl, i32)
140
DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG,
141
void, env, ptr, ptr, ptr, tl, i32)
142
-DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
143
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zd, TCG_CALL_NO_WG,
144
void, env, ptr, ptr, ptr, tl, i32)
145
-DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
146
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zd, TCG_CALL_NO_WG,
147
+ void, env, ptr, ptr, ptr, tl, i32)
148
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zd, TCG_CALL_NO_WG,
149
+ void, env, ptr, ptr, ptr, tl, i32)
150
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zd, TCG_CALL_NO_WG,
151
void, env, ptr, ptr, ptr, tl, i32)
152
153
DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
154
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
155
index XXXXXXX..XXXXXXX 100644
156
--- a/target/arm/sve_helper.c
157
+++ b/target/arm/sve_helper.c
158
@@ -XXX,XX +XXX,XX @@ DO_STN_2(4, dd, 8, 8)
159
#undef DO_STN_1
160
#undef DO_STN_2
161
162
-/* Loads with a vector index. */
163
+/*
164
+ * Loads with a vector index.
165
+ */
166
167
-#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
168
-void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
169
- target_ulong base, uint32_t desc) \
170
-{ \
171
- intptr_t i, oprsz = simd_oprsz(desc); \
172
- unsigned scale = simd_data(desc); \
173
- uintptr_t ra = GETPC(); \
174
- for (i = 0; i < oprsz; ) { \
175
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
176
- do { \
177
- TYPEM m = 0; \
178
- if (pg & 1) { \
179
- target_ulong off = *(TYPEI *)(vm + H1_4(i)); \
180
- m = FN(env, base + (off << scale), ra); \
181
- } \
182
- *(uint32_t *)(vd + H1_4(i)) = m; \
183
- i += 4, pg >>= 4; \
184
- } while (i & 15); \
185
- } \
186
+/*
187
+ * Load the element at @reg + @reg_ofs, sign or zero-extend as needed.
188
+ */
189
+typedef target_ulong zreg_off_fn(void *reg, intptr_t reg_ofs);
190
+
191
+static target_ulong off_zsu_s(void *reg, intptr_t reg_ofs)
192
+{
193
+ return *(uint32_t *)(reg + H1_4(reg_ofs));
194
}
195
196
-#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \
197
-void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
198
- target_ulong base, uint32_t desc) \
199
-{ \
200
- intptr_t i, oprsz = simd_oprsz(desc) / 8; \
201
- unsigned scale = simd_data(desc); \
202
- uintptr_t ra = GETPC(); \
203
- uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \
204
- for (i = 0; i < oprsz; i++) { \
205
- TYPEM mm = 0; \
206
- if (pg[H1(i)] & 1) { \
207
- target_ulong off = (TYPEI)m[i]; \
208
- mm = FN(env, base + (off << scale), ra); \
209
- } \
210
- d[i] = mm; \
211
- } \
212
+static target_ulong off_zss_s(void *reg, intptr_t reg_ofs)
213
+{
214
+ return *(int32_t *)(reg + H1_4(reg_ofs));
215
}
216
217
-DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
218
-DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
219
-DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
220
-DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
221
-DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
222
+static target_ulong off_zsu_d(void *reg, intptr_t reg_ofs)
223
+{
224
+ return (uint32_t)*(uint64_t *)(reg + reg_ofs);
225
+}
226
227
-DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
228
-DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
229
-DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
230
-DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra)
231
-DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra)
232
+static target_ulong off_zss_d(void *reg, intptr_t reg_ofs)
233
+{
234
+ return (int32_t)*(uint64_t *)(reg + reg_ofs);
235
+}
236
237
-DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
238
-DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
239
-DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
240
-DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra)
241
-DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
242
-DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
243
-DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra)
244
+static target_ulong off_zd_d(void *reg, intptr_t reg_ofs)
245
+{
246
+ return *(uint64_t *)(reg + reg_ofs);
247
+}
248
249
-DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
250
-DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
251
-DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
252
-DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra)
253
-DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra)
254
-DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra)
255
-DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra)
256
+static void sve_ld1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
257
+ target_ulong base, uint32_t desc, uintptr_t ra,
258
+ zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
259
+{
260
+ const int mmu_idx = cpu_mmu_index(env, false);
261
+ intptr_t i, oprsz = simd_oprsz(desc);
262
+ unsigned scale = simd_data(desc);
263
+ ARMVectorReg scratch = { };
264
265
-DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra)
266
-DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra)
267
-DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra)
268
-DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra)
269
-DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
270
-DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
271
-DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
272
+ set_helper_retaddr(ra);
273
+ for (i = 0; i < oprsz; ) {
274
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
275
+ do {
276
+ if (likely(pg & 1)) {
277
+ target_ulong off = off_fn(vm, i);
278
+ tlb_fn(env, &scratch, i, base + (off << scale), mmu_idx, ra);
279
+ }
280
+ i += 4, pg >>= 4;
281
+ } while (i & 15);
282
+ }
283
+ set_helper_retaddr(0);
284
+
285
+ /* Wait until all exceptions have been raised to write back. */
286
+ memcpy(vd, &scratch, oprsz);
287
+}
288
+
289
+static void sve_ld1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
290
+ target_ulong base, uint32_t desc, uintptr_t ra,
291
+ zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
292
+{
293
+ const int mmu_idx = cpu_mmu_index(env, false);
294
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
295
+ unsigned scale = simd_data(desc);
296
+ ARMVectorReg scratch = { };
297
+
298
+ set_helper_retaddr(ra);
299
+ for (i = 0; i < oprsz; i++) {
300
+ uint8_t pg = *(uint8_t *)(vg + H1(i));
301
+ if (likely(pg & 1)) {
302
+ target_ulong off = off_fn(vm, i * 8);
303
+ tlb_fn(env, &scratch, i * 8, base + (off << scale), mmu_idx, ra);
304
+ }
305
+ }
306
+ set_helper_retaddr(0);
307
+
308
+ /* Wait until all exceptions have been raised to write back. */
309
+ memcpy(vd, &scratch, oprsz * 8);
310
+}
311
+
312
+#define DO_LD1_ZPZ_S(MEM, OFS) \
313
+void __attribute__((flatten)) HELPER(sve_ld##MEM##_##OFS) \
314
+ (CPUARMState *env, void *vd, void *vg, void *vm, \
315
+ target_ulong base, uint32_t desc) \
316
+{ \
317
+ sve_ld1_zs(env, vd, vg, vm, base, desc, GETPC(), \
318
+ off_##OFS##_s, sve_ld1##MEM##_tlb); \
319
+}
320
+
321
+#define DO_LD1_ZPZ_D(MEM, OFS) \
322
+void __attribute__((flatten)) HELPER(sve_ld##MEM##_##OFS) \
323
+ (CPUARMState *env, void *vd, void *vg, void *vm, \
324
+ target_ulong base, uint32_t desc) \
325
+{ \
326
+ sve_ld1_zd(env, vd, vg, vm, base, desc, GETPC(), \
327
+ off_##OFS##_d, sve_ld1##MEM##_tlb); \
328
+}
329
+
330
+DO_LD1_ZPZ_S(bsu, zsu)
331
+DO_LD1_ZPZ_S(bsu, zss)
332
+DO_LD1_ZPZ_D(bdu, zsu)
333
+DO_LD1_ZPZ_D(bdu, zss)
334
+DO_LD1_ZPZ_D(bdu, zd)
335
+
336
+DO_LD1_ZPZ_S(bss, zsu)
337
+DO_LD1_ZPZ_S(bss, zss)
338
+DO_LD1_ZPZ_D(bds, zsu)
339
+DO_LD1_ZPZ_D(bds, zss)
340
+DO_LD1_ZPZ_D(bds, zd)
341
+
342
+DO_LD1_ZPZ_S(hsu_le, zsu)
343
+DO_LD1_ZPZ_S(hsu_le, zss)
344
+DO_LD1_ZPZ_D(hdu_le, zsu)
345
+DO_LD1_ZPZ_D(hdu_le, zss)
346
+DO_LD1_ZPZ_D(hdu_le, zd)
347
+
348
+DO_LD1_ZPZ_S(hsu_be, zsu)
349
+DO_LD1_ZPZ_S(hsu_be, zss)
350
+DO_LD1_ZPZ_D(hdu_be, zsu)
351
+DO_LD1_ZPZ_D(hdu_be, zss)
352
+DO_LD1_ZPZ_D(hdu_be, zd)
353
+
354
+DO_LD1_ZPZ_S(hss_le, zsu)
355
+DO_LD1_ZPZ_S(hss_le, zss)
356
+DO_LD1_ZPZ_D(hds_le, zsu)
357
+DO_LD1_ZPZ_D(hds_le, zss)
358
+DO_LD1_ZPZ_D(hds_le, zd)
359
+
360
+DO_LD1_ZPZ_S(hss_be, zsu)
361
+DO_LD1_ZPZ_S(hss_be, zss)
362
+DO_LD1_ZPZ_D(hds_be, zsu)
363
+DO_LD1_ZPZ_D(hds_be, zss)
364
+DO_LD1_ZPZ_D(hds_be, zd)
365
+
366
+DO_LD1_ZPZ_S(ss_le, zsu)
367
+DO_LD1_ZPZ_S(ss_le, zss)
368
+DO_LD1_ZPZ_D(sdu_le, zsu)
369
+DO_LD1_ZPZ_D(sdu_le, zss)
370
+DO_LD1_ZPZ_D(sdu_le, zd)
371
+
372
+DO_LD1_ZPZ_S(ss_be, zsu)
373
+DO_LD1_ZPZ_S(ss_be, zss)
374
+DO_LD1_ZPZ_D(sdu_be, zsu)
375
+DO_LD1_ZPZ_D(sdu_be, zss)
376
+DO_LD1_ZPZ_D(sdu_be, zd)
377
+
378
+DO_LD1_ZPZ_D(sds_le, zsu)
379
+DO_LD1_ZPZ_D(sds_le, zss)
380
+DO_LD1_ZPZ_D(sds_le, zd)
381
+
382
+DO_LD1_ZPZ_D(sds_be, zsu)
383
+DO_LD1_ZPZ_D(sds_be, zss)
384
+DO_LD1_ZPZ_D(sds_be, zd)
385
+
386
+DO_LD1_ZPZ_D(dd_le, zsu)
387
+DO_LD1_ZPZ_D(dd_le, zss)
388
+DO_LD1_ZPZ_D(dd_le, zd)
389
+
390
+DO_LD1_ZPZ_D(dd_be, zsu)
391
+DO_LD1_ZPZ_D(dd_be, zss)
392
+DO_LD1_ZPZ_D(dd_be, zd)
393
+
394
+#undef DO_LD1_ZPZ_S
395
+#undef DO_LD1_ZPZ_D
396
397
/* First fault loads with a vector index. */
398
399
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
14
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
400
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
401
--- a/target/arm/translate-sve.c
16
--- a/target/arm/translate-sve.c
402
+++ b/target/arm/translate-sve.c
17
+++ b/target/arm/translate-sve.c
403
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale,
18
@@ -XXX,XX +XXX,XX @@ static bool trans_EOR_pppp(DisasContext *s, arg_rprr_s *a)
404
tcg_temp_free_i32(desc);
19
return do_pppp_flags(s, a, &op);
405
}
20
}
406
21
407
-/* Indexed by [ff][xs][u][msz]. */
22
-static void gen_sel_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
408
-static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = {
23
-{
409
- { { { gen_helper_sve_ldbss_zsu,
24
- tcg_gen_and_i64(pn, pn, pg);
410
- gen_helper_sve_ldhss_zsu,
25
- tcg_gen_andc_i64(pm, pm, pg);
411
- NULL, },
26
- tcg_gen_or_i64(pd, pn, pm);
412
- { gen_helper_sve_ldbsu_zsu,
27
-}
413
- gen_helper_sve_ldhsu_zsu,
28
-
414
- gen_helper_sve_ldssu_zsu, } },
29
-static void gen_sel_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
415
- { { gen_helper_sve_ldbss_zss,
30
- TCGv_vec pm, TCGv_vec pg)
416
- gen_helper_sve_ldhss_zss,
31
-{
417
- NULL, },
32
- tcg_gen_and_vec(vece, pn, pn, pg);
418
- { gen_helper_sve_ldbsu_zss,
33
- tcg_gen_andc_vec(vece, pm, pm, pg);
419
- gen_helper_sve_ldhsu_zss,
34
- tcg_gen_or_vec(vece, pd, pn, pm);
420
- gen_helper_sve_ldssu_zss, } } },
35
-}
421
+/* Indexed by [be][ff][xs][u][msz]. */
36
-
422
+static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][2][3] = {
37
static bool trans_SEL_pppp(DisasContext *s, arg_rprr_s *a)
423
+ /* Little-endian */
424
+ { { { { gen_helper_sve_ldbss_zsu,
425
+ gen_helper_sve_ldhss_le_zsu,
426
+ NULL, },
427
+ { gen_helper_sve_ldbsu_zsu,
428
+ gen_helper_sve_ldhsu_le_zsu,
429
+ gen_helper_sve_ldss_le_zsu, } },
430
+ { { gen_helper_sve_ldbss_zss,
431
+ gen_helper_sve_ldhss_le_zss,
432
+ NULL, },
433
+ { gen_helper_sve_ldbsu_zss,
434
+ gen_helper_sve_ldhsu_le_zss,
435
+ gen_helper_sve_ldss_le_zss, } } },
436
437
- { { { gen_helper_sve_ldffbss_zsu,
438
- gen_helper_sve_ldffhss_zsu,
439
- NULL, },
440
- { gen_helper_sve_ldffbsu_zsu,
441
- gen_helper_sve_ldffhsu_zsu,
442
- gen_helper_sve_ldffssu_zsu, } },
443
- { { gen_helper_sve_ldffbss_zss,
444
- gen_helper_sve_ldffhss_zss,
445
- NULL, },
446
- { gen_helper_sve_ldffbsu_zss,
447
- gen_helper_sve_ldffhsu_zss,
448
- gen_helper_sve_ldffssu_zss, } } }
449
+ /* First-fault */
450
+ { { { gen_helper_sve_ldffbss_zsu,
451
+ gen_helper_sve_ldffhss_zsu,
452
+ NULL, },
453
+ { gen_helper_sve_ldffbsu_zsu,
454
+ gen_helper_sve_ldffhsu_zsu,
455
+ gen_helper_sve_ldffssu_zsu, } },
456
+ { { gen_helper_sve_ldffbss_zss,
457
+ gen_helper_sve_ldffhss_zss,
458
+ NULL, },
459
+ { gen_helper_sve_ldffbsu_zss,
460
+ gen_helper_sve_ldffhsu_zss,
461
+ gen_helper_sve_ldffssu_zss, } } } },
462
+
463
+ /* Big-endian */
464
+ { { { { gen_helper_sve_ldbss_zsu,
465
+ gen_helper_sve_ldhss_be_zsu,
466
+ NULL, },
467
+ { gen_helper_sve_ldbsu_zsu,
468
+ gen_helper_sve_ldhsu_be_zsu,
469
+ gen_helper_sve_ldss_be_zsu, } },
470
+ { { gen_helper_sve_ldbss_zss,
471
+ gen_helper_sve_ldhss_be_zss,
472
+ NULL, },
473
+ { gen_helper_sve_ldbsu_zss,
474
+ gen_helper_sve_ldhsu_be_zss,
475
+ gen_helper_sve_ldss_be_zss, } } },
476
+
477
+ /* First-fault */
478
+ { { { gen_helper_sve_ldffbss_zsu,
479
+ gen_helper_sve_ldffhss_zsu,
480
+ NULL, },
481
+ { gen_helper_sve_ldffbsu_zsu,
482
+ gen_helper_sve_ldffhsu_zsu,
483
+ gen_helper_sve_ldffssu_zsu, } },
484
+ { { gen_helper_sve_ldffbss_zss,
485
+ gen_helper_sve_ldffhss_zss,
486
+ NULL, },
487
+ { gen_helper_sve_ldffbsu_zss,
488
+ gen_helper_sve_ldffhsu_zss,
489
+ gen_helper_sve_ldffssu_zss, } } } },
490
};
491
492
/* Note that we overload xs=2 to indicate 64-bit offset. */
493
-static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = {
494
- { { { gen_helper_sve_ldbds_zsu,
495
- gen_helper_sve_ldhds_zsu,
496
- gen_helper_sve_ldsds_zsu,
497
- NULL, },
498
- { gen_helper_sve_ldbdu_zsu,
499
- gen_helper_sve_ldhdu_zsu,
500
- gen_helper_sve_ldsdu_zsu,
501
- gen_helper_sve_ldddu_zsu, } },
502
- { { gen_helper_sve_ldbds_zss,
503
- gen_helper_sve_ldhds_zss,
504
- gen_helper_sve_ldsds_zss,
505
- NULL, },
506
- { gen_helper_sve_ldbdu_zss,
507
- gen_helper_sve_ldhdu_zss,
508
- gen_helper_sve_ldsdu_zss,
509
- gen_helper_sve_ldddu_zss, } },
510
- { { gen_helper_sve_ldbds_zd,
511
- gen_helper_sve_ldhds_zd,
512
- gen_helper_sve_ldsds_zd,
513
- NULL, },
514
- { gen_helper_sve_ldbdu_zd,
515
- gen_helper_sve_ldhdu_zd,
516
- gen_helper_sve_ldsdu_zd,
517
- gen_helper_sve_ldddu_zd, } } },
518
+static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][2][3][2][4] = {
519
+ /* Little-endian */
520
+ { { { { gen_helper_sve_ldbds_zsu,
521
+ gen_helper_sve_ldhds_le_zsu,
522
+ gen_helper_sve_ldsds_le_zsu,
523
+ NULL, },
524
+ { gen_helper_sve_ldbdu_zsu,
525
+ gen_helper_sve_ldhdu_le_zsu,
526
+ gen_helper_sve_ldsdu_le_zsu,
527
+ gen_helper_sve_lddd_le_zsu, } },
528
+ { { gen_helper_sve_ldbds_zss,
529
+ gen_helper_sve_ldhds_le_zss,
530
+ gen_helper_sve_ldsds_le_zss,
531
+ NULL, },
532
+ { gen_helper_sve_ldbdu_zss,
533
+ gen_helper_sve_ldhdu_le_zss,
534
+ gen_helper_sve_ldsdu_le_zss,
535
+ gen_helper_sve_lddd_le_zss, } },
536
+ { { gen_helper_sve_ldbds_zd,
537
+ gen_helper_sve_ldhds_le_zd,
538
+ gen_helper_sve_ldsds_le_zd,
539
+ NULL, },
540
+ { gen_helper_sve_ldbdu_zd,
541
+ gen_helper_sve_ldhdu_le_zd,
542
+ gen_helper_sve_ldsdu_le_zd,
543
+ gen_helper_sve_lddd_le_zd, } } },
544
545
- { { { gen_helper_sve_ldffbds_zsu,
546
- gen_helper_sve_ldffhds_zsu,
547
- gen_helper_sve_ldffsds_zsu,
548
- NULL, },
549
- { gen_helper_sve_ldffbdu_zsu,
550
- gen_helper_sve_ldffhdu_zsu,
551
- gen_helper_sve_ldffsdu_zsu,
552
- gen_helper_sve_ldffddu_zsu, } },
553
- { { gen_helper_sve_ldffbds_zss,
554
- gen_helper_sve_ldffhds_zss,
555
- gen_helper_sve_ldffsds_zss,
556
- NULL, },
557
- { gen_helper_sve_ldffbdu_zss,
558
- gen_helper_sve_ldffhdu_zss,
559
- gen_helper_sve_ldffsdu_zss,
560
- gen_helper_sve_ldffddu_zss, } },
561
- { { gen_helper_sve_ldffbds_zd,
562
- gen_helper_sve_ldffhds_zd,
563
- gen_helper_sve_ldffsds_zd,
564
- NULL, },
565
- { gen_helper_sve_ldffbdu_zd,
566
- gen_helper_sve_ldffhdu_zd,
567
- gen_helper_sve_ldffsdu_zd,
568
- gen_helper_sve_ldffddu_zd, } } }
569
+ /* First-fault */
570
+ { { { gen_helper_sve_ldffbds_zsu,
571
+ gen_helper_sve_ldffhds_zsu,
572
+ gen_helper_sve_ldffsds_zsu,
573
+ NULL, },
574
+ { gen_helper_sve_ldffbdu_zsu,
575
+ gen_helper_sve_ldffhdu_zsu,
576
+ gen_helper_sve_ldffsdu_zsu,
577
+ gen_helper_sve_ldffddu_zsu, } },
578
+ { { gen_helper_sve_ldffbds_zss,
579
+ gen_helper_sve_ldffhds_zss,
580
+ gen_helper_sve_ldffsds_zss,
581
+ NULL, },
582
+ { gen_helper_sve_ldffbdu_zss,
583
+ gen_helper_sve_ldffhdu_zss,
584
+ gen_helper_sve_ldffsdu_zss,
585
+ gen_helper_sve_ldffddu_zss, } },
586
+ { { gen_helper_sve_ldffbds_zd,
587
+ gen_helper_sve_ldffhds_zd,
588
+ gen_helper_sve_ldffsds_zd,
589
+ NULL, },
590
+ { gen_helper_sve_ldffbdu_zd,
591
+ gen_helper_sve_ldffhdu_zd,
592
+ gen_helper_sve_ldffsdu_zd,
593
+ gen_helper_sve_ldffddu_zd, } } } },
594
+
595
+ /* Big-endian */
596
+ { { { { gen_helper_sve_ldbds_zsu,
597
+ gen_helper_sve_ldhds_be_zsu,
598
+ gen_helper_sve_ldsds_be_zsu,
599
+ NULL, },
600
+ { gen_helper_sve_ldbdu_zsu,
601
+ gen_helper_sve_ldhdu_be_zsu,
602
+ gen_helper_sve_ldsdu_be_zsu,
603
+ gen_helper_sve_lddd_be_zsu, } },
604
+ { { gen_helper_sve_ldbds_zss,
605
+ gen_helper_sve_ldhds_be_zss,
606
+ gen_helper_sve_ldsds_be_zss,
607
+ NULL, },
608
+ { gen_helper_sve_ldbdu_zss,
609
+ gen_helper_sve_ldhdu_be_zss,
610
+ gen_helper_sve_ldsdu_be_zss,
611
+ gen_helper_sve_lddd_be_zss, } },
612
+ { { gen_helper_sve_ldbds_zd,
613
+ gen_helper_sve_ldhds_be_zd,
614
+ gen_helper_sve_ldsds_be_zd,
615
+ NULL, },
616
+ { gen_helper_sve_ldbdu_zd,
617
+ gen_helper_sve_ldhdu_be_zd,
618
+ gen_helper_sve_ldsdu_be_zd,
619
+ gen_helper_sve_lddd_be_zd, } } },
620
+
621
+ /* First-fault */
622
+ { { { gen_helper_sve_ldffbds_zsu,
623
+ gen_helper_sve_ldffhds_zsu,
624
+ gen_helper_sve_ldffsds_zsu,
625
+ NULL, },
626
+ { gen_helper_sve_ldffbdu_zsu,
627
+ gen_helper_sve_ldffhdu_zsu,
628
+ gen_helper_sve_ldffsdu_zsu,
629
+ gen_helper_sve_ldffddu_zsu, } },
630
+ { { gen_helper_sve_ldffbds_zss,
631
+ gen_helper_sve_ldffhds_zss,
632
+ gen_helper_sve_ldffsds_zss,
633
+ NULL, },
634
+ { gen_helper_sve_ldffbdu_zss,
635
+ gen_helper_sve_ldffhdu_zss,
636
+ gen_helper_sve_ldffsdu_zss,
637
+ gen_helper_sve_ldffddu_zss, } },
638
+ { { gen_helper_sve_ldffbds_zd,
639
+ gen_helper_sve_ldffhds_zd,
640
+ gen_helper_sve_ldffsds_zd,
641
+ NULL, },
642
+ { gen_helper_sve_ldffbdu_zd,
643
+ gen_helper_sve_ldffhdu_zd,
644
+ gen_helper_sve_ldffsdu_zd,
645
+ gen_helper_sve_ldffddu_zd, } } } },
646
};
647
648
static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
649
{
38
{
650
gen_helper_gvec_mem_scatter *fn = NULL;
39
- static const GVecGen4 op = {
651
+ int be = s->be_data == MO_BE;
40
- .fni8 = gen_sel_pg_i64,
652
41
- .fniv = gen_sel_pg_vec,
653
if (!sve_access_check(s)) {
42
- .fno = gen_helper_sve_sel_pppp,
654
return true;
43
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
655
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
44
- };
656
45
-
657
switch (a->esz) {
46
if (a->s) {
658
case MO_32:
47
return false;
659
- fn = gather_load_fn32[a->ff][a->xs][a->u][a->msz];
660
+ fn = gather_load_fn32[be][a->ff][a->xs][a->u][a->msz];
661
break;
662
case MO_64:
663
- fn = gather_load_fn64[a->ff][a->xs][a->u][a->msz];
664
+ fn = gather_load_fn64[be][a->ff][a->xs][a->u][a->msz];
665
break;
666
}
48
}
667
assert(fn != NULL);
49
- return do_pppp_flags(s, a, &op);
668
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
50
+ if (sve_access_check(s)) {
669
static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
51
+ unsigned psz = pred_gvec_reg_size(s);
670
{
52
+ tcg_gen_gvec_bitsel(MO_8, pred_full_reg_offset(s, a->rd),
671
gen_helper_gvec_mem_scatter *fn = NULL;
53
+ pred_full_reg_offset(s, a->pg),
672
+ int be = s->be_data == MO_BE;
54
+ pred_full_reg_offset(s, a->rn),
673
TCGv_i64 imm;
55
+ pred_full_reg_offset(s, a->rm), psz, psz);
674
56
+ }
675
if (a->esz < a->msz || (a->esz == a->msz && !a->u)) {
57
+ return true;
676
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
58
}
677
59
678
switch (a->esz) {
60
static void gen_orr_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
679
case MO_32:
680
- fn = gather_load_fn32[a->ff][0][a->u][a->msz];
681
+ fn = gather_load_fn32[be][a->ff][0][a->u][a->msz];
682
break;
683
case MO_64:
684
- fn = gather_load_fn64[a->ff][2][a->u][a->msz];
685
+ fn = gather_load_fn64[be][a->ff][2][a->u][a->msz];
686
break;
687
}
688
assert(fn != NULL);
689
--
61
--
690
2.19.0
62
2.20.1
691
63
692
64
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We can choose the endianness at translation time, rather than
3
Model after gen_gvec_fn_zzz et al.
4
re-computing it at execution time.
5
4
6
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-12-richard.henderson@linaro.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20200815013145.539409-9-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
9
---
12
target/arm/helper-sve.h | 48 +++++++++++++++++--------
10
target/arm/translate-sve.c | 35 ++++++++++++++++-------------------
13
target/arm/sve_helper.c | 11 ++++--
11
1 file changed, 16 insertions(+), 19 deletions(-)
14
target/arm/translate-sve.c | 72 +++++++++++++++++++++++++++++---------
15
3 files changed, 96 insertions(+), 35 deletions(-)
16
12
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
20
+++ b/target/arm/helper-sve.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
23
DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
24
25
-DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
-DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
-DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
-DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_4(sve_st1hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_st2hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_4(sve_st3hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
32
+DEF_HELPER_FLAGS_4(sve_st4hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
33
34
-DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
-DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
-DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
-DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_4(sve_st1hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_4(sve_st2hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_st3hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_4(sve_st4hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
43
-DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
44
-DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
-DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
-DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_4(sve_st1ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_st2ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_4(sve_st3ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
50
+DEF_HELPER_FLAGS_4(sve_st4ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
51
+
52
+DEF_HELPER_FLAGS_4(sve_st1ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_4(sve_st2ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_4(sve_st3ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_4(sve_st4ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
56
+
57
+DEF_HELPER_FLAGS_4(sve_st1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
58
+DEF_HELPER_FLAGS_4(sve_st2dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
59
+DEF_HELPER_FLAGS_4(sve_st3dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_4(sve_st4dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
61
+
62
+DEF_HELPER_FLAGS_4(sve_st1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
63
+DEF_HELPER_FLAGS_4(sve_st2dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_4(sve_st3dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
65
+DEF_HELPER_FLAGS_4(sve_st4dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
66
67
DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
68
DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
69
DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
70
71
-DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
72
-DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
73
+DEF_HELPER_FLAGS_4(sve_st1hs_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
74
+DEF_HELPER_FLAGS_4(sve_st1hd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
75
+DEF_HELPER_FLAGS_4(sve_st1hs_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
76
+DEF_HELPER_FLAGS_4(sve_st1hd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
77
78
-DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_4(sve_st1sd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
80
+DEF_HELPER_FLAGS_4(sve_st1sd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
81
82
DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
83
void, env, ptr, ptr, ptr, tl, i32)
84
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/sve_helper.c
87
+++ b/target/arm/sve_helper.c
88
@@ -XXX,XX +XXX,XX @@ void __attribute__((flatten)) HELPER(sve_st##N##NAME##_r) \
89
}
90
91
#define DO_STN_2(N, NAME, ESIZE, MSIZE) \
92
-void __attribute__((flatten)) HELPER(sve_st##N##NAME##_r) \
93
+void __attribute__((flatten)) HELPER(sve_st##N##NAME##_le_r) \
94
(CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
95
{ \
96
sve_st##N##_r(env, vg, addr, desc, GETPC(), ESIZE, MSIZE, \
97
- arm_cpu_data_is_big_endian(env) \
98
- ? sve_st1##NAME##_be_tlb : sve_st1##NAME##_le_tlb); \
99
+ sve_st1##NAME##_le_tlb); \
100
+} \
101
+void __attribute__((flatten)) HELPER(sve_st##N##NAME##_be_r) \
102
+ (CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
103
+{ \
104
+ sve_st##N##_r(env, vg, addr, desc, GETPC(), ESIZE, MSIZE, \
105
+ sve_st1##NAME##_be_tlb); \
106
}
107
108
DO_STN_1(1, bb, 1)
109
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
110
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
111
--- a/target/arm/translate-sve.c
15
--- a/target/arm/translate-sve.c
112
+++ b/target/arm/translate-sve.c
16
+++ b/target/arm/translate-sve.c
113
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
17
@@ -XXX,XX +XXX,XX @@ static int pred_gvec_reg_size(DisasContext *s)
114
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
18
return size_for_gvec(pred_full_reg_size(s));
115
int msz, int esz, int nreg)
19
}
20
21
-/* Invoke a vector expander on two Zregs. */
22
+/* Invoke an out-of-line helper on 3 Zregs and a predicate. */
23
+static void gen_gvec_ool_zzzp(DisasContext *s, gen_helper_gvec_4 *fn,
24
+ int rd, int rn, int rm, int pg, int data)
25
+{
26
+ unsigned vsz = vec_full_reg_size(s);
27
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
28
+ vec_full_reg_offset(s, rn),
29
+ vec_full_reg_offset(s, rm),
30
+ pred_full_reg_offset(s, pg),
31
+ vsz, vsz, data, fn);
32
+}
33
34
+/* Invoke a vector expander on two Zregs. */
35
static void gen_gvec_fn_zz(DisasContext *s, GVecGen2Fn *gvec_fn,
36
int esz, int rd, int rn)
116
{
37
{
117
- static gen_helper_gvec_mem * const fn_single[4][4] = {
38
@@ -XXX,XX +XXX,XX @@ static bool trans_UQSUB_zzz(DisasContext *s, arg_rrr_esz *a)
118
- { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r,
39
119
- gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r },
40
static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn)
120
- { NULL, gen_helper_sve_st1hh_r,
41
{
121
- gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r },
42
- unsigned vsz = vec_full_reg_size(s);
122
- { NULL, NULL,
43
if (fn == NULL) {
123
- gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r },
44
return false;
124
- { NULL, NULL, NULL, gen_helper_sve_st1dd_r },
45
}
125
+ static gen_helper_gvec_mem * const fn_single[2][4][4] = {
46
if (sve_access_check(s)) {
126
+ { { gen_helper_sve_st1bb_r,
47
- tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd),
127
+ gen_helper_sve_st1bh_r,
48
- vec_full_reg_offset(s, a->rn),
128
+ gen_helper_sve_st1bs_r,
49
- vec_full_reg_offset(s, a->rm),
129
+ gen_helper_sve_st1bd_r },
50
- pred_full_reg_offset(s, a->pg),
130
+ { NULL,
51
- vsz, vsz, 0, fn);
131
+ gen_helper_sve_st1hh_le_r,
52
+ gen_gvec_ool_zzzp(s, fn, a->rd, a->rn, a->rm, a->pg, 0);
132
+ gen_helper_sve_st1hs_le_r,
53
}
133
+ gen_helper_sve_st1hd_le_r },
54
return true;
134
+ { NULL, NULL,
55
}
135
+ gen_helper_sve_st1ss_le_r,
56
@@ -XXX,XX +XXX,XX @@ static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz)
136
+ gen_helper_sve_st1sd_le_r },
57
gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h,
137
+ { NULL, NULL, NULL,
58
gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d
138
+ gen_helper_sve_st1dd_le_r } },
139
+ { { gen_helper_sve_st1bb_r,
140
+ gen_helper_sve_st1bh_r,
141
+ gen_helper_sve_st1bs_r,
142
+ gen_helper_sve_st1bd_r },
143
+ { NULL,
144
+ gen_helper_sve_st1hh_be_r,
145
+ gen_helper_sve_st1hs_be_r,
146
+ gen_helper_sve_st1hd_be_r },
147
+ { NULL, NULL,
148
+ gen_helper_sve_st1ss_be_r,
149
+ gen_helper_sve_st1sd_be_r },
150
+ { NULL, NULL, NULL,
151
+ gen_helper_sve_st1dd_be_r } },
152
};
59
};
153
- static gen_helper_gvec_mem * const fn_multiple[3][4] = {
60
- unsigned vsz = vec_full_reg_size(s);
154
- { gen_helper_sve_st2bb_r, gen_helper_sve_st2hh_r,
61
- tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
155
- gen_helper_sve_st2ss_r, gen_helper_sve_st2dd_r },
62
- vec_full_reg_offset(s, rn),
156
- { gen_helper_sve_st3bb_r, gen_helper_sve_st3hh_r,
63
- vec_full_reg_offset(s, rm),
157
- gen_helper_sve_st3ss_r, gen_helper_sve_st3dd_r },
64
- pred_full_reg_offset(s, pg),
158
- { gen_helper_sve_st4bb_r, gen_helper_sve_st4hh_r,
65
- vsz, vsz, 0, fns[esz]);
159
- gen_helper_sve_st4ss_r, gen_helper_sve_st4dd_r },
66
+ gen_gvec_ool_zzzp(s, fns[esz], rd, rn, rm, pg, 0);
160
+ static gen_helper_gvec_mem * const fn_multiple[2][3][4] = {
67
}
161
+ { { gen_helper_sve_st2bb_r,
68
162
+ gen_helper_sve_st2hh_le_r,
69
#define DO_ZPZZ(NAME, name) \
163
+ gen_helper_sve_st2ss_le_r,
70
@@ -XXX,XX +XXX,XX @@ static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a)
164
+ gen_helper_sve_st2dd_le_r },
71
static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a)
165
+ { gen_helper_sve_st3bb_r,
72
{
166
+ gen_helper_sve_st3hh_le_r,
73
if (sve_access_check(s)) {
167
+ gen_helper_sve_st3ss_le_r,
74
- unsigned vsz = vec_full_reg_size(s);
168
+ gen_helper_sve_st3dd_le_r },
75
- tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd),
169
+ { gen_helper_sve_st4bb_r,
76
- vec_full_reg_offset(s, a->rn),
170
+ gen_helper_sve_st4hh_le_r,
77
- vec_full_reg_offset(s, a->rm),
171
+ gen_helper_sve_st4ss_le_r,
78
- pred_full_reg_offset(s, a->pg),
172
+ gen_helper_sve_st4dd_le_r } },
79
- vsz, vsz, a->esz, gen_helper_sve_splice);
173
+ { { gen_helper_sve_st2bb_r,
80
+ gen_gvec_ool_zzzp(s, gen_helper_sve_splice,
174
+ gen_helper_sve_st2hh_be_r,
81
+ a->rd, a->rn, a->rm, a->pg, 0);
175
+ gen_helper_sve_st2ss_be_r,
176
+ gen_helper_sve_st2dd_be_r },
177
+ { gen_helper_sve_st3bb_r,
178
+ gen_helper_sve_st3hh_be_r,
179
+ gen_helper_sve_st3ss_be_r,
180
+ gen_helper_sve_st3dd_be_r },
181
+ { gen_helper_sve_st4bb_r,
182
+ gen_helper_sve_st4hh_be_r,
183
+ gen_helper_sve_st4ss_be_r,
184
+ gen_helper_sve_st4dd_be_r } },
185
};
186
gen_helper_gvec_mem *fn;
187
+ int be = s->be_data == MO_BE;
188
189
if (nreg == 0) {
190
/* ST1 */
191
- fn = fn_single[msz][esz];
192
+ fn = fn_single[be][msz][esz];
193
} else {
194
/* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
195
assert(msz == esz);
196
- fn = fn_multiple[nreg - 1][msz];
197
+ fn = fn_multiple[be][nreg - 1][msz];
198
}
82
}
199
assert(fn != NULL);
83
return true;
200
do_mem_zpa(s, zt, pg, addr, fn);
84
}
201
--
85
--
202
2.19.0
86
2.20.1
203
87
204
88
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This implements the feature for softmmu, and moves the
3
The existing clr functions have only one vector argument, and so
4
main loop out of a macro and into a function.
4
can only clear in place. The existing movz functions have two
5
5
vector arguments, and so can clear while moving. Merge them, with
6
a flag that controls the sense of active vs inactive elements
7
being cleared.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
11
Message-id: 20200815013145.539409-10-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-15-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/helper-sve.h | 84 ++++++++---
14
target/arm/helper-sve.h | 5 ---
13
target/arm/sve_helper.c | 290 +++++++++++++++++++++++++++----------
15
target/arm/sve_helper.c | 70 ++++++++------------------------------
14
target/arm/translate-sve.c | 84 +++++------
16
target/arm/translate-sve.c | 53 +++++++++++------------------
15
3 files changed, 321 insertions(+), 137 deletions(-)
17
3 files changed, 34 insertions(+), 94 deletions(-)
16
18
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
19
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
21
--- a/target/arm/helper-sve.h
20
+++ b/target/arm/helper-sve.h
22
+++ b/target/arm/helper-sve.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_ldsds_be_zd, TCG_CALL_NO_WG,
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
22
24
DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
23
DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
25
DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
24
void, env, ptr, ptr, ptr, tl, i32)
26
25
-DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG,
27
-DEF_HELPER_FLAGS_3(sve_clr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zsu, TCG_CALL_NO_WG,
28
-DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
27
void, env, ptr, ptr, ptr, tl, i32)
29
-DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
28
-DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG,
30
-DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_6(sve_ldffhsu_be_zsu, TCG_CALL_NO_WG,
31
-
30
+ void, env, ptr, ptr, ptr, tl, i32)
32
DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_6(sve_ldffss_le_zsu, TCG_CALL_NO_WG,
33
DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+ void, env, ptr, ptr, ptr, tl, i32)
34
DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_ldffss_be_zsu, TCG_CALL_NO_WG,
34
void, env, ptr, ptr, ptr, tl, i32)
35
DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG,
36
void, env, ptr, ptr, ptr, tl, i32)
37
-DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG,
38
+DEF_HELPER_FLAGS_6(sve_ldffhss_le_zsu, TCG_CALL_NO_WG,
39
+ void, env, ptr, ptr, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_6(sve_ldffhss_be_zsu, TCG_CALL_NO_WG,
41
void, env, ptr, ptr, ptr, tl, i32)
42
43
DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG,
44
void, env, ptr, ptr, ptr, tl, i32)
45
-DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG,
46
+DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zss, TCG_CALL_NO_WG,
47
void, env, ptr, ptr, ptr, tl, i32)
48
-DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG,
49
+DEF_HELPER_FLAGS_6(sve_ldffhsu_be_zss, TCG_CALL_NO_WG,
50
+ void, env, ptr, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_6(sve_ldffss_le_zss, TCG_CALL_NO_WG,
52
+ void, env, ptr, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_6(sve_ldffss_be_zss, TCG_CALL_NO_WG,
54
void, env, ptr, ptr, ptr, tl, i32)
55
DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG,
56
void, env, ptr, ptr, ptr, tl, i32)
57
-DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG,
58
+DEF_HELPER_FLAGS_6(sve_ldffhss_le_zss, TCG_CALL_NO_WG,
59
+ void, env, ptr, ptr, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_6(sve_ldffhss_be_zss, TCG_CALL_NO_WG,
61
void, env, ptr, ptr, ptr, tl, i32)
62
63
DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG,
64
void, env, ptr, ptr, ptr, tl, i32)
65
-DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG,
66
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zsu, TCG_CALL_NO_WG,
67
void, env, ptr, ptr, ptr, tl, i32)
68
-DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG,
69
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zsu, TCG_CALL_NO_WG,
70
void, env, ptr, ptr, ptr, tl, i32)
71
-DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG,
72
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zsu, TCG_CALL_NO_WG,
73
+ void, env, ptr, ptr, ptr, tl, i32)
74
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zsu, TCG_CALL_NO_WG,
75
+ void, env, ptr, ptr, ptr, tl, i32)
76
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zsu, TCG_CALL_NO_WG,
77
+ void, env, ptr, ptr, ptr, tl, i32)
78
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zsu, TCG_CALL_NO_WG,
79
void, env, ptr, ptr, ptr, tl, i32)
80
DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG,
81
void, env, ptr, ptr, ptr, tl, i32)
82
-DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG,
83
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zsu, TCG_CALL_NO_WG,
84
void, env, ptr, ptr, ptr, tl, i32)
85
-DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG,
86
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zsu, TCG_CALL_NO_WG,
87
+ void, env, ptr, ptr, ptr, tl, i32)
88
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zsu, TCG_CALL_NO_WG,
89
+ void, env, ptr, ptr, ptr, tl, i32)
90
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zsu, TCG_CALL_NO_WG,
91
void, env, ptr, ptr, ptr, tl, i32)
92
93
DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG,
94
void, env, ptr, ptr, ptr, tl, i32)
95
-DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG,
96
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zss, TCG_CALL_NO_WG,
97
void, env, ptr, ptr, ptr, tl, i32)
98
-DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG,
99
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zss, TCG_CALL_NO_WG,
100
void, env, ptr, ptr, ptr, tl, i32)
101
-DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG,
102
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zss, TCG_CALL_NO_WG,
103
+ void, env, ptr, ptr, ptr, tl, i32)
104
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zss, TCG_CALL_NO_WG,
105
+ void, env, ptr, ptr, ptr, tl, i32)
106
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zss, TCG_CALL_NO_WG,
107
+ void, env, ptr, ptr, ptr, tl, i32)
108
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zss, TCG_CALL_NO_WG,
109
void, env, ptr, ptr, ptr, tl, i32)
110
DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG,
111
void, env, ptr, ptr, ptr, tl, i32)
112
-DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG,
113
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zss, TCG_CALL_NO_WG,
114
void, env, ptr, ptr, ptr, tl, i32)
115
-DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG,
116
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zss, TCG_CALL_NO_WG,
117
+ void, env, ptr, ptr, ptr, tl, i32)
118
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zss, TCG_CALL_NO_WG,
119
+ void, env, ptr, ptr, ptr, tl, i32)
120
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zss, TCG_CALL_NO_WG,
121
void, env, ptr, ptr, ptr, tl, i32)
122
123
DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG,
124
void, env, ptr, ptr, ptr, tl, i32)
125
-DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG,
126
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zd, TCG_CALL_NO_WG,
127
void, env, ptr, ptr, ptr, tl, i32)
128
-DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG,
129
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zd, TCG_CALL_NO_WG,
130
void, env, ptr, ptr, ptr, tl, i32)
131
-DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG,
132
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zd, TCG_CALL_NO_WG,
133
+ void, env, ptr, ptr, ptr, tl, i32)
134
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zd, TCG_CALL_NO_WG,
135
+ void, env, ptr, ptr, ptr, tl, i32)
136
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zd, TCG_CALL_NO_WG,
137
+ void, env, ptr, ptr, ptr, tl, i32)
138
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zd, TCG_CALL_NO_WG,
139
void, env, ptr, ptr, ptr, tl, i32)
140
DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG,
141
void, env, ptr, ptr, ptr, tl, i32)
142
-DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG,
143
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zd, TCG_CALL_NO_WG,
144
void, env, ptr, ptr, ptr, tl, i32)
145
-DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG,
146
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zd, TCG_CALL_NO_WG,
147
+ void, env, ptr, ptr, ptr, tl, i32)
148
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zd, TCG_CALL_NO_WG,
149
+ void, env, ptr, ptr, ptr, tl, i32)
150
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zd, TCG_CALL_NO_WG,
151
void, env, ptr, ptr, ptr, tl, i32)
152
153
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
154
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
35
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
155
index XXXXXXX..XXXXXXX 100644
36
index XXXXXXX..XXXXXXX 100644
156
--- a/target/arm/sve_helper.c
37
--- a/target/arm/sve_helper.c
157
+++ b/target/arm/sve_helper.c
38
+++ b/target/arm/sve_helper.c
158
@@ -XXX,XX +XXX,XX @@ DO_LD1_ZPZ_D(dd_be, zd)
39
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32_t pred_desc)
159
40
return flags;
160
/* First fault loads with a vector index. */
41
}
161
42
162
-#ifdef CONFIG_USER_ONLY
43
-/* Store zero into every active element of Zd. We will use this for two
163
+/* Load one element into VD+REG_OFF from (ENV,VADDR) without faulting.
44
- * and three-operand predicated instructions for which logic dictates a
164
+ * The controlling predicate is known to be true. Return true if the
45
- * zero result. In particular, logical shift by element size, which is
165
+ * load was successful.
46
- * otherwise undefined on the host.
166
+ */
47
- *
167
+typedef bool sve_ld1_nf_fn(CPUARMState *env, void *vd, intptr_t reg_off,
48
- * For element sizes smaller than uint64_t, we use tables to expand
168
+ target_ulong vaddr, int mmu_idx);
49
- * the N bits of the controlling predicate to a byte mask, and clear
169
50
- * those bytes.
170
-#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \
171
-void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
172
- target_ulong base, uint32_t desc) \
173
-{ \
174
- intptr_t i, oprsz = simd_oprsz(desc); \
175
- unsigned scale = simd_data(desc); \
176
- uintptr_t ra = GETPC(); \
177
- bool first = true; \
178
- mmap_lock(); \
179
- for (i = 0; i < oprsz; ) { \
180
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
181
- do { \
182
- TYPEM m = 0; \
183
- if (pg & 1) { \
184
- target_ulong off = *(TYPEI *)(vm + H(i)); \
185
- target_ulong addr = base + (off << scale); \
186
- if (!first && \
187
- page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \
188
- record_fault(env, i, oprsz); \
189
- goto exit; \
190
- } \
191
- m = FN(env, addr, ra); \
192
- first = false; \
193
- } \
194
- *(TYPEE *)(vd + H(i)) = m; \
195
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
196
- } while (i & 15); \
197
- } \
198
- exit: \
199
- mmap_unlock(); \
200
+#ifdef CONFIG_SOFTMMU
201
+#define DO_LD_NF(NAME, H, TYPEE, TYPEM, HOST) \
202
+static bool sve_ld##NAME##_nf(CPUARMState *env, void *vd, intptr_t reg_off, \
203
+ target_ulong addr, int mmu_idx) \
204
+{ \
205
+ target_ulong next_page = -(addr | TARGET_PAGE_MASK); \
206
+ if (likely(next_page - addr >= sizeof(TYPEM))) { \
207
+ void *host = tlb_vaddr_to_host(env, addr, MMU_DATA_LOAD, mmu_idx); \
208
+ if (likely(host)) { \
209
+ TYPEM val = HOST(host); \
210
+ *(TYPEE *)(vd + H(reg_off)) = val; \
211
+ return true; \
212
+ } \
213
+ } \
214
+ return false; \
215
}
216
-
217
#else
218
-
219
-#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \
220
-void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
221
- target_ulong base, uint32_t desc) \
222
-{ \
223
- g_assert_not_reached(); \
224
+#define DO_LD_NF(NAME, H, TYPEE, TYPEM, HOST) \
225
+static bool sve_ld##NAME##_nf(CPUARMState *env, void *vd, intptr_t reg_off, \
226
+ target_ulong addr, int mmu_idx) \
227
+{ \
228
+ if (likely(page_check_range(addr, sizeof(TYPEM), PAGE_READ))) { \
229
+ TYPEM val = HOST(g2h(addr)); \
230
+ *(TYPEE *)(vd + H(reg_off)) = val; \
231
+ return true; \
232
+ } \
233
+ return false; \
234
}
235
-
236
#endif
237
238
-#define DO_LDFF1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
239
- DO_LDFF1_ZPZ(NAME, uint32_t, TYPEI, TYPEM, FN, H1_4)
240
-#define DO_LDFF1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \
241
- DO_LDFF1_ZPZ(NAME, uint64_t, TYPEI, TYPEM, FN, )
242
+DO_LD_NF(bsu, H1_4, uint32_t, uint8_t, ldub_p)
243
+DO_LD_NF(bss, H1_4, uint32_t, int8_t, ldsb_p)
244
+DO_LD_NF(bdu, , uint64_t, uint8_t, ldub_p)
245
+DO_LD_NF(bds, , uint64_t, int8_t, ldsb_p)
246
247
-DO_LDFF1_ZPZ_S(sve_ldffbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
248
-DO_LDFF1_ZPZ_S(sve_ldffhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
249
-DO_LDFF1_ZPZ_S(sve_ldffssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
250
-DO_LDFF1_ZPZ_S(sve_ldffbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
251
-DO_LDFF1_ZPZ_S(sve_ldffhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
252
+DO_LD_NF(hsu_le, H1_4, uint32_t, uint16_t, lduw_le_p)
253
+DO_LD_NF(hss_le, H1_4, uint32_t, int16_t, ldsw_le_p)
254
+DO_LD_NF(hsu_be, H1_4, uint32_t, uint16_t, lduw_be_p)
255
+DO_LD_NF(hss_be, H1_4, uint32_t, int16_t, ldsw_be_p)
256
+DO_LD_NF(hdu_le, , uint64_t, uint16_t, lduw_le_p)
257
+DO_LD_NF(hds_le, , uint64_t, int16_t, ldsw_le_p)
258
+DO_LD_NF(hdu_be, , uint64_t, uint16_t, lduw_be_p)
259
+DO_LD_NF(hds_be, , uint64_t, int16_t, ldsw_be_p)
260
261
-DO_LDFF1_ZPZ_S(sve_ldffbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
262
-DO_LDFF1_ZPZ_S(sve_ldffhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
263
-DO_LDFF1_ZPZ_S(sve_ldffssu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
264
-DO_LDFF1_ZPZ_S(sve_ldffbss_zss, int32_t, int8_t, cpu_ldub_data_ra)
265
-DO_LDFF1_ZPZ_S(sve_ldffhss_zss, int32_t, int16_t, cpu_lduw_data_ra)
266
+DO_LD_NF(ss_le, H1_4, uint32_t, uint32_t, ldl_le_p)
267
+DO_LD_NF(ss_be, H1_4, uint32_t, uint32_t, ldl_be_p)
268
+DO_LD_NF(sdu_le, , uint64_t, uint32_t, ldl_le_p)
269
+DO_LD_NF(sds_le, , uint64_t, int32_t, ldl_le_p)
270
+DO_LD_NF(sdu_be, , uint64_t, uint32_t, ldl_be_p)
271
+DO_LD_NF(sds_be, , uint64_t, int32_t, ldl_be_p)
272
273
-DO_LDFF1_ZPZ_D(sve_ldffbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
274
-DO_LDFF1_ZPZ_D(sve_ldffhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
275
-DO_LDFF1_ZPZ_D(sve_ldffsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
276
-DO_LDFF1_ZPZ_D(sve_ldffddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra)
277
-DO_LDFF1_ZPZ_D(sve_ldffbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
278
-DO_LDFF1_ZPZ_D(sve_ldffhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
279
-DO_LDFF1_ZPZ_D(sve_ldffsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra)
280
+DO_LD_NF(dd_le, , uint64_t, uint64_t, ldq_le_p)
281
+DO_LD_NF(dd_be, , uint64_t, uint64_t, ldq_be_p)
282
283
-DO_LDFF1_ZPZ_D(sve_ldffbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
284
-DO_LDFF1_ZPZ_D(sve_ldffhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
285
-DO_LDFF1_ZPZ_D(sve_ldffsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
286
-DO_LDFF1_ZPZ_D(sve_ldffddu_zss, int32_t, uint64_t, cpu_ldq_data_ra)
287
-DO_LDFF1_ZPZ_D(sve_ldffbds_zss, int32_t, int8_t, cpu_ldub_data_ra)
288
-DO_LDFF1_ZPZ_D(sve_ldffhds_zss, int32_t, int16_t, cpu_lduw_data_ra)
289
-DO_LDFF1_ZPZ_D(sve_ldffsds_zss, int32_t, int32_t, cpu_ldl_data_ra)
290
+/*
51
+/*
291
+ * Common helper for all gather first-faulting loads.
52
+ * Copy Zn into Zd, and store zero into inactive elements.
292
+ */
53
+ * If inv, store zeros into the active elements.
293
+static inline void sve_ldff1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
54
*/
294
+ target_ulong base, uint32_t desc, uintptr_t ra,
55
-void HELPER(sve_clr_b)(void *vd, void *vg, uint32_t desc)
295
+ zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn,
56
-{
296
+ sve_ld1_nf_fn *nonfault_fn)
57
- intptr_t i, opr_sz = simd_oprsz(desc) / 8;
297
+{
58
- uint64_t *d = vd;
298
+ const int mmu_idx = cpu_mmu_index(env, false);
59
- uint8_t *pg = vg;
299
+ intptr_t reg_off, reg_max = simd_oprsz(desc);
60
- for (i = 0; i < opr_sz; i += 1) {
300
+ unsigned scale = simd_data(desc);
61
- d[i] &= ~expand_pred_b(pg[H1(i)]);
301
+ target_ulong addr;
62
- }
302
63
-}
303
-DO_LDFF1_ZPZ_D(sve_ldffbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra)
64
-
304
-DO_LDFF1_ZPZ_D(sve_ldffhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra)
65
-void HELPER(sve_clr_h)(void *vd, void *vg, uint32_t desc)
305
-DO_LDFF1_ZPZ_D(sve_ldffsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra)
66
-{
306
-DO_LDFF1_ZPZ_D(sve_ldffddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra)
67
- intptr_t i, opr_sz = simd_oprsz(desc) / 8;
307
-DO_LDFF1_ZPZ_D(sve_ldffbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
68
- uint64_t *d = vd;
308
-DO_LDFF1_ZPZ_D(sve_ldffhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
69
- uint8_t *pg = vg;
309
-DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
70
- for (i = 0; i < opr_sz; i += 1) {
310
+ /* Skip to the first true predicate. */
71
- d[i] &= ~expand_pred_h(pg[H1(i)]);
311
+ reg_off = find_next_active(vg, 0, reg_max, MO_32);
72
- }
312
+ if (likely(reg_off < reg_max)) {
73
-}
313
+ /* Perform one normal read, which will fault or not. */
74
-
314
+ set_helper_retaddr(ra);
75
-void HELPER(sve_clr_s)(void *vd, void *vg, uint32_t desc)
315
+ addr = off_fn(vm, reg_off);
76
-{
316
+ addr = base + (addr << scale);
77
- intptr_t i, opr_sz = simd_oprsz(desc) / 8;
317
+ tlb_fn(env, vd, reg_off, addr, mmu_idx, ra);
78
- uint64_t *d = vd;
318
+
79
- uint8_t *pg = vg;
319
+ /* The rest of the reads will be non-faulting. */
80
- for (i = 0; i < opr_sz; i += 1) {
320
+ set_helper_retaddr(0);
81
- d[i] &= ~expand_pred_s(pg[H1(i)]);
321
+ }
82
- }
322
+
83
-}
323
+ /* After any fault, zero the leading predicated false elements. */
84
-
324
+ swap_memzero(vd, reg_off);
85
-void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc)
325
+
86
-{
326
+ while (likely((reg_off += 4) < reg_max)) {
87
- intptr_t i, opr_sz = simd_oprsz(desc) / 8;
327
+ uint64_t pg = *(uint64_t *)(vg + (reg_off >> 6) * 8);
88
- uint64_t *d = vd;
328
+ if (likely((pg >> (reg_off & 63)) & 1)) {
89
- uint8_t *pg = vg;
329
+ addr = off_fn(vm, reg_off);
90
- for (i = 0; i < opr_sz; i += 1) {
330
+ addr = base + (addr << scale);
91
- if (pg[H1(i)] & 1) {
331
+ if (!nonfault_fn(env, vd, reg_off, addr, mmu_idx)) {
92
- d[i] = 0;
332
+ record_fault(env, reg_off, reg_max);
93
- }
333
+ break;
94
- }
334
+ }
95
-}
335
+ } else {
96
-
336
+ *(uint32_t *)(vd + H1_4(reg_off)) = 0;
97
-/* Copy Zn into Zd, and store zero into inactive elements. */
337
+ }
98
void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc)
338
+ }
99
{
339
+}
100
intptr_t i, opr_sz = simd_oprsz(desc) / 8;
340
+
101
+ uint64_t inv = -(uint64_t)(simd_data(desc) & 1);
341
+static inline void sve_ldff1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
102
uint64_t *d = vd, *n = vn;
342
+ target_ulong base, uint32_t desc, uintptr_t ra,
103
uint8_t *pg = vg;
343
+ zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn,
104
+
344
+ sve_ld1_nf_fn *nonfault_fn)
105
for (i = 0; i < opr_sz; i += 1) {
345
+{
106
- d[i] = n[i] & expand_pred_b(pg[H1(i)]);
346
+ const int mmu_idx = cpu_mmu_index(env, false);
107
+ d[i] = n[i] & (expand_pred_b(pg[H1(i)]) ^ inv);
347
+ intptr_t reg_off, reg_max = simd_oprsz(desc);
108
}
348
+ unsigned scale = simd_data(desc);
109
}
349
+ target_ulong addr;
110
350
+
111
void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc)
351
+ /* Skip to the first true predicate. */
112
{
352
+ reg_off = find_next_active(vg, 0, reg_max, MO_64);
113
intptr_t i, opr_sz = simd_oprsz(desc) / 8;
353
+ if (likely(reg_off < reg_max)) {
114
+ uint64_t inv = -(uint64_t)(simd_data(desc) & 1);
354
+ /* Perform one normal read, which will fault or not. */
115
uint64_t *d = vd, *n = vn;
355
+ set_helper_retaddr(ra);
116
uint8_t *pg = vg;
356
+ addr = off_fn(vm, reg_off);
117
+
357
+ addr = base + (addr << scale);
118
for (i = 0; i < opr_sz; i += 1) {
358
+ tlb_fn(env, vd, reg_off, addr, mmu_idx, ra);
119
- d[i] = n[i] & expand_pred_h(pg[H1(i)]);
359
+
120
+ d[i] = n[i] & (expand_pred_h(pg[H1(i)]) ^ inv);
360
+ /* The rest of the reads will be non-faulting. */
121
}
361
+ set_helper_retaddr(0);
122
}
362
+ }
123
363
+
124
void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc)
364
+ /* After any fault, zero the leading predicated false elements. */
125
{
365
+ swap_memzero(vd, reg_off);
126
intptr_t i, opr_sz = simd_oprsz(desc) / 8;
366
+
127
+ uint64_t inv = -(uint64_t)(simd_data(desc) & 1);
367
+ while (likely((reg_off += 8) < reg_max)) {
128
uint64_t *d = vd, *n = vn;
368
+ uint8_t pg = *(uint8_t *)(vg + H1(reg_off >> 3));
129
uint8_t *pg = vg;
369
+ if (likely(pg & 1)) {
130
+
370
+ addr = off_fn(vm, reg_off);
131
for (i = 0; i < opr_sz; i += 1) {
371
+ addr = base + (addr << scale);
132
- d[i] = n[i] & expand_pred_s(pg[H1(i)]);
372
+ if (!nonfault_fn(env, vd, reg_off, addr, mmu_idx)) {
133
+ d[i] = n[i] & (expand_pred_s(pg[H1(i)]) ^ inv);
373
+ record_fault(env, reg_off, reg_max);
134
}
374
+ break;
135
}
375
+ }
136
376
+ } else {
137
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc)
377
+ *(uint64_t *)(vd + reg_off) = 0;
138
intptr_t i, opr_sz = simd_oprsz(desc) / 8;
378
+ }
139
uint64_t *d = vd, *n = vn;
379
+ }
140
uint8_t *pg = vg;
380
+}
141
+ uint8_t inv = simd_data(desc);
381
+
142
+
382
+#define DO_LDFF1_ZPZ_S(MEM, OFS) \
143
for (i = 0; i < opr_sz; i += 1) {
383
+void HELPER(sve_ldff##MEM##_##OFS) \
144
- d[i] = n[i] & -(uint64_t)(pg[H1(i)] & 1);
384
+ (CPUARMState *env, void *vd, void *vg, void *vm, \
145
+ d[i] = n[i] & -(uint64_t)((pg[H1(i)] ^ inv) & 1);
385
+ target_ulong base, uint32_t desc) \
146
}
386
+{ \
147
}
387
+ sve_ldff1_zs(env, vd, vg, vm, base, desc, GETPC(), \
388
+ off_##OFS##_s, sve_ld1##MEM##_tlb, sve_ld##MEM##_nf); \
389
+}
390
+
391
+#define DO_LDFF1_ZPZ_D(MEM, OFS) \
392
+void HELPER(sve_ldff##MEM##_##OFS) \
393
+ (CPUARMState *env, void *vd, void *vg, void *vm, \
394
+ target_ulong base, uint32_t desc) \
395
+{ \
396
+ sve_ldff1_zd(env, vd, vg, vm, base, desc, GETPC(), \
397
+ off_##OFS##_d, sve_ld1##MEM##_tlb, sve_ld##MEM##_nf); \
398
+}
399
+
400
+DO_LDFF1_ZPZ_S(bsu, zsu)
401
+DO_LDFF1_ZPZ_S(bsu, zss)
402
+DO_LDFF1_ZPZ_D(bdu, zsu)
403
+DO_LDFF1_ZPZ_D(bdu, zss)
404
+DO_LDFF1_ZPZ_D(bdu, zd)
405
+
406
+DO_LDFF1_ZPZ_S(bss, zsu)
407
+DO_LDFF1_ZPZ_S(bss, zss)
408
+DO_LDFF1_ZPZ_D(bds, zsu)
409
+DO_LDFF1_ZPZ_D(bds, zss)
410
+DO_LDFF1_ZPZ_D(bds, zd)
411
+
412
+DO_LDFF1_ZPZ_S(hsu_le, zsu)
413
+DO_LDFF1_ZPZ_S(hsu_le, zss)
414
+DO_LDFF1_ZPZ_D(hdu_le, zsu)
415
+DO_LDFF1_ZPZ_D(hdu_le, zss)
416
+DO_LDFF1_ZPZ_D(hdu_le, zd)
417
+
418
+DO_LDFF1_ZPZ_S(hsu_be, zsu)
419
+DO_LDFF1_ZPZ_S(hsu_be, zss)
420
+DO_LDFF1_ZPZ_D(hdu_be, zsu)
421
+DO_LDFF1_ZPZ_D(hdu_be, zss)
422
+DO_LDFF1_ZPZ_D(hdu_be, zd)
423
+
424
+DO_LDFF1_ZPZ_S(hss_le, zsu)
425
+DO_LDFF1_ZPZ_S(hss_le, zss)
426
+DO_LDFF1_ZPZ_D(hds_le, zsu)
427
+DO_LDFF1_ZPZ_D(hds_le, zss)
428
+DO_LDFF1_ZPZ_D(hds_le, zd)
429
+
430
+DO_LDFF1_ZPZ_S(hss_be, zsu)
431
+DO_LDFF1_ZPZ_S(hss_be, zss)
432
+DO_LDFF1_ZPZ_D(hds_be, zsu)
433
+DO_LDFF1_ZPZ_D(hds_be, zss)
434
+DO_LDFF1_ZPZ_D(hds_be, zd)
435
+
436
+DO_LDFF1_ZPZ_S(ss_le, zsu)
437
+DO_LDFF1_ZPZ_S(ss_le, zss)
438
+DO_LDFF1_ZPZ_D(sdu_le, zsu)
439
+DO_LDFF1_ZPZ_D(sdu_le, zss)
440
+DO_LDFF1_ZPZ_D(sdu_le, zd)
441
+
442
+DO_LDFF1_ZPZ_S(ss_be, zsu)
443
+DO_LDFF1_ZPZ_S(ss_be, zss)
444
+DO_LDFF1_ZPZ_D(sdu_be, zsu)
445
+DO_LDFF1_ZPZ_D(sdu_be, zss)
446
+DO_LDFF1_ZPZ_D(sdu_be, zd)
447
+
448
+DO_LDFF1_ZPZ_D(sds_le, zsu)
449
+DO_LDFF1_ZPZ_D(sds_le, zss)
450
+DO_LDFF1_ZPZ_D(sds_le, zd)
451
+
452
+DO_LDFF1_ZPZ_D(sds_be, zsu)
453
+DO_LDFF1_ZPZ_D(sds_be, zss)
454
+DO_LDFF1_ZPZ_D(sds_be, zd)
455
+
456
+DO_LDFF1_ZPZ_D(dd_le, zsu)
457
+DO_LDFF1_ZPZ_D(dd_le, zss)
458
+DO_LDFF1_ZPZ_D(dd_le, zd)
459
+
460
+DO_LDFF1_ZPZ_D(dd_be, zsu)
461
+DO_LDFF1_ZPZ_D(dd_be, zss)
462
+DO_LDFF1_ZPZ_D(dd_be, zd)
463
464
/* Stores with a vector index. */
465
148
466
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
149
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
467
index XXXXXXX..XXXXXXX 100644
150
index XXXXXXX..XXXXXXX 100644
468
--- a/target/arm/translate-sve.c
151
--- a/target/arm/translate-sve.c
469
+++ b/target/arm/translate-sve.c
152
+++ b/target/arm/translate-sve.c
470
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][2][3] = {
153
@@ -XXX,XX +XXX,XX @@ static bool trans_SADDV(DisasContext *s, arg_rpr_esz *a)
471
154
*** SVE Shift by Immediate - Predicated Group
472
/* First-fault */
155
*/
473
{ { { gen_helper_sve_ldffbss_zsu,
156
474
- gen_helper_sve_ldffhss_zsu,
157
-/* Store zero into every active element of Zd. We will use this for two
475
+ gen_helper_sve_ldffhss_le_zsu,
158
- * and three-operand predicated instructions for which logic dictates a
476
NULL, },
159
- * zero result.
477
{ gen_helper_sve_ldffbsu_zsu,
160
+/*
478
- gen_helper_sve_ldffhsu_zsu,
161
+ * Copy Zn into Zd, storing zeros into inactive elements.
479
- gen_helper_sve_ldffssu_zsu, } },
162
+ * If invert, store zeros into the active elements.
480
+ gen_helper_sve_ldffhsu_le_zsu,
163
*/
481
+ gen_helper_sve_ldffss_le_zsu, } },
164
-static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz)
482
{ { gen_helper_sve_ldffbss_zss,
165
-{
483
- gen_helper_sve_ldffhss_zss,
166
- static gen_helper_gvec_2 * const fns[4] = {
484
+ gen_helper_sve_ldffhss_le_zss,
167
- gen_helper_sve_clr_b, gen_helper_sve_clr_h,
485
NULL, },
168
- gen_helper_sve_clr_s, gen_helper_sve_clr_d,
486
{ gen_helper_sve_ldffbsu_zss,
169
- };
487
- gen_helper_sve_ldffhsu_zss,
170
- if (sve_access_check(s)) {
488
- gen_helper_sve_ldffssu_zss, } } } },
171
- unsigned vsz = vec_full_reg_size(s);
489
+ gen_helper_sve_ldffhsu_le_zss,
172
- tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd),
490
+ gen_helper_sve_ldffss_le_zss, } } } },
173
- pred_full_reg_offset(s, pg),
491
174
- vsz, vsz, 0, fns[esz]);
492
/* Big-endian */
175
- }
493
{ { { { gen_helper_sve_ldbss_zsu,
176
- return true;
494
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][2][3] = {
177
-}
495
178
-
496
/* First-fault */
179
-/* Copy Zn into Zd, storing zeros into inactive elements. */
497
{ { { gen_helper_sve_ldffbss_zsu,
180
-static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz)
498
- gen_helper_sve_ldffhss_zsu,
181
+static bool do_movz_zpz(DisasContext *s, int rd, int rn, int pg,
499
+ gen_helper_sve_ldffhss_be_zsu,
182
+ int esz, bool invert)
500
NULL, },
183
{
501
{ gen_helper_sve_ldffbsu_zsu,
184
static gen_helper_gvec_3 * const fns[4] = {
502
- gen_helper_sve_ldffhsu_zsu,
185
gen_helper_sve_movz_b, gen_helper_sve_movz_h,
503
- gen_helper_sve_ldffssu_zsu, } },
186
gen_helper_sve_movz_s, gen_helper_sve_movz_d,
504
+ gen_helper_sve_ldffhsu_be_zsu,
187
};
505
+ gen_helper_sve_ldffss_be_zsu, } },
188
- unsigned vsz = vec_full_reg_size(s);
506
{ { gen_helper_sve_ldffbss_zss,
189
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
507
- gen_helper_sve_ldffhss_zss,
190
- vec_full_reg_offset(s, rn),
508
+ gen_helper_sve_ldffhss_be_zss,
191
- pred_full_reg_offset(s, pg),
509
NULL, },
192
- vsz, vsz, 0, fns[esz]);
510
{ gen_helper_sve_ldffbsu_zss,
193
+
511
- gen_helper_sve_ldffhsu_zss,
194
+ if (sve_access_check(s)) {
512
- gen_helper_sve_ldffssu_zss, } } } },
195
+ unsigned vsz = vec_full_reg_size(s);
513
+ gen_helper_sve_ldffhsu_be_zss,
196
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
514
+ gen_helper_sve_ldffss_be_zss, } } } },
197
+ vec_full_reg_offset(s, rn),
515
};
198
+ pred_full_reg_offset(s, pg),
516
199
+ vsz, vsz, invert, fns[esz]);
517
/* Note that we overload xs=2 to indicate 64-bit offset. */
200
+ }
518
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][2][3][2][4] = {
201
+ return true;
519
202
}
520
/* First-fault */
203
521
{ { { gen_helper_sve_ldffbds_zsu,
204
static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a,
522
- gen_helper_sve_ldffhds_zsu,
205
@@ -XXX,XX +XXX,XX @@ static bool trans_LSR_zpzi(DisasContext *s, arg_rpri_esz *a)
523
- gen_helper_sve_ldffsds_zsu,
206
/* Shift by element size is architecturally valid.
524
+ gen_helper_sve_ldffhds_le_zsu,
207
For logical shifts, it is a zeroing operation. */
525
+ gen_helper_sve_ldffsds_le_zsu,
208
if (a->imm >= (8 << a->esz)) {
526
NULL, },
209
- return do_clr_zp(s, a->rd, a->pg, a->esz);
527
{ gen_helper_sve_ldffbdu_zsu,
210
+ return do_movz_zpz(s, a->rd, a->rd, a->pg, a->esz, true);
528
- gen_helper_sve_ldffhdu_zsu,
211
} else {
529
- gen_helper_sve_ldffsdu_zsu,
212
return do_zpzi_ool(s, a, fns[a->esz]);
530
- gen_helper_sve_ldffddu_zsu, } },
213
}
531
+ gen_helper_sve_ldffhdu_le_zsu,
214
@@ -XXX,XX +XXX,XX @@ static bool trans_LSL_zpzi(DisasContext *s, arg_rpri_esz *a)
532
+ gen_helper_sve_ldffsdu_le_zsu,
215
/* Shift by element size is architecturally valid.
533
+ gen_helper_sve_ldffdd_le_zsu, } },
216
For logical shifts, it is a zeroing operation. */
534
{ { gen_helper_sve_ldffbds_zss,
217
if (a->imm >= (8 << a->esz)) {
535
- gen_helper_sve_ldffhds_zss,
218
- return do_clr_zp(s, a->rd, a->pg, a->esz);
536
- gen_helper_sve_ldffsds_zss,
219
+ return do_movz_zpz(s, a->rd, a->rd, a->pg, a->esz, true);
537
+ gen_helper_sve_ldffhds_le_zss,
220
} else {
538
+ gen_helper_sve_ldffsds_le_zss,
221
return do_zpzi_ool(s, a, fns[a->esz]);
539
NULL, },
222
}
540
{ gen_helper_sve_ldffbdu_zss,
223
@@ -XXX,XX +XXX,XX @@ static bool trans_ASRD(DisasContext *s, arg_rpri_esz *a)
541
- gen_helper_sve_ldffhdu_zss,
224
/* Shift by element size is architecturally valid. For arithmetic
542
- gen_helper_sve_ldffsdu_zss,
225
right shift for division, it is a zeroing operation. */
543
- gen_helper_sve_ldffddu_zss, } },
226
if (a->imm >= (8 << a->esz)) {
544
+ gen_helper_sve_ldffhdu_le_zss,
227
- return do_clr_zp(s, a->rd, a->pg, a->esz);
545
+ gen_helper_sve_ldffsdu_le_zss,
228
+ return do_movz_zpz(s, a->rd, a->rd, a->pg, a->esz, true);
546
+ gen_helper_sve_ldffdd_le_zss, } },
229
} else {
547
{ { gen_helper_sve_ldffbds_zd,
230
return do_zpzi_ool(s, a, fns[a->esz]);
548
- gen_helper_sve_ldffhds_zd,
231
}
549
- gen_helper_sve_ldffsds_zd,
232
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a)
550
+ gen_helper_sve_ldffhds_le_zd,
233
551
+ gen_helper_sve_ldffsds_le_zd,
234
/* Zero the inactive elements. */
552
NULL, },
235
gen_set_label(over);
553
{ gen_helper_sve_ldffbdu_zd,
236
- do_movz_zpz(s, a->rd, a->rd, a->pg, esz);
554
- gen_helper_sve_ldffhdu_zd,
237
- return true;
555
- gen_helper_sve_ldffsdu_zd,
238
+ return do_movz_zpz(s, a->rd, a->rd, a->pg, esz, false);
556
- gen_helper_sve_ldffddu_zd, } } } },
239
}
557
+ gen_helper_sve_ldffhdu_le_zd,
240
558
+ gen_helper_sve_ldffsdu_le_zd,
241
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
559
+ gen_helper_sve_ldffdd_le_zd, } } } },
242
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a)
560
243
561
/* Big-endian */
244
static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a)
562
{ { { { gen_helper_sve_ldbds_zsu,
245
{
563
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][2][3][2][4] = {
246
- if (sve_access_check(s)) {
564
247
- do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz);
565
/* First-fault */
248
- }
566
{ { { gen_helper_sve_ldffbds_zsu,
249
- return true;
567
- gen_helper_sve_ldffhds_zsu,
250
+ return do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz, false);
568
- gen_helper_sve_ldffsds_zsu,
251
}
569
+ gen_helper_sve_ldffhds_be_zsu,
570
+ gen_helper_sve_ldffsds_be_zsu,
571
NULL, },
572
{ gen_helper_sve_ldffbdu_zsu,
573
- gen_helper_sve_ldffhdu_zsu,
574
- gen_helper_sve_ldffsdu_zsu,
575
- gen_helper_sve_ldffddu_zsu, } },
576
+ gen_helper_sve_ldffhdu_be_zsu,
577
+ gen_helper_sve_ldffsdu_be_zsu,
578
+ gen_helper_sve_ldffdd_be_zsu, } },
579
{ { gen_helper_sve_ldffbds_zss,
580
- gen_helper_sve_ldffhds_zss,
581
- gen_helper_sve_ldffsds_zss,
582
+ gen_helper_sve_ldffhds_be_zss,
583
+ gen_helper_sve_ldffsds_be_zss,
584
NULL, },
585
{ gen_helper_sve_ldffbdu_zss,
586
- gen_helper_sve_ldffhdu_zss,
587
- gen_helper_sve_ldffsdu_zss,
588
- gen_helper_sve_ldffddu_zss, } },
589
+ gen_helper_sve_ldffhdu_be_zss,
590
+ gen_helper_sve_ldffsdu_be_zss,
591
+ gen_helper_sve_ldffdd_be_zss, } },
592
{ { gen_helper_sve_ldffbds_zd,
593
- gen_helper_sve_ldffhds_zd,
594
- gen_helper_sve_ldffsds_zd,
595
+ gen_helper_sve_ldffhds_be_zd,
596
+ gen_helper_sve_ldffsds_be_zd,
597
NULL, },
598
{ gen_helper_sve_ldffbdu_zd,
599
- gen_helper_sve_ldffhdu_zd,
600
- gen_helper_sve_ldffsdu_zd,
601
- gen_helper_sve_ldffddu_zd, } } } },
602
+ gen_helper_sve_ldffhdu_be_zd,
603
+ gen_helper_sve_ldffsdu_be_zd,
604
+ gen_helper_sve_ldffdd_be_zd, } } } },
605
};
606
607
static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
608
--
252
--
609
2.19.0
253
2.20.1
610
254
611
255
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The 16-byte load only uses 16 predicate bits. But while
3
Model after gen_gvec_fn_zzz et al.
4
reusing the other load infrastructure, we find other bits
5
that are set and trigger an assert. To avoid this and
6
retain the assert, zero-extend the predicate that we pass
7
to the LD1 helper.
8
4
9
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200815013145.539409-11-richard.henderson@linaro.org
13
Message-id: 20181005175350.30752-7-richard.henderson@linaro.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
9
---
16
target/arm/translate-sve.c | 25 +++++++++++++++++++++++--
10
target/arm/translate-sve.c | 29 ++++++++++++++---------------
17
1 file changed, 23 insertions(+), 2 deletions(-)
11
1 file changed, 14 insertions(+), 15 deletions(-)
18
12
19
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
20
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate-sve.c
15
--- a/target/arm/translate-sve.c
22
+++ b/target/arm/translate-sve.c
16
+++ b/target/arm/translate-sve.c
23
@@ -XXX,XX +XXX,XX @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
17
@@ -XXX,XX +XXX,XX @@ static int pred_gvec_reg_size(DisasContext *s)
24
unsigned vsz = vec_full_reg_size(s);
18
return size_for_gvec(pred_full_reg_size(s));
25
TCGv_ptr t_pg;
19
}
26
TCGv_i32 desc;
20
27
+ int poff;
21
+/* Invoke an out-of-line helper on 2 Zregs and a predicate. */
28
22
+static void gen_gvec_ool_zzp(DisasContext *s, gen_helper_gvec_3 *fn,
29
/* Load the first quadword using the normal predicated load helpers. */
23
+ int rd, int rn, int pg, int data)
30
desc = tcg_const_i32(simd_desc(16, 16, zt));
24
+{
31
- t_pg = tcg_temp_new_ptr();
25
+ unsigned vsz = vec_full_reg_size(s);
32
26
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
33
- tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
27
+ vec_full_reg_offset(s, rn),
34
+ poff = pred_full_reg_offset(s, pg);
28
+ pred_full_reg_offset(s, pg),
35
+ if (vsz > 16) {
29
+ vsz, vsz, data, fn);
36
+ /*
30
+}
37
+ * Zero-extend the first 16 bits of the predicate into a temporary.
38
+ * This avoids triggering an assert making sure we don't have bits
39
+ * set within a predicate beyond VQ, but we have lowered VQ to 1
40
+ * for this load operation.
41
+ */
42
+ TCGv_i64 tmp = tcg_temp_new_i64();
43
+#ifdef HOST_WORDS_BIGENDIAN
44
+ poff += 6;
45
+#endif
46
+ tcg_gen_ld16u_i64(tmp, cpu_env, poff);
47
+
31
+
48
+ poff = offsetof(CPUARMState, vfp.preg_tmp);
32
/* Invoke an out-of-line helper on 3 Zregs and a predicate. */
49
+ tcg_gen_st_i64(tmp, cpu_env, poff);
33
static void gen_gvec_ool_zzzp(DisasContext *s, gen_helper_gvec_4 *fn,
50
+ tcg_temp_free_i64(tmp);
34
int rd, int rn, int rm, int pg, int data)
51
+ }
35
@@ -XXX,XX +XXX,XX @@ static bool do_zpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_3 *fn)
52
+
36
return false;
53
+ t_pg = tcg_temp_new_ptr();
37
}
54
+ tcg_gen_addi_ptr(t_pg, cpu_env, poff);
38
if (sve_access_check(s)) {
55
+
39
- unsigned vsz = vec_full_reg_size(s);
56
fns[msz](cpu_env, t_pg, addr, desc);
40
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
57
41
- vec_full_reg_offset(s, a->rn),
58
tcg_temp_free_ptr(t_pg);
42
- pred_full_reg_offset(s, a->pg),
43
- vsz, vsz, 0, fn);
44
+ gen_gvec_ool_zzp(s, fn, a->rd, a->rn, a->pg, 0);
45
}
46
return true;
47
}
48
@@ -XXX,XX +XXX,XX @@ static bool do_movz_zpz(DisasContext *s, int rd, int rn, int pg,
49
};
50
51
if (sve_access_check(s)) {
52
- unsigned vsz = vec_full_reg_size(s);
53
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
54
- vec_full_reg_offset(s, rn),
55
- pred_full_reg_offset(s, pg),
56
- vsz, vsz, invert, fns[esz]);
57
+ gen_gvec_ool_zzp(s, fns[esz], rd, rn, pg, invert);
58
}
59
return true;
60
}
61
@@ -XXX,XX +XXX,XX @@ static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a,
62
gen_helper_gvec_3 *fn)
63
{
64
if (sve_access_check(s)) {
65
- unsigned vsz = vec_full_reg_size(s);
66
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
67
- vec_full_reg_offset(s, a->rn),
68
- pred_full_reg_offset(s, a->pg),
69
- vsz, vsz, a->imm, fn);
70
+ gen_gvec_ool_zzp(s, fn, a->rd, a->rn, a->pg, a->imm);
71
}
72
return true;
73
}
59
--
74
--
60
2.19.0
75
2.20.1
61
76
62
77
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
There is quite a lot of code required to compute cpu_mem_index,
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
or even put together the full TCGMemOpIdx. This can easily be
5
done at translation time.
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
5
Message-id: 20200815013145.539409-12-richard.henderson@linaro.org
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20181005175350.30752-16-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
7
---
13
target/arm/internals.h | 5 ++
8
target/arm/translate-sve.c | 53 +++++++++++++-------------------------
14
target/arm/sve_helper.c | 138 +++++++++++++++++++------------------
9
1 file changed, 18 insertions(+), 35 deletions(-)
15
target/arm/translate-sve.c | 67 +++++++++++-------
16
3 files changed, 121 insertions(+), 89 deletions(-)
17
10
18
diff --git a/target/arm/internals.h b/target/arm/internals.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/internals.h
21
+++ b/target/arm/internals.h
22
@@ -XXX,XX +XXX,XX @@ static inline uint32_t arm_debug_exception_fsr(CPUARMState *env)
23
}
24
}
25
26
+/* Note make_memop_idx reserves 4 bits for mmu_idx, and MO_BSWAP is bit 3.
27
+ * Thus a TCGMemOpIdx, without any MO_ALIGN bits, fits in 8 bits.
28
+ */
29
+#define MEMOPIDX_SHIFT 8
30
+
31
#endif
32
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/sve_helper.c
35
+++ b/target/arm/sve_helper.c
36
@@ -XXX,XX +XXX,XX @@
37
38
#include "qemu/osdep.h"
39
#include "cpu.h"
40
+#include "internals.h"
41
#include "exec/exec-all.h"
42
#include "exec/cpu_ldst.h"
43
#include "exec/helper-proto.h"
44
@@ -XXX,XX +XXX,XX @@ typedef intptr_t sve_ld1_host_fn(void *vd, void *vg, void *host,
45
* The controlling predicate is known to be true.
46
*/
47
typedef void sve_ld1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
48
- target_ulong vaddr, int mmu_idx, uintptr_t ra);
49
+ target_ulong vaddr, TCGMemOpIdx oi, uintptr_t ra);
50
typedef sve_ld1_tlb_fn sve_st1_tlb_fn;
51
52
/*
53
@@ -XXX,XX +XXX,XX @@ static intptr_t sve_##NAME##_host(void *vd, void *vg, void *host, \
54
#ifdef CONFIG_SOFTMMU
55
#define DO_LD_TLB(NAME, H, TYPEE, TYPEM, HOST, MOEND, TLB) \
56
static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
57
- target_ulong addr, int mmu_idx, uintptr_t ra) \
58
+ target_ulong addr, TCGMemOpIdx oi, uintptr_t ra) \
59
{ \
60
- TCGMemOpIdx oi = make_memop_idx(ctz32(sizeof(TYPEM)) | MOEND, mmu_idx); \
61
TYPEM val = TLB(env, addr, oi, ra); \
62
*(TYPEE *)(vd + H(reg_off)) = val; \
63
}
64
#else
65
#define DO_LD_TLB(NAME, H, TYPEE, TYPEM, HOST, MOEND, TLB) \
66
static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
67
- target_ulong addr, int mmu_idx, uintptr_t ra) \
68
+ target_ulong addr, TCGMemOpIdx oi, uintptr_t ra) \
69
{ \
70
TYPEM val = HOST(g2h(addr)); \
71
*(TYPEE *)(vd + H(reg_off)) = val; \
72
@@ -XXX,XX +XXX,XX @@ static void sve_ld1_r(CPUARMState *env, void *vg, const target_ulong addr,
73
sve_ld1_host_fn *host_fn,
74
sve_ld1_tlb_fn *tlb_fn)
75
{
76
- void *vd = &env->vfp.zregs[simd_data(desc)];
77
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
78
+ const int mmu_idx = get_mmuidx(oi);
79
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
80
+ void *vd = &env->vfp.zregs[rd];
81
const int diffsz = esz - msz;
82
const intptr_t reg_max = simd_oprsz(desc);
83
const intptr_t mem_max = reg_max >> diffsz;
84
- const int mmu_idx = cpu_mmu_index(env, false);
85
ARMVectorReg scratch;
86
void *host;
87
intptr_t split, reg_off, mem_off;
88
@@ -XXX,XX +XXX,XX @@ static void sve_ld1_r(CPUARMState *env, void *vg, const target_ulong addr,
89
* on I/O memory, it may succeed but not bring in the TLB entry.
90
* But even then we have still made forward progress.
91
*/
92
- tlb_fn(env, &scratch, reg_off, addr + mem_off, mmu_idx, retaddr);
93
+ tlb_fn(env, &scratch, reg_off, addr + mem_off, oi, retaddr);
94
reg_off += 1 << esz;
95
}
96
#endif
97
@@ -XXX,XX +XXX,XX @@ static void sve_ld2_r(CPUARMState *env, void *vg, target_ulong addr,
98
uint32_t desc, int size, uintptr_t ra,
99
sve_ld1_tlb_fn *tlb_fn)
100
{
101
- const int mmu_idx = cpu_mmu_index(env, false);
102
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
103
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
104
intptr_t i, oprsz = simd_oprsz(desc);
105
- unsigned rd = simd_data(desc);
106
ARMVectorReg scratch[2] = { };
107
108
set_helper_retaddr(ra);
109
@@ -XXX,XX +XXX,XX @@ static void sve_ld2_r(CPUARMState *env, void *vg, target_ulong addr,
110
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
111
do {
112
if (pg & 1) {
113
- tlb_fn(env, &scratch[0], i, addr, mmu_idx, ra);
114
- tlb_fn(env, &scratch[1], i, addr + size, mmu_idx, ra);
115
+ tlb_fn(env, &scratch[0], i, addr, oi, ra);
116
+ tlb_fn(env, &scratch[1], i, addr + size, oi, ra);
117
}
118
i += size, pg >>= size;
119
addr += 2 * size;
120
@@ -XXX,XX +XXX,XX @@ static void sve_ld3_r(CPUARMState *env, void *vg, target_ulong addr,
121
uint32_t desc, int size, uintptr_t ra,
122
sve_ld1_tlb_fn *tlb_fn)
123
{
124
- const int mmu_idx = cpu_mmu_index(env, false);
125
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
126
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
127
intptr_t i, oprsz = simd_oprsz(desc);
128
- unsigned rd = simd_data(desc);
129
ARMVectorReg scratch[3] = { };
130
131
set_helper_retaddr(ra);
132
@@ -XXX,XX +XXX,XX @@ static void sve_ld3_r(CPUARMState *env, void *vg, target_ulong addr,
133
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
134
do {
135
if (pg & 1) {
136
- tlb_fn(env, &scratch[0], i, addr, mmu_idx, ra);
137
- tlb_fn(env, &scratch[1], i, addr + size, mmu_idx, ra);
138
- tlb_fn(env, &scratch[2], i, addr + 2 * size, mmu_idx, ra);
139
+ tlb_fn(env, &scratch[0], i, addr, oi, ra);
140
+ tlb_fn(env, &scratch[1], i, addr + size, oi, ra);
141
+ tlb_fn(env, &scratch[2], i, addr + 2 * size, oi, ra);
142
}
143
i += size, pg >>= size;
144
addr += 3 * size;
145
@@ -XXX,XX +XXX,XX @@ static void sve_ld4_r(CPUARMState *env, void *vg, target_ulong addr,
146
uint32_t desc, int size, uintptr_t ra,
147
sve_ld1_tlb_fn *tlb_fn)
148
{
149
- const int mmu_idx = cpu_mmu_index(env, false);
150
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
151
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
152
intptr_t i, oprsz = simd_oprsz(desc);
153
- unsigned rd = simd_data(desc);
154
ARMVectorReg scratch[4] = { };
155
156
set_helper_retaddr(ra);
157
@@ -XXX,XX +XXX,XX @@ static void sve_ld4_r(CPUARMState *env, void *vg, target_ulong addr,
158
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
159
do {
160
if (pg & 1) {
161
- tlb_fn(env, &scratch[0], i, addr, mmu_idx, ra);
162
- tlb_fn(env, &scratch[1], i, addr + size, mmu_idx, ra);
163
- tlb_fn(env, &scratch[2], i, addr + 2 * size, mmu_idx, ra);
164
- tlb_fn(env, &scratch[3], i, addr + 3 * size, mmu_idx, ra);
165
+ tlb_fn(env, &scratch[0], i, addr, oi, ra);
166
+ tlb_fn(env, &scratch[1], i, addr + size, oi, ra);
167
+ tlb_fn(env, &scratch[2], i, addr + 2 * size, oi, ra);
168
+ tlb_fn(env, &scratch[3], i, addr + 3 * size, oi, ra);
169
}
170
i += size, pg >>= size;
171
addr += 4 * size;
172
@@ -XXX,XX +XXX,XX @@ static void sve_ldff1_r(CPUARMState *env, void *vg, const target_ulong addr,
173
sve_ld1_host_fn *host_fn,
174
sve_ld1_tlb_fn *tlb_fn)
175
{
176
- void *vd = &env->vfp.zregs[simd_data(desc)];
177
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
178
+ const int mmu_idx = get_mmuidx(oi);
179
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
180
+ void *vd = &env->vfp.zregs[rd];
181
const int diffsz = esz - msz;
182
const intptr_t reg_max = simd_oprsz(desc);
183
const intptr_t mem_max = reg_max >> diffsz;
184
- const int mmu_idx = cpu_mmu_index(env, false);
185
intptr_t split, reg_off, mem_off;
186
void *host;
187
188
@@ -XXX,XX +XXX,XX @@ static void sve_ldff1_r(CPUARMState *env, void *vg, const target_ulong addr,
189
* Perform one normal read, which will fault or not.
190
* But it is likely to bring the page into the tlb.
191
*/
192
- tlb_fn(env, vd, reg_off, addr + mem_off, mmu_idx, retaddr);
193
+ tlb_fn(env, vd, reg_off, addr + mem_off, oi, retaddr);
194
195
/* After any fault, zero any leading predicated false elts. */
196
swap_memzero(vd, reg_off);
197
@@ -XXX,XX +XXX,XX @@ static void sve_ldnf1_r(CPUARMState *env, void *vg, const target_ulong addr,
198
uint32_t desc, const int esz, const int msz,
199
sve_ld1_host_fn *host_fn)
200
{
201
- void *vd = &env->vfp.zregs[simd_data(desc)];
202
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
203
+ void *vd = &env->vfp.zregs[rd];
204
const int diffsz = esz - msz;
205
const intptr_t reg_max = simd_oprsz(desc);
206
const intptr_t mem_max = reg_max >> diffsz;
207
@@ -XXX,XX +XXX,XX @@ DO_LDFF1_LDNF1_2(dd, 3, 3)
208
#ifdef CONFIG_SOFTMMU
209
#define DO_ST_TLB(NAME, H, TYPEM, HOST, MOEND, TLB) \
210
static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
211
- target_ulong addr, int mmu_idx, uintptr_t ra) \
212
+ target_ulong addr, TCGMemOpIdx oi, uintptr_t ra) \
213
{ \
214
- TCGMemOpIdx oi = make_memop_idx(ctz32(sizeof(TYPEM)) | MOEND, mmu_idx); \
215
TLB(env, addr, *(TYPEM *)(vd + H(reg_off)), oi, ra); \
216
}
217
#else
218
#define DO_ST_TLB(NAME, H, TYPEM, HOST, MOEND, TLB) \
219
static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
220
- target_ulong addr, int mmu_idx, uintptr_t ra) \
221
+ target_ulong addr, TCGMemOpIdx oi, uintptr_t ra) \
222
{ \
223
HOST(g2h(addr), *(TYPEM *)(vd + H(reg_off))); \
224
}
225
@@ -XXX,XX +XXX,XX @@ static void sve_st1_r(CPUARMState *env, void *vg, target_ulong addr,
226
const int esize, const int msize,
227
sve_st1_tlb_fn *tlb_fn)
228
{
229
- const int mmu_idx = cpu_mmu_index(env, false);
230
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
231
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
232
intptr_t i, oprsz = simd_oprsz(desc);
233
- unsigned rd = simd_data(desc);
234
void *vd = &env->vfp.zregs[rd];
235
236
set_helper_retaddr(ra);
237
@@ -XXX,XX +XXX,XX @@ static void sve_st1_r(CPUARMState *env, void *vg, target_ulong addr,
238
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
239
do {
240
if (pg & 1) {
241
- tlb_fn(env, vd, i, addr, mmu_idx, ra);
242
+ tlb_fn(env, vd, i, addr, oi, ra);
243
}
244
i += esize, pg >>= esize;
245
addr += msize;
246
@@ -XXX,XX +XXX,XX @@ static void sve_st2_r(CPUARMState *env, void *vg, target_ulong addr,
247
const int esize, const int msize,
248
sve_st1_tlb_fn *tlb_fn)
249
{
250
- const int mmu_idx = cpu_mmu_index(env, false);
251
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
252
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
253
intptr_t i, oprsz = simd_oprsz(desc);
254
- unsigned rd = simd_data(desc);
255
void *d1 = &env->vfp.zregs[rd];
256
void *d2 = &env->vfp.zregs[(rd + 1) & 31];
257
258
@@ -XXX,XX +XXX,XX @@ static void sve_st2_r(CPUARMState *env, void *vg, target_ulong addr,
259
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
260
do {
261
if (pg & 1) {
262
- tlb_fn(env, d1, i, addr, mmu_idx, ra);
263
- tlb_fn(env, d2, i, addr + msize, mmu_idx, ra);
264
+ tlb_fn(env, d1, i, addr, oi, ra);
265
+ tlb_fn(env, d2, i, addr + msize, oi, ra);
266
}
267
i += esize, pg >>= esize;
268
addr += 2 * msize;
269
@@ -XXX,XX +XXX,XX @@ static void sve_st3_r(CPUARMState *env, void *vg, target_ulong addr,
270
const int esize, const int msize,
271
sve_st1_tlb_fn *tlb_fn)
272
{
273
- const int mmu_idx = cpu_mmu_index(env, false);
274
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
275
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
276
intptr_t i, oprsz = simd_oprsz(desc);
277
- unsigned rd = simd_data(desc);
278
void *d1 = &env->vfp.zregs[rd];
279
void *d2 = &env->vfp.zregs[(rd + 1) & 31];
280
void *d3 = &env->vfp.zregs[(rd + 2) & 31];
281
@@ -XXX,XX +XXX,XX @@ static void sve_st3_r(CPUARMState *env, void *vg, target_ulong addr,
282
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
283
do {
284
if (pg & 1) {
285
- tlb_fn(env, d1, i, addr, mmu_idx, ra);
286
- tlb_fn(env, d2, i, addr + msize, mmu_idx, ra);
287
- tlb_fn(env, d3, i, addr + 2 * msize, mmu_idx, ra);
288
+ tlb_fn(env, d1, i, addr, oi, ra);
289
+ tlb_fn(env, d2, i, addr + msize, oi, ra);
290
+ tlb_fn(env, d3, i, addr + 2 * msize, oi, ra);
291
}
292
i += esize, pg >>= esize;
293
addr += 3 * msize;
294
@@ -XXX,XX +XXX,XX @@ static void sve_st4_r(CPUARMState *env, void *vg, target_ulong addr,
295
const int esize, const int msize,
296
sve_st1_tlb_fn *tlb_fn)
297
{
298
- const int mmu_idx = cpu_mmu_index(env, false);
299
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
300
+ const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
301
intptr_t i, oprsz = simd_oprsz(desc);
302
- unsigned rd = simd_data(desc);
303
void *d1 = &env->vfp.zregs[rd];
304
void *d2 = &env->vfp.zregs[(rd + 1) & 31];
305
void *d3 = &env->vfp.zregs[(rd + 2) & 31];
306
@@ -XXX,XX +XXX,XX @@ static void sve_st4_r(CPUARMState *env, void *vg, target_ulong addr,
307
uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
308
do {
309
if (pg & 1) {
310
- tlb_fn(env, d1, i, addr, mmu_idx, ra);
311
- tlb_fn(env, d2, i, addr + msize, mmu_idx, ra);
312
- tlb_fn(env, d3, i, addr + 2 * msize, mmu_idx, ra);
313
- tlb_fn(env, d4, i, addr + 3 * msize, mmu_idx, ra);
314
+ tlb_fn(env, d1, i, addr, oi, ra);
315
+ tlb_fn(env, d2, i, addr + msize, oi, ra);
316
+ tlb_fn(env, d3, i, addr + 2 * msize, oi, ra);
317
+ tlb_fn(env, d4, i, addr + 3 * msize, oi, ra);
318
}
319
i += esize, pg >>= esize;
320
addr += 4 * msize;
321
@@ -XXX,XX +XXX,XX @@ static void sve_ld1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
322
target_ulong base, uint32_t desc, uintptr_t ra,
323
zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
324
{
325
- const int mmu_idx = cpu_mmu_index(env, false);
326
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
327
+ const int scale = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 2);
328
intptr_t i, oprsz = simd_oprsz(desc);
329
- unsigned scale = simd_data(desc);
330
ARMVectorReg scratch = { };
331
332
set_helper_retaddr(ra);
333
@@ -XXX,XX +XXX,XX @@ static void sve_ld1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
334
do {
335
if (likely(pg & 1)) {
336
target_ulong off = off_fn(vm, i);
337
- tlb_fn(env, &scratch, i, base + (off << scale), mmu_idx, ra);
338
+ tlb_fn(env, &scratch, i, base + (off << scale), oi, ra);
339
}
340
i += 4, pg >>= 4;
341
} while (i & 15);
342
@@ -XXX,XX +XXX,XX @@ static void sve_ld1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
343
target_ulong base, uint32_t desc, uintptr_t ra,
344
zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
345
{
346
- const int mmu_idx = cpu_mmu_index(env, false);
347
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
348
+ const int scale = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 2);
349
intptr_t i, oprsz = simd_oprsz(desc) / 8;
350
- unsigned scale = simd_data(desc);
351
ARMVectorReg scratch = { };
352
353
set_helper_retaddr(ra);
354
@@ -XXX,XX +XXX,XX @@ static void sve_ld1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
355
uint8_t pg = *(uint8_t *)(vg + H1(i));
356
if (likely(pg & 1)) {
357
target_ulong off = off_fn(vm, i * 8);
358
- tlb_fn(env, &scratch, i * 8, base + (off << scale), mmu_idx, ra);
359
+ tlb_fn(env, &scratch, i * 8, base + (off << scale), oi, ra);
360
}
361
}
362
set_helper_retaddr(0);
363
@@ -XXX,XX +XXX,XX @@ typedef bool sve_ld1_nf_fn(CPUARMState *env, void *vd, intptr_t reg_off,
364
#ifdef CONFIG_SOFTMMU
365
#define DO_LD_NF(NAME, H, TYPEE, TYPEM, HOST) \
366
static bool sve_ld##NAME##_nf(CPUARMState *env, void *vd, intptr_t reg_off, \
367
- target_ulong addr, int mmu_idx) \
368
+ target_ulong addr, int mmu_idx) \
369
{ \
370
target_ulong next_page = -(addr | TARGET_PAGE_MASK); \
371
if (likely(next_page - addr >= sizeof(TYPEM))) { \
372
@@ -XXX,XX +XXX,XX @@ static inline void sve_ldff1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
373
zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn,
374
sve_ld1_nf_fn *nonfault_fn)
375
{
376
- const int mmu_idx = cpu_mmu_index(env, false);
377
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
378
+ const int mmu_idx = get_mmuidx(oi);
379
+ const int scale = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 2);
380
intptr_t reg_off, reg_max = simd_oprsz(desc);
381
- unsigned scale = simd_data(desc);
382
target_ulong addr;
383
384
/* Skip to the first true predicate. */
385
@@ -XXX,XX +XXX,XX @@ static inline void sve_ldff1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
386
set_helper_retaddr(ra);
387
addr = off_fn(vm, reg_off);
388
addr = base + (addr << scale);
389
- tlb_fn(env, vd, reg_off, addr, mmu_idx, ra);
390
+ tlb_fn(env, vd, reg_off, addr, oi, ra);
391
392
/* The rest of the reads will be non-faulting. */
393
set_helper_retaddr(0);
394
@@ -XXX,XX +XXX,XX @@ static inline void sve_ldff1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
395
zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn,
396
sve_ld1_nf_fn *nonfault_fn)
397
{
398
- const int mmu_idx = cpu_mmu_index(env, false);
399
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
400
+ const int mmu_idx = get_mmuidx(oi);
401
+ const int scale = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 2);
402
intptr_t reg_off, reg_max = simd_oprsz(desc);
403
- unsigned scale = simd_data(desc);
404
target_ulong addr;
405
406
/* Skip to the first true predicate. */
407
@@ -XXX,XX +XXX,XX @@ static inline void sve_ldff1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
408
set_helper_retaddr(ra);
409
addr = off_fn(vm, reg_off);
410
addr = base + (addr << scale);
411
- tlb_fn(env, vd, reg_off, addr, mmu_idx, ra);
412
+ tlb_fn(env, vd, reg_off, addr, oi, ra);
413
414
/* The rest of the reads will be non-faulting. */
415
set_helper_retaddr(0);
416
@@ -XXX,XX +XXX,XX @@ static void sve_st1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
417
target_ulong base, uint32_t desc, uintptr_t ra,
418
zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
419
{
420
- const int mmu_idx = cpu_mmu_index(env, false);
421
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
422
+ const int scale = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 2);
423
intptr_t i, oprsz = simd_oprsz(desc);
424
- unsigned scale = simd_data(desc);
425
426
set_helper_retaddr(ra);
427
for (i = 0; i < oprsz; ) {
428
@@ -XXX,XX +XXX,XX @@ static void sve_st1_zs(CPUARMState *env, void *vd, void *vg, void *vm,
429
do {
430
if (likely(pg & 1)) {
431
target_ulong off = off_fn(vm, i);
432
- tlb_fn(env, vd, i, base + (off << scale), mmu_idx, ra);
433
+ tlb_fn(env, vd, i, base + (off << scale), oi, ra);
434
}
435
i += 4, pg >>= 4;
436
} while (i & 15);
437
@@ -XXX,XX +XXX,XX @@ static void sve_st1_zd(CPUARMState *env, void *vd, void *vg, void *vm,
438
target_ulong base, uint32_t desc, uintptr_t ra,
439
zreg_off_fn *off_fn, sve_ld1_tlb_fn *tlb_fn)
440
{
441
- const int mmu_idx = cpu_mmu_index(env, false);
442
+ const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
443
+ const int scale = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 2);
444
intptr_t i, oprsz = simd_oprsz(desc) / 8;
445
- unsigned scale = simd_data(desc);
446
447
set_helper_retaddr(ra);
448
for (i = 0; i < oprsz; i++) {
449
uint8_t pg = *(uint8_t *)(vg + H1(i));
450
if (likely(pg & 1)) {
451
target_ulong off = off_fn(vm, i * 8);
452
- tlb_fn(env, vd, i * 8, base + (off << scale), mmu_idx, ra);
453
+ tlb_fn(env, vd, i * 8, base + (off << scale), oi, ra);
454
}
455
}
456
set_helper_retaddr(0);
457
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
11
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
458
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
459
--- a/target/arm/translate-sve.c
13
--- a/target/arm/translate-sve.c
460
+++ b/target/arm/translate-sve.c
14
+++ b/target/arm/translate-sve.c
461
@@ -XXX,XX +XXX,XX @@ static const uint8_t dtype_esz[16] = {
15
@@ -XXX,XX +XXX,XX @@ static int pred_gvec_reg_size(DisasContext *s)
462
3, 2, 1, 3
16
return size_for_gvec(pred_full_reg_size(s));
463
};
17
}
464
18
465
+static TCGMemOpIdx sve_memopidx(DisasContext *s, int dtype)
19
+/* Invoke an out-of-line helper on 3 Zregs. */
20
+static void gen_gvec_ool_zzz(DisasContext *s, gen_helper_gvec_3 *fn,
21
+ int rd, int rn, int rm, int data)
466
+{
22
+{
467
+ return make_memop_idx(s->be_data | dtype_mop[dtype], get_mem_index(s));
23
+ unsigned vsz = vec_full_reg_size(s);
24
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
25
+ vec_full_reg_offset(s, rn),
26
+ vec_full_reg_offset(s, rm),
27
+ vsz, vsz, data, fn);
468
+}
28
+}
469
+
29
+
470
static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
30
/* Invoke an out-of-line helper on 2 Zregs and a predicate. */
471
- gen_helper_gvec_mem *fn)
31
static void gen_gvec_ool_zzp(DisasContext *s, gen_helper_gvec_3 *fn,
472
+ int dtype, gen_helper_gvec_mem *fn)
32
int rd, int rn, int pg, int data)
473
{
33
@@ -XXX,XX +XXX,XX @@ static bool do_zzw_ool(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 *fn)
474
unsigned vsz = vec_full_reg_size(s);
34
return false;
475
TCGv_ptr t_pg;
35
}
476
- TCGv_i32 desc;
36
if (sve_access_check(s)) {
477
+ TCGv_i32 t_desc;
37
- unsigned vsz = vec_full_reg_size(s);
478
+ int desc;
38
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
479
39
- vec_full_reg_offset(s, a->rn),
480
/* For e.g. LD4, there are not enough arguments to pass all 4
40
- vec_full_reg_offset(s, a->rm),
481
* registers as pointers, so encode the regno into the data field.
41
- vsz, vsz, 0, fn);
482
* For consistency, do this even for LD1.
42
+ gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, 0);
483
*/
484
- desc = tcg_const_i32(simd_desc(vsz, vsz, zt));
485
+ desc = sve_memopidx(s, dtype);
486
+ desc |= zt << MEMOPIDX_SHIFT;
487
+ desc = simd_desc(vsz, vsz, desc);
488
+ t_desc = tcg_const_i32(desc);
489
t_pg = tcg_temp_new_ptr();
490
491
tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
492
- fn(cpu_env, t_pg, addr, desc);
493
+ fn(cpu_env, t_pg, addr, t_desc);
494
495
tcg_temp_free_ptr(t_pg);
496
- tcg_temp_free_i32(desc);
497
+ tcg_temp_free_i32(t_desc);
498
}
499
500
static void do_ld_zpa(DisasContext *s, int zt, int pg,
501
@@ -XXX,XX +XXX,XX @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
502
* accessible via the instruction encoding.
503
*/
504
assert(fn != NULL);
505
- do_mem_zpa(s, zt, pg, addr, fn);
506
+ do_mem_zpa(s, zt, pg, addr, dtype, fn);
507
}
508
509
static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
510
@@ -XXX,XX +XXX,XX @@ static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
511
TCGv_i64 addr = new_tmp_a64(s);
512
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
513
tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
514
- do_mem_zpa(s, a->rd, a->pg, addr, fns[s->be_data == MO_BE][a->dtype]);
515
+ do_mem_zpa(s, a->rd, a->pg, addr, a->dtype,
516
+ fns[s->be_data == MO_BE][a->dtype]);
517
}
43
}
518
return true;
44
return true;
519
}
45
}
520
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
46
@@ -XXX,XX +XXX,XX @@ static bool trans_RDVL(DisasContext *s, arg_RDVL *a)
521
TCGv_i64 addr = new_tmp_a64(s);
47
static bool do_adr(DisasContext *s, arg_rrri *a, gen_helper_gvec_3 *fn)
522
48
{
523
tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off);
49
if (sve_access_check(s)) {
524
- do_mem_zpa(s, a->rd, a->pg, addr, fns[s->be_data == MO_BE][a->dtype]);
50
- unsigned vsz = vec_full_reg_size(s);
525
+ do_mem_zpa(s, a->rd, a->pg, addr, a->dtype,
51
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
526
+ fns[s->be_data == MO_BE][a->dtype]);
52
- vec_full_reg_offset(s, a->rn),
53
- vec_full_reg_offset(s, a->rm),
54
- vsz, vsz, a->imm, fn);
55
+ gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, a->imm);
527
}
56
}
528
return true;
57
return true;
529
}
58
}
530
@@ -XXX,XX +XXX,XX @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
59
@@ -XXX,XX +XXX,XX @@ static bool trans_FTSSEL(DisasContext *s, arg_rrr_esz *a)
531
};
60
return false;
532
unsigned vsz = vec_full_reg_size(s);
533
TCGv_ptr t_pg;
534
- TCGv_i32 desc;
535
- int poff;
536
+ TCGv_i32 t_desc;
537
+ int desc, poff;
538
539
/* Load the first quadword using the normal predicated load helpers. */
540
- desc = tcg_const_i32(simd_desc(16, 16, zt));
541
+ desc = sve_memopidx(s, msz_dtype(msz));
542
+ desc |= zt << MEMOPIDX_SHIFT;
543
+ desc = simd_desc(16, 16, desc);
544
+ t_desc = tcg_const_i32(desc);
545
546
poff = pred_full_reg_offset(s, pg);
547
if (vsz > 16) {
548
@@ -XXX,XX +XXX,XX @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
549
t_pg = tcg_temp_new_ptr();
550
tcg_gen_addi_ptr(t_pg, cpu_env, poff);
551
552
- fns[s->be_data == MO_BE][msz](cpu_env, t_pg, addr, desc);
553
+ fns[s->be_data == MO_BE][msz](cpu_env, t_pg, addr, t_desc);
554
555
tcg_temp_free_ptr(t_pg);
556
- tcg_temp_free_i32(desc);
557
+ tcg_temp_free_i32(t_desc);
558
559
/* Replicate that first quadword. */
560
if (vsz > 16) {
561
@@ -XXX,XX +XXX,XX @@ static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
562
fn = fn_multiple[be][nreg - 1][msz];
563
}
61
}
564
assert(fn != NULL);
62
if (sve_access_check(s)) {
565
- do_mem_zpa(s, zt, pg, addr, fn);
63
- unsigned vsz = vec_full_reg_size(s);
566
+ do_mem_zpa(s, zt, pg, addr, msz_dtype(msz), fn);
64
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
567
}
65
- vec_full_reg_offset(s, a->rn),
568
66
- vec_full_reg_offset(s, a->rm),
569
static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t insn)
67
- vsz, vsz, 0, fns[a->esz]);
570
@@ -XXX,XX +XXX,XX @@ static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn)
68
+ gen_gvec_ool_zzz(s, fns[a->esz], a->rd, a->rn, a->rm, 0);
571
*** SVE gather loads / scatter stores
69
}
572
*/
573
574
-static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale,
575
- TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn)
576
+static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
577
+ int scale, TCGv_i64 scalar, int msz,
578
+ gen_helper_gvec_mem_scatter *fn)
579
{
580
unsigned vsz = vec_full_reg_size(s);
581
- TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, scale));
582
TCGv_ptr t_zm = tcg_temp_new_ptr();
583
TCGv_ptr t_pg = tcg_temp_new_ptr();
584
TCGv_ptr t_zt = tcg_temp_new_ptr();
585
+ TCGv_i32 t_desc;
586
+ int desc;
587
+
588
+ desc = sve_memopidx(s, msz_dtype(msz));
589
+ desc |= scale << MEMOPIDX_SHIFT;
590
+ desc = simd_desc(vsz, vsz, desc);
591
+ t_desc = tcg_const_i32(desc);
592
593
tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
594
tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm));
595
tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt));
596
- fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc);
597
+ fn(cpu_env, t_zt, t_pg, t_zm, scalar, t_desc);
598
599
tcg_temp_free_ptr(t_zt);
600
tcg_temp_free_ptr(t_zm);
601
tcg_temp_free_ptr(t_pg);
602
- tcg_temp_free_i32(desc);
603
+ tcg_temp_free_i32(t_desc);
604
}
605
606
/* Indexed by [be][ff][xs][u][msz]. */
607
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
608
assert(fn != NULL);
609
610
do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
611
- cpu_reg_sp(s, a->rn), fn);
612
+ cpu_reg_sp(s, a->rn), a->msz, fn);
613
return true;
70
return true;
614
}
71
}
615
72
@@ -XXX,XX +XXX,XX @@ static bool trans_TBL(DisasContext *s, arg_rrr_esz *a)
616
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
73
};
617
* by loading the immediate into the scalar parameter.
74
618
*/
75
if (sve_access_check(s)) {
619
imm = tcg_const_i64(a->imm << a->msz);
76
- unsigned vsz = vec_full_reg_size(s);
620
- do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
77
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
621
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, fn);
78
- vec_full_reg_offset(s, a->rn),
622
tcg_temp_free_i64(imm);
79
- vec_full_reg_offset(s, a->rm),
80
- vsz, vsz, 0, fns[a->esz]);
81
+ gen_gvec_ool_zzz(s, fns[a->esz], a->rd, a->rn, a->rm, 0);
82
}
623
return true;
83
return true;
624
}
84
}
625
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
85
@@ -XXX,XX +XXX,XX @@ static bool do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data,
626
g_assert_not_reached();
86
gen_helper_gvec_3 *fn)
87
{
88
if (sve_access_check(s)) {
89
- unsigned vsz = vec_full_reg_size(s);
90
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
91
- vec_full_reg_offset(s, a->rn),
92
- vec_full_reg_offset(s, a->rm),
93
- vsz, vsz, data, fn);
94
+ gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, data);
627
}
95
}
628
do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
629
- cpu_reg_sp(s, a->rn), fn);
630
+ cpu_reg_sp(s, a->rn), a->msz, fn);
631
return true;
96
return true;
632
}
97
}
633
98
@@ -XXX,XX +XXX,XX @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a)
634
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn)
99
};
635
* by loading the immediate into the scalar parameter.
100
636
*/
101
if (sve_access_check(s)) {
637
imm = tcg_const_i64(a->imm << a->msz);
102
- unsigned vsz = vec_full_reg_size(s);
638
- do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
103
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
639
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, fn);
104
- vec_full_reg_offset(s, a->rn),
640
tcg_temp_free_i64(imm);
105
- vec_full_reg_offset(s, a->rm),
106
- vsz, vsz, 0, fns[a->u][a->sz]);
107
+ gen_gvec_ool_zzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, 0);
108
}
109
return true;
110
}
111
@@ -XXX,XX +XXX,XX @@ static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a)
112
};
113
114
if (sve_access_check(s)) {
115
- unsigned vsz = vec_full_reg_size(s);
116
- tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
117
- vec_full_reg_offset(s, a->rn),
118
- vec_full_reg_offset(s, a->rm),
119
- vsz, vsz, a->index, fns[a->u][a->sz]);
120
+ gen_gvec_ool_zzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, a->index);
121
}
641
return true;
122
return true;
642
}
123
}
643
--
124
--
644
2.19.0
125
2.20.1
645
126
646
127
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We can choose the endianness at translation time, rather than
4
re-computing it at execution time.
5
6
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-11-richard.henderson@linaro.org
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20200815013145.539409-13-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/helper-sve.h | 117 +++++++++++++++-------
8
target/arm/translate-sve.c | 20 ++++++++++++--------
13
target/arm/sve_helper.c | 70 ++++++-------
9
1 file changed, 12 insertions(+), 8 deletions(-)
14
target/arm/translate-sve.c | 196 +++++++++++++++++++++++++------------
15
3 files changed, 252 insertions(+), 131 deletions(-)
16
10
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
20
+++ b/target/arm/helper-sve.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
23
DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
24
25
-DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
-DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
-DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
-DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_4(sve_ld1hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_ld2hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_4(sve_ld3hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
32
+DEF_HELPER_FLAGS_4(sve_ld4hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
33
34
-DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
-DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
-DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
-DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_4(sve_ld1hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_4(sve_ld2hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_ld3hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_4(sve_ld4hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
43
-DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
44
-DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
-DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
-DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_4(sve_ld1ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_ld2ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_4(sve_ld3ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
50
+DEF_HELPER_FLAGS_4(sve_ld4ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
51
+
52
+DEF_HELPER_FLAGS_4(sve_ld1ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_4(sve_ld2ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_4(sve_ld3ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_4(sve_ld4ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
56
+
57
+DEF_HELPER_FLAGS_4(sve_ld1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
58
+DEF_HELPER_FLAGS_4(sve_ld2dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
59
+DEF_HELPER_FLAGS_4(sve_ld3dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_4(sve_ld4dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
61
+
62
+DEF_HELPER_FLAGS_4(sve_ld1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
63
+DEF_HELPER_FLAGS_4(sve_ld2dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_4(sve_ld3dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
65
+DEF_HELPER_FLAGS_4(sve_ld4dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
66
67
DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
68
DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
69
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
70
DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
71
DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
72
73
-DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
74
-DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
75
-DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
76
-DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
77
+DEF_HELPER_FLAGS_4(sve_ld1hsu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
78
+DEF_HELPER_FLAGS_4(sve_ld1hdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_4(sve_ld1hss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
80
+DEF_HELPER_FLAGS_4(sve_ld1hds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
81
82
-DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
83
-DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
84
+DEF_HELPER_FLAGS_4(sve_ld1hsu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
85
+DEF_HELPER_FLAGS_4(sve_ld1hdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
86
+DEF_HELPER_FLAGS_4(sve_ld1hss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
87
+DEF_HELPER_FLAGS_4(sve_ld1hds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
88
+
89
+DEF_HELPER_FLAGS_4(sve_ld1sdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
90
+DEF_HELPER_FLAGS_4(sve_ld1sds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
91
+
92
+DEF_HELPER_FLAGS_4(sve_ld1sdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
93
+DEF_HELPER_FLAGS_4(sve_ld1sds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
94
95
DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
96
DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
97
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
98
DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
99
DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
100
101
-DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
102
-DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
103
-DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
104
-DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
105
-DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
106
+DEF_HELPER_FLAGS_4(sve_ldff1hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
107
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
108
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
109
+DEF_HELPER_FLAGS_4(sve_ldff1hss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
110
+DEF_HELPER_FLAGS_4(sve_ldff1hds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
111
112
-DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
113
-DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
114
-DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
115
+DEF_HELPER_FLAGS_4(sve_ldff1hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
116
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
117
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
118
+DEF_HELPER_FLAGS_4(sve_ldff1hss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
119
+DEF_HELPER_FLAGS_4(sve_ldff1hds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
120
121
-DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
122
+DEF_HELPER_FLAGS_4(sve_ldff1ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
123
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
124
+DEF_HELPER_FLAGS_4(sve_ldff1sds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
125
+
126
+DEF_HELPER_FLAGS_4(sve_ldff1ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
127
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
128
+DEF_HELPER_FLAGS_4(sve_ldff1sds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
129
+
130
+DEF_HELPER_FLAGS_4(sve_ldff1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
131
+DEF_HELPER_FLAGS_4(sve_ldff1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
132
133
DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
134
DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
135
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
136
DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
137
DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
138
139
-DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
140
-DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
141
-DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
142
-DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
143
-DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
144
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
145
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
146
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
147
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
148
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
149
150
-DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
151
-DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
152
-DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
153
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
154
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
155
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
156
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
157
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
158
159
-DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
160
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
161
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
162
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
163
+
164
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
165
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
166
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
167
+
168
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
169
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
170
171
DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
172
DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
173
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
174
index XXXXXXX..XXXXXXX 100644
175
--- a/target/arm/sve_helper.c
176
+++ b/target/arm/sve_helper.c
177
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_##NAME##_r)(CPUARMState *env, void *vg, \
178
sve_##NAME##_host, sve_##NAME##_tlb); \
179
}
180
181
-/* TODO: Propagate the endian check back to the translator. */
182
#define DO_LD1_2(NAME, ESZ, MSZ) \
183
-void HELPER(sve_##NAME##_r)(CPUARMState *env, void *vg, \
184
- target_ulong addr, uint32_t desc) \
185
-{ \
186
- if (arm_cpu_data_is_big_endian(env)) { \
187
- sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
188
- sve_##NAME##_be_host, sve_##NAME##_be_tlb); \
189
- } else { \
190
- sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
191
- sve_##NAME##_le_host, sve_##NAME##_le_tlb); \
192
- } \
193
+void HELPER(sve_##NAME##_le_r)(CPUARMState *env, void *vg, \
194
+ target_ulong addr, uint32_t desc) \
195
+{ \
196
+ sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
197
+ sve_##NAME##_le_host, sve_##NAME##_le_tlb); \
198
+} \
199
+void HELPER(sve_##NAME##_be_r)(CPUARMState *env, void *vg, \
200
+ target_ulong addr, uint32_t desc) \
201
+{ \
202
+ sve_ld1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
203
+ sve_##NAME##_be_host, sve_##NAME##_be_tlb); \
204
}
205
206
DO_LD1_1(ld1bb, 0)
207
@@ -XXX,XX +XXX,XX @@ void __attribute__((flatten)) HELPER(sve_ld##N##bb_r) \
208
}
209
210
#define DO_LDN_2(N, SUFF, SIZE) \
211
-void __attribute__((flatten)) HELPER(sve_ld##N##SUFF##_r) \
212
+void __attribute__((flatten)) HELPER(sve_ld##N##SUFF##_le_r) \
213
(CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
214
{ \
215
sve_ld##N##_r(env, vg, addr, desc, SIZE, GETPC(), \
216
- arm_cpu_data_is_big_endian(env) \
217
- ? sve_ld1##SUFF##_be_tlb : sve_ld1##SUFF##_le_tlb); \
218
+ sve_ld1##SUFF##_le_tlb); \
219
+} \
220
+void __attribute__((flatten)) HELPER(sve_ld##N##SUFF##_be_r) \
221
+ (CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
222
+{ \
223
+ sve_ld##N##_r(env, vg, addr, desc, SIZE, GETPC(), \
224
+ sve_ld1##SUFF##_be_tlb); \
225
}
226
227
DO_LDN_1(2)
228
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ldnf1##PART##_r)(CPUARMState *env, void *vg, \
229
sve_ldnf1_r(env, vg, addr, desc, ESZ, 0, sve_ld1##PART##_host); \
230
}
231
232
-/* TODO: Propagate the endian check back to the translator. */
233
#define DO_LDFF1_LDNF1_2(PART, ESZ, MSZ) \
234
-void HELPER(sve_ldff1##PART##_r)(CPUARMState *env, void *vg, \
235
- target_ulong addr, uint32_t desc) \
236
+void HELPER(sve_ldff1##PART##_le_r)(CPUARMState *env, void *vg, \
237
+ target_ulong addr, uint32_t desc) \
238
{ \
239
- if (arm_cpu_data_is_big_endian(env)) { \
240
- sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
241
- sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
242
- } else { \
243
- sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
244
- sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
245
- } \
246
+ sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
247
+ sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
248
} \
249
-void HELPER(sve_ldnf1##PART##_r)(CPUARMState *env, void *vg, \
250
- target_ulong addr, uint32_t desc) \
251
+void HELPER(sve_ldnf1##PART##_le_r)(CPUARMState *env, void *vg, \
252
+ target_ulong addr, uint32_t desc) \
253
{ \
254
- if (arm_cpu_data_is_big_endian(env)) { \
255
- sve_ldnf1_r(env, vg, addr, desc, ESZ, MSZ, \
256
- sve_ld1##PART##_be_host); \
257
- } else { \
258
- sve_ldnf1_r(env, vg, addr, desc, ESZ, MSZ, \
259
- sve_ld1##PART##_le_host); \
260
- } \
261
+ sve_ldnf1_r(env, vg, addr, desc, ESZ, MSZ, sve_ld1##PART##_le_host); \
262
+} \
263
+void HELPER(sve_ldff1##PART##_be_r)(CPUARMState *env, void *vg, \
264
+ target_ulong addr, uint32_t desc) \
265
+{ \
266
+ sve_ldff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, \
267
+ sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
268
+} \
269
+void HELPER(sve_ldnf1##PART##_be_r)(CPUARMState *env, void *vg, \
270
+ target_ulong addr, uint32_t desc) \
271
+{ \
272
+ sve_ldnf1_r(env, vg, addr, desc, ESZ, MSZ, sve_ld1##PART##_be_host); \
273
}
274
275
DO_LDFF1_LDNF1_1(bb, 0)
276
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
11
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
277
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
278
--- a/target/arm/translate-sve.c
13
--- a/target/arm/translate-sve.c
279
+++ b/target/arm/translate-sve.c
14
+++ b/target/arm/translate-sve.c
280
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
15
@@ -XXX,XX +XXX,XX @@ static int pred_gvec_reg_size(DisasContext *s)
281
static void do_ld_zpa(DisasContext *s, int zt, int pg,
16
return size_for_gvec(pred_full_reg_size(s));
282
TCGv_i64 addr, int dtype, int nreg)
17
}
283
{
18
284
- static gen_helper_gvec_mem * const fns[16][4] = {
19
+/* Invoke an out-of-line helper on 2 Zregs. */
285
- { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
20
+static void gen_gvec_ool_zz(DisasContext *s, gen_helper_gvec_2 *fn,
286
- gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
21
+ int rd, int rn, int data)
287
- { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
22
+{
288
- { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
23
+ unsigned vsz = vec_full_reg_size(s);
289
- { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
24
+ tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd),
290
+ static gen_helper_gvec_mem * const fns[2][16][4] = {
25
+ vec_full_reg_offset(s, rn),
291
+ /* Little-endian */
26
+ vsz, vsz, data, fn);
292
+ { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
27
+}
293
+ gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
294
+ { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
295
+ { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
296
+ { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
297
298
- { gen_helper_sve_ld1sds_r, NULL, NULL, NULL },
299
- { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r,
300
- gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r },
301
- { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL },
302
- { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL },
303
+ { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL },
304
+ { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r,
305
+ gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r },
306
+ { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL },
307
+ { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL },
308
309
- { gen_helper_sve_ld1hds_r, NULL, NULL, NULL },
310
- { gen_helper_sve_ld1hss_r, NULL, NULL, NULL },
311
- { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r,
312
- gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r },
313
- { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL },
314
+ { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL },
315
+ { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL },
316
+ { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r,
317
+ gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r },
318
+ { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL },
319
320
- { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
321
- { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
322
- { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
323
- { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r,
324
- gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r },
325
+ { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
326
+ { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
327
+ { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
328
+ { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r,
329
+ gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } },
330
+
28
+
331
+ /* Big-endian */
29
/* Invoke an out-of-line helper on 3 Zregs. */
332
+ { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
30
static void gen_gvec_ool_zzz(DisasContext *s, gen_helper_gvec_3 *fn,
333
+ gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
31
int rd, int rn, int rm, int data)
334
+ { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
32
@@ -XXX,XX +XXX,XX @@ static bool trans_FEXPA(DisasContext *s, arg_rr_esz *a)
335
+ { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
33
return false;
336
+ { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
34
}
337
+
338
+ { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL },
339
+ { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r,
340
+ gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r },
341
+ { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL },
342
+ { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL },
343
+
344
+ { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL },
345
+ { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL },
346
+ { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r,
347
+ gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r },
348
+ { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL },
349
+
350
+ { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
351
+ { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
352
+ { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
353
+ { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r,
354
+ gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } }
355
};
356
- gen_helper_gvec_mem *fn = fns[dtype][nreg];
357
+ gen_helper_gvec_mem *fn = fns[s->be_data == MO_BE][dtype][nreg];
358
359
/* While there are holes in the table, they are not
360
* accessible via the instruction encoding.
361
@@ -XXX,XX +XXX,XX @@ static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
362
363
static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
364
{
365
- static gen_helper_gvec_mem * const fns[16] = {
366
- gen_helper_sve_ldff1bb_r,
367
- gen_helper_sve_ldff1bhu_r,
368
- gen_helper_sve_ldff1bsu_r,
369
- gen_helper_sve_ldff1bdu_r,
370
+ static gen_helper_gvec_mem * const fns[2][16] = {
371
+ /* Little-endian */
372
+ { gen_helper_sve_ldff1bb_r,
373
+ gen_helper_sve_ldff1bhu_r,
374
+ gen_helper_sve_ldff1bsu_r,
375
+ gen_helper_sve_ldff1bdu_r,
376
377
- gen_helper_sve_ldff1sds_r,
378
- gen_helper_sve_ldff1hh_r,
379
- gen_helper_sve_ldff1hsu_r,
380
- gen_helper_sve_ldff1hdu_r,
381
+ gen_helper_sve_ldff1sds_le_r,
382
+ gen_helper_sve_ldff1hh_le_r,
383
+ gen_helper_sve_ldff1hsu_le_r,
384
+ gen_helper_sve_ldff1hdu_le_r,
385
386
- gen_helper_sve_ldff1hds_r,
387
- gen_helper_sve_ldff1hss_r,
388
- gen_helper_sve_ldff1ss_r,
389
- gen_helper_sve_ldff1sdu_r,
390
+ gen_helper_sve_ldff1hds_le_r,
391
+ gen_helper_sve_ldff1hss_le_r,
392
+ gen_helper_sve_ldff1ss_le_r,
393
+ gen_helper_sve_ldff1sdu_le_r,
394
395
- gen_helper_sve_ldff1bds_r,
396
- gen_helper_sve_ldff1bss_r,
397
- gen_helper_sve_ldff1bhs_r,
398
- gen_helper_sve_ldff1dd_r,
399
+ gen_helper_sve_ldff1bds_r,
400
+ gen_helper_sve_ldff1bss_r,
401
+ gen_helper_sve_ldff1bhs_r,
402
+ gen_helper_sve_ldff1dd_le_r },
403
+
404
+ /* Big-endian */
405
+ { gen_helper_sve_ldff1bb_r,
406
+ gen_helper_sve_ldff1bhu_r,
407
+ gen_helper_sve_ldff1bsu_r,
408
+ gen_helper_sve_ldff1bdu_r,
409
+
410
+ gen_helper_sve_ldff1sds_be_r,
411
+ gen_helper_sve_ldff1hh_be_r,
412
+ gen_helper_sve_ldff1hsu_be_r,
413
+ gen_helper_sve_ldff1hdu_be_r,
414
+
415
+ gen_helper_sve_ldff1hds_be_r,
416
+ gen_helper_sve_ldff1hss_be_r,
417
+ gen_helper_sve_ldff1ss_be_r,
418
+ gen_helper_sve_ldff1sdu_be_r,
419
+
420
+ gen_helper_sve_ldff1bds_r,
421
+ gen_helper_sve_ldff1bss_r,
422
+ gen_helper_sve_ldff1bhs_r,
423
+ gen_helper_sve_ldff1dd_be_r },
424
};
425
426
if (sve_access_check(s)) {
35
if (sve_access_check(s)) {
427
TCGv_i64 addr = new_tmp_a64(s);
36
- unsigned vsz = vec_full_reg_size(s);
428
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
37
- tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd),
429
tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
38
- vec_full_reg_offset(s, a->rn),
430
- do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]);
39
- vsz, vsz, 0, fns[a->esz]);
431
+ do_mem_zpa(s, a->rd, a->pg, addr, fns[s->be_data == MO_BE][a->dtype]);
40
+ gen_gvec_ool_zz(s, fns[a->esz], a->rd, a->rn, 0);
432
}
41
}
433
return true;
42
return true;
434
}
43
}
435
44
@@ -XXX,XX +XXX,XX @@ static bool trans_REV_v(DisasContext *s, arg_rr_esz *a)
436
static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
437
{
438
- static gen_helper_gvec_mem * const fns[16] = {
439
- gen_helper_sve_ldnf1bb_r,
440
- gen_helper_sve_ldnf1bhu_r,
441
- gen_helper_sve_ldnf1bsu_r,
442
- gen_helper_sve_ldnf1bdu_r,
443
+ static gen_helper_gvec_mem * const fns[2][16] = {
444
+ /* Little-endian */
445
+ { gen_helper_sve_ldnf1bb_r,
446
+ gen_helper_sve_ldnf1bhu_r,
447
+ gen_helper_sve_ldnf1bsu_r,
448
+ gen_helper_sve_ldnf1bdu_r,
449
450
- gen_helper_sve_ldnf1sds_r,
451
- gen_helper_sve_ldnf1hh_r,
452
- gen_helper_sve_ldnf1hsu_r,
453
- gen_helper_sve_ldnf1hdu_r,
454
+ gen_helper_sve_ldnf1sds_le_r,
455
+ gen_helper_sve_ldnf1hh_le_r,
456
+ gen_helper_sve_ldnf1hsu_le_r,
457
+ gen_helper_sve_ldnf1hdu_le_r,
458
459
- gen_helper_sve_ldnf1hds_r,
460
- gen_helper_sve_ldnf1hss_r,
461
- gen_helper_sve_ldnf1ss_r,
462
- gen_helper_sve_ldnf1sdu_r,
463
+ gen_helper_sve_ldnf1hds_le_r,
464
+ gen_helper_sve_ldnf1hss_le_r,
465
+ gen_helper_sve_ldnf1ss_le_r,
466
+ gen_helper_sve_ldnf1sdu_le_r,
467
468
- gen_helper_sve_ldnf1bds_r,
469
- gen_helper_sve_ldnf1bss_r,
470
- gen_helper_sve_ldnf1bhs_r,
471
- gen_helper_sve_ldnf1dd_r,
472
+ gen_helper_sve_ldnf1bds_r,
473
+ gen_helper_sve_ldnf1bss_r,
474
+ gen_helper_sve_ldnf1bhs_r,
475
+ gen_helper_sve_ldnf1dd_le_r },
476
+
477
+ /* Big-endian */
478
+ { gen_helper_sve_ldnf1bb_r,
479
+ gen_helper_sve_ldnf1bhu_r,
480
+ gen_helper_sve_ldnf1bsu_r,
481
+ gen_helper_sve_ldnf1bdu_r,
482
+
483
+ gen_helper_sve_ldnf1sds_be_r,
484
+ gen_helper_sve_ldnf1hh_be_r,
485
+ gen_helper_sve_ldnf1hsu_be_r,
486
+ gen_helper_sve_ldnf1hdu_be_r,
487
+
488
+ gen_helper_sve_ldnf1hds_be_r,
489
+ gen_helper_sve_ldnf1hss_be_r,
490
+ gen_helper_sve_ldnf1ss_be_r,
491
+ gen_helper_sve_ldnf1sdu_be_r,
492
+
493
+ gen_helper_sve_ldnf1bds_r,
494
+ gen_helper_sve_ldnf1bss_r,
495
+ gen_helper_sve_ldnf1bhs_r,
496
+ gen_helper_sve_ldnf1dd_be_r },
497
};
45
};
498
46
499
if (sve_access_check(s)) {
47
if (sve_access_check(s)) {
500
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
48
- unsigned vsz = vec_full_reg_size(s);
501
TCGv_i64 addr = new_tmp_a64(s);
49
- tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd),
502
50
- vec_full_reg_offset(s, a->rn),
503
tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off);
51
- vsz, vsz, 0, fns[a->esz]);
504
- do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]);
52
+ gen_gvec_ool_zz(s, fns[a->esz], a->rd, a->rn, 0);
505
+ do_mem_zpa(s, a->rd, a->pg, addr, fns[s->be_data == MO_BE][a->dtype]);
506
}
53
}
507
return true;
54
return true;
508
}
55
}
509
510
static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
511
{
512
- static gen_helper_gvec_mem * const fns[4] = {
513
- gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r,
514
- gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r,
515
+ static gen_helper_gvec_mem * const fns[2][4] = {
516
+ { gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_le_r,
517
+ gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld1dd_le_r },
518
+ { gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_be_r,
519
+ gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld1dd_be_r },
520
};
521
unsigned vsz = vec_full_reg_size(s);
522
TCGv_ptr t_pg;
523
@@ -XXX,XX +XXX,XX @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
524
t_pg = tcg_temp_new_ptr();
525
tcg_gen_addi_ptr(t_pg, cpu_env, poff);
526
527
- fns[msz](cpu_env, t_pg, addr, desc);
528
+ fns[s->be_data == MO_BE][msz](cpu_env, t_pg, addr, desc);
529
530
tcg_temp_free_ptr(t_pg);
531
tcg_temp_free_i32(desc);
532
--
56
--
533
2.19.0
57
2.20.1
534
58
535
59
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Given that the only field defined for this new register may only
3
Rather than require the user to fill in the immediate (shl or shr),
4
be 0, we don't actually need to change anything except the name.
4
create full formats that include the immediate.
5
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
8
Message-id: 20200815013145.539409-14-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20181005175350.30752-2-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/helper.c | 3 ++-
11
target/arm/sve.decode | 35 ++++++++++++++++-------------------
13
1 file changed, 2 insertions(+), 1 deletion(-)
12
1 file changed, 16 insertions(+), 19 deletions(-)
14
13
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
14
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
16
--- a/target/arm/sve.decode
18
+++ b/target/arm/helper.c
17
+++ b/target/arm/sve.decode
19
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
18
@@ -XXX,XX +XXX,XX @@
20
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 3,
19
@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri
21
.access = PL1_R, .type = ARM_CP_CONST,
20
22
.resetvalue = 0 },
21
# Two register operand, one immediate operand, with predicate,
23
- { .name = "ID_AA64PFR4_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
22
-# element size encoded as TSZHL. User must fill in imm.
24
+ { .name = "ID_AA64ZFR0_EL1", .state = ARM_CP_STATE_AA64,
23
-@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \
25
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 4,
24
- &rpri_esz rn=%reg_movprfx esz=%tszimm_esz
26
.access = PL1_R, .type = ARM_CP_CONST,
25
+# element size encoded as TSZHL.
27
+ /* At present, only SVEver == 0 is defined anyway. */
26
+@rdn_pg_tszimm_shl ........ .. ... ... ... pg:3 ..... rd:5 \
28
.resetvalue = 0 },
27
+ &rpri_esz rn=%reg_movprfx esz=%tszimm_esz imm=%tszimm_shl
29
{ .name = "ID_AA64PFR5_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
28
+@rdn_pg_tszimm_shr ........ .. ... ... ... pg:3 ..... rd:5 \
30
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 5,
29
+ &rpri_esz rn=%reg_movprfx esz=%tszimm_esz imm=%tszimm_shr
30
31
# Similarly without predicate.
32
-@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \
33
- &rri_esz esz=%tszimm16_esz
34
+@rd_rn_tszimm_shl ........ .. ... ... ...... rn:5 rd:5 \
35
+ &rri_esz esz=%tszimm16_esz imm=%tszimm16_shl
36
+@rd_rn_tszimm_shr ........ .. ... ... ...... rn:5 rd:5 \
37
+ &rri_esz esz=%tszimm16_esz imm=%tszimm16_shr
38
39
# Two register operand, one immediate operand, with 4-bit predicate.
40
# User must fill in imm.
41
@@ -XXX,XX +XXX,XX @@ UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn
42
### SVE Shift by Immediate - Predicated Group
43
44
# SVE bitwise shift by immediate (predicated)
45
-ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... \
46
- @rdn_pg_tszimm imm=%tszimm_shr
47
-LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... \
48
- @rdn_pg_tszimm imm=%tszimm_shr
49
-LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \
50
- @rdn_pg_tszimm imm=%tszimm_shl
51
-ASRD 00000100 .. 000 100 100 ... .. ... ..... \
52
- @rdn_pg_tszimm imm=%tszimm_shr
53
+ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... @rdn_pg_tszimm_shr
54
+LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rdn_pg_tszimm_shr
55
+LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm_shl
56
+ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm_shr
57
58
# SVE bitwise shift by vector (predicated)
59
ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm
60
@@ -XXX,XX +XXX,XX @@ RDVL 00000100 101 11111 01010 imm:s6 rd:5
61
### SVE Bitwise Shift - Unpredicated Group
62
63
# SVE bitwise shift by immediate (unpredicated)
64
-ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... \
65
- @rd_rn_tszimm imm=%tszimm16_shr
66
-LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... \
67
- @rd_rn_tszimm imm=%tszimm16_shr
68
-LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... \
69
- @rd_rn_tszimm imm=%tszimm16_shl
70
+ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... @rd_rn_tszimm_shr
71
+LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... @rd_rn_tszimm_shr
72
+LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... @rd_rn_tszimm_shl
73
74
# SVE bitwise shift by wide elements (unpredicated)
75
# Note esz != 3
31
--
76
--
32
2.19.0
77
2.20.1
33
78
34
79
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Use the same *_tlb primitives as we use for ld1.
3
Unify add/sub helpers and add a parameter for rounding.
4
This will allow saturating non-rounding to reuse this code.
4
5
5
For linux-user, this hoists the set of helper_retaddr. For softmmu,
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
hoists the computation of the current mmu_idx outside the loop,
7
[PMM: fixed accidental use of '=' rather than '+=' in do_sqrdmlah_s]
7
fixes the endianness problem, and moves the main loop out of a
8
macro and into an inlined function.
9
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
9
Message-id: 20200815013145.539409-15-richard.henderson@linaro.org
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20181005175350.30752-9-richard.henderson@linaro.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
11
---
16
target/arm/sve_helper.c | 210 ++++++++++++++++++++++------------------
12
target/arm/vec_helper.c | 80 +++++++++++++++--------------------------
17
1 file changed, 117 insertions(+), 93 deletions(-)
13
1 file changed, 29 insertions(+), 51 deletions(-)
18
14
19
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
15
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
20
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/sve_helper.c
17
--- a/target/arm/vec_helper.c
22
+++ b/target/arm/sve_helper.c
18
+++ b/target/arm/vec_helper.c
23
@@ -XXX,XX +XXX,XX @@ DO_LD1_2(ld1dd, 3, 3)
19
@@ -XXX,XX +XXX,XX @@
24
#undef DO_LD1_1
20
#endif
25
#undef DO_LD1_2
21
26
22
/* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
27
-#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \
23
-static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
28
-void HELPER(NAME)(CPUARMState *env, void *vg, \
24
- int16_t src3, uint32_t *sat)
29
- target_ulong addr, uint32_t desc) \
25
+static int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3,
30
-{ \
26
+ bool neg, bool round, uint32_t *sat)
31
- intptr_t i, oprsz = simd_oprsz(desc); \
27
{
32
- intptr_t ra = GETPC(); \
28
- /* Simplify:
33
- unsigned rd = simd_data(desc); \
29
+ /*
34
- void *d1 = &env->vfp.zregs[rd]; \
30
+ * Simplify:
35
- void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
31
* = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
36
- for (i = 0; i < oprsz; ) { \
32
* = ((a3 << 15) + (e1 * e2) + (1 << 14)) >> 15
37
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
33
*/
38
- do { \
34
int32_t ret = (int32_t)src1 * src2;
39
- TYPEM m1 = 0, m2 = 0; \
35
- ret = ((int32_t)src3 << 15) + ret + (1 << 14);
40
- if (pg & 1) { \
36
+ if (neg) {
41
- m1 = FN(env, addr, ra); \
37
+ ret = -ret;
42
- m2 = FN(env, addr + sizeof(TYPEM), ra); \
38
+ }
43
- } \
39
+ ret += ((int32_t)src3 << 15) + (round << 14);
44
- *(TYPEE *)(d1 + H(i)) = m1; \
40
ret >>= 15;
45
- *(TYPEE *)(d2 + H(i)) = m2; \
46
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
47
- addr += 2 * sizeof(TYPEM); \
48
- } while (i & 15); \
49
- } \
50
+/*
51
+ * Common helpers for all contiguous 2,3,4-register predicated loads.
52
+ */
53
+static void sve_ld2_r(CPUARMState *env, void *vg, target_ulong addr,
54
+ uint32_t desc, int size, uintptr_t ra,
55
+ sve_ld1_tlb_fn *tlb_fn)
56
+{
57
+ const int mmu_idx = cpu_mmu_index(env, false);
58
+ intptr_t i, oprsz = simd_oprsz(desc);
59
+ unsigned rd = simd_data(desc);
60
+ ARMVectorReg scratch[2] = { };
61
+
41
+
62
+ set_helper_retaddr(ra);
42
if (ret != (int16_t)ret) {
63
+ for (i = 0; i < oprsz; ) {
43
*sat = 1;
64
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
44
- ret = (ret < 0 ? -0x8000 : 0x7fff);
65
+ do {
45
+ ret = (ret < 0 ? INT16_MIN : INT16_MAX);
66
+ if (pg & 1) {
46
}
67
+ tlb_fn(env, &scratch[0], i, addr, mmu_idx, ra);
47
return ret;
68
+ tlb_fn(env, &scratch[1], i, addr + size, mmu_idx, ra);
48
}
69
+ }
49
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
70
+ i += size, pg >>= size;
50
uint32_t src2, uint32_t src3)
71
+ addr += 2 * size;
51
{
72
+ } while (i & 15);
52
uint32_t *sat = &env->vfp.qc[0];
53
- uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
54
- uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
55
+ uint16_t e1 = do_sqrdmlah_h(src1, src2, src3, false, true, sat);
56
+ uint16_t e2 = do_sqrdmlah_h(src1 >> 16, src2 >> 16, src3 >> 16,
57
+ false, true, sat);
58
return deposit32(e1, 16, 16, e2);
59
}
60
61
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
62
uintptr_t i;
63
64
for (i = 0; i < opr_sz / 2; ++i) {
65
- d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
66
+ d[i] = do_sqrdmlah_h(n[i], m[i], d[i], false, true, vq);
67
}
68
clear_tail(d, opr_sz, simd_maxsz(desc));
69
}
70
71
-/* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
72
-static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
73
- int16_t src3, uint32_t *sat)
74
-{
75
- /* Similarly, using subtraction:
76
- * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
77
- * = ((a3 << 15) - (e1 * e2) + (1 << 14)) >> 15
78
- */
79
- int32_t ret = (int32_t)src1 * src2;
80
- ret = ((int32_t)src3 << 15) - ret + (1 << 14);
81
- ret >>= 15;
82
- if (ret != (int16_t)ret) {
83
- *sat = 1;
84
- ret = (ret < 0 ? -0x8000 : 0x7fff);
85
- }
86
- return ret;
87
-}
88
-
89
uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
90
uint32_t src2, uint32_t src3)
91
{
92
uint32_t *sat = &env->vfp.qc[0];
93
- uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
94
- uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
95
+ uint16_t e1 = do_sqrdmlah_h(src1, src2, src3, true, true, sat);
96
+ uint16_t e2 = do_sqrdmlah_h(src1 >> 16, src2 >> 16, src3 >> 16,
97
+ true, true, sat);
98
return deposit32(e1, 16, 16, e2);
99
}
100
101
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
102
uintptr_t i;
103
104
for (i = 0; i < opr_sz / 2; ++i) {
105
- d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
106
+ d[i] = do_sqrdmlah_h(n[i], m[i], d[i], true, true, vq);
107
}
108
clear_tail(d, opr_sz, simd_maxsz(desc));
109
}
110
111
/* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
112
-static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
113
- int32_t src3, uint32_t *sat)
114
+static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3,
115
+ bool neg, bool round, uint32_t *sat)
116
{
117
/* Simplify similarly to int_qrdmlah_s16 above. */
118
int64_t ret = (int64_t)src1 * src2;
119
- ret = ((int64_t)src3 << 31) + ret + (1 << 30);
120
+ if (neg) {
121
+ ret = -ret;
73
+ }
122
+ }
74
+ set_helper_retaddr(0);
123
+ ret += ((int64_t)src3 << 31) + (round << 30);
124
ret >>= 31;
75
+
125
+
76
+ /* Wait until all exceptions have been raised to write back. */
126
if (ret != (int32_t)ret) {
77
+ memcpy(&env->vfp.zregs[rd], &scratch[0], oprsz);
127
*sat = 1;
78
+ memcpy(&env->vfp.zregs[(rd + 1) & 31], &scratch[1], oprsz);
128
ret = (ret < 0 ? INT32_MIN : INT32_MAX);
129
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
130
int32_t src2, int32_t src3)
131
{
132
uint32_t *sat = &env->vfp.qc[0];
133
- return inl_qrdmlah_s32(src1, src2, src3, sat);
134
+ return do_sqrdmlah_s(src1, src2, src3, false, true, sat);
79
}
135
}
80
136
81
-#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \
137
void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
82
-void HELPER(NAME)(CPUARMState *env, void *vg, \
138
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
83
- target_ulong addr, uint32_t desc) \
139
uintptr_t i;
84
-{ \
140
85
- intptr_t i, oprsz = simd_oprsz(desc); \
141
for (i = 0; i < opr_sz / 4; ++i) {
86
- intptr_t ra = GETPC(); \
142
- d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
87
- unsigned rd = simd_data(desc); \
143
+ d[i] = do_sqrdmlah_s(n[i], m[i], d[i], false, true, vq);
88
- void *d1 = &env->vfp.zregs[rd]; \
144
}
89
- void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
145
clear_tail(d, opr_sz, simd_maxsz(desc));
90
- void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
91
- for (i = 0; i < oprsz; ) { \
92
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
93
- do { \
94
- TYPEM m1 = 0, m2 = 0, m3 = 0; \
95
- if (pg & 1) { \
96
- m1 = FN(env, addr, ra); \
97
- m2 = FN(env, addr + sizeof(TYPEM), ra); \
98
- m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
99
- } \
100
- *(TYPEE *)(d1 + H(i)) = m1; \
101
- *(TYPEE *)(d2 + H(i)) = m2; \
102
- *(TYPEE *)(d3 + H(i)) = m3; \
103
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
104
- addr += 3 * sizeof(TYPEM); \
105
- } while (i & 15); \
106
- } \
107
+static void sve_ld3_r(CPUARMState *env, void *vg, target_ulong addr,
108
+ uint32_t desc, int size, uintptr_t ra,
109
+ sve_ld1_tlb_fn *tlb_fn)
110
+{
111
+ const int mmu_idx = cpu_mmu_index(env, false);
112
+ intptr_t i, oprsz = simd_oprsz(desc);
113
+ unsigned rd = simd_data(desc);
114
+ ARMVectorReg scratch[3] = { };
115
+
116
+ set_helper_retaddr(ra);
117
+ for (i = 0; i < oprsz; ) {
118
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
119
+ do {
120
+ if (pg & 1) {
121
+ tlb_fn(env, &scratch[0], i, addr, mmu_idx, ra);
122
+ tlb_fn(env, &scratch[1], i, addr + size, mmu_idx, ra);
123
+ tlb_fn(env, &scratch[2], i, addr + 2 * size, mmu_idx, ra);
124
+ }
125
+ i += size, pg >>= size;
126
+ addr += 3 * size;
127
+ } while (i & 15);
128
+ }
129
+ set_helper_retaddr(0);
130
+
131
+ /* Wait until all exceptions have been raised to write back. */
132
+ memcpy(&env->vfp.zregs[rd], &scratch[0], oprsz);
133
+ memcpy(&env->vfp.zregs[(rd + 1) & 31], &scratch[1], oprsz);
134
+ memcpy(&env->vfp.zregs[(rd + 2) & 31], &scratch[2], oprsz);
135
}
146
}
136
147
137
-#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \
148
-/* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
138
-void HELPER(NAME)(CPUARMState *env, void *vg, \
149
-static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
139
- target_ulong addr, uint32_t desc) \
150
- int32_t src3, uint32_t *sat)
140
-{ \
151
-{
141
- intptr_t i, oprsz = simd_oprsz(desc); \
152
- /* Simplify similarly to int_qrdmlsh_s16 above. */
142
- intptr_t ra = GETPC(); \
153
- int64_t ret = (int64_t)src1 * src2;
143
- unsigned rd = simd_data(desc); \
154
- ret = ((int64_t)src3 << 31) - ret + (1 << 30);
144
- void *d1 = &env->vfp.zregs[rd]; \
155
- ret >>= 31;
145
- void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
156
- if (ret != (int32_t)ret) {
146
- void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
157
- *sat = 1;
147
- void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \
158
- ret = (ret < 0 ? INT32_MIN : INT32_MAX);
148
- for (i = 0; i < oprsz; ) { \
159
- }
149
- uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
160
- return ret;
150
- do { \
161
-}
151
- TYPEM m1 = 0, m2 = 0, m3 = 0, m4 = 0; \
162
-
152
- if (pg & 1) { \
163
uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
153
- m1 = FN(env, addr, ra); \
164
int32_t src2, int32_t src3)
154
- m2 = FN(env, addr + sizeof(TYPEM), ra); \
165
{
155
- m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
166
uint32_t *sat = &env->vfp.qc[0];
156
- m4 = FN(env, addr + 3 * sizeof(TYPEM), ra); \
167
- return inl_qrdmlsh_s32(src1, src2, src3, sat);
157
- } \
168
+ return do_sqrdmlah_s(src1, src2, src3, true, true, sat);
158
- *(TYPEE *)(d1 + H(i)) = m1; \
159
- *(TYPEE *)(d2 + H(i)) = m2; \
160
- *(TYPEE *)(d3 + H(i)) = m3; \
161
- *(TYPEE *)(d4 + H(i)) = m4; \
162
- i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
163
- addr += 4 * sizeof(TYPEM); \
164
- } while (i & 15); \
165
- } \
166
+static void sve_ld4_r(CPUARMState *env, void *vg, target_ulong addr,
167
+ uint32_t desc, int size, uintptr_t ra,
168
+ sve_ld1_tlb_fn *tlb_fn)
169
+{
170
+ const int mmu_idx = cpu_mmu_index(env, false);
171
+ intptr_t i, oprsz = simd_oprsz(desc);
172
+ unsigned rd = simd_data(desc);
173
+ ARMVectorReg scratch[4] = { };
174
+
175
+ set_helper_retaddr(ra);
176
+ for (i = 0; i < oprsz; ) {
177
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
178
+ do {
179
+ if (pg & 1) {
180
+ tlb_fn(env, &scratch[0], i, addr, mmu_idx, ra);
181
+ tlb_fn(env, &scratch[1], i, addr + size, mmu_idx, ra);
182
+ tlb_fn(env, &scratch[2], i, addr + 2 * size, mmu_idx, ra);
183
+ tlb_fn(env, &scratch[3], i, addr + 3 * size, mmu_idx, ra);
184
+ }
185
+ i += size, pg >>= size;
186
+ addr += 4 * size;
187
+ } while (i & 15);
188
+ }
189
+ set_helper_retaddr(0);
190
+
191
+ /* Wait until all exceptions have been raised to write back. */
192
+ memcpy(&env->vfp.zregs[rd], &scratch[0], oprsz);
193
+ memcpy(&env->vfp.zregs[(rd + 1) & 31], &scratch[1], oprsz);
194
+ memcpy(&env->vfp.zregs[(rd + 2) & 31], &scratch[2], oprsz);
195
+ memcpy(&env->vfp.zregs[(rd + 3) & 31], &scratch[3], oprsz);
196
}
169
}
197
170
198
-DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
171
void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
199
-DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
172
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
200
-DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
173
uintptr_t i;
201
+#define DO_LDN_1(N) \
174
202
+void __attribute__((flatten)) HELPER(sve_ld##N##bb_r) \
175
for (i = 0; i < opr_sz / 4; ++i) {
203
+ (CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
176
- d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
204
+{ \
177
+ d[i] = do_sqrdmlah_s(n[i], m[i], d[i], true, true, vq);
205
+ sve_ld##N##_r(env, vg, addr, desc, 1, GETPC(), sve_ld1bb_tlb); \
178
}
206
+}
179
clear_tail(d, opr_sz, simd_maxsz(desc));
207
180
}
208
-DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
209
-DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
210
-DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
211
+#define DO_LDN_2(N, SUFF, SIZE) \
212
+void __attribute__((flatten)) HELPER(sve_ld##N##SUFF##_r) \
213
+ (CPUARMState *env, void *vg, target_ulong addr, uint32_t desc) \
214
+{ \
215
+ sve_ld##N##_r(env, vg, addr, desc, SIZE, GETPC(), \
216
+ arm_cpu_data_is_big_endian(env) \
217
+ ? sve_ld1##SUFF##_be_tlb : sve_ld1##SUFF##_le_tlb); \
218
+}
219
220
-DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
221
-DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
222
-DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
223
+DO_LDN_1(2)
224
+DO_LDN_1(3)
225
+DO_LDN_1(4)
226
227
-DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
228
-DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
229
-DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
230
+DO_LDN_2(2, hh, 2)
231
+DO_LDN_2(3, hh, 2)
232
+DO_LDN_2(4, hh, 2)
233
234
-#undef DO_LD2
235
-#undef DO_LD3
236
-#undef DO_LD4
237
+DO_LDN_2(2, ss, 4)
238
+DO_LDN_2(3, ss, 4)
239
+DO_LDN_2(4, ss, 4)
240
+
241
+DO_LDN_2(2, dd, 8)
242
+DO_LDN_2(3, dd, 8)
243
+DO_LDN_2(4, dd, 8)
244
+
245
+#undef DO_LDN_1
246
+#undef DO_LDN_2
247
248
/*
249
* Load contiguous data, first-fault and no-fault.
250
--
181
--
251
2.19.0
182
2.20.1
252
183
253
184
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Use the existing helpers to determine if (1) the fpu is enabled,
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
(2) sve state is enabled, and (3) the current sve vector length.
5
6
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200815013145.539409-19-richard.henderson@linaro.org
9
Message-id: 20181005175350.30752-6-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/cpu.h | 4 ++++
8
target/arm/helper.h | 4 ++++
13
target/arm/helper.c | 6 +++---
9
target/arm/translate-a64.c | 16 ++++++++++++++++
14
target/arm/translate-a64.c | 8 ++++++--
10
target/arm/vec_helper.c | 29 +++++++++++++++++++++++++----
15
3 files changed, 13 insertions(+), 5 deletions(-)
11
3 files changed, 45 insertions(+), 4 deletions(-)
16
12
17
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
18
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/cpu.h
15
--- a/target/arm/helper.h
20
+++ b/target/arm/cpu.h
16
+++ b/target/arm/helper.h
21
@@ -XXX,XX +XXX,XX @@ target_ulong do_arm_semihosting(CPUARMState *env);
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
void aarch64_sync_32_to_64(CPUARMState *env);
18
DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
void aarch64_sync_64_to_32(CPUARMState *env);
19
DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
20
25
+int fp_exception_el(CPUARMState *env, int cur_el);
21
+DEF_HELPER_FLAGS_4(gvec_mul_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+int sve_exception_el(CPUARMState *env, int cur_el);
22
+DEF_HELPER_FLAGS_4(gvec_mul_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+uint32_t sve_zcr_len_for_el(CPUARMState *env, int el);
23
+DEF_HELPER_FLAGS_4(gvec_mul_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+
24
+
29
static inline bool is_a64(CPUARMState *env)
25
#ifdef TARGET_AARCH64
30
{
26
#include "helper-a64.h"
31
return env->aarch64;
27
#include "helper-sve.h"
32
diff --git a/target/arm/helper.c b/target/arm/helper.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/helper.c
35
+++ b/target/arm/helper.c
36
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
37
* take care of raising that exception.
38
* C.f. the ARM pseudocode function CheckSVEEnabled.
39
*/
40
-static int sve_exception_el(CPUARMState *env, int el)
41
+int sve_exception_el(CPUARMState *env, int el)
42
{
43
#ifndef CONFIG_USER_ONLY
44
if (el <= 1) {
45
@@ -XXX,XX +XXX,XX @@ static int sve_exception_el(CPUARMState *env, int el)
46
/*
47
* Given that SVE is enabled, return the vector length for EL.
48
*/
49
-static uint32_t sve_zcr_len_for_el(CPUARMState *env, int el)
50
+uint32_t sve_zcr_len_for_el(CPUARMState *env, int el)
51
{
52
ARMCPU *cpu = arm_env_get_cpu(env);
53
uint32_t zcr_len = cpu->sve_max_vq - 1;
54
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
55
/* Return the exception level to which FP-disabled exceptions should
56
* be taken, or 0 if FP is enabled.
57
*/
58
-static int fp_exception_el(CPUARMState *env, int cur_el)
59
+int fp_exception_el(CPUARMState *env, int cur_el)
60
{
61
#ifndef CONFIG_USER_ONLY
62
int fpen;
63
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
28
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
64
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/translate-a64.c
30
--- a/target/arm/translate-a64.c
66
+++ b/target/arm/translate-a64.c
31
+++ b/target/arm/translate-a64.c
67
@@ -XXX,XX +XXX,XX @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
32
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
68
cpu_fprintf(f, "\n");
33
data, gen_helper_gvec_fmlal_idx_a64);
34
}
69
return;
35
return;
36
+
37
+ case 0x08: /* MUL */
38
+ if (!is_long && !is_scalar) {
39
+ static gen_helper_gvec_3 * const fns[3] = {
40
+ gen_helper_gvec_mul_idx_h,
41
+ gen_helper_gvec_mul_idx_s,
42
+ gen_helper_gvec_mul_idx_d,
43
+ };
44
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
45
+ vec_full_reg_offset(s, rn),
46
+ vec_full_reg_offset(s, rm),
47
+ is_q ? 16 : 8, vec_full_reg_size(s),
48
+ index, fns[size - 1]);
49
+ return;
50
+ }
51
+ break;
70
}
52
}
71
+ if (fp_exception_el(env, el) != 0) {
53
72
+ cpu_fprintf(f, " FPU disabled\n");
54
if (size == 3) {
73
+ return;
55
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
74
+ }
56
index XXXXXXX..XXXXXXX 100644
75
cpu_fprintf(f, " FPCR=%08x FPSR=%08x\n",
57
--- a/target/arm/vec_helper.c
76
vfp_get_fpcr(env), vfp_get_fpsr(env));
58
+++ b/target/arm/vec_helper.c
77
59
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64)
78
- if (arm_feature(env, ARM_FEATURE_SVE)) {
60
*/
79
- int j, zcr_len = env->vfp.zcr_el[1] & 0xf; /* fix for system mode */
61
80
+ if (arm_feature(env, ARM_FEATURE_SVE) && sve_exception_el(env, el) == 0) {
62
#define DO_MUL_IDX(NAME, TYPE, H) \
81
+ int j, zcr_len = sve_zcr_len_for_el(env, el);
63
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
82
64
+{ \
83
for (i = 0; i <= FFR_PRED_NUM; i++) {
65
+ intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
84
bool eol;
66
+ intptr_t idx = simd_data(desc); \
67
+ TYPE *d = vd, *n = vn, *m = vm; \
68
+ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
69
+ TYPE mm = m[H(i + idx)]; \
70
+ for (j = 0; j < segment; j++) { \
71
+ d[i + j] = n[i + j] * mm; \
72
+ } \
73
+ } \
74
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
75
+}
76
+
77
+DO_MUL_IDX(gvec_mul_idx_h, uint16_t, H2)
78
+DO_MUL_IDX(gvec_mul_idx_s, uint32_t, H4)
79
+DO_MUL_IDX(gvec_mul_idx_d, uint64_t, )
80
+
81
+#undef DO_MUL_IDX
82
+
83
+#define DO_FMUL_IDX(NAME, TYPE, H) \
84
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
85
{ \
86
intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
87
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
88
clear_tail(d, oprsz, simd_maxsz(desc)); \
89
}
90
91
-DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
92
-DO_MUL_IDX(gvec_fmul_idx_s, float32, H4)
93
-DO_MUL_IDX(gvec_fmul_idx_d, float64, )
94
+DO_FMUL_IDX(gvec_fmul_idx_h, float16, H2)
95
+DO_FMUL_IDX(gvec_fmul_idx_s, float32, H4)
96
+DO_FMUL_IDX(gvec_fmul_idx_d, float64, )
97
98
-#undef DO_MUL_IDX
99
+#undef DO_FMUL_IDX
100
101
#define DO_FMLA_IDX(NAME, TYPE, H) \
102
void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
85
--
103
--
86
2.19.0
104
2.20.1
87
105
88
106
diff view generated by jsdifflib
1
Add some comments to the Thumb decoder indicating what bits
1
From: Richard Henderson <richard.henderson@linaro.org>
2
of the instruction have been decoded at various points in
3
the code.
4
2
5
This is not an exhaustive set of comments; we're gradually
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
adding comments as we work with particular bits of the code.
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20200815013145.539409-20-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper.h | 14 ++++++++++++++
9
target/arm/translate-a64.c | 34 ++++++++++++++++++++++++++++++++++
10
target/arm/vec_helper.c | 25 +++++++++++++++++++++++++
11
3 files changed, 73 insertions(+)
7
12
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181002163556.10279-6-peter.maydell@linaro.org
12
---
13
target/arm/translate.c | 20 +++++++++++++++++---
14
1 file changed, 17 insertions(+), 3 deletions(-)
15
16
diff --git a/target/arm/translate.c b/target/arm/translate.c
17
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate.c
15
--- a/target/arm/helper.h
19
+++ b/target/arm/translate.c
16
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_mul_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
tmp2 = load_reg(s, rm);
18
DEF_HELPER_FLAGS_4(gvec_mul_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
if ((insn & 0x70) != 0)
19
DEF_HELPER_FLAGS_4(gvec_mul_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
goto illegal_op;
20
24
+ /*
21
+DEF_HELPER_FLAGS_5(gvec_mla_idx_h, TCG_CALL_NO_RWG,
25
+ * 0b1111_1010_0xxx_xxxx_1111_xxxx_0000_xxxx:
22
+ void, ptr, ptr, ptr, ptr, i32)
26
+ * - MOV, MOVS (register-shifted register), flagsetting
23
+DEF_HELPER_FLAGS_5(gvec_mla_idx_s, TCG_CALL_NO_RWG,
27
+ */
24
+ void, ptr, ptr, ptr, ptr, i32)
28
op = (insn >> 21) & 3;
25
+DEF_HELPER_FLAGS_5(gvec_mla_idx_d, TCG_CALL_NO_RWG,
29
logic_cc = (insn & (1 << 20)) != 0;
26
+ void, ptr, ptr, ptr, ptr, i32)
30
gen_arm_shift_reg(tmp, op, tmp2, logic_cc);
27
+
31
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
28
+DEF_HELPER_FLAGS_5(gvec_mls_idx_h, TCG_CALL_NO_RWG,
32
rd = insn & 7;
29
+ void, ptr, ptr, ptr, ptr, i32)
33
op = (insn >> 11) & 3;
30
+DEF_HELPER_FLAGS_5(gvec_mls_idx_s, TCG_CALL_NO_RWG,
34
if (op == 3) {
31
+ void, ptr, ptr, ptr, ptr, i32)
35
- /* add/subtract */
32
+DEF_HELPER_FLAGS_5(gvec_mls_idx_d, TCG_CALL_NO_RWG,
36
+ /*
33
+ void, ptr, ptr, ptr, ptr, i32)
37
+ * 0b0001_1xxx_xxxx_xxxx
34
+
38
+ * - Add, subtract (three low registers)
35
#ifdef TARGET_AARCH64
39
+ * - Add, subtract (two low registers and immediate)
36
#include "helper-a64.h"
40
+ */
37
#include "helper-sve.h"
41
rn = (insn >> 3) & 7;
38
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
42
tmp = load_reg(s, rn);
39
index XXXXXXX..XXXXXXX 100644
43
if (insn & (1 << 10)) {
40
--- a/target/arm/translate-a64.c
44
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
41
+++ b/target/arm/translate-a64.c
42
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
43
return;
45
}
44
}
46
break;
45
break;
47
case 2: case 3:
46
+
48
- /* arithmetic large immediate */
47
+ case 0x10: /* MLA */
49
+ /*
48
+ if (!is_long && !is_scalar) {
50
+ * 0b001x_xxxx_xxxx_xxxx
49
+ static gen_helper_gvec_4 * const fns[3] = {
51
+ * - Add, subtract, compare, move (one low register and immediate)
50
+ gen_helper_gvec_mla_idx_h,
52
+ */
51
+ gen_helper_gvec_mla_idx_s,
53
op = (insn >> 11) & 3;
52
+ gen_helper_gvec_mla_idx_d,
54
rd = (insn >> 8) & 0x7;
53
+ };
55
if (op == 0) { /* mov */
54
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
56
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
55
+ vec_full_reg_offset(s, rn),
57
break;
56
+ vec_full_reg_offset(s, rm),
58
}
57
+ vec_full_reg_offset(s, rd),
59
58
+ is_q ? 16 : 8, vec_full_reg_size(s),
60
- /* data processing register */
59
+ index, fns[size - 1]);
61
+ /*
60
+ return;
62
+ * 0b0100_00xx_xxxx_xxxx
61
+ }
63
+ * - Data-processing (two low registers)
62
+ break;
64
+ */
63
+
65
rd = insn & 7;
64
+ case 0x14: /* MLS */
66
rm = (insn >> 3) & 7;
65
+ if (!is_long && !is_scalar) {
67
op = (insn >> 6) & 0xf;
66
+ static gen_helper_gvec_4 * const fns[3] = {
67
+ gen_helper_gvec_mls_idx_h,
68
+ gen_helper_gvec_mls_idx_s,
69
+ gen_helper_gvec_mls_idx_d,
70
+ };
71
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
72
+ vec_full_reg_offset(s, rn),
73
+ vec_full_reg_offset(s, rm),
74
+ vec_full_reg_offset(s, rd),
75
+ is_q ? 16 : 8, vec_full_reg_size(s),
76
+ index, fns[size - 1]);
77
+ return;
78
+ }
79
+ break;
80
}
81
82
if (size == 3) {
83
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
84
index XXXXXXX..XXXXXXX 100644
85
--- a/target/arm/vec_helper.c
86
+++ b/target/arm/vec_helper.c
87
@@ -XXX,XX +XXX,XX @@ DO_MUL_IDX(gvec_mul_idx_d, uint64_t, )
88
89
#undef DO_MUL_IDX
90
91
+#define DO_MLA_IDX(NAME, TYPE, OP, H) \
92
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \
93
+{ \
94
+ intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
95
+ intptr_t idx = simd_data(desc); \
96
+ TYPE *d = vd, *n = vn, *m = vm, *a = va; \
97
+ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
98
+ TYPE mm = m[H(i + idx)]; \
99
+ for (j = 0; j < segment; j++) { \
100
+ d[i + j] = a[i + j] OP n[i + j] * mm; \
101
+ } \
102
+ } \
103
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
104
+}
105
+
106
+DO_MLA_IDX(gvec_mla_idx_h, uint16_t, +, H2)
107
+DO_MLA_IDX(gvec_mla_idx_s, uint32_t, +, H4)
108
+DO_MLA_IDX(gvec_mla_idx_d, uint64_t, +, )
109
+
110
+DO_MLA_IDX(gvec_mls_idx_h, uint16_t, -, H2)
111
+DO_MLA_IDX(gvec_mls_idx_s, uint32_t, -, H4)
112
+DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, )
113
+
114
+#undef DO_MLA_IDX
115
+
116
#define DO_FMUL_IDX(NAME, TYPE, H) \
117
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
118
{ \
68
--
119
--
69
2.19.0
120
2.20.1
70
121
71
122
diff view generated by jsdifflib
1
Add code to insert calls to a helper function to do the stack
1
From: Richard Henderson <richard.henderson@linaro.org>
2
limit checking when we handle these forms of instruction
3
that write to SP:
4
* ADD (SP plus immediate)
5
* ADD (SP plus register)
6
* SUB (SP minus immediate)
7
* SUB (SP minus register)
8
* MOV (register)
9
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20200815013145.539409-21-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20181002163556.10279-5-peter.maydell@linaro.org
13
---
7
---
14
target/arm/helper.h | 2 ++
8
target/arm/helper.h | 10 ++++++++
15
target/arm/internals.h | 14 ++++++++
9
target/arm/translate-a64.c | 33 ++++++++++++++++++--------
16
target/arm/op_helper.c | 19 ++++++++++
10
target/arm/vec_helper.c | 48 ++++++++++++++++++++++++++++++++++++++
17
target/arm/translate.c | 80 +++++++++++++++++++++++++++++++++++++-----
11
3 files changed, 81 insertions(+), 10 deletions(-)
18
4 files changed, 106 insertions(+), 9 deletions(-)
19
12
20
diff --git a/target/arm/helper.h b/target/arm/helper.h
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
21
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/helper.h
15
--- a/target/arm/helper.h
23
+++ b/target/arm/helper.h
16
+++ b/target/arm/helper.h
24
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(v7m_blxns, void, env, i32)
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_mls_idx_s, TCG_CALL_NO_RWG,
25
18
DEF_HELPER_FLAGS_5(gvec_mls_idx_d, TCG_CALL_NO_RWG,
26
DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
19
void, ptr, ptr, ptr, ptr, i32)
27
20
28
+DEF_HELPER_2(v8m_stackcheck, void, env, i32)
21
+DEF_HELPER_FLAGS_5(neon_sqdmulh_h, TCG_CALL_NO_RWG,
22
+ void, ptr, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_5(neon_sqdmulh_s, TCG_CALL_NO_RWG,
24
+ void, ptr, ptr, ptr, ptr, i32)
29
+
25
+
30
DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
26
+DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG,
31
DEF_HELPER_3(set_cp_reg, void, env, ptr, i32)
27
+ void, ptr, ptr, ptr, ptr, i32)
32
DEF_HELPER_2(get_cp_reg, i32, env, ptr)
28
+DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG,
33
diff --git a/target/arm/internals.h b/target/arm/internals.h
29
+ void, ptr, ptr, ptr, ptr, i32)
30
+
31
#ifdef TARGET_AARCH64
32
#include "helper-a64.h"
33
#include "helper-sve.h"
34
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
34
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/internals.h
36
--- a/target/arm/translate-a64.c
36
+++ b/target/arm/internals.h
37
+++ b/target/arm/translate-a64.c
37
@@ -XXX,XX +XXX,XX @@ static inline bool v7m_using_psp(CPUARMState *env)
38
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_fpst(DisasContext *s, bool is_q, int rd, int rn,
38
env->v7m.control[env->v7m.secure] & R_V7M_CONTROL_SPSEL_MASK;
39
tcg_temp_free_ptr(fpst);
39
}
40
}
40
41
41
+/**
42
+/* Expand a 3-operand + qc + operation using an out-of-line helper. */
42
+ * v7m_sp_limit: Return SP limit for current CPU state
43
+static void gen_gvec_op3_qc(DisasContext *s, bool is_q, int rd, int rn,
43
+ * Return the SP limit value for the current CPU security state
44
+ int rm, gen_helper_gvec_3_ptr *fn)
44
+ * and stack pointer.
45
+ */
46
+static inline uint32_t v7m_sp_limit(CPUARMState *env)
47
+{
45
+{
48
+ if (v7m_using_psp(env)) {
46
+ TCGv_ptr qc_ptr = tcg_temp_new_ptr();
49
+ return env->v7m.psplim[env->v7m.secure];
47
+
50
+ } else {
48
+ tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
51
+ return env->v7m.msplim[env->v7m.secure];
49
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
52
+ }
50
+ vec_full_reg_offset(s, rn),
51
+ vec_full_reg_offset(s, rm), qc_ptr,
52
+ is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
53
+ tcg_temp_free_ptr(qc_ptr);
53
+}
54
+}
54
+
55
+
55
#endif
56
/* Set ZF and NF based on a 64 bit result. This is alas fiddlier
56
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
57
* than the 32 bit equivalent.
58
*/
59
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
60
gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size);
61
}
62
return;
63
+ case 0x16: /* SQDMULH, SQRDMULH */
64
+ {
65
+ static gen_helper_gvec_3_ptr * const fns[2][2] = {
66
+ { gen_helper_neon_sqdmulh_h, gen_helper_neon_sqrdmulh_h },
67
+ { gen_helper_neon_sqdmulh_s, gen_helper_neon_sqrdmulh_s },
68
+ };
69
+ gen_gvec_op3_qc(s, is_q, rd, rn, rm, fns[size - 1][u]);
70
+ }
71
+ return;
72
case 0x11:
73
if (!u) { /* CMTST */
74
gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
75
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
76
genenvfn = fns[size][u];
77
break;
78
}
79
- case 0x16: /* SQDMULH, SQRDMULH */
80
- {
81
- static NeonGenTwoOpEnvFn * const fns[2][2] = {
82
- { gen_helper_neon_qdmulh_s16, gen_helper_neon_qrdmulh_s16 },
83
- { gen_helper_neon_qdmulh_s32, gen_helper_neon_qrdmulh_s32 },
84
- };
85
- assert(size == 1 || size == 2);
86
- genenvfn = fns[size - 1][u];
87
- break;
88
- }
89
default:
90
g_assert_not_reached();
91
}
92
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
57
index XXXXXXX..XXXXXXX 100644
93
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/op_helper.c
94
--- a/target/arm/vec_helper.c
59
+++ b/target/arm/op_helper.c
95
+++ b/target/arm/vec_helper.c
60
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
96
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
61
97
clear_tail(d, opr_sz, simd_maxsz(desc));
62
#endif /* !defined(CONFIG_USER_ONLY) */
98
}
63
99
64
+void HELPER(v8m_stackcheck)(CPUARMState *env, uint32_t newvalue)
100
+void HELPER(neon_sqdmulh_h)(void *vd, void *vn, void *vm,
101
+ void *vq, uint32_t desc)
65
+{
102
+{
66
+ /*
103
+ intptr_t i, opr_sz = simd_oprsz(desc);
67
+ * Perform the v8M stack limit check for SP updates from translated code,
104
+ int16_t *d = vd, *n = vn, *m = vm;
68
+ * raising an exception if the limit is breached.
69
+ */
70
+ if (newvalue < v7m_sp_limit(env)) {
71
+ CPUState *cs = CPU(arm_env_get_cpu(env));
72
+
105
+
73
+ /*
106
+ for (i = 0; i < opr_sz / 2; ++i) {
74
+ * Stack limit exceptions are a rare case, so rather than syncing
107
+ d[i] = do_sqrdmlah_h(n[i], m[i], 0, false, false, vq);
75
+ * PC/condbits before the call, we use cpu_restore_state() to
76
+ * get them right before raising the exception.
77
+ */
78
+ cpu_restore_state(cs, GETPC(), true);
79
+ raise_exception(env, EXCP_STKOF, 0, 1);
80
+ }
108
+ }
109
+ clear_tail(d, opr_sz, simd_maxsz(desc));
81
+}
110
+}
82
+
111
+
83
uint32_t HELPER(add_setq)(CPUARMState *env, uint32_t a, uint32_t b)
112
+void HELPER(neon_sqrdmulh_h)(void *vd, void *vn, void *vm,
84
{
113
+ void *vq, uint32_t desc)
85
uint32_t res = a + b;
86
diff --git a/target/arm/translate.c b/target/arm/translate.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/translate.c
89
+++ b/target/arm/translate.c
90
@@ -XXX,XX +XXX,XX @@ static void store_reg(DisasContext *s, int reg, TCGv_i32 var)
91
tcg_temp_free_i32(var);
92
}
93
94
+/*
95
+ * Variant of store_reg which applies v8M stack-limit checks before updating
96
+ * SP. If the check fails this will result in an exception being taken.
97
+ * We disable the stack checks for CONFIG_USER_ONLY because we have
98
+ * no idea what the stack limits should be in that case.
99
+ * If stack checking is not being done this just acts like store_reg().
100
+ */
101
+static void store_sp_checked(DisasContext *s, TCGv_i32 var)
102
+{
114
+{
103
+#ifndef CONFIG_USER_ONLY
115
+ intptr_t i, opr_sz = simd_oprsz(desc);
104
+ if (s->v8m_stackcheck) {
116
+ int16_t *d = vd, *n = vn, *m = vm;
105
+ gen_helper_v8m_stackcheck(cpu_env, var);
117
+
118
+ for (i = 0; i < opr_sz / 2; ++i) {
119
+ d[i] = do_sqrdmlah_h(n[i], m[i], 0, false, true, vq);
106
+ }
120
+ }
107
+#endif
121
+ clear_tail(d, opr_sz, simd_maxsz(desc));
108
+ store_reg(s, 13, var);
109
+}
122
+}
110
+
123
+
111
/* Value extensions. */
124
/* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
112
#define gen_uxtb(var) tcg_gen_ext8u_i32(var, var)
125
static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3,
113
#define gen_uxth(var) tcg_gen_ext16u_i32(var, var)
126
bool neg, bool round, uint32_t *sat)
114
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
127
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
115
if (gen_thumb2_data_op(s, op, conds, 0, tmp, tmp2))
128
clear_tail(d, opr_sz, simd_maxsz(desc));
116
goto illegal_op;
129
}
117
tcg_temp_free_i32(tmp2);
130
118
- if (rd != 15) {
131
+void HELPER(neon_sqdmulh_s)(void *vd, void *vn, void *vm,
119
+ if (rd == 13 &&
132
+ void *vq, uint32_t desc)
120
+ ((op == 2 && rn == 15) ||
133
+{
121
+ (op == 8 && rn == 13) ||
134
+ intptr_t i, opr_sz = simd_oprsz(desc);
122
+ (op == 13 && rn == 13))) {
135
+ int32_t *d = vd, *n = vn, *m = vm;
123
+ /* MOV SP, ... or ADD SP, SP, ... or SUB SP, SP, ... */
136
+
124
+ store_sp_checked(s, tmp);
137
+ for (i = 0; i < opr_sz / 4; ++i) {
125
+ } else if (rd != 15) {
138
+ d[i] = do_sqrdmlah_s(n[i], m[i], 0, false, false, vq);
126
store_reg(s, rd, tmp);
139
+ }
127
} else {
140
+ clear_tail(d, opr_sz, simd_maxsz(desc));
128
tcg_temp_free_i32(tmp);
141
+}
129
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
142
+
130
gen_jmp(s, s->pc + offset);
143
+void HELPER(neon_sqrdmulh_s)(void *vd, void *vn, void *vm,
131
}
144
+ void *vq, uint32_t desc)
132
} else {
145
+{
133
- /* Data processing immediate. */
146
+ intptr_t i, opr_sz = simd_oprsz(desc);
134
+ /*
147
+ int32_t *d = vd, *n = vn, *m = vm;
135
+ * 0b1111_0xxx_xxxx_0xxx_xxxx_xxxx
148
+
136
+ * - Data-processing (modified immediate, plain binary immediate)
149
+ for (i = 0; i < opr_sz / 4; ++i) {
137
+ */
150
+ d[i] = do_sqrdmlah_s(n[i], m[i], 0, false, true, vq);
138
if (insn & (1 << 25)) {
151
+ }
139
+ /*
152
+ clear_tail(d, opr_sz, simd_maxsz(desc));
140
+ * 0b1111_0x1x_xxxx_0xxx_xxxx_xxxx
153
+}
141
+ * - Data-processing (plain binary immediate)
154
+
142
+ */
155
/* Integer 8 and 16-bit dot-product.
143
if (insn & (1 << 24)) {
156
*
144
if (insn & (1 << 20))
157
* Note that for the loops herein, host endianness does not matter
145
goto illegal_op;
146
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
147
tmp = tcg_temp_new_i32();
148
tcg_gen_movi_i32(tmp, imm);
149
}
150
+ store_reg(s, rd, tmp);
151
} else {
152
/* Add/sub 12-bit immediate. */
153
if (rn == 15) {
154
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
155
offset += imm;
156
tmp = tcg_temp_new_i32();
157
tcg_gen_movi_i32(tmp, offset);
158
+ store_reg(s, rd, tmp);
159
} else {
160
tmp = load_reg(s, rn);
161
if (insn & (1 << 23))
162
tcg_gen_subi_i32(tmp, tmp, imm);
163
else
164
tcg_gen_addi_i32(tmp, tmp, imm);
165
+ if (rn == 13 && rd == 13) {
166
+ /* ADD SP, SP, imm or SUB SP, SP, imm */
167
+ store_sp_checked(s, tmp);
168
+ } else {
169
+ store_reg(s, rd, tmp);
170
+ }
171
}
172
}
173
- store_reg(s, rd, tmp);
174
}
175
} else {
176
+ /*
177
+ * 0b1111_0x0x_xxxx_0xxx_xxxx_xxxx
178
+ * - Data-processing (modified immediate)
179
+ */
180
int shifter_out = 0;
181
/* modified 12-bit immediate. */
182
shift = ((insn & 0x04000000) >> 23) | ((insn & 0x7000) >> 12);
183
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
184
goto illegal_op;
185
tcg_temp_free_i32(tmp2);
186
rd = (insn >> 8) & 0xf;
187
- if (rd != 15) {
188
+ if (rd == 13 && rn == 13
189
+ && (op == 8 || op == 13)) {
190
+ /* ADD(S) SP, SP, imm or SUB(S) SP, SP, imm */
191
+ store_sp_checked(s, tmp);
192
+ } else if (rd != 15) {
193
store_reg(s, rd, tmp);
194
} else {
195
tcg_temp_free_i32(tmp);
196
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
197
tmp2 = load_reg(s, rm);
198
tcg_gen_add_i32(tmp, tmp, tmp2);
199
tcg_temp_free_i32(tmp2);
200
- store_reg(s, rd, tmp);
201
+ if (rd == 13) {
202
+ /* ADD SP, SP, reg */
203
+ store_sp_checked(s, tmp);
204
+ } else {
205
+ store_reg(s, rd, tmp);
206
+ }
207
break;
208
case 1: /* cmp */
209
tmp = load_reg(s, rd);
210
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
211
break;
212
case 2: /* mov/cpy */
213
tmp = load_reg(s, rm);
214
- store_reg(s, rd, tmp);
215
+ if (rd == 13) {
216
+ /* MOV SP, reg */
217
+ store_sp_checked(s, tmp);
218
+ } else {
219
+ store_reg(s, rd, tmp);
220
+ }
221
break;
222
case 3:
223
{
224
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
225
break;
226
227
case 10:
228
- /* add to high reg */
229
+ /*
230
+ * 0b1010_xxxx_xxxx_xxxx
231
+ * - Add PC/SP (immediate)
232
+ */
233
rd = (insn >> 8) & 7;
234
if (insn & (1 << 11)) {
235
/* SP */
236
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
237
op = (insn >> 8) & 0xf;
238
switch (op) {
239
case 0:
240
- /* adjust stack pointer */
241
+ /*
242
+ * 0b1011_0000_xxxx_xxxx
243
+ * - ADD (SP plus immediate)
244
+ * - SUB (SP minus immediate)
245
+ */
246
tmp = load_reg(s, 13);
247
val = (insn & 0x7f) * 4;
248
if (insn & (1 << 7))
249
val = -(int32_t)val;
250
tcg_gen_addi_i32(tmp, tmp, val);
251
- store_reg(s, 13, tmp);
252
+ store_sp_checked(s, tmp);
253
break;
254
255
case 2: /* sign/zero extend. */
256
--
158
--
257
2.19.0
159
2.20.1
258
160
259
161
diff view generated by jsdifflib