1
Another target-arm queue, since we're over 30 patches
1
target-arm queue for 3.0:
2
already. Most of this is RTH's SVE-patches-part-1.
2
3
Thomas' fixes for instrospection issues with a handful of
4
devices (including one microblaze one that I include in this
5
pullreq for convenience's sake), plus my bugfix for a
6
corner case of small MPU region support.
3
7
4
thanks
8
thanks
5
-- PMM
9
-- PMM
6
10
11
The following changes since commit 55b1f14cefcb19ce6d5e28c4c83404230888aa7e:
7
12
8
The following changes since commit d32e41a1188e929cc0fb16829ce3736046951e39:
13
Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-3.0-pull-request' into staging (2018-07-23 14:03:14 +0100)
9
10
Merge remote-tracking branch 'remotes/famz/tags/docker-and-block-pull-request' into staging (2018-05-18 14:11:52 +0100)
11
14
12
are available in the Git repository at:
15
are available in the Git repository at:
13
16
14
git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180518
17
git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180723
15
18
16
for you to fetch changes up to b94f8f60bd841c5b737185cd38263e26822f77ab:
19
for you to fetch changes up to 1ddc9b98c3cb89fe23a55ba924000fd645253e87:
17
20
18
target/arm: Implement SVE Permute - Extract Group (2018-05-18 17:48:09 +0100)
21
hw/intc/exynos4210_gic: Turn instance_init into realize function (2018-07-23 15:21:27 +0100)
19
22
20
----------------------------------------------------------------
23
----------------------------------------------------------------
21
target-arm queue:
24
target-arm queue:
22
* Initial part of SVE implementation (currently disabled)
25
* spitz, exynos: fix bugs when introspecting some devices
23
* smmuv3: fix some minor Coverity issues
26
* hw/microblaze/xlnx-zynqmp-pmu: Fix introspection problem in 'xlnx, zynqmp-pmu-soc'
24
* add model of Xilinx ZynqMP generic DMA controller
27
* target/arm: Correctly handle overlapping small MPU regions
25
* expose (most) Arm coprocessor/system registers to
28
* hw/sd/bcm2835_sdhost: Fix PIO mode writes
26
gdb via QEMU's gdbstub, for reads only
27
29
28
----------------------------------------------------------------
30
----------------------------------------------------------------
29
Abdallah Bouassida (3):
31
Guenter Roeck (1):
30
target/arm: Add "ARM_CP_NO_GDB" as a new bit field for ARMCPRegInfo type
32
hw/sd/bcm2835_sdhost: Fix PIO mode writes
31
target/arm: Add "_S" suffix to the secure version of a sysreg
32
target/arm: Add the XML dynamic generation
33
33
34
Eric Auger (2):
34
Peter Maydell (1):
35
hw/arm/smmuv3: Fix Coverity issue in smmuv3_record_event
35
target/arm: Correctly handle overlapping small MPU regions
36
hw/arm/smmu-common: Fix coverity issue in get_block_pte_address
37
36
38
Francisco Iglesias (2):
37
Thomas Huth (3):
39
xlnx-zdma: Add a model of the Xilinx ZynqMP generic DMA
38
hw/microblaze/xlnx-zynqmp-pmu: Fix introspection problem in 'xlnx, zynqmp-pmu-soc'
40
xlnx-zynqmp: Connect the ZynqMP GDMA and ADMA
39
hw/arm/spitz: Move problematic nand_init() code to realize function
40
hw/intc/exynos4210_gic: Turn instance_init into realize function
41
41
42
Richard Henderson (25):
42
hw/arm/spitz.c | 15 ++++++++++----
43
target/arm: Introduce translate-a64.h
43
hw/intc/exynos4210_gic.c | 6 +++---
44
target/arm: Add SVE decode skeleton
44
hw/microblaze/xlnx-zynqmp-pmu.c | 10 ++++-----
45
target/arm: Implement SVE Bitwise Logical - Unpredicated Group
45
hw/sd/bcm2835_sdhost.c | 20 ++++++++++++++----
46
target/arm: Implement SVE load vector/predicate
46
target/arm/helper.c | 46 +++++++++++++++++++++++++++++++++++++++++
47
target/arm: Implement SVE predicate test
47
5 files changed, 80 insertions(+), 17 deletions(-)
48
target/arm: Implement SVE Predicate Logical Operations Group
49
target/arm: Implement SVE Predicate Misc Group
50
target/arm: Implement SVE Integer Binary Arithmetic - Predicated Group
51
target/arm: Implement SVE Integer Reduction Group
52
target/arm: Implement SVE bitwise shift by immediate (predicated)
53
target/arm: Implement SVE bitwise shift by vector (predicated)
54
target/arm: Implement SVE bitwise shift by wide elements (predicated)
55
target/arm: Implement SVE Integer Arithmetic - Unary Predicated Group
56
target/arm: Implement SVE Integer Multiply-Add Group
57
target/arm: Implement SVE Integer Arithmetic - Unpredicated Group
58
target/arm: Implement SVE Index Generation Group
59
target/arm: Implement SVE Stack Allocation Group
60
target/arm: Implement SVE Bitwise Shift - Unpredicated Group
61
target/arm: Implement SVE Compute Vector Address Group
62
target/arm: Implement SVE floating-point exponential accelerator
63
target/arm: Implement SVE floating-point trig select coefficient
64
target/arm: Implement SVE Element Count Group
65
target/arm: Implement SVE Bitwise Immediate Group
66
target/arm: Implement SVE Integer Wide Immediate - Predicated Group
67
target/arm: Implement SVE Permute - Extract Group
68
48
69
hw/dma/Makefile.objs | 1 +
70
target/arm/Makefile.objs | 10 +
71
include/hw/arm/xlnx-zynqmp.h | 5 +
72
include/hw/dma/xlnx-zdma.h | 84 ++
73
include/qom/cpu.h | 5 +-
74
target/arm/cpu.h | 37 +-
75
target/arm/helper-sve.h | 427 +++++++++
76
target/arm/helper.h | 1 +
77
target/arm/translate-a64.h | 118 +++
78
gdbstub.c | 10 +
79
hw/arm/smmu-common.c | 4 +-
80
hw/arm/smmuv3.c | 2 +-
81
hw/arm/xlnx-zynqmp.c | 53 ++
82
hw/dma/xlnx-zdma.c | 832 +++++++++++++++++
83
target/arm/cpu.c | 1 +
84
target/arm/gdbstub.c | 76 ++
85
target/arm/helper.c | 57 +-
86
target/arm/sve_helper.c | 1562 +++++++++++++++++++++++++++++++
87
target/arm/translate-a64.c | 119 +--
88
target/arm/translate-sve.c | 2070 ++++++++++++++++++++++++++++++++++++++++++
89
.gitignore | 1 +
90
target/arm/sve.decode | 419 +++++++++
91
22 files changed, 5778 insertions(+), 116 deletions(-)
92
create mode 100644 include/hw/dma/xlnx-zdma.h
93
create mode 100644 target/arm/helper-sve.h
94
create mode 100644 target/arm/translate-a64.h
95
create mode 100644 hw/dma/xlnx-zdma.c
96
create mode 100644 target/arm/sve_helper.c
97
create mode 100644 target/arm/translate-sve.c
98
create mode 100644 target/arm/sve.decode
99
diff view generated by jsdifflib
Deleted patch
1
From: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
2
1
3
This is a preparation for the coming feature of creating dynamically an XML
4
description for the ARM sysregs.
5
A register has ARM_CP_NO_GDB enabled will not be shown in the dynamic XML.
6
This bit is enabled automatically when creating CP_ANY wildcard aliases.
7
This bit could be enabled manually for any register we want to remove from the
8
dynamic XML description.
9
10
Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
13
Tested-by: Alex Bennée <alex.bennee@linaro.org>
14
Message-id: 1524153386-3550-2-git-send-email-abdallah.bouassida@lauterbach.com
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
target/arm/cpu.h | 3 ++-
18
target/arm/helper.c | 2 +-
19
2 files changed, 3 insertions(+), 2 deletions(-)
20
21
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
22
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/cpu.h
24
+++ b/target/arm/cpu.h
25
@@ -XXX,XX +XXX,XX @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
26
#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
27
#define ARM_CP_FPU 0x1000
28
#define ARM_CP_SVE 0x2000
29
+#define ARM_CP_NO_GDB 0x4000
30
/* Used only as a terminator for ARMCPRegInfo lists */
31
#define ARM_CP_SENTINEL 0xffff
32
/* Mask of only the flag bits in a type field */
33
-#define ARM_CP_FLAG_MASK 0x30ff
34
+#define ARM_CP_FLAG_MASK 0x70ff
35
36
/* Valid values for ARMCPRegInfo state field, indicating which of
37
* the AArch32 and AArch64 execution states this register is visible in.
38
diff --git a/target/arm/helper.c b/target/arm/helper.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/helper.c
41
+++ b/target/arm/helper.c
42
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
43
if (((r->crm == CP_ANY) && crm != 0) ||
44
((r->opc1 == CP_ANY) && opc1 != 0) ||
45
((r->opc2 == CP_ANY) && opc2 != 0)) {
46
- r2->type |= ARM_CP_ALIAS;
47
+ r2->type |= ARM_CP_ALIAS | ARM_CP_NO_GDB;
48
}
49
50
/* Check that raw accesses are either forbidden or handled. Note that
51
--
52
2.17.0
53
54
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Valgrind complains:
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
5
Message-id: 20180516223007.10256-24-richard.henderson@linaro.org
5
echo "{'execute':'qmp_capabilities'} {'execute':'device-list-properties'," \
6
"'arguments':{'typename':'xlnx,zynqmp-pmu-soc'}}" \
7
"{'execute': 'human-monitor-command', " \
8
"'arguments': {'command-line': 'info qtree'}}" | \
9
valgrind -q microblazeel-softmmu/qemu-system-microblazeel -M none,accel=qtest -qmp stdio
10
[...]
11
==13605== Invalid read of size 8
12
==13605== at 0x2AC69A: qdev_print (qdev-monitor.c:686)
13
==13605== by 0x2AC69A: qbus_print (qdev-monitor.c:719)
14
==13605== by 0x2591E8: handle_hmp_command (monitor.c:3446)
15
16
Use the new object_initialize_child() and sysbus_init_child_obj() to
17
fix the issue.
18
19
Signed-off-by: Thomas Huth <thuth@redhat.com>
20
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
21
Message-id: 1531839343-13828-1-git-send-email-thuth@redhat.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
23
---
8
target/arm/translate-sve.c | 49 ++++++++++++++++++++++++++++++++++++++
24
hw/microblaze/xlnx-zynqmp-pmu.c | 10 ++++------
9
target/arm/sve.decode | 17 +++++++++++++
25
1 file changed, 4 insertions(+), 6 deletions(-)
10
2 files changed, 66 insertions(+)
11
26
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
27
diff --git a/hw/microblaze/xlnx-zynqmp-pmu.c b/hw/microblaze/xlnx-zynqmp-pmu.c
13
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
29
--- a/hw/microblaze/xlnx-zynqmp-pmu.c
15
+++ b/target/arm/translate-sve.c
30
+++ b/hw/microblaze/xlnx-zynqmp-pmu.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_SINCDEC_v(DisasContext *s, arg_incdec2_cnt *a,
31
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_pmu_soc_init(Object *obj)
17
return true;
32
{
33
XlnxZynqMPPMUSoCState *s = XLNX_ZYNQMP_PMU_SOC(obj);
34
35
- object_initialize(&s->cpu, sizeof(s->cpu),
36
- TYPE_MICROBLAZE_CPU);
37
- object_property_add_child(obj, "pmu-cpu", OBJECT(&s->cpu),
38
- &error_abort);
39
+ object_initialize_child(obj, "pmu-cpu", &s->cpu, sizeof(s->cpu),
40
+ TYPE_MICROBLAZE_CPU, &error_abort, NULL);
41
42
- object_initialize(&s->intc, sizeof(s->intc), TYPE_XLNX_PMU_IO_INTC);
43
- qdev_set_parent_bus(DEVICE(&s->intc), sysbus_get_default());
44
+ sysbus_init_child_obj(obj, "intc", &s->intc, sizeof(s->intc),
45
+ TYPE_XLNX_PMU_IO_INTC);
18
}
46
}
19
47
20
+/*
48
static void xlnx_zynqmp_pmu_soc_realize(DeviceState *dev, Error **errp)
21
+ *** SVE Bitwise Immediate Group
22
+ */
23
+
24
+static bool do_zz_dbm(DisasContext *s, arg_rr_dbm *a, GVecGen2iFn *gvec_fn)
25
+{
26
+ uint64_t imm;
27
+ if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1),
28
+ extract32(a->dbm, 0, 6),
29
+ extract32(a->dbm, 6, 6))) {
30
+ return false;
31
+ }
32
+ if (sve_access_check(s)) {
33
+ unsigned vsz = vec_full_reg_size(s);
34
+ gvec_fn(MO_64, vec_full_reg_offset(s, a->rd),
35
+ vec_full_reg_offset(s, a->rn), imm, vsz, vsz);
36
+ }
37
+ return true;
38
+}
39
+
40
+static bool trans_AND_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn)
41
+{
42
+ return do_zz_dbm(s, a, tcg_gen_gvec_andi);
43
+}
44
+
45
+static bool trans_ORR_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn)
46
+{
47
+ return do_zz_dbm(s, a, tcg_gen_gvec_ori);
48
+}
49
+
50
+static bool trans_EOR_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn)
51
+{
52
+ return do_zz_dbm(s, a, tcg_gen_gvec_xori);
53
+}
54
+
55
+static bool trans_DUPM(DisasContext *s, arg_DUPM *a, uint32_t insn)
56
+{
57
+ uint64_t imm;
58
+ if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1),
59
+ extract32(a->dbm, 0, 6),
60
+ extract32(a->dbm, 6, 6))) {
61
+ return false;
62
+ }
63
+ if (sve_access_check(s)) {
64
+ do_dupi_z(s, a->rd, imm);
65
+ }
66
+ return true;
67
+}
68
+
69
/*
70
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
71
*/
72
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
73
index XXXXXXX..XXXXXXX 100644
74
--- a/target/arm/sve.decode
75
+++ b/target/arm/sve.decode
76
@@ -XXX,XX +XXX,XX @@
77
78
&rr_esz rd rn esz
79
&rri rd rn imm
80
+&rr_dbm rd rn dbm
81
&rrri rd rn rm imm
82
&rri_esz rd rn imm esz
83
&rrr_esz rd rn rm esz
84
@@ -XXX,XX +XXX,XX @@
85
@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \
86
&rri_esz esz=%tszimm16_esz
87
88
+# Two register operand, one encoded bitmask.
89
+@rdn_dbm ........ .. .... dbm:13 rd:5 \
90
+ &rr_dbm rn=%reg_movprfx
91
+
92
# Basic Load/Store with 9-bit immediate offset
93
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
94
&rri imm=%imm9_16_10
95
@@ -XXX,XX +XXX,XX @@ INCDEC_v 00000100 .. 1 1 .... 1100 0 d:1 ..... ..... @incdec2_cnt u=1
96
# Note these require esz != 0.
97
SINCDEC_v 00000100 .. 1 0 .... 1100 d:1 u:1 ..... ..... @incdec2_cnt
98
99
+### SVE Bitwise Immediate Group
100
+
101
+# SVE bitwise logical with immediate (unpredicated)
102
+ORR_zzi 00000101 00 0000 ............. ..... @rdn_dbm
103
+EOR_zzi 00000101 01 0000 ............. ..... @rdn_dbm
104
+AND_zzi 00000101 10 0000 ............. ..... @rdn_dbm
105
+
106
+# SVE broadcast bitmask immediate
107
+DUPM 00000101 11 0000 dbm:13 rd:5
108
+
109
+### SVE Predicate Logical Operations Group
110
+
111
# SVE predicate logical operations
112
AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s
113
BIC_pppp 00100101 0. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s
114
--
49
--
115
2.17.0
50
2.17.1
116
51
117
52
diff view generated by jsdifflib
1
From: Eric Auger <eric.auger@redhat.com>
1
From: Guenter Roeck <linux@roeck-us.net>
2
2
3
Coverity points out that this can overflow if n > 31,
3
Writes in PIO mode have two requirements:
4
because it's only doing 32-bit arithmetic. Let's use 1ULL instead
5
of 1. Also the formulae used to compute n can be replaced by
6
the level_shift() macro.
7
4
8
Reported-by: Peter Maydell <peter.maydell@linaro.org>
5
- A data interrupt must be generated after a write command has been
9
Signed-off-by: Eric Auger <eric.auger@redhat.com>
6
issued to indicate that the chip is ready to receive data.
10
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
- A block interrupt must be generated after each block to indicate
11
Message-id: 1526493784-25328-3-git-send-email-eric.auger@redhat.com
8
that the chip is ready to receive the next data block.
9
10
Rearrange the code to make this happen. Tested on raspi3 (in PIO mode)
11
and raspi2 (in DMA mode).
12
13
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
14
Message-id: 1531779837-20557-1-git-send-email-linux@roeck-us.net
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
17
---
15
hw/arm/smmu-common.c | 4 ++--
18
hw/sd/bcm2835_sdhost.c | 20 ++++++++++++++++----
16
1 file changed, 2 insertions(+), 2 deletions(-)
19
1 file changed, 16 insertions(+), 4 deletions(-)
17
20
18
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
21
diff --git a/hw/sd/bcm2835_sdhost.c b/hw/sd/bcm2835_sdhost.c
19
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/arm/smmu-common.c
23
--- a/hw/sd/bcm2835_sdhost.c
21
+++ b/hw/arm/smmu-common.c
24
+++ b/hw/sd/bcm2835_sdhost.c
22
@@ -XXX,XX +XXX,XX @@ static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
25
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_fifo_run(BCM2835SDHostState *s)
23
static inline hwaddr get_block_pte_address(uint64_t pte, int level,
26
uint32_t value = 0;
24
int granule_sz, uint64_t *bsz)
27
int n;
25
{
28
int is_read;
26
- int n = (granule_sz - 3) * (4 - level) + 3;
29
+ int is_write;
27
+ int n = level_shift(level, granule_sz);
30
28
31
is_read = (s->cmd & SDCMD_READ_CMD) != 0;
29
- *bsz = 1 << n;
32
- if (s->datacnt != 0 && (!is_read || sdbus_data_ready(&s->sdbus))) {
30
+ *bsz = 1ULL << n;
33
+ is_write = (s->cmd & SDCMD_WRITE_CMD) != 0;
31
return PTE_ADDRESS(pte, n);
34
+ if (s->datacnt != 0 && (is_write || sdbus_data_ready(&s->sdbus))) {
32
}
35
if (is_read) {
36
n = 0;
37
while (s->datacnt && s->fifo_len < BCM2835_SDHOST_FIFO_LEN) {
38
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_fifo_run(BCM2835SDHostState *s)
39
if (n != 0) {
40
bcm2835_sdhost_fifo_push(s, value);
41
s->status |= SDHSTS_DATA_FLAG;
42
+ if (s->config & SDHCFG_DATA_IRPT_EN) {
43
+ s->status |= SDHSTS_SDIO_IRPT;
44
+ }
45
}
46
- } else { /* write */
47
+ } else if (is_write) { /* write */
48
n = 0;
49
while (s->datacnt > 0 && (s->fifo_len > 0 || n > 0)) {
50
if (n == 0) {
51
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_fifo_run(BCM2835SDHostState *s)
52
s->edm &= ~SDEDM_FSM_MASK;
53
s->edm |= SDEDM_FSM_DATAMODE;
54
trace_bcm2835_sdhost_edm_change("datacnt 0", s->edm);
55
-
56
- if ((s->cmd & SDCMD_WRITE_CMD) &&
57
+ }
58
+ if (is_write) {
59
+ /* set block interrupt at end of each block transfer */
60
+ if (s->hbct && s->datacnt % s->hbct == 0 &&
61
(s->config & SDHCFG_BLOCK_IRPT_EN)) {
62
s->status |= SDHSTS_BLOCK_IRPT;
63
}
64
+ /* set data interrupt after each transfer */
65
+ s->status |= SDHSTS_DATA_FLAG;
66
+ if (s->config & SDHCFG_DATA_IRPT_EN) {
67
+ s->status |= SDHSTS_SDIO_IRPT;
68
+ }
69
}
70
}
33
71
34
--
72
--
35
2.17.0
73
2.17.1
36
74
37
75
diff view generated by jsdifflib
1
From: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
1
To correctly handle small (less than TARGET_PAGE_SIZE) MPU regions,
2
we must correctly handle the case where the address being looked
3
up hits in an MPU region that is not small but the address is
4
in the same page as a small region. For instance if MPU region
5
1 covers an entire page from 0x2000 to 0x2400 and MPU region
6
2 is small and covers only 0x2200 to 0x2280, then for an access
7
to 0x2000 we must not return a result covering the full page
8
even though we hit the page-sized region 1. Otherwise we will
9
then cache that result in the TLB and accesses that should
10
hit region 2 will incorrectly find the region 1 information.
2
11
3
This is a preparation for the coming feature of creating dynamically an XML
12
Check for the case where we miss an MPU region but it is still
4
description for the ARM sysregs.
13
within the same page, and in that case narrow the size we will
5
Add "_S" suffix to the secure version of sysregs that have both S and NS views
14
pass to tlb_set_page_with_attrs() for whatever the final
6
Replace (S) and (NS) by _S and _NS for the register that are manually defined,
15
outcome is of the MPU lookup.
7
so all the registers follow the same convention.
8
16
9
Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
17
Reported-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
12
Tested-by: Alex Bennée <alex.bennee@linaro.org>
13
Message-id: 1524153386-3550-3-git-send-email-abdallah.bouassida@lauterbach.com
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20180716133302.25989-1-peter.maydell@linaro.org
15
---
21
---
16
target/arm/helper.c | 29 ++++++++++++++++++-----------
22
target/arm/helper.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
17
1 file changed, 18 insertions(+), 11 deletions(-)
23
1 file changed, 46 insertions(+)
18
24
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
25
diff --git a/target/arm/helper.c b/target/arm/helper.c
20
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.c
27
--- a/target/arm/helper.c
22
+++ b/target/arm/helper.c
28
+++ b/target/arm/helper.c
23
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
29
@@ -XXX,XX +XXX,XX @@
24
* the secure register to be properly reset and migrated. There is also no
30
#include "exec/semihost.h"
25
* v8 EL1 version of the register so the non-secure instance stands alone.
31
#include "sysemu/kvm.h"
26
*/
32
#include "fpu/softfloat.h"
27
- { .name = "FCSEIDR(NS)",
33
+#include "qemu/range.h"
28
+ { .name = "FCSEIDR",
34
29
.cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 0,
35
#define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
30
.access = PL1_RW, .secure = ARM_CP_SECSTATE_NS,
36
31
.fieldoffset = offsetof(CPUARMState, cp15.fcseidr_ns),
37
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_pmsav7(CPUARMState *env, uint32_t address,
32
.resetvalue = 0, .writefn = fcse_write, .raw_writefn = raw_write, },
38
}
33
- { .name = "FCSEIDR(S)",
39
34
+ { .name = "FCSEIDR_S",
40
if (address < base || address > base + rmask) {
35
.cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 0,
41
+ /*
36
.access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
42
+ * Address not in this region. We must check whether the
37
.fieldoffset = offsetof(CPUARMState, cp15.fcseidr_s),
43
+ * region covers addresses in the same page as our address.
38
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
44
+ * In that case we must not report a size that covers the
39
.access = PL1_RW, .secure = ARM_CP_SECSTATE_NS,
45
+ * whole page for a subsequent hit against a different MPU
40
.fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[1]),
46
+ * region or the background region, because it would result in
41
.resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
47
+ * incorrect TLB hits for subsequent accesses to addresses that
42
- { .name = "CONTEXTIDR(S)", .state = ARM_CP_STATE_AA32,
48
+ * are in this MPU region.
43
+ { .name = "CONTEXTIDR_S", .state = ARM_CP_STATE_AA32,
49
+ */
44
.cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
50
+ if (ranges_overlap(base, rmask,
45
.access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
51
+ address & TARGET_PAGE_MASK,
46
.fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
52
+ TARGET_PAGE_SIZE)) {
47
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
53
+ *page_size = 1;
48
cp15.c14_timer[GTIMER_PHYS].ctl),
54
+ }
49
.writefn = gt_phys_ctl_write, .raw_writefn = raw_write,
55
continue;
50
},
56
}
51
- { .name = "CNTP_CTL(S)",
57
52
+ { .name = "CNTP_CTL_S",
58
@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
53
.cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1,
59
sattrs->srvalid = true;
54
.secure = ARM_CP_SECSTATE_S,
60
sattrs->sregion = r;
55
.type = ARM_CP_IO | ARM_CP_ALIAS, .access = PL1_RW | PL0_R,
56
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
57
.accessfn = gt_ptimer_access,
58
.readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write,
59
},
60
- { .name = "CNTP_TVAL(S)",
61
+ { .name = "CNTP_TVAL_S",
62
.cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0,
63
.secure = ARM_CP_SECSTATE_S,
64
.type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL1_RW | PL0_R,
65
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
66
.accessfn = gt_ptimer_access,
67
.writefn = gt_phys_cval_write, .raw_writefn = raw_write,
68
},
69
- { .name = "CNTP_CVAL(S)", .cp = 15, .crm = 14, .opc1 = 2,
70
+ { .name = "CNTP_CVAL_S", .cp = 15, .crm = 14, .opc1 = 2,
71
.secure = ARM_CP_SECSTATE_S,
72
.access = PL1_RW | PL0_R,
73
.type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS,
74
@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp)
75
76
static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
77
void *opaque, int state, int secstate,
78
- int crm, int opc1, int opc2)
79
+ int crm, int opc1, int opc2,
80
+ const char *name)
81
{
82
/* Private utility function for define_one_arm_cp_reg_with_opaque():
83
* add a single reginfo struct to the hash table.
84
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
85
int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
86
int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
87
88
+ r2->name = g_strdup(name);
89
/* Reset the secure state to the specific incoming state. This is
90
* necessary as the register may have been defined with both states.
91
*/
92
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
93
/* Under AArch32 CP registers can be common
94
* (same for secure and non-secure world) or banked.
95
*/
96
+ char *name;
97
+
98
switch (r->secure) {
99
case ARM_CP_SECSTATE_S:
100
case ARM_CP_SECSTATE_NS:
101
add_cpreg_to_hashtable(cpu, r, opaque, state,
102
- r->secure, crm, opc1, opc2);
103
+ r->secure, crm, opc1, opc2,
104
+ r->name);
105
break;
106
default:
107
+ name = g_strdup_printf("%s_S", r->name);
108
add_cpreg_to_hashtable(cpu, r, opaque, state,
109
ARM_CP_SECSTATE_S,
110
- crm, opc1, opc2);
111
+ crm, opc1, opc2, name);
112
+ g_free(name);
113
add_cpreg_to_hashtable(cpu, r, opaque, state,
114
ARM_CP_SECSTATE_NS,
115
- crm, opc1, opc2);
116
+ crm, opc1, opc2, r->name);
117
break;
118
}
119
} else {
120
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
121
* of AArch32 */
122
add_cpreg_to_hashtable(cpu, r, opaque, state,
123
ARM_CP_SECSTATE_NS,
124
- crm, opc1, opc2);
125
+ crm, opc1, opc2, r->name);
126
}
61
}
62
+ } else {
63
+ /*
64
+ * Address not in this region. We must check whether the
65
+ * region covers addresses in the same page as our address.
66
+ * In that case we must not report a size that covers the
67
+ * whole page for a subsequent hit against a different MPU
68
+ * region or the background region, because it would result
69
+ * in incorrect TLB hits for subsequent accesses to
70
+ * addresses that are in this MPU region.
71
+ */
72
+ if (limit >= base &&
73
+ ranges_overlap(base, limit - base + 1,
74
+ addr_page_base,
75
+ TARGET_PAGE_SIZE)) {
76
+ sattrs->subpage = true;
77
+ }
127
}
78
}
128
}
79
}
80
}
81
@@ -XXX,XX +XXX,XX @@ static bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
82
}
83
84
if (address < base || address > limit) {
85
+ /*
86
+ * Address not in this region. We must check whether the
87
+ * region covers addresses in the same page as our address.
88
+ * In that case we must not report a size that covers the
89
+ * whole page for a subsequent hit against a different MPU
90
+ * region or the background region, because it would result in
91
+ * incorrect TLB hits for subsequent accesses to addresses that
92
+ * are in this MPU region.
93
+ */
94
+ if (limit >= base &&
95
+ ranges_overlap(base, limit - base + 1,
96
+ addr_page_base,
97
+ TARGET_PAGE_SIZE)) {
98
+ *is_subpage = true;
99
+ }
100
continue;
101
}
102
129
--
103
--
130
2.17.0
104
2.17.1
131
105
132
106
diff view generated by jsdifflib
1
From: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Generate an XML description for the cp-regs.
3
nand_init() does not only create the NAND device, it also realizes
4
Register these regs with the gdb_register_coprocessor().
4
the device with qdev_init_nofail() already. So we must not call
5
Add arm_gdb_get_sysreg() to use it as a callback to read those regs.
5
nand_init() from an instance_init function like sl_nand_init(),
6
Add a dummy arm_gdb_set_sysreg().
6
otherwise we get superfluous NAND devices in the QOM tree after
7
introspecting the 'sl-nand' device. So move the nand_init() to the
8
realize function of 'sl-nand' instead.
7
9
8
Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
10
Signed-off-by: Thomas Huth <thuth@redhat.com>
9
Tested-by: Alex Bennée <alex.bennee@linaro.org>
11
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Message-id: 1524153386-3550-4-git-send-email-abdallah.bouassida@lauterbach.com
12
Message-id: 1532006134-7701-1-git-send-email-thuth@redhat.com
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
15
---
14
include/qom/cpu.h | 5 ++-
16
hw/arm/spitz.c | 15 +++++++++++----
15
target/arm/cpu.h | 26 +++++++++++++++
17
1 file changed, 11 insertions(+), 4 deletions(-)
16
gdbstub.c | 10 ++++++
17
target/arm/cpu.c | 1 +
18
target/arm/gdbstub.c | 76 ++++++++++++++++++++++++++++++++++++++++++++
19
target/arm/helper.c | 26 +++++++++++++++
20
6 files changed, 143 insertions(+), 1 deletion(-)
21
18
22
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
19
diff --git a/hw/arm/spitz.c b/hw/arm/spitz.c
23
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
24
--- a/include/qom/cpu.h
21
--- a/hw/arm/spitz.c
25
+++ b/include/qom/cpu.h
22
+++ b/hw/arm/spitz.c
26
@@ -XXX,XX +XXX,XX @@ struct TranslationBlock;
23
@@ -XXX,XX +XXX,XX @@ static void sl_nand_init(Object *obj)
27
* before the insn which triggers a watchpoint rather than after it.
24
{
28
* @gdb_arch_name: Optional callback that returns the architecture name known
25
SLNANDState *s = SL_NAND(obj);
29
* to GDB. The caller must free the returned string with g_free.
26
SysBusDevice *dev = SYS_BUS_DEVICE(obj);
30
+ * @gdb_get_dynamic_xml: Callback to return dynamically generated XML for the
27
- DriveInfo *nand;
31
+ * gdb stub. Returns a pointer to the XML contents for the specified XML file
28
32
+ * or NULL if the CPU doesn't have a dynamically generated content for it.
29
s->ctl = 0;
33
* @cpu_exec_enter: Callback for cpu_exec preparation.
34
* @cpu_exec_exit: Callback for cpu_exec cleanup.
35
* @cpu_exec_interrupt: Callback for processing interrupts in cpu_exec.
36
@@ -XXX,XX +XXX,XX @@ typedef struct CPUClass {
37
const struct VMStateDescription *vmsd;
38
const char *gdb_core_xml_file;
39
gchar * (*gdb_arch_name)(CPUState *cpu);
40
-
41
+ const char * (*gdb_get_dynamic_xml)(CPUState *cpu, const char *xmlname);
42
void (*cpu_exec_enter)(CPUState *cpu);
43
void (*cpu_exec_exit)(CPUState *cpu);
44
bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request);
45
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/cpu.h
48
+++ b/target/arm/cpu.h
49
@@ -XXX,XX +XXX,XX @@ enum {
50
s<2n+1> maps to the most significant half of d<n>
51
*/
52
53
+/**
54
+ * DynamicGDBXMLInfo:
55
+ * @desc: Contains the XML descriptions.
56
+ * @num_cpregs: Number of the Coprocessor registers seen by GDB.
57
+ * @cpregs_keys: Array that contains the corresponding Key of
58
+ * a given cpreg with the same order of the cpreg in the XML description.
59
+ */
60
+typedef struct DynamicGDBXMLInfo {
61
+ char *desc;
62
+ int num_cpregs;
63
+ uint32_t *cpregs_keys;
64
+} DynamicGDBXMLInfo;
65
+
30
+
66
/* CPU state for each instance of a generic timer (in cp15 c14) */
31
+ memory_region_init_io(&s->iomem, obj, &sl_ops, s, "sl", 0x40);
67
typedef struct ARMGenericTimer {
32
+ sysbus_init_mmio(dev, &s->iomem);
68
uint64_t cval; /* Timer CompareValue register */
69
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
70
uint64_t *cpreg_vmstate_values;
71
int32_t cpreg_vmstate_array_len;
72
73
+ DynamicGDBXMLInfo dyn_xml;
74
+
75
/* Timers used by the generic (architected) timer */
76
QEMUTimer *gt_timer[NUM_GTIMERS];
77
/* GPIO outputs for generic timer */
78
@@ -XXX,XX +XXX,XX @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cpu, vaddr addr,
79
int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
80
int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
81
82
+/* Dynamically generates for gdb stub an XML description of the sysregs from
83
+ * the cp_regs hashtable. Returns the registered sysregs number.
84
+ */
85
+int arm_gen_dynamic_xml(CPUState *cpu);
86
+
87
+/* Returns the dynamically generated XML for the gdb stub.
88
+ * Returns a pointer to the XML contents for the specified XML file or NULL
89
+ * if the XML name doesn't match the predefined one.
90
+ */
91
+const char *arm_gdb_get_dynamic_xml(CPUState *cpu, const char *xmlname);
92
+
93
int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
94
int cpuid, void *opaque);
95
int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
96
diff --git a/gdbstub.c b/gdbstub.c
97
index XXXXXXX..XXXXXXX 100644
98
--- a/gdbstub.c
99
+++ b/gdbstub.c
100
@@ -XXX,XX +XXX,XX @@ static const char *get_feature_xml(const char *p, const char **newp,
101
}
102
return target_xml;
103
}
104
+ if (cc->gdb_get_dynamic_xml) {
105
+ CPUState *cpu = first_cpu;
106
+ char *xmlname = g_strndup(p, len);
107
+ const char *xml = cc->gdb_get_dynamic_xml(cpu, xmlname);
108
+
109
+ g_free(xmlname);
110
+ if (xml) {
111
+ return xml;
112
+ }
113
+ }
114
for (i = 0; ; i++) {
115
name = xml_builtin[i][0];
116
if (!name || (strncmp(name, p, len) == 0 && strlen(name) == len))
117
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
118
index XXXXXXX..XXXXXXX 100644
119
--- a/target/arm/cpu.c
120
+++ b/target/arm/cpu.c
121
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
122
cc->gdb_num_core_regs = 26;
123
cc->gdb_core_xml_file = "arm-core.xml";
124
cc->gdb_arch_name = arm_gdb_arch_name;
125
+ cc->gdb_get_dynamic_xml = arm_gdb_get_dynamic_xml;
126
cc->gdb_stop_before_watchpoint = true;
127
cc->debug_excp_handler = arm_debug_excp_handler;
128
cc->debug_check_watchpoint = arm_debug_check_watchpoint;
129
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/target/arm/gdbstub.c
132
+++ b/target/arm/gdbstub.c
133
@@ -XXX,XX +XXX,XX @@
134
#include "cpu.h"
135
#include "exec/gdbstub.h"
136
137
+typedef struct RegisterSysregXmlParam {
138
+ CPUState *cs;
139
+ GString *s;
140
+} RegisterSysregXmlParam;
141
+
142
/* Old gdb always expect FPA registers. Newer (xml-aware) gdb only expect
143
whatever the target description contains. Due to a historical mishap
144
the FPA registers appear in between core integer regs and the CPSR.
145
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
146
/* Unknown register. */
147
return 0;
148
}
149
+
150
+static void arm_gen_one_xml_reg_tag(GString *s, DynamicGDBXMLInfo *dyn_xml,
151
+ ARMCPRegInfo *ri, uint32_t ri_key,
152
+ int bitsize)
153
+{
154
+ g_string_append_printf(s, "<reg name=\"%s\"", ri->name);
155
+ g_string_append_printf(s, " bitsize=\"%d\"", bitsize);
156
+ g_string_append_printf(s, " group=\"cp_regs\"/>");
157
+ dyn_xml->num_cpregs++;
158
+ dyn_xml->cpregs_keys[dyn_xml->num_cpregs - 1] = ri_key;
159
+}
33
+}
160
+
34
+
161
+static void arm_register_sysreg_for_xml(gpointer key, gpointer value,
35
+static void sl_nand_realize(DeviceState *dev, Error **errp)
162
+ gpointer p)
163
+{
36
+{
164
+ uint32_t ri_key = *(uint32_t *)key;
37
+ SLNANDState *s = SL_NAND(dev);
165
+ ARMCPRegInfo *ri = value;
38
+ DriveInfo *nand;
166
+ RegisterSysregXmlParam *param = (RegisterSysregXmlParam *)p;
167
+ GString *s = param->s;
168
+ ARMCPU *cpu = ARM_CPU(param->cs);
169
+ CPUARMState *env = &cpu->env;
170
+ DynamicGDBXMLInfo *dyn_xml = &cpu->dyn_xml;
171
+
39
+
172
+ if (!(ri->type & (ARM_CP_NO_RAW | ARM_CP_NO_GDB))) {
40
/* FIXME use a qdev drive property instead of drive_get() */
173
+ if (arm_feature(env, ARM_FEATURE_AARCH64)) {
41
nand = drive_get(IF_MTD, 0, 0);
174
+ if (ri->state == ARM_CP_STATE_AA64) {
42
s->nand = nand_init(nand ? blk_by_legacy_dinfo(nand) : NULL,
175
+ arm_gen_one_xml_reg_tag(s , dyn_xml, ri, ri_key, 64);
43
s->manf_id, s->chip_id);
176
+ }
44
-
177
+ } else {
45
- memory_region_init_io(&s->iomem, obj, &sl_ops, s, "sl", 0x40);
178
+ if (ri->state == ARM_CP_STATE_AA32) {
46
- sysbus_init_mmio(dev, &s->iomem);
179
+ if (!arm_feature(env, ARM_FEATURE_EL3) &&
180
+ (ri->secure & ARM_CP_SECSTATE_S)) {
181
+ return;
182
+ }
183
+ if (ri->type & ARM_CP_64BIT) {
184
+ arm_gen_one_xml_reg_tag(s , dyn_xml, ri, ri_key, 64);
185
+ } else {
186
+ arm_gen_one_xml_reg_tag(s , dyn_xml, ri, ri_key, 32);
187
+ }
188
+ }
189
+ }
190
+ }
191
+}
192
+
193
+int arm_gen_dynamic_xml(CPUState *cs)
194
+{
195
+ ARMCPU *cpu = ARM_CPU(cs);
196
+ GString *s = g_string_new(NULL);
197
+ RegisterSysregXmlParam param = {cs, s};
198
+
199
+ cpu->dyn_xml.num_cpregs = 0;
200
+ cpu->dyn_xml.cpregs_keys = g_malloc(sizeof(uint32_t *) *
201
+ g_hash_table_size(cpu->cp_regs));
202
+ g_string_printf(s, "<?xml version=\"1.0\"?>");
203
+ g_string_append_printf(s, "<!DOCTYPE target SYSTEM \"gdb-target.dtd\">");
204
+ g_string_append_printf(s, "<feature name=\"org.qemu.gdb.arm.sys.regs\">");
205
+ g_hash_table_foreach(cpu->cp_regs, arm_register_sysreg_for_xml, &param);
206
+ g_string_append_printf(s, "</feature>");
207
+ cpu->dyn_xml.desc = g_string_free(s, false);
208
+ return cpu->dyn_xml.num_cpregs;
209
+}
210
+
211
+const char *arm_gdb_get_dynamic_xml(CPUState *cs, const char *xmlname)
212
+{
213
+ ARMCPU *cpu = ARM_CPU(cs);
214
+
215
+ if (strcmp(xmlname, "system-registers.xml") == 0) {
216
+ return cpu->dyn_xml.desc;
217
+ }
218
+ return NULL;
219
+}
220
diff --git a/target/arm/helper.c b/target/arm/helper.c
221
index XXXXXXX..XXXXXXX 100644
222
--- a/target/arm/helper.c
223
+++ b/target/arm/helper.c
224
@@ -XXX,XX +XXX,XX @@ static void write_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri,
225
}
226
}
47
}
227
48
228
+static int arm_gdb_get_sysreg(CPUARMState *env, uint8_t *buf, int reg)
49
/* Spitz Keyboard */
229
+{
50
@@ -XXX,XX +XXX,XX @@ static void sl_nand_class_init(ObjectClass *klass, void *data)
230
+ ARMCPU *cpu = arm_env_get_cpu(env);
51
231
+ const ARMCPRegInfo *ri;
52
dc->vmsd = &vmstate_sl_nand_info;
232
+ uint32_t key;
53
dc->props = sl_nand_properties;
233
+
54
+ dc->realize = sl_nand_realize;
234
+ key = cpu->dyn_xml.cpregs_keys[reg];
55
/* Reason: init() method uses drive_get() */
235
+ ri = get_arm_cp_reginfo(cpu->cp_regs, key);
56
dc->user_creatable = false;
236
+ if (ri) {
237
+ if (cpreg_field_is_64bit(ri)) {
238
+ return gdb_get_reg64(buf, (uint64_t)read_raw_cp_reg(env, ri));
239
+ } else {
240
+ return gdb_get_reg32(buf, (uint32_t)read_raw_cp_reg(env, ri));
241
+ }
242
+ }
243
+ return 0;
244
+}
245
+
246
+static int arm_gdb_set_sysreg(CPUARMState *env, uint8_t *buf, int reg)
247
+{
248
+ return 0;
249
+}
250
+
251
static bool raw_accessors_invalid(const ARMCPRegInfo *ri)
252
{
253
/* Return true if the regdef would cause an assertion if you called
254
@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
255
gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
256
19, "arm-vfp.xml", 0);
257
}
258
+ gdb_register_coprocessor(cs, arm_gdb_get_sysreg, arm_gdb_set_sysreg,
259
+ arm_gen_dynamic_xml(cs),
260
+ "system-registers.xml", 0);
261
}
57
}
262
263
/* Sort alphabetically by type name, except for "any". */
264
--
58
--
265
2.17.0
59
2.17.1
266
60
267
61
diff view generated by jsdifflib
Deleted patch
1
From: Francisco Iglesias <frasse.iglesias@gmail.com>
2
1
3
Add a model of the generic DMA found on Xilinx ZynqMP.
4
5
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
6
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
7
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
8
Message-id: 20180503214201.29082-2-frasse.iglesias@gmail.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
hw/dma/Makefile.objs | 1 +
12
include/hw/dma/xlnx-zdma.h | 84 ++++
13
hw/dma/xlnx-zdma.c | 832 +++++++++++++++++++++++++++++++++++++
14
3 files changed, 917 insertions(+)
15
create mode 100644 include/hw/dma/xlnx-zdma.h
16
create mode 100644 hw/dma/xlnx-zdma.c
17
18
diff --git a/hw/dma/Makefile.objs b/hw/dma/Makefile.objs
19
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/dma/Makefile.objs
21
+++ b/hw/dma/Makefile.objs
22
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ETRAXFS) += etraxfs_dma.o
23
common-obj-$(CONFIG_STP2000) += sparc32_dma.o
24
obj-$(CONFIG_XLNX_ZYNQMP) += xlnx_dpdma.o
25
obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx_dpdma.o
26
+common-obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx-zdma.o
27
28
obj-$(CONFIG_OMAP) += omap_dma.o soc_dma.o
29
obj-$(CONFIG_PXA2XX) += pxa2xx_dma.o
30
diff --git a/include/hw/dma/xlnx-zdma.h b/include/hw/dma/xlnx-zdma.h
31
new file mode 100644
32
index XXXXXXX..XXXXXXX
33
--- /dev/null
34
+++ b/include/hw/dma/xlnx-zdma.h
35
@@ -XXX,XX +XXX,XX @@
36
+/*
37
+ * QEMU model of the ZynqMP generic DMA
38
+ *
39
+ * Copyright (c) 2014 Xilinx Inc.
40
+ * Copyright (c) 2018 FEIMTECH AB
41
+ *
42
+ * Written by Edgar E. Iglesias <edgar.iglesias@xilinx.com>,
43
+ * Francisco Iglesias <francisco.iglesias@feimtech.se>
44
+ *
45
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
46
+ * of this software and associated documentation files (the "Software"), to deal
47
+ * in the Software without restriction, including without limitation the rights
48
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
49
+ * copies of the Software, and to permit persons to whom the Software is
50
+ * furnished to do so, subject to the following conditions:
51
+ *
52
+ * The above copyright notice and this permission notice shall be included in
53
+ * all copies or substantial portions of the Software.
54
+ *
55
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
56
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
57
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
58
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
59
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
60
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
61
+ * THE SOFTWARE.
62
+ */
63
+
64
+#ifndef XLNX_ZDMA_H
65
+#define XLNX_ZDMA_H
66
+
67
+#include "hw/sysbus.h"
68
+#include "hw/register.h"
69
+#include "sysemu/dma.h"
70
+
71
+#define ZDMA_R_MAX (0x204 / 4)
72
+
73
+typedef enum {
74
+ DISABLED = 0,
75
+ ENABLED = 1,
76
+ PAUSED = 2,
77
+} XlnxZDMAState;
78
+
79
+typedef union {
80
+ struct {
81
+ uint64_t addr;
82
+ uint32_t size;
83
+ uint32_t attr;
84
+ };
85
+ uint32_t words[4];
86
+} XlnxZDMADescr;
87
+
88
+typedef struct XlnxZDMA {
89
+ SysBusDevice parent_obj;
90
+ MemoryRegion iomem;
91
+ MemTxAttrs attr;
92
+ MemoryRegion *dma_mr;
93
+ AddressSpace *dma_as;
94
+ qemu_irq irq_zdma_ch_imr;
95
+
96
+ struct {
97
+ uint32_t bus_width;
98
+ } cfg;
99
+
100
+ XlnxZDMAState state;
101
+ bool error;
102
+
103
+ XlnxZDMADescr dsc_src;
104
+ XlnxZDMADescr dsc_dst;
105
+
106
+ uint32_t regs[ZDMA_R_MAX];
107
+ RegisterInfo regs_info[ZDMA_R_MAX];
108
+
109
+ /* We don't model the common bufs. Must be at least 16 bytes
110
+ to model write only mode. */
111
+ uint8_t buf[2048];
112
+} XlnxZDMA;
113
+
114
+#define TYPE_XLNX_ZDMA "xlnx.zdma"
115
+
116
+#define XLNX_ZDMA(obj) \
117
+ OBJECT_CHECK(XlnxZDMA, (obj), TYPE_XLNX_ZDMA)
118
+
119
+#endif /* XLNX_ZDMA_H */
120
diff --git a/hw/dma/xlnx-zdma.c b/hw/dma/xlnx-zdma.c
121
new file mode 100644
122
index XXXXXXX..XXXXXXX
123
--- /dev/null
124
+++ b/hw/dma/xlnx-zdma.c
125
@@ -XXX,XX +XXX,XX @@
126
+/*
127
+ * QEMU model of the ZynqMP generic DMA
128
+ *
129
+ * Copyright (c) 2014 Xilinx Inc.
130
+ * Copyright (c) 2018 FEIMTECH AB
131
+ *
132
+ * Written by Edgar E. Iglesias <edgar.iglesias@xilinx.com>,
133
+ * Francisco Iglesias <francisco.iglesias@feimtech.se>
134
+ *
135
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
136
+ * of this software and associated documentation files (the "Software"), to deal
137
+ * in the Software without restriction, including without limitation the rights
138
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
139
+ * copies of the Software, and to permit persons to whom the Software is
140
+ * furnished to do so, subject to the following conditions:
141
+ *
142
+ * The above copyright notice and this permission notice shall be included in
143
+ * all copies or substantial portions of the Software.
144
+ *
145
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
146
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
147
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
148
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
149
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
150
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
151
+ * THE SOFTWARE.
152
+ */
153
+
154
+#include "qemu/osdep.h"
155
+#include "hw/dma/xlnx-zdma.h"
156
+#include "qemu/bitops.h"
157
+#include "qemu/log.h"
158
+#include "qapi/error.h"
159
+
160
+#ifndef XLNX_ZDMA_ERR_DEBUG
161
+#define XLNX_ZDMA_ERR_DEBUG 0
162
+#endif
163
+
164
+REG32(ZDMA_ERR_CTRL, 0x0)
165
+ FIELD(ZDMA_ERR_CTRL, APB_ERR_RES, 0, 1)
166
+REG32(ZDMA_CH_ISR, 0x100)
167
+ FIELD(ZDMA_CH_ISR, DMA_PAUSE, 11, 1)
168
+ FIELD(ZDMA_CH_ISR, DMA_DONE, 10, 1)
169
+ FIELD(ZDMA_CH_ISR, AXI_WR_DATA, 9, 1)
170
+ FIELD(ZDMA_CH_ISR, AXI_RD_DATA, 8, 1)
171
+ FIELD(ZDMA_CH_ISR, AXI_RD_DST_DSCR, 7, 1)
172
+ FIELD(ZDMA_CH_ISR, AXI_RD_SRC_DSCR, 6, 1)
173
+ FIELD(ZDMA_CH_ISR, IRQ_DST_ACCT_ERR, 5, 1)
174
+ FIELD(ZDMA_CH_ISR, IRQ_SRC_ACCT_ERR, 4, 1)
175
+ FIELD(ZDMA_CH_ISR, BYTE_CNT_OVRFL, 3, 1)
176
+ FIELD(ZDMA_CH_ISR, DST_DSCR_DONE, 2, 1)
177
+ FIELD(ZDMA_CH_ISR, SRC_DSCR_DONE, 1, 1)
178
+ FIELD(ZDMA_CH_ISR, INV_APB, 0, 1)
179
+REG32(ZDMA_CH_IMR, 0x104)
180
+ FIELD(ZDMA_CH_IMR, DMA_PAUSE, 11, 1)
181
+ FIELD(ZDMA_CH_IMR, DMA_DONE, 10, 1)
182
+ FIELD(ZDMA_CH_IMR, AXI_WR_DATA, 9, 1)
183
+ FIELD(ZDMA_CH_IMR, AXI_RD_DATA, 8, 1)
184
+ FIELD(ZDMA_CH_IMR, AXI_RD_DST_DSCR, 7, 1)
185
+ FIELD(ZDMA_CH_IMR, AXI_RD_SRC_DSCR, 6, 1)
186
+ FIELD(ZDMA_CH_IMR, IRQ_DST_ACCT_ERR, 5, 1)
187
+ FIELD(ZDMA_CH_IMR, IRQ_SRC_ACCT_ERR, 4, 1)
188
+ FIELD(ZDMA_CH_IMR, BYTE_CNT_OVRFL, 3, 1)
189
+ FIELD(ZDMA_CH_IMR, DST_DSCR_DONE, 2, 1)
190
+ FIELD(ZDMA_CH_IMR, SRC_DSCR_DONE, 1, 1)
191
+ FIELD(ZDMA_CH_IMR, INV_APB, 0, 1)
192
+REG32(ZDMA_CH_IEN, 0x108)
193
+ FIELD(ZDMA_CH_IEN, DMA_PAUSE, 11, 1)
194
+ FIELD(ZDMA_CH_IEN, DMA_DONE, 10, 1)
195
+ FIELD(ZDMA_CH_IEN, AXI_WR_DATA, 9, 1)
196
+ FIELD(ZDMA_CH_IEN, AXI_RD_DATA, 8, 1)
197
+ FIELD(ZDMA_CH_IEN, AXI_RD_DST_DSCR, 7, 1)
198
+ FIELD(ZDMA_CH_IEN, AXI_RD_SRC_DSCR, 6, 1)
199
+ FIELD(ZDMA_CH_IEN, IRQ_DST_ACCT_ERR, 5, 1)
200
+ FIELD(ZDMA_CH_IEN, IRQ_SRC_ACCT_ERR, 4, 1)
201
+ FIELD(ZDMA_CH_IEN, BYTE_CNT_OVRFL, 3, 1)
202
+ FIELD(ZDMA_CH_IEN, DST_DSCR_DONE, 2, 1)
203
+ FIELD(ZDMA_CH_IEN, SRC_DSCR_DONE, 1, 1)
204
+ FIELD(ZDMA_CH_IEN, INV_APB, 0, 1)
205
+REG32(ZDMA_CH_IDS, 0x10c)
206
+ FIELD(ZDMA_CH_IDS, DMA_PAUSE, 11, 1)
207
+ FIELD(ZDMA_CH_IDS, DMA_DONE, 10, 1)
208
+ FIELD(ZDMA_CH_IDS, AXI_WR_DATA, 9, 1)
209
+ FIELD(ZDMA_CH_IDS, AXI_RD_DATA, 8, 1)
210
+ FIELD(ZDMA_CH_IDS, AXI_RD_DST_DSCR, 7, 1)
211
+ FIELD(ZDMA_CH_IDS, AXI_RD_SRC_DSCR, 6, 1)
212
+ FIELD(ZDMA_CH_IDS, IRQ_DST_ACCT_ERR, 5, 1)
213
+ FIELD(ZDMA_CH_IDS, IRQ_SRC_ACCT_ERR, 4, 1)
214
+ FIELD(ZDMA_CH_IDS, BYTE_CNT_OVRFL, 3, 1)
215
+ FIELD(ZDMA_CH_IDS, DST_DSCR_DONE, 2, 1)
216
+ FIELD(ZDMA_CH_IDS, SRC_DSCR_DONE, 1, 1)
217
+ FIELD(ZDMA_CH_IDS, INV_APB, 0, 1)
218
+REG32(ZDMA_CH_CTRL0, 0x110)
219
+ FIELD(ZDMA_CH_CTRL0, OVR_FETCH, 7, 1)
220
+ FIELD(ZDMA_CH_CTRL0, POINT_TYPE, 6, 1)
221
+ FIELD(ZDMA_CH_CTRL0, MODE, 4, 2)
222
+ FIELD(ZDMA_CH_CTRL0, RATE_CTRL, 3, 1)
223
+ FIELD(ZDMA_CH_CTRL0, CONT_ADDR, 2, 1)
224
+ FIELD(ZDMA_CH_CTRL0, CONT, 1, 1)
225
+REG32(ZDMA_CH_CTRL1, 0x114)
226
+ FIELD(ZDMA_CH_CTRL1, DST_ISSUE, 5, 5)
227
+ FIELD(ZDMA_CH_CTRL1, SRC_ISSUE, 0, 5)
228
+REG32(ZDMA_CH_FCI, 0x118)
229
+ FIELD(ZDMA_CH_FCI, PROG_CELL_CNT, 2, 2)
230
+ FIELD(ZDMA_CH_FCI, SIDE, 1, 1)
231
+ FIELD(ZDMA_CH_FCI, EN, 0, 1)
232
+REG32(ZDMA_CH_STATUS, 0x11c)
233
+ FIELD(ZDMA_CH_STATUS, STATE, 0, 2)
234
+REG32(ZDMA_CH_DATA_ATTR, 0x120)
235
+ FIELD(ZDMA_CH_DATA_ATTR, ARBURST, 26, 2)
236
+ FIELD(ZDMA_CH_DATA_ATTR, ARCACHE, 22, 4)
237
+ FIELD(ZDMA_CH_DATA_ATTR, ARQOS, 18, 4)
238
+ FIELD(ZDMA_CH_DATA_ATTR, ARLEN, 14, 4)
239
+ FIELD(ZDMA_CH_DATA_ATTR, AWBURST, 12, 2)
240
+ FIELD(ZDMA_CH_DATA_ATTR, AWCACHE, 8, 4)
241
+ FIELD(ZDMA_CH_DATA_ATTR, AWQOS, 4, 4)
242
+ FIELD(ZDMA_CH_DATA_ATTR, AWLEN, 0, 4)
243
+REG32(ZDMA_CH_DSCR_ATTR, 0x124)
244
+ FIELD(ZDMA_CH_DSCR_ATTR, AXCOHRNT, 8, 1)
245
+ FIELD(ZDMA_CH_DSCR_ATTR, AXCACHE, 4, 4)
246
+ FIELD(ZDMA_CH_DSCR_ATTR, AXQOS, 0, 4)
247
+REG32(ZDMA_CH_SRC_DSCR_WORD0, 0x128)
248
+REG32(ZDMA_CH_SRC_DSCR_WORD1, 0x12c)
249
+ FIELD(ZDMA_CH_SRC_DSCR_WORD1, MSB, 0, 17)
250
+REG32(ZDMA_CH_SRC_DSCR_WORD2, 0x130)
251
+ FIELD(ZDMA_CH_SRC_DSCR_WORD2, SIZE, 0, 30)
252
+REG32(ZDMA_CH_SRC_DSCR_WORD3, 0x134)
253
+ FIELD(ZDMA_CH_SRC_DSCR_WORD3, CMD, 3, 2)
254
+ FIELD(ZDMA_CH_SRC_DSCR_WORD3, INTR, 2, 1)
255
+ FIELD(ZDMA_CH_SRC_DSCR_WORD3, TYPE, 1, 1)
256
+ FIELD(ZDMA_CH_SRC_DSCR_WORD3, COHRNT, 0, 1)
257
+REG32(ZDMA_CH_DST_DSCR_WORD0, 0x138)
258
+REG32(ZDMA_CH_DST_DSCR_WORD1, 0x13c)
259
+ FIELD(ZDMA_CH_DST_DSCR_WORD1, MSB, 0, 17)
260
+REG32(ZDMA_CH_DST_DSCR_WORD2, 0x140)
261
+ FIELD(ZDMA_CH_DST_DSCR_WORD2, SIZE, 0, 30)
262
+REG32(ZDMA_CH_DST_DSCR_WORD3, 0x144)
263
+ FIELD(ZDMA_CH_DST_DSCR_WORD3, INTR, 2, 1)
264
+ FIELD(ZDMA_CH_DST_DSCR_WORD3, TYPE, 1, 1)
265
+ FIELD(ZDMA_CH_DST_DSCR_WORD3, COHRNT, 0, 1)
266
+REG32(ZDMA_CH_WR_ONLY_WORD0, 0x148)
267
+REG32(ZDMA_CH_WR_ONLY_WORD1, 0x14c)
268
+REG32(ZDMA_CH_WR_ONLY_WORD2, 0x150)
269
+REG32(ZDMA_CH_WR_ONLY_WORD3, 0x154)
270
+REG32(ZDMA_CH_SRC_START_LSB, 0x158)
271
+REG32(ZDMA_CH_SRC_START_MSB, 0x15c)
272
+ FIELD(ZDMA_CH_SRC_START_MSB, ADDR, 0, 17)
273
+REG32(ZDMA_CH_DST_START_LSB, 0x160)
274
+REG32(ZDMA_CH_DST_START_MSB, 0x164)
275
+ FIELD(ZDMA_CH_DST_START_MSB, ADDR, 0, 17)
276
+REG32(ZDMA_CH_RATE_CTRL, 0x18c)
277
+ FIELD(ZDMA_CH_RATE_CTRL, CNT, 0, 12)
278
+REG32(ZDMA_CH_SRC_CUR_PYLD_LSB, 0x168)
279
+REG32(ZDMA_CH_SRC_CUR_PYLD_MSB, 0x16c)
280
+ FIELD(ZDMA_CH_SRC_CUR_PYLD_MSB, ADDR, 0, 17)
281
+REG32(ZDMA_CH_DST_CUR_PYLD_LSB, 0x170)
282
+REG32(ZDMA_CH_DST_CUR_PYLD_MSB, 0x174)
283
+ FIELD(ZDMA_CH_DST_CUR_PYLD_MSB, ADDR, 0, 17)
284
+REG32(ZDMA_CH_SRC_CUR_DSCR_LSB, 0x178)
285
+REG32(ZDMA_CH_SRC_CUR_DSCR_MSB, 0x17c)
286
+ FIELD(ZDMA_CH_SRC_CUR_DSCR_MSB, ADDR, 0, 17)
287
+REG32(ZDMA_CH_DST_CUR_DSCR_LSB, 0x180)
288
+REG32(ZDMA_CH_DST_CUR_DSCR_MSB, 0x184)
289
+ FIELD(ZDMA_CH_DST_CUR_DSCR_MSB, ADDR, 0, 17)
290
+REG32(ZDMA_CH_TOTAL_BYTE, 0x188)
291
+REG32(ZDMA_CH_RATE_CNTL, 0x18c)
292
+ FIELD(ZDMA_CH_RATE_CNTL, CNT, 0, 12)
293
+REG32(ZDMA_CH_IRQ_SRC_ACCT, 0x190)
294
+ FIELD(ZDMA_CH_IRQ_SRC_ACCT, CNT, 0, 8)
295
+REG32(ZDMA_CH_IRQ_DST_ACCT, 0x194)
296
+ FIELD(ZDMA_CH_IRQ_DST_ACCT, CNT, 0, 8)
297
+REG32(ZDMA_CH_DBG0, 0x198)
298
+ FIELD(ZDMA_CH_DBG0, CMN_BUF_FREE, 0, 9)
299
+REG32(ZDMA_CH_DBG1, 0x19c)
300
+ FIELD(ZDMA_CH_DBG1, CMN_BUF_OCC, 0, 9)
301
+REG32(ZDMA_CH_CTRL2, 0x200)
302
+ FIELD(ZDMA_CH_CTRL2, EN, 0, 1)
303
+
304
+enum {
305
+ PT_REG = 0,
306
+ PT_MEM = 1,
307
+};
308
+
309
+enum {
310
+ CMD_HALT = 1,
311
+ CMD_STOP = 2,
312
+};
313
+
314
+enum {
315
+ RW_MODE_RW = 0,
316
+ RW_MODE_WO = 1,
317
+ RW_MODE_RO = 2,
318
+};
319
+
320
+enum {
321
+ DTYPE_LINEAR = 0,
322
+ DTYPE_LINKED = 1,
323
+};
324
+
325
+enum {
326
+ AXI_BURST_FIXED = 0,
327
+ AXI_BURST_INCR = 1,
328
+};
329
+
330
+static void zdma_ch_imr_update_irq(XlnxZDMA *s)
331
+{
332
+ bool pending;
333
+
334
+ pending = s->regs[R_ZDMA_CH_ISR] & ~s->regs[R_ZDMA_CH_IMR];
335
+
336
+ qemu_set_irq(s->irq_zdma_ch_imr, pending);
337
+}
338
+
339
+static void zdma_ch_isr_postw(RegisterInfo *reg, uint64_t val64)
340
+{
341
+ XlnxZDMA *s = XLNX_ZDMA(reg->opaque);
342
+ zdma_ch_imr_update_irq(s);
343
+}
344
+
345
+static uint64_t zdma_ch_ien_prew(RegisterInfo *reg, uint64_t val64)
346
+{
347
+ XlnxZDMA *s = XLNX_ZDMA(reg->opaque);
348
+ uint32_t val = val64;
349
+
350
+ s->regs[R_ZDMA_CH_IMR] &= ~val;
351
+ zdma_ch_imr_update_irq(s);
352
+ return 0;
353
+}
354
+
355
+static uint64_t zdma_ch_ids_prew(RegisterInfo *reg, uint64_t val64)
356
+{
357
+ XlnxZDMA *s = XLNX_ZDMA(reg->opaque);
358
+ uint32_t val = val64;
359
+
360
+ s->regs[R_ZDMA_CH_IMR] |= val;
361
+ zdma_ch_imr_update_irq(s);
362
+ return 0;
363
+}
364
+
365
+static void zdma_set_state(XlnxZDMA *s, XlnxZDMAState state)
366
+{
367
+ s->state = state;
368
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_STATUS, STATE, state);
369
+
370
+ /* Signal error if we have an error condition. */
371
+ if (s->error) {
372
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_STATUS, STATE, 3);
373
+ }
374
+}
375
+
376
+static void zdma_src_done(XlnxZDMA *s)
377
+{
378
+ unsigned int cnt;
379
+ cnt = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_IRQ_SRC_ACCT, CNT);
380
+ cnt++;
381
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_IRQ_SRC_ACCT, CNT, cnt);
382
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, SRC_DSCR_DONE, true);
383
+
384
+ /* Did we overflow? */
385
+ if (cnt != ARRAY_FIELD_EX32(s->regs, ZDMA_CH_IRQ_SRC_ACCT, CNT)) {
386
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, IRQ_SRC_ACCT_ERR, true);
387
+ }
388
+ zdma_ch_imr_update_irq(s);
389
+}
390
+
391
+static void zdma_dst_done(XlnxZDMA *s)
392
+{
393
+ unsigned int cnt;
394
+ cnt = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_IRQ_DST_ACCT, CNT);
395
+ cnt++;
396
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_IRQ_DST_ACCT, CNT, cnt);
397
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, DST_DSCR_DONE, true);
398
+
399
+ /* Did we overflow? */
400
+ if (cnt != ARRAY_FIELD_EX32(s->regs, ZDMA_CH_IRQ_DST_ACCT, CNT)) {
401
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, IRQ_DST_ACCT_ERR, true);
402
+ }
403
+ zdma_ch_imr_update_irq(s);
404
+}
405
+
406
+static uint64_t zdma_get_regaddr64(XlnxZDMA *s, unsigned int basereg)
407
+{
408
+ uint64_t addr;
409
+
410
+ addr = s->regs[basereg + 1];
411
+ addr <<= 32;
412
+ addr |= s->regs[basereg];
413
+
414
+ return addr;
415
+}
416
+
417
+static void zdma_put_regaddr64(XlnxZDMA *s, unsigned int basereg, uint64_t addr)
418
+{
419
+ s->regs[basereg] = addr;
420
+ s->regs[basereg + 1] = addr >> 32;
421
+}
422
+
423
+static bool zdma_load_descriptor(XlnxZDMA *s, uint64_t addr, void *buf)
424
+{
425
+ /* ZDMA descriptors must be aligned to their own size. */
426
+ if (addr % sizeof(XlnxZDMADescr)) {
427
+ qemu_log_mask(LOG_GUEST_ERROR,
428
+ "zdma: unaligned descriptor at %" PRIx64,
429
+ addr);
430
+ memset(buf, 0xdeadbeef, sizeof(XlnxZDMADescr));
431
+ s->error = true;
432
+ return false;
433
+ }
434
+
435
+ address_space_rw(s->dma_as, addr, s->attr,
436
+ buf, sizeof(XlnxZDMADescr), false);
437
+ return true;
438
+}
439
+
440
+static void zdma_load_src_descriptor(XlnxZDMA *s)
441
+{
442
+ uint64_t src_addr;
443
+ unsigned int ptype = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, POINT_TYPE);
444
+
445
+ if (ptype == PT_REG) {
446
+ memcpy(&s->dsc_src, &s->regs[R_ZDMA_CH_SRC_DSCR_WORD0],
447
+ sizeof(s->dsc_src));
448
+ return;
449
+ }
450
+
451
+ src_addr = zdma_get_regaddr64(s, R_ZDMA_CH_SRC_CUR_DSCR_LSB);
452
+
453
+ if (!zdma_load_descriptor(s, src_addr, &s->dsc_src)) {
454
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, AXI_RD_SRC_DSCR, true);
455
+ }
456
+}
457
+
458
+static void zdma_load_dst_descriptor(XlnxZDMA *s)
459
+{
460
+ uint64_t dst_addr;
461
+ unsigned int ptype = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, POINT_TYPE);
462
+
463
+ if (ptype == PT_REG) {
464
+ memcpy(&s->dsc_dst, &s->regs[R_ZDMA_CH_DST_DSCR_WORD0],
465
+ sizeof(s->dsc_dst));
466
+ return;
467
+ }
468
+
469
+ dst_addr = zdma_get_regaddr64(s, R_ZDMA_CH_DST_CUR_DSCR_LSB);
470
+
471
+ if (!zdma_load_descriptor(s, dst_addr, &s->dsc_dst)) {
472
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, AXI_RD_DST_DSCR, true);
473
+ }
474
+}
475
+
476
+static uint64_t zdma_update_descr_addr(XlnxZDMA *s, bool type,
477
+ unsigned int basereg)
478
+{
479
+ uint64_t addr, next;
480
+
481
+ if (type == DTYPE_LINEAR) {
482
+ next = zdma_get_regaddr64(s, basereg);
483
+ next += sizeof(s->dsc_dst);
484
+ zdma_put_regaddr64(s, basereg, next);
485
+ } else {
486
+ addr = zdma_get_regaddr64(s, basereg);
487
+ addr += sizeof(s->dsc_dst);
488
+ address_space_rw(s->dma_as, addr, s->attr, (void *) &next, 8, false);
489
+ zdma_put_regaddr64(s, basereg, next);
490
+ }
491
+ return next;
492
+}
493
+
494
+static void zdma_write_dst(XlnxZDMA *s, uint8_t *buf, uint32_t len)
495
+{
496
+ uint32_t dst_size, dlen;
497
+ bool dst_intr, dst_type;
498
+ unsigned int ptype = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, POINT_TYPE);
499
+ unsigned int rw_mode = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, MODE);
500
+ unsigned int burst_type = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_DATA_ATTR,
501
+ AWBURST);
502
+
503
+ /* FIXED burst types are only supported in simple dma mode. */
504
+ if (ptype != PT_REG) {
505
+ burst_type = AXI_BURST_INCR;
506
+ }
507
+
508
+ while (len) {
509
+ dst_size = FIELD_EX32(s->dsc_dst.words[2], ZDMA_CH_DST_DSCR_WORD2,
510
+ SIZE);
511
+ dst_type = FIELD_EX32(s->dsc_dst.words[3], ZDMA_CH_DST_DSCR_WORD3,
512
+ TYPE);
513
+ if (dst_size == 0 && ptype == PT_MEM) {
514
+ uint64_t next;
515
+ next = zdma_update_descr_addr(s, dst_type,
516
+ R_ZDMA_CH_DST_CUR_DSCR_LSB);
517
+ zdma_load_descriptor(s, next, &s->dsc_dst);
518
+ dst_size = FIELD_EX32(s->dsc_dst.words[2], ZDMA_CH_DST_DSCR_WORD2,
519
+ SIZE);
520
+ dst_type = FIELD_EX32(s->dsc_dst.words[3], ZDMA_CH_DST_DSCR_WORD3,
521
+ TYPE);
522
+ }
523
+
524
+ /* Match what hardware does by ignoring the dst_size and only using
525
+ * the src size for Simple register mode. */
526
+ if (ptype == PT_REG && rw_mode != RW_MODE_WO) {
527
+ dst_size = len;
528
+ }
529
+
530
+ dst_intr = FIELD_EX32(s->dsc_dst.words[3], ZDMA_CH_DST_DSCR_WORD3,
531
+ INTR);
532
+
533
+ dlen = len > dst_size ? dst_size : len;
534
+ if (burst_type == AXI_BURST_FIXED) {
535
+ if (dlen > (s->cfg.bus_width / 8)) {
536
+ dlen = s->cfg.bus_width / 8;
537
+ }
538
+ }
539
+
540
+ address_space_rw(s->dma_as, s->dsc_dst.addr, s->attr, buf, dlen,
541
+ true);
542
+ if (burst_type == AXI_BURST_INCR) {
543
+ s->dsc_dst.addr += dlen;
544
+ }
545
+ dst_size -= dlen;
546
+ buf += dlen;
547
+ len -= dlen;
548
+
549
+ if (dst_size == 0 && dst_intr) {
550
+ zdma_dst_done(s);
551
+ }
552
+
553
+ /* Write back to buffered descriptor. */
554
+ s->dsc_dst.words[2] = FIELD_DP32(s->dsc_dst.words[2],
555
+ ZDMA_CH_DST_DSCR_WORD2,
556
+ SIZE,
557
+ dst_size);
558
+ }
559
+}
560
+
561
+static void zdma_process_descr(XlnxZDMA *s)
562
+{
563
+ uint64_t src_addr;
564
+ uint32_t src_size, len;
565
+ unsigned int src_cmd;
566
+ bool src_intr, src_type;
567
+ unsigned int ptype = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, POINT_TYPE);
568
+ unsigned int rw_mode = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, MODE);
569
+ unsigned int burst_type = ARRAY_FIELD_EX32(s->regs, ZDMA_CH_DATA_ATTR,
570
+ ARBURST);
571
+
572
+ src_addr = s->dsc_src.addr;
573
+ src_size = FIELD_EX32(s->dsc_src.words[2], ZDMA_CH_SRC_DSCR_WORD2, SIZE);
574
+ src_cmd = FIELD_EX32(s->dsc_src.words[3], ZDMA_CH_SRC_DSCR_WORD3, CMD);
575
+ src_type = FIELD_EX32(s->dsc_src.words[3], ZDMA_CH_SRC_DSCR_WORD3, TYPE);
576
+ src_intr = FIELD_EX32(s->dsc_src.words[3], ZDMA_CH_SRC_DSCR_WORD3, INTR);
577
+
578
+ /* FIXED burst types and non-rw modes are only supported in
579
+ * simple dma mode.
580
+ */
581
+ if (ptype != PT_REG) {
582
+ if (rw_mode != RW_MODE_RW) {
583
+ qemu_log_mask(LOG_GUEST_ERROR,
584
+ "zDMA: rw-mode=%d but not simple DMA mode.\n",
585
+ rw_mode);
586
+ }
587
+ if (burst_type != AXI_BURST_INCR) {
588
+ qemu_log_mask(LOG_GUEST_ERROR,
589
+ "zDMA: burst_type=%d but not simple DMA mode.\n",
590
+ burst_type);
591
+ }
592
+ burst_type = AXI_BURST_INCR;
593
+ rw_mode = RW_MODE_RW;
594
+ }
595
+
596
+ if (rw_mode == RW_MODE_WO) {
597
+ /* In Simple DMA Write-Only, we need to push DST size bytes
598
+ * regardless of what SRC size is set to. */
599
+ src_size = FIELD_EX32(s->dsc_dst.words[2], ZDMA_CH_DST_DSCR_WORD2,
600
+ SIZE);
601
+ memcpy(s->buf, &s->regs[R_ZDMA_CH_WR_ONLY_WORD0], s->cfg.bus_width / 8);
602
+ }
603
+
604
+ while (src_size) {
605
+ len = src_size > ARRAY_SIZE(s->buf) ? ARRAY_SIZE(s->buf) : src_size;
606
+ if (burst_type == AXI_BURST_FIXED) {
607
+ if (len > (s->cfg.bus_width / 8)) {
608
+ len = s->cfg.bus_width / 8;
609
+ }
610
+ }
611
+
612
+ if (rw_mode == RW_MODE_WO) {
613
+ if (len > s->cfg.bus_width / 8) {
614
+ len = s->cfg.bus_width / 8;
615
+ }
616
+ } else {
617
+ address_space_rw(s->dma_as, src_addr, s->attr, s->buf, len,
618
+ false);
619
+ if (burst_type == AXI_BURST_INCR) {
620
+ src_addr += len;
621
+ }
622
+ }
623
+
624
+ if (rw_mode != RW_MODE_RO) {
625
+ zdma_write_dst(s, s->buf, len);
626
+ }
627
+
628
+ s->regs[R_ZDMA_CH_TOTAL_BYTE] += len;
629
+ src_size -= len;
630
+ }
631
+
632
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, DMA_DONE, true);
633
+
634
+ if (src_intr) {
635
+ zdma_src_done(s);
636
+ }
637
+
638
+ /* Load next descriptor. */
639
+ if (ptype == PT_REG || src_cmd == CMD_STOP) {
640
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_CTRL2, EN, 0);
641
+ zdma_set_state(s, DISABLED);
642
+ return;
643
+ }
644
+
645
+ if (src_cmd == CMD_HALT) {
646
+ zdma_set_state(s, PAUSED);
647
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, DMA_PAUSE, 1);
648
+ zdma_ch_imr_update_irq(s);
649
+ return;
650
+ }
651
+
652
+ zdma_update_descr_addr(s, src_type, R_ZDMA_CH_SRC_CUR_DSCR_LSB);
653
+}
654
+
655
+static void zdma_run(XlnxZDMA *s)
656
+{
657
+ while (s->state == ENABLED && !s->error) {
658
+ zdma_load_src_descriptor(s);
659
+
660
+ if (s->error) {
661
+ zdma_set_state(s, DISABLED);
662
+ } else {
663
+ zdma_process_descr(s);
664
+ }
665
+ }
666
+
667
+ zdma_ch_imr_update_irq(s);
668
+}
669
+
670
+static void zdma_update_descr_addr_from_start(XlnxZDMA *s)
671
+{
672
+ uint64_t src_addr, dst_addr;
673
+
674
+ src_addr = zdma_get_regaddr64(s, R_ZDMA_CH_SRC_START_LSB);
675
+ zdma_put_regaddr64(s, R_ZDMA_CH_SRC_CUR_DSCR_LSB, src_addr);
676
+ dst_addr = zdma_get_regaddr64(s, R_ZDMA_CH_DST_START_LSB);
677
+ zdma_put_regaddr64(s, R_ZDMA_CH_DST_CUR_DSCR_LSB, dst_addr);
678
+ zdma_load_dst_descriptor(s);
679
+}
680
+
681
+static void zdma_ch_ctrlx_postw(RegisterInfo *reg, uint64_t val64)
682
+{
683
+ XlnxZDMA *s = XLNX_ZDMA(reg->opaque);
684
+
685
+ if (ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL2, EN)) {
686
+ s->error = false;
687
+
688
+ if (s->state == PAUSED &&
689
+ ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, CONT)) {
690
+ if (ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, CONT_ADDR) == 1) {
691
+ zdma_update_descr_addr_from_start(s);
692
+ } else {
693
+ bool src_type = FIELD_EX32(s->dsc_src.words[3],
694
+ ZDMA_CH_SRC_DSCR_WORD3, TYPE);
695
+ zdma_update_descr_addr(s, src_type,
696
+ R_ZDMA_CH_SRC_CUR_DSCR_LSB);
697
+ }
698
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_CTRL0, CONT, false);
699
+ zdma_set_state(s, ENABLED);
700
+ } else if (s->state == DISABLED) {
701
+ zdma_update_descr_addr_from_start(s);
702
+ zdma_set_state(s, ENABLED);
703
+ }
704
+ } else {
705
+ /* Leave Paused state? */
706
+ if (s->state == PAUSED &&
707
+ ARRAY_FIELD_EX32(s->regs, ZDMA_CH_CTRL0, CONT)) {
708
+ zdma_set_state(s, DISABLED);
709
+ }
710
+ }
711
+
712
+ zdma_run(s);
713
+}
714
+
715
+static RegisterAccessInfo zdma_regs_info[] = {
716
+ { .name = "ZDMA_ERR_CTRL", .addr = A_ZDMA_ERR_CTRL,
717
+ .rsvd = 0xfffffffe,
718
+ },{ .name = "ZDMA_CH_ISR", .addr = A_ZDMA_CH_ISR,
719
+ .rsvd = 0xfffff000,
720
+ .w1c = 0xfff,
721
+ .post_write = zdma_ch_isr_postw,
722
+ },{ .name = "ZDMA_CH_IMR", .addr = A_ZDMA_CH_IMR,
723
+ .reset = 0xfff,
724
+ .rsvd = 0xfffff000,
725
+ .ro = 0xfff,
726
+ },{ .name = "ZDMA_CH_IEN", .addr = A_ZDMA_CH_IEN,
727
+ .rsvd = 0xfffff000,
728
+ .pre_write = zdma_ch_ien_prew,
729
+ },{ .name = "ZDMA_CH_IDS", .addr = A_ZDMA_CH_IDS,
730
+ .rsvd = 0xfffff000,
731
+ .pre_write = zdma_ch_ids_prew,
732
+ },{ .name = "ZDMA_CH_CTRL0", .addr = A_ZDMA_CH_CTRL0,
733
+ .reset = 0x80,
734
+ .rsvd = 0xffffff01,
735
+ .post_write = zdma_ch_ctrlx_postw,
736
+ },{ .name = "ZDMA_CH_CTRL1", .addr = A_ZDMA_CH_CTRL1,
737
+ .reset = 0x3ff,
738
+ .rsvd = 0xfffffc00,
739
+ },{ .name = "ZDMA_CH_FCI", .addr = A_ZDMA_CH_FCI,
740
+ .rsvd = 0xffffffc0,
741
+ },{ .name = "ZDMA_CH_STATUS", .addr = A_ZDMA_CH_STATUS,
742
+ .rsvd = 0xfffffffc,
743
+ .ro = 0x3,
744
+ },{ .name = "ZDMA_CH_DATA_ATTR", .addr = A_ZDMA_CH_DATA_ATTR,
745
+ .reset = 0x483d20f,
746
+ .rsvd = 0xf0000000,
747
+ },{ .name = "ZDMA_CH_DSCR_ATTR", .addr = A_ZDMA_CH_DSCR_ATTR,
748
+ .rsvd = 0xfffffe00,
749
+ },{ .name = "ZDMA_CH_SRC_DSCR_WORD0", .addr = A_ZDMA_CH_SRC_DSCR_WORD0,
750
+ },{ .name = "ZDMA_CH_SRC_DSCR_WORD1", .addr = A_ZDMA_CH_SRC_DSCR_WORD1,
751
+ .rsvd = 0xfffe0000,
752
+ },{ .name = "ZDMA_CH_SRC_DSCR_WORD2", .addr = A_ZDMA_CH_SRC_DSCR_WORD2,
753
+ .rsvd = 0xc0000000,
754
+ },{ .name = "ZDMA_CH_SRC_DSCR_WORD3", .addr = A_ZDMA_CH_SRC_DSCR_WORD3,
755
+ .rsvd = 0xffffffe0,
756
+ },{ .name = "ZDMA_CH_DST_DSCR_WORD0", .addr = A_ZDMA_CH_DST_DSCR_WORD0,
757
+ },{ .name = "ZDMA_CH_DST_DSCR_WORD1", .addr = A_ZDMA_CH_DST_DSCR_WORD1,
758
+ .rsvd = 0xfffe0000,
759
+ },{ .name = "ZDMA_CH_DST_DSCR_WORD2", .addr = A_ZDMA_CH_DST_DSCR_WORD2,
760
+ .rsvd = 0xc0000000,
761
+ },{ .name = "ZDMA_CH_DST_DSCR_WORD3", .addr = A_ZDMA_CH_DST_DSCR_WORD3,
762
+ .rsvd = 0xfffffffa,
763
+ },{ .name = "ZDMA_CH_WR_ONLY_WORD0", .addr = A_ZDMA_CH_WR_ONLY_WORD0,
764
+ },{ .name = "ZDMA_CH_WR_ONLY_WORD1", .addr = A_ZDMA_CH_WR_ONLY_WORD1,
765
+ },{ .name = "ZDMA_CH_WR_ONLY_WORD2", .addr = A_ZDMA_CH_WR_ONLY_WORD2,
766
+ },{ .name = "ZDMA_CH_WR_ONLY_WORD3", .addr = A_ZDMA_CH_WR_ONLY_WORD3,
767
+ },{ .name = "ZDMA_CH_SRC_START_LSB", .addr = A_ZDMA_CH_SRC_START_LSB,
768
+ },{ .name = "ZDMA_CH_SRC_START_MSB", .addr = A_ZDMA_CH_SRC_START_MSB,
769
+ .rsvd = 0xfffe0000,
770
+ },{ .name = "ZDMA_CH_DST_START_LSB", .addr = A_ZDMA_CH_DST_START_LSB,
771
+ },{ .name = "ZDMA_CH_DST_START_MSB", .addr = A_ZDMA_CH_DST_START_MSB,
772
+ .rsvd = 0xfffe0000,
773
+ },{ .name = "ZDMA_CH_SRC_CUR_PYLD_LSB", .addr = A_ZDMA_CH_SRC_CUR_PYLD_LSB,
774
+ .ro = 0xffffffff,
775
+ },{ .name = "ZDMA_CH_SRC_CUR_PYLD_MSB", .addr = A_ZDMA_CH_SRC_CUR_PYLD_MSB,
776
+ .rsvd = 0xfffe0000,
777
+ .ro = 0x1ffff,
778
+ },{ .name = "ZDMA_CH_DST_CUR_PYLD_LSB", .addr = A_ZDMA_CH_DST_CUR_PYLD_LSB,
779
+ .ro = 0xffffffff,
780
+ },{ .name = "ZDMA_CH_DST_CUR_PYLD_MSB", .addr = A_ZDMA_CH_DST_CUR_PYLD_MSB,
781
+ .rsvd = 0xfffe0000,
782
+ .ro = 0x1ffff,
783
+ },{ .name = "ZDMA_CH_SRC_CUR_DSCR_LSB", .addr = A_ZDMA_CH_SRC_CUR_DSCR_LSB,
784
+ .ro = 0xffffffff,
785
+ },{ .name = "ZDMA_CH_SRC_CUR_DSCR_MSB", .addr = A_ZDMA_CH_SRC_CUR_DSCR_MSB,
786
+ .rsvd = 0xfffe0000,
787
+ .ro = 0x1ffff,
788
+ },{ .name = "ZDMA_CH_DST_CUR_DSCR_LSB", .addr = A_ZDMA_CH_DST_CUR_DSCR_LSB,
789
+ .ro = 0xffffffff,
790
+ },{ .name = "ZDMA_CH_DST_CUR_DSCR_MSB", .addr = A_ZDMA_CH_DST_CUR_DSCR_MSB,
791
+ .rsvd = 0xfffe0000,
792
+ .ro = 0x1ffff,
793
+ },{ .name = "ZDMA_CH_TOTAL_BYTE", .addr = A_ZDMA_CH_TOTAL_BYTE,
794
+ .w1c = 0xffffffff,
795
+ },{ .name = "ZDMA_CH_RATE_CNTL", .addr = A_ZDMA_CH_RATE_CNTL,
796
+ .rsvd = 0xfffff000,
797
+ },{ .name = "ZDMA_CH_IRQ_SRC_ACCT", .addr = A_ZDMA_CH_IRQ_SRC_ACCT,
798
+ .rsvd = 0xffffff00,
799
+ .ro = 0xff,
800
+ .cor = 0xff,
801
+ },{ .name = "ZDMA_CH_IRQ_DST_ACCT", .addr = A_ZDMA_CH_IRQ_DST_ACCT,
802
+ .rsvd = 0xffffff00,
803
+ .ro = 0xff,
804
+ .cor = 0xff,
805
+ },{ .name = "ZDMA_CH_DBG0", .addr = A_ZDMA_CH_DBG0,
806
+ .rsvd = 0xfffffe00,
807
+ .ro = 0x1ff,
808
+ },{ .name = "ZDMA_CH_DBG1", .addr = A_ZDMA_CH_DBG1,
809
+ .rsvd = 0xfffffe00,
810
+ .ro = 0x1ff,
811
+ },{ .name = "ZDMA_CH_CTRL2", .addr = A_ZDMA_CH_CTRL2,
812
+ .rsvd = 0xfffffffe,
813
+ .post_write = zdma_ch_ctrlx_postw,
814
+ }
815
+};
816
+
817
+static void zdma_reset(DeviceState *dev)
818
+{
819
+ XlnxZDMA *s = XLNX_ZDMA(dev);
820
+ unsigned int i;
821
+
822
+ for (i = 0; i < ARRAY_SIZE(s->regs_info); ++i) {
823
+ register_reset(&s->regs_info[i]);
824
+ }
825
+
826
+ zdma_ch_imr_update_irq(s);
827
+}
828
+
829
+static uint64_t zdma_read(void *opaque, hwaddr addr, unsigned size)
830
+{
831
+ XlnxZDMA *s = XLNX_ZDMA(opaque);
832
+ RegisterInfo *r = &s->regs_info[addr / 4];
833
+
834
+ if (!r->data) {
835
+ qemu_log("%s: Decode error: read from %" HWADDR_PRIx "\n",
836
+ object_get_canonical_path(OBJECT(s)),
837
+ addr);
838
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, INV_APB, true);
839
+ zdma_ch_imr_update_irq(s);
840
+ return 0;
841
+ }
842
+ return register_read(r, ~0, NULL, false);
843
+}
844
+
845
+static void zdma_write(void *opaque, hwaddr addr, uint64_t value,
846
+ unsigned size)
847
+{
848
+ XlnxZDMA *s = XLNX_ZDMA(opaque);
849
+ RegisterInfo *r = &s->regs_info[addr / 4];
850
+
851
+ if (!r->data) {
852
+ qemu_log("%s: Decode error: write to %" HWADDR_PRIx "=%" PRIx64 "\n",
853
+ object_get_canonical_path(OBJECT(s)),
854
+ addr, value);
855
+ ARRAY_FIELD_DP32(s->regs, ZDMA_CH_ISR, INV_APB, true);
856
+ zdma_ch_imr_update_irq(s);
857
+ return;
858
+ }
859
+ register_write(r, value, ~0, NULL, false);
860
+}
861
+
862
+static const MemoryRegionOps zdma_ops = {
863
+ .read = zdma_read,
864
+ .write = zdma_write,
865
+ .endianness = DEVICE_LITTLE_ENDIAN,
866
+ .valid = {
867
+ .min_access_size = 4,
868
+ .max_access_size = 4,
869
+ },
870
+};
871
+
872
+static void zdma_realize(DeviceState *dev, Error **errp)
873
+{
874
+ XlnxZDMA *s = XLNX_ZDMA(dev);
875
+ unsigned int i;
876
+
877
+ for (i = 0; i < ARRAY_SIZE(zdma_regs_info); ++i) {
878
+ RegisterInfo *r = &s->regs_info[zdma_regs_info[i].addr / 4];
879
+
880
+ *r = (RegisterInfo) {
881
+ .data = (uint8_t *)&s->regs[
882
+ zdma_regs_info[i].addr / 4],
883
+ .data_size = sizeof(uint32_t),
884
+ .access = &zdma_regs_info[i],
885
+ .opaque = s,
886
+ };
887
+ }
888
+
889
+ if (s->dma_mr) {
890
+ s->dma_as = g_malloc0(sizeof(AddressSpace));
891
+ address_space_init(s->dma_as, s->dma_mr, NULL);
892
+ } else {
893
+ s->dma_as = &address_space_memory;
894
+ }
895
+ s->attr = MEMTXATTRS_UNSPECIFIED;
896
+}
897
+
898
+static void zdma_init(Object *obj)
899
+{
900
+ XlnxZDMA *s = XLNX_ZDMA(obj);
901
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
902
+
903
+ memory_region_init_io(&s->iomem, obj, &zdma_ops, s,
904
+ TYPE_XLNX_ZDMA, ZDMA_R_MAX * 4);
905
+ sysbus_init_mmio(sbd, &s->iomem);
906
+ sysbus_init_irq(sbd, &s->irq_zdma_ch_imr);
907
+
908
+ object_property_add_link(obj, "dma", TYPE_MEMORY_REGION,
909
+ (Object **)&s->dma_mr,
910
+ qdev_prop_allow_set_link_before_realize,
911
+ OBJ_PROP_LINK_UNREF_ON_RELEASE,
912
+ &error_abort);
913
+}
914
+
915
+static const VMStateDescription vmstate_zdma = {
916
+ .name = TYPE_XLNX_ZDMA,
917
+ .version_id = 1,
918
+ .minimum_version_id = 1,
919
+ .minimum_version_id_old = 1,
920
+ .fields = (VMStateField[]) {
921
+ VMSTATE_UINT32_ARRAY(regs, XlnxZDMA, ZDMA_R_MAX),
922
+ VMSTATE_UINT32(state, XlnxZDMA),
923
+ VMSTATE_UINT32_ARRAY(dsc_src.words, XlnxZDMA, 4),
924
+ VMSTATE_UINT32_ARRAY(dsc_dst.words, XlnxZDMA, 4),
925
+ VMSTATE_END_OF_LIST(),
926
+ }
927
+};
928
+
929
+static Property zdma_props[] = {
930
+ DEFINE_PROP_UINT32("bus-width", XlnxZDMA, cfg.bus_width, 64),
931
+ DEFINE_PROP_END_OF_LIST(),
932
+};
933
+
934
+static void zdma_class_init(ObjectClass *klass, void *data)
935
+{
936
+ DeviceClass *dc = DEVICE_CLASS(klass);
937
+
938
+ dc->reset = zdma_reset;
939
+ dc->realize = zdma_realize;
940
+ dc->props = zdma_props;
941
+ dc->vmsd = &vmstate_zdma;
942
+}
943
+
944
+static const TypeInfo zdma_info = {
945
+ .name = TYPE_XLNX_ZDMA,
946
+ .parent = TYPE_SYS_BUS_DEVICE,
947
+ .instance_size = sizeof(XlnxZDMA),
948
+ .class_init = zdma_class_init,
949
+ .instance_init = zdma_init,
950
+};
951
+
952
+static void zdma_register_types(void)
953
+{
954
+ type_register_static(&zdma_info);
955
+}
956
+
957
+type_init(zdma_register_types)
958
--
959
2.17.0
960
961
diff view generated by jsdifflib
Deleted patch
1
From: Francisco Iglesias <frasse.iglesias@gmail.com>
2
1
3
The ZynqMP contains two instances of a generic DMA, the GDMA, located in the
4
FPD (full power domain), and the ADMA, located in LPD (low power domain). This
5
patch adds these two DMAs to the ZynqMP board.
6
7
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
8
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
9
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
10
Message-id: 20180503214201.29082-3-frasse.iglesias@gmail.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
include/hw/arm/xlnx-zynqmp.h | 5 ++++
14
hw/arm/xlnx-zynqmp.c | 53 ++++++++++++++++++++++++++++++++++++
15
2 files changed, 58 insertions(+)
16
17
diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/include/hw/arm/xlnx-zynqmp.h
20
+++ b/include/hw/arm/xlnx-zynqmp.h
21
@@ -XXX,XX +XXX,XX @@
22
#include "hw/sd/sdhci.h"
23
#include "hw/ssi/xilinx_spips.h"
24
#include "hw/dma/xlnx_dpdma.h"
25
+#include "hw/dma/xlnx-zdma.h"
26
#include "hw/display/xlnx_dp.h"
27
#include "hw/intc/xlnx-zynqmp-ipi.h"
28
#include "hw/timer/xlnx-zynqmp-rtc.h"
29
@@ -XXX,XX +XXX,XX @@
30
#define XLNX_ZYNQMP_NUM_UARTS 2
31
#define XLNX_ZYNQMP_NUM_SDHCI 2
32
#define XLNX_ZYNQMP_NUM_SPIS 2
33
+#define XLNX_ZYNQMP_NUM_GDMA_CH 8
34
+#define XLNX_ZYNQMP_NUM_ADMA_CH 8
35
36
#define XLNX_ZYNQMP_NUM_QSPI_BUS 2
37
#define XLNX_ZYNQMP_NUM_QSPI_BUS_CS 2
38
@@ -XXX,XX +XXX,XX @@ typedef struct XlnxZynqMPState {
39
XlnxDPDMAState dpdma;
40
XlnxZynqMPIPI ipi;
41
XlnxZynqMPRTC rtc;
42
+ XlnxZDMA gdma[XLNX_ZYNQMP_NUM_GDMA_CH];
43
+ XlnxZDMA adma[XLNX_ZYNQMP_NUM_ADMA_CH];
44
45
char *boot_cpu;
46
ARMCPU *boot_cpu_ptr;
47
diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/hw/arm/xlnx-zynqmp.c
50
+++ b/hw/arm/xlnx-zynqmp.c
51
@@ -XXX,XX +XXX,XX @@ static const int spi_intr[XLNX_ZYNQMP_NUM_SPIS] = {
52
19, 20,
53
};
54
55
+static const uint64_t gdma_ch_addr[XLNX_ZYNQMP_NUM_GDMA_CH] = {
56
+ 0xFD500000, 0xFD510000, 0xFD520000, 0xFD530000,
57
+ 0xFD540000, 0xFD550000, 0xFD560000, 0xFD570000
58
+};
59
+
60
+static const int gdma_ch_intr[XLNX_ZYNQMP_NUM_GDMA_CH] = {
61
+ 124, 125, 126, 127, 128, 129, 130, 131
62
+};
63
+
64
+static const uint64_t adma_ch_addr[XLNX_ZYNQMP_NUM_ADMA_CH] = {
65
+ 0xFFA80000, 0xFFA90000, 0xFFAA0000, 0xFFAB0000,
66
+ 0xFFAC0000, 0xFFAD0000, 0xFFAE0000, 0xFFAF0000
67
+};
68
+
69
+static const int adma_ch_intr[XLNX_ZYNQMP_NUM_ADMA_CH] = {
70
+ 77, 78, 79, 80, 81, 82, 83, 84
71
+};
72
+
73
typedef struct XlnxZynqMPGICRegion {
74
int region_index;
75
uint32_t address;
76
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_init(Object *obj)
77
78
object_initialize(&s->rtc, sizeof(s->rtc), TYPE_XLNX_ZYNQMP_RTC);
79
qdev_set_parent_bus(DEVICE(&s->rtc), sysbus_get_default());
80
+
81
+ for (i = 0; i < XLNX_ZYNQMP_NUM_GDMA_CH; i++) {
82
+ object_initialize(&s->gdma[i], sizeof(s->gdma[i]), TYPE_XLNX_ZDMA);
83
+ qdev_set_parent_bus(DEVICE(&s->gdma[i]), sysbus_get_default());
84
+ }
85
+
86
+ for (i = 0; i < XLNX_ZYNQMP_NUM_ADMA_CH; i++) {
87
+ object_initialize(&s->adma[i], sizeof(s->adma[i]), TYPE_XLNX_ZDMA);
88
+ qdev_set_parent_bus(DEVICE(&s->adma[i]), sysbus_get_default());
89
+ }
90
}
91
92
static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
93
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
94
}
95
sysbus_mmio_map(SYS_BUS_DEVICE(&s->rtc), 0, RTC_ADDR);
96
sysbus_connect_irq(SYS_BUS_DEVICE(&s->rtc), 0, gic_spi[RTC_IRQ]);
97
+
98
+ for (i = 0; i < XLNX_ZYNQMP_NUM_GDMA_CH; i++) {
99
+ object_property_set_uint(OBJECT(&s->gdma[i]), 128, "bus-width", &err);
100
+ object_property_set_bool(OBJECT(&s->gdma[i]), true, "realized", &err);
101
+ if (err) {
102
+ error_propagate(errp, err);
103
+ return;
104
+ }
105
+
106
+ sysbus_mmio_map(SYS_BUS_DEVICE(&s->gdma[i]), 0, gdma_ch_addr[i]);
107
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->gdma[i]), 0,
108
+ gic_spi[gdma_ch_intr[i]]);
109
+ }
110
+
111
+ for (i = 0; i < XLNX_ZYNQMP_NUM_ADMA_CH; i++) {
112
+ object_property_set_bool(OBJECT(&s->adma[i]), true, "realized", &err);
113
+ if (err) {
114
+ error_propagate(errp, err);
115
+ return;
116
+ }
117
+
118
+ sysbus_mmio_map(SYS_BUS_DEVICE(&s->adma[i]), 0, adma_ch_addr[i]);
119
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->adma[i]), 0,
120
+ gic_spi[adma_ch_intr[i]]);
121
+ }
122
}
123
124
static Property xlnx_zynqmp_props[] = {
125
--
126
2.17.0
127
128
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
Coverity complains about use of uninitialized Evt struct.
4
The EVT_SET_TYPE and similar setters use deposit32() on fields
5
in the struct, so they read the uninitialized existing values.
6
In cases where we don't set all the fields in the event struct
7
we'll end up leaking random uninitialized data from QEMU's
8
stack into the guest.
9
10
Initializing the struct with "Evt evt = {};" ought to satisfy
11
Coverity and fix the data leak.
12
13
Signed-off-by: Eric Auger <eric.auger@redhat.com>
14
Reported-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
16
Message-id: 1526493784-25328-2-git-send-email-eric.auger@redhat.com
17
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
---
20
hw/arm/smmuv3.c | 2 +-
21
1 file changed, 1 insertion(+), 1 deletion(-)
22
23
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
24
index XXXXXXX..XXXXXXX 100644
25
--- a/hw/arm/smmuv3.c
26
+++ b/hw/arm/smmuv3.c
27
@@ -XXX,XX +XXX,XX @@ static MemTxResult smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
28
29
void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *info)
30
{
31
- Evt evt;
32
+ Evt evt = {};
33
MemTxResult r;
34
35
if (!smmuv3_eventq_enabled(s)) {
36
--
37
2.17.0
38
39
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Move some stuff that will be common to both translate-a64.c
3
The instance_init function of the "exynos4210.gic" device creates a
4
and translate-sve.c.
4
new "arm_gic" device and immediately realizes it with qdev_init_nofail().
5
This will leave a lot of object in the QOM tree during introspection of
6
the "exynos4210.gic" device, e.g. reproducible by starting QEMU like this:
5
7
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
qemu-system-aarch64 -M none -nodefaults -nographic -monitor stdio
9
10
And then by running "info qom-tree" at the HMP monitor, followed by
11
"device_add exynos4210.gic,help" and finally checking "info qom-tree"
12
again.
13
14
Also note that qdev_init_nofail() can exit QEMU in case of errors - and
15
this must never happen during an instance_init function, otherwise QEMU
16
could terminate unexpectedly during introspection of a device.
17
18
Since most of the code that follows the qdev_init_nofail() depends on
19
the realized "gicbusdev", the easiest solution to the problem is to
20
turn the whole instance_init function into a realize function instead.
21
22
Signed-off-by: Thomas Huth <thuth@redhat.com>
23
Message-id: 1532337784-334-1-git-send-email-thuth@redhat.com
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
24
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20180516223007.10256-2-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
26
---
12
target/arm/translate-a64.h | 118 +++++++++++++++++++++++++++++++++++++
27
hw/intc/exynos4210_gic.c | 6 +++---
13
target/arm/translate-a64.c | 112 +++++------------------------------
28
1 file changed, 3 insertions(+), 3 deletions(-)
14
2 files changed, 133 insertions(+), 97 deletions(-)
15
create mode 100644 target/arm/translate-a64.h
16
29
17
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
30
diff --git a/hw/intc/exynos4210_gic.c b/hw/intc/exynos4210_gic.c
18
new file mode 100644
19
index XXXXXXX..XXXXXXX
20
--- /dev/null
21
+++ b/target/arm/translate-a64.h
22
@@ -XXX,XX +XXX,XX @@
23
+/*
24
+ * AArch64 translation, common definitions.
25
+ *
26
+ * This library is free software; you can redistribute it and/or
27
+ * modify it under the terms of the GNU Lesser General Public
28
+ * License as published by the Free Software Foundation; either
29
+ * version 2 of the License, or (at your option) any later version.
30
+ *
31
+ * This library is distributed in the hope that it will be useful,
32
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
33
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
34
+ * Lesser General Public License for more details.
35
+ *
36
+ * You should have received a copy of the GNU Lesser General Public
37
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
38
+ */
39
+
40
+#ifndef TARGET_ARM_TRANSLATE_A64_H
41
+#define TARGET_ARM_TRANSLATE_A64_H
42
+
43
+void unallocated_encoding(DisasContext *s);
44
+
45
+#define unsupported_encoding(s, insn) \
46
+ do { \
47
+ qemu_log_mask(LOG_UNIMP, \
48
+ "%s:%d: unsupported instruction encoding 0x%08x " \
49
+ "at pc=%016" PRIx64 "\n", \
50
+ __FILE__, __LINE__, insn, s->pc - 4); \
51
+ unallocated_encoding(s); \
52
+ } while (0)
53
+
54
+TCGv_i64 new_tmp_a64(DisasContext *s);
55
+TCGv_i64 new_tmp_a64_zero(DisasContext *s);
56
+TCGv_i64 cpu_reg(DisasContext *s, int reg);
57
+TCGv_i64 cpu_reg_sp(DisasContext *s, int reg);
58
+TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf);
59
+TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf);
60
+void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v);
61
+TCGv_ptr get_fpstatus_ptr(bool);
62
+bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
63
+ unsigned int imms, unsigned int immr);
64
+uint64_t vfp_expand_imm(int size, uint8_t imm8);
65
+bool sve_access_check(DisasContext *s);
66
+
67
+/* We should have at some point before trying to access an FP register
68
+ * done the necessary access check, so assert that
69
+ * (a) we did the check and
70
+ * (b) we didn't then just plough ahead anyway if it failed.
71
+ * Print the instruction pattern in the abort message so we can figure
72
+ * out what we need to fix if a user encounters this problem in the wild.
73
+ */
74
+static inline void assert_fp_access_checked(DisasContext *s)
75
+{
76
+#ifdef CONFIG_DEBUG_TCG
77
+ if (unlikely(!s->fp_access_checked || s->fp_excp_el)) {
78
+ fprintf(stderr, "target-arm: FP access check missing for "
79
+ "instruction 0x%08x\n", s->insn);
80
+ abort();
81
+ }
82
+#endif
83
+}
84
+
85
+/* Return the offset into CPUARMState of an element of specified
86
+ * size, 'element' places in from the least significant end of
87
+ * the FP/vector register Qn.
88
+ */
89
+static inline int vec_reg_offset(DisasContext *s, int regno,
90
+ int element, TCGMemOp size)
91
+{
92
+ int offs = 0;
93
+#ifdef HOST_WORDS_BIGENDIAN
94
+ /* This is complicated slightly because vfp.zregs[n].d[0] is
95
+ * still the low half and vfp.zregs[n].d[1] the high half
96
+ * of the 128 bit vector, even on big endian systems.
97
+ * Calculate the offset assuming a fully bigendian 128 bits,
98
+ * then XOR to account for the order of the two 64 bit halves.
99
+ */
100
+ offs += (16 - ((element + 1) * (1 << size)));
101
+ offs ^= 8;
102
+#else
103
+ offs += element * (1 << size);
104
+#endif
105
+ offs += offsetof(CPUARMState, vfp.zregs[regno]);
106
+ assert_fp_access_checked(s);
107
+ return offs;
108
+}
109
+
110
+/* Return the offset info CPUARMState of the "whole" vector register Qn. */
111
+static inline int vec_full_reg_offset(DisasContext *s, int regno)
112
+{
113
+ assert_fp_access_checked(s);
114
+ return offsetof(CPUARMState, vfp.zregs[regno]);
115
+}
116
+
117
+/* Return a newly allocated pointer to the vector register. */
118
+static inline TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno)
119
+{
120
+ TCGv_ptr ret = tcg_temp_new_ptr();
121
+ tcg_gen_addi_ptr(ret, cpu_env, vec_full_reg_offset(s, regno));
122
+ return ret;
123
+}
124
+
125
+/* Return the byte size of the "whole" vector register, VL / 8. */
126
+static inline int vec_full_reg_size(DisasContext *s)
127
+{
128
+ return s->sve_len;
129
+}
130
+
131
+bool disas_sve(DisasContext *, uint32_t);
132
+
133
+/* Note that the gvec expanders operate on offsets + sizes. */
134
+typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
135
+typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
136
+ uint32_t, uint32_t);
137
+typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
138
+ uint32_t, uint32_t, uint32_t);
139
+
140
+#endif /* TARGET_ARM_TRANSLATE_A64_H */
141
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
142
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
143
--- a/target/arm/translate-a64.c
32
--- a/hw/intc/exynos4210_gic.c
144
+++ b/target/arm/translate-a64.c
33
+++ b/hw/intc/exynos4210_gic.c
145
@@ -XXX,XX +XXX,XX @@
34
@@ -XXX,XX +XXX,XX @@ static void exynos4210_gic_set_irq(void *opaque, int irq, int level)
146
#include "exec/log.h"
35
qemu_set_irq(qdev_get_gpio_in(s->gic, irq), level);
147
36
}
148
#include "trace-tcg.h"
37
149
+#include "translate-a64.h"
38
-static void exynos4210_gic_init(Object *obj)
150
39
+static void exynos4210_gic_realize(DeviceState *dev, Error **errp)
151
static TCGv_i64 cpu_X[32];
152
static TCGv_i64 cpu_pc;
153
154
/* Load/store exclusive handling */
155
static TCGv_i64 cpu_exclusive_high;
156
-static TCGv_i64 cpu_reg(DisasContext *s, int reg);
157
158
static const char *regnames[] = {
159
"x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7",
160
@@ -XXX,XX +XXX,XX @@ typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
161
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
162
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, TCGMemOp);
163
164
-/* Note that the gvec expanders operate on offsets + sizes. */
165
-typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
166
-typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
167
- uint32_t, uint32_t);
168
-typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
169
- uint32_t, uint32_t, uint32_t);
170
-
171
/* initialize TCG globals. */
172
void a64_translate_init(void)
173
{
40
{
174
@@ -XXX,XX +XXX,XX @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
41
- DeviceState *dev = DEVICE(obj);
175
}
42
+ Object *obj = OBJECT(dev);
43
Exynos4210GicState *s = EXYNOS4210_GIC(obj);
44
SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
45
const char cpu_prefix[] = "exynos4210-gic-alias_cpu";
46
@@ -XXX,XX +XXX,XX @@ static void exynos4210_gic_class_init(ObjectClass *klass, void *data)
47
DeviceClass *dc = DEVICE_CLASS(klass);
48
49
dc->props = exynos4210_gic_properties;
50
+ dc->realize = exynos4210_gic_realize;
176
}
51
}
177
52
178
-static void unallocated_encoding(DisasContext *s)
53
static const TypeInfo exynos4210_gic_info = {
179
+void unallocated_encoding(DisasContext *s)
54
.name = TYPE_EXYNOS4210_GIC,
180
{
55
.parent = TYPE_SYS_BUS_DEVICE,
181
/* Unallocated and reserved encodings are uncategorized */
56
.instance_size = sizeof(Exynos4210GicState),
182
gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
57
- .instance_init = exynos4210_gic_init,
183
default_exception_el(s));
58
.class_init = exynos4210_gic_class_init,
184
}
59
};
185
186
-#define unsupported_encoding(s, insn) \
187
- do { \
188
- qemu_log_mask(LOG_UNIMP, \
189
- "%s:%d: unsupported instruction encoding 0x%08x " \
190
- "at pc=%016" PRIx64 "\n", \
191
- __FILE__, __LINE__, insn, s->pc - 4); \
192
- unallocated_encoding(s); \
193
- } while (0)
194
-
195
static void init_tmp_a64_array(DisasContext *s)
196
{
197
#ifdef CONFIG_DEBUG_TCG
198
@@ -XXX,XX +XXX,XX @@ static void free_tmp_a64(DisasContext *s)
199
init_tmp_a64_array(s);
200
}
201
202
-static TCGv_i64 new_tmp_a64(DisasContext *s)
203
+TCGv_i64 new_tmp_a64(DisasContext *s)
204
{
205
assert(s->tmp_a64_count < TMP_A64_MAX);
206
return s->tmp_a64[s->tmp_a64_count++] = tcg_temp_new_i64();
207
}
208
209
-static TCGv_i64 new_tmp_a64_zero(DisasContext *s)
210
+TCGv_i64 new_tmp_a64_zero(DisasContext *s)
211
{
212
TCGv_i64 t = new_tmp_a64(s);
213
tcg_gen_movi_i64(t, 0);
214
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 new_tmp_a64_zero(DisasContext *s)
215
* to cpu_X[31] and ZR accesses to a temporary which can be discarded.
216
* This is the point of the _sp forms.
217
*/
218
-static TCGv_i64 cpu_reg(DisasContext *s, int reg)
219
+TCGv_i64 cpu_reg(DisasContext *s, int reg)
220
{
221
if (reg == 31) {
222
return new_tmp_a64_zero(s);
223
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 cpu_reg(DisasContext *s, int reg)
224
}
225
226
/* register access for when 31 == SP */
227
-static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg)
228
+TCGv_i64 cpu_reg_sp(DisasContext *s, int reg)
229
{
230
return cpu_X[reg];
231
}
232
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg)
233
* representing the register contents. This TCGv is an auto-freed
234
* temporary so it need not be explicitly freed, and may be modified.
235
*/
236
-static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf)
237
+TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf)
238
{
239
TCGv_i64 v = new_tmp_a64(s);
240
if (reg != 31) {
241
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf)
242
return v;
243
}
244
245
-static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf)
246
+TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf)
247
{
248
TCGv_i64 v = new_tmp_a64(s);
249
if (sf) {
250
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf)
251
return v;
252
}
253
254
-/* We should have at some point before trying to access an FP register
255
- * done the necessary access check, so assert that
256
- * (a) we did the check and
257
- * (b) we didn't then just plough ahead anyway if it failed.
258
- * Print the instruction pattern in the abort message so we can figure
259
- * out what we need to fix if a user encounters this problem in the wild.
260
- */
261
-static inline void assert_fp_access_checked(DisasContext *s)
262
-{
263
-#ifdef CONFIG_DEBUG_TCG
264
- if (unlikely(!s->fp_access_checked || s->fp_excp_el)) {
265
- fprintf(stderr, "target-arm: FP access check missing for "
266
- "instruction 0x%08x\n", s->insn);
267
- abort();
268
- }
269
-#endif
270
-}
271
-
272
-/* Return the offset into CPUARMState of an element of specified
273
- * size, 'element' places in from the least significant end of
274
- * the FP/vector register Qn.
275
- */
276
-static inline int vec_reg_offset(DisasContext *s, int regno,
277
- int element, TCGMemOp size)
278
-{
279
- int offs = 0;
280
-#ifdef HOST_WORDS_BIGENDIAN
281
- /* This is complicated slightly because vfp.zregs[n].d[0] is
282
- * still the low half and vfp.zregs[n].d[1] the high half
283
- * of the 128 bit vector, even on big endian systems.
284
- * Calculate the offset assuming a fully bigendian 128 bits,
285
- * then XOR to account for the order of the two 64 bit halves.
286
- */
287
- offs += (16 - ((element + 1) * (1 << size)));
288
- offs ^= 8;
289
-#else
290
- offs += element * (1 << size);
291
-#endif
292
- offs += offsetof(CPUARMState, vfp.zregs[regno]);
293
- assert_fp_access_checked(s);
294
- return offs;
295
-}
296
-
297
-/* Return the offset info CPUARMState of the "whole" vector register Qn. */
298
-static inline int vec_full_reg_offset(DisasContext *s, int regno)
299
-{
300
- assert_fp_access_checked(s);
301
- return offsetof(CPUARMState, vfp.zregs[regno]);
302
-}
303
-
304
-/* Return a newly allocated pointer to the vector register. */
305
-static TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno)
306
-{
307
- TCGv_ptr ret = tcg_temp_new_ptr();
308
- tcg_gen_addi_ptr(ret, cpu_env, vec_full_reg_offset(s, regno));
309
- return ret;
310
-}
311
-
312
-/* Return the byte size of the "whole" vector register, VL / 8. */
313
-static inline int vec_full_reg_size(DisasContext *s)
314
-{
315
- /* FIXME SVE: We should put the composite ZCR_EL* value into tb->flags.
316
- In the meantime this is just the AdvSIMD length of 128. */
317
- return 128 / 8;
318
-}
319
-
320
/* Return the offset into CPUARMState of a slice (from
321
* the least significant end) of FP register Qn (ie
322
* Dn, Sn, Hn or Bn).
323
@@ -XXX,XX +XXX,XX @@ static void clear_vec_high(DisasContext *s, bool is_q, int rd)
324
}
325
}
326
327
-static void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v)
328
+void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v)
329
{
330
unsigned ofs = fp_reg_offset(s, reg, MO_64);
331
332
@@ -XXX,XX +XXX,XX @@ static void write_fp_sreg(DisasContext *s, int reg, TCGv_i32 v)
333
tcg_temp_free_i64(tmp);
334
}
335
336
-static TCGv_ptr get_fpstatus_ptr(bool is_f16)
337
+TCGv_ptr get_fpstatus_ptr(bool is_f16)
338
{
339
TCGv_ptr statusptr = tcg_temp_new_ptr();
340
int offset;
341
@@ -XXX,XX +XXX,XX @@ static inline bool fp_access_check(DisasContext *s)
342
/* Check that SVE access is enabled. If it is, return true.
343
* If not, emit code to generate an appropriate exception and return false.
344
*/
345
-static inline bool sve_access_check(DisasContext *s)
346
+bool sve_access_check(DisasContext *s)
347
{
348
if (s->sve_excp_el) {
349
gen_exception_insn(s, 4, EXCP_UDEF, syn_sve_access_trap(),
350
s->sve_excp_el);
351
return false;
352
}
353
- return true;
354
+ return fp_access_check(s);
355
}
356
357
/*
358
@@ -XXX,XX +XXX,XX @@ static inline uint64_t bitmask64(unsigned int length)
359
* value (ie should cause a guest UNDEF exception), and true if they are
360
* valid, in which case the decoded bit pattern is written to result.
361
*/
362
-static bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
363
- unsigned int imms, unsigned int immr)
364
+bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
365
+ unsigned int imms, unsigned int immr)
366
{
367
uint64_t mask;
368
unsigned e, levels, s, r;
369
@@ -XXX,XX +XXX,XX @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
370
* the range 01....1xx to 10....0xx, and the most significant 4 bits of
371
* the mantissa; see VFPExpandImm() in the v8 ARM ARM.
372
*/
373
-static uint64_t vfp_expand_imm(int size, uint8_t imm8)
374
+uint64_t vfp_expand_imm(int size, uint8_t imm8)
375
{
376
uint64_t imm;
377
60
378
--
61
--
379
2.17.0
62
2.17.1
380
63
381
64
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Including only 4, as-yet unimplemented, instruction patterns
4
so that the whole thing compiles.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20180516223007.10256-3-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/Makefile.objs | 10 ++++++
12
target/arm/translate-a64.c | 7 ++++-
13
target/arm/translate-sve.c | 63 ++++++++++++++++++++++++++++++++++++++
14
.gitignore | 1 +
15
target/arm/sve.decode | 45 +++++++++++++++++++++++++++
16
5 files changed, 125 insertions(+), 1 deletion(-)
17
create mode 100644 target/arm/translate-sve.c
18
create mode 100644 target/arm/sve.decode
19
20
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
21
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/Makefile.objs
23
+++ b/target/arm/Makefile.objs
24
@@ -XXX,XX +XXX,XX @@ obj-y += gdbstub.o
25
obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
26
obj-y += crypto_helper.o
27
obj-$(CONFIG_SOFTMMU) += arm-powerctl.o
28
+
29
+DECODETREE = $(SRC_PATH)/scripts/decodetree.py
30
+
31
+target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
32
+    $(call quiet-command,\
33
+     $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\
34
+     "GEN", $(TARGET_DIR)$@)
35
+
36
+target/arm/translate-sve.o: target/arm/decode-sve.inc.c
37
+obj-$(TARGET_AARCH64) += translate-sve.o
38
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate-a64.c
41
+++ b/target/arm/translate-a64.c
42
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
43
s->fp_access_checked = false;
44
45
switch (extract32(insn, 25, 4)) {
46
- case 0x0: case 0x1: case 0x2: case 0x3: /* UNALLOCATED */
47
+ case 0x0: case 0x1: case 0x3: /* UNALLOCATED */
48
unallocated_encoding(s);
49
break;
50
+ case 0x2:
51
+ if (!arm_dc_feature(s, ARM_FEATURE_SVE) || !disas_sve(s, insn)) {
52
+ unallocated_encoding(s);
53
+ }
54
+ break;
55
case 0x8: case 0x9: /* Data processing - immediate */
56
disas_data_proc_imm(s, insn);
57
break;
58
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
59
new file mode 100644
60
index XXXXXXX..XXXXXXX
61
--- /dev/null
62
+++ b/target/arm/translate-sve.c
63
@@ -XXX,XX +XXX,XX @@
64
+/*
65
+ * AArch64 SVE translation
66
+ *
67
+ * Copyright (c) 2018 Linaro, Ltd
68
+ *
69
+ * This library is free software; you can redistribute it and/or
70
+ * modify it under the terms of the GNU Lesser General Public
71
+ * License as published by the Free Software Foundation; either
72
+ * version 2 of the License, or (at your option) any later version.
73
+ *
74
+ * This library is distributed in the hope that it will be useful,
75
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
76
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
77
+ * Lesser General Public License for more details.
78
+ *
79
+ * You should have received a copy of the GNU Lesser General Public
80
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
81
+ */
82
+
83
+#include "qemu/osdep.h"
84
+#include "cpu.h"
85
+#include "exec/exec-all.h"
86
+#include "tcg-op.h"
87
+#include "tcg-op-gvec.h"
88
+#include "qemu/log.h"
89
+#include "arm_ldst.h"
90
+#include "translate.h"
91
+#include "internals.h"
92
+#include "exec/helper-proto.h"
93
+#include "exec/helper-gen.h"
94
+#include "exec/log.h"
95
+#include "trace-tcg.h"
96
+#include "translate-a64.h"
97
+
98
+/*
99
+ * Include the generated decoder.
100
+ */
101
+
102
+#include "decode-sve.inc.c"
103
+
104
+/*
105
+ * Implement all of the translator functions referenced by the decoder.
106
+ */
107
+
108
+static bool trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn)
109
+{
110
+ return false;
111
+}
112
+
113
+static bool trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn)
114
+{
115
+ return false;
116
+}
117
+
118
+static bool trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn)
119
+{
120
+ return false;
121
+}
122
+
123
+static bool trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn)
124
+{
125
+ return false;
126
+}
127
diff --git a/.gitignore b/.gitignore
128
index XXXXXXX..XXXXXXX 100644
129
--- a/.gitignore
130
+++ b/.gitignore
131
@@ -XXX,XX +XXX,XX @@ trace-dtrace-root.h
132
trace-dtrace-root.dtrace
133
trace-ust-all.h
134
trace-ust-all.c
135
+/target/arm/decode-sve.inc.c
136
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
137
new file mode 100644
138
index XXXXXXX..XXXXXXX
139
--- /dev/null
140
+++ b/target/arm/sve.decode
141
@@ -XXX,XX +XXX,XX @@
142
+# AArch64 SVE instruction descriptions
143
+#
144
+# Copyright (c) 2017 Linaro, Ltd
145
+#
146
+# This library is free software; you can redistribute it and/or
147
+# modify it under the terms of the GNU Lesser General Public
148
+# License as published by the Free Software Foundation; either
149
+# version 2 of the License, or (at your option) any later version.
150
+#
151
+# This library is distributed in the hope that it will be useful,
152
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
153
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
154
+# Lesser General Public License for more details.
155
+#
156
+# You should have received a copy of the GNU Lesser General Public
157
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
158
+
159
+#
160
+# This file is processed by scripts/decodetree.py
161
+#
162
+
163
+###########################################################################
164
+# Named attribute sets. These are used to make nice(er) names
165
+# when creating helpers common to those for the individual
166
+# instruction patterns.
167
+
168
+&rrr_esz rd rn rm esz
169
+
170
+###########################################################################
171
+# Named instruction formats. These are generally used to
172
+# reduce the amount of duplication between instruction patterns.
173
+
174
+# Three operand with unused vector element size
175
+@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0
176
+
177
+###########################################################################
178
+# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
179
+
180
+### SVE Logical - Unpredicated Group
181
+
182
+# SVE bitwise logical operations (unpredicated)
183
+AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
184
+ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
185
+EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
186
+BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
187
--
188
2.17.0
189
190
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
These were the instructions that were stubbed out when
4
introducing the decode skeleton.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20180516223007.10256-4-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/translate-sve.c | 55 ++++++++++++++++++++++++++++++++------
12
1 file changed, 47 insertions(+), 8 deletions(-)
13
14
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/translate-sve.c
17
+++ b/target/arm/translate-sve.c
18
@@ -XXX,XX +XXX,XX @@
19
* Implement all of the translator functions referenced by the decoder.
20
*/
21
22
-static bool trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn)
23
+/* Invoke a vector expander on two Zregs. */
24
+static bool do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn,
25
+ int esz, int rd, int rn)
26
{
27
- return false;
28
+ if (sve_access_check(s)) {
29
+ unsigned vsz = vec_full_reg_size(s);
30
+ gvec_fn(esz, vec_full_reg_offset(s, rd),
31
+ vec_full_reg_offset(s, rn), vsz, vsz);
32
+ }
33
+ return true;
34
}
35
36
-static bool trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn)
37
+/* Invoke a vector expander on three Zregs. */
38
+static bool do_vector3_z(DisasContext *s, GVecGen3Fn *gvec_fn,
39
+ int esz, int rd, int rn, int rm)
40
{
41
- return false;
42
+ if (sve_access_check(s)) {
43
+ unsigned vsz = vec_full_reg_size(s);
44
+ gvec_fn(esz, vec_full_reg_offset(s, rd),
45
+ vec_full_reg_offset(s, rn),
46
+ vec_full_reg_offset(s, rm), vsz, vsz);
47
+ }
48
+ return true;
49
}
50
51
-static bool trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn)
52
+/* Invoke a vector move on two Zregs. */
53
+static bool do_mov_z(DisasContext *s, int rd, int rn)
54
{
55
- return false;
56
+ return do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn);
57
}
58
59
-static bool trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn)
60
+/*
61
+ *** SVE Logical - Unpredicated Group
62
+ */
63
+
64
+static bool trans_AND_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
65
{
66
- return false;
67
+ return do_vector3_z(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm);
68
+}
69
+
70
+static bool trans_ORR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
71
+{
72
+ if (a->rn == a->rm) { /* MOV */
73
+ return do_mov_z(s, a->rd, a->rn);
74
+ } else {
75
+ return do_vector3_z(s, tcg_gen_gvec_or, 0, a->rd, a->rn, a->rm);
76
+ }
77
+}
78
+
79
+static bool trans_EOR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
80
+{
81
+ return do_vector3_z(s, tcg_gen_gvec_xor, 0, a->rd, a->rn, a->rm);
82
+}
83
+
84
+static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
85
+{
86
+ return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
87
}
88
--
89
2.17.0
90
91
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20180516223007.10256-5-richard.henderson@linaro.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
7
target/arm/translate-sve.c | 127 +++++++++++++++++++++++++++++++++++++
8
target/arm/sve.decode | 20 ++++++
9
2 files changed, 147 insertions(+)
10
11
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-sve.c
14
+++ b/target/arm/translate-sve.c
15
@@ -XXX,XX +XXX,XX @@
16
* Implement all of the translator functions referenced by the decoder.
17
*/
18
19
+/* Return the offset info CPUARMState of the predicate vector register Pn.
20
+ * Note for this purpose, FFR is P16.
21
+ */
22
+static inline int pred_full_reg_offset(DisasContext *s, int regno)
23
+{
24
+ return offsetof(CPUARMState, vfp.pregs[regno]);
25
+}
26
+
27
+/* Return the byte size of the whole predicate register, VL / 64. */
28
+static inline int pred_full_reg_size(DisasContext *s)
29
+{
30
+ return s->sve_len >> 3;
31
+}
32
+
33
/* Invoke a vector expander on two Zregs. */
34
static bool do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn,
35
int esz, int rd, int rn)
36
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
37
{
38
return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
39
}
40
+
41
+/*
42
+ *** SVE Memory - 32-bit Gather and Unsized Contiguous Group
43
+ */
44
+
45
+/* Subroutine loading a vector register at VOFS of LEN bytes.
46
+ * The load should begin at the address Rn + IMM.
47
+ */
48
+
49
+static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len,
50
+ int rn, int imm)
51
+{
52
+ uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
53
+ uint32_t len_remain = len % 8;
54
+ uint32_t nparts = len / 8 + ctpop8(len_remain);
55
+ int midx = get_mem_index(s);
56
+ TCGv_i64 addr, t0, t1;
57
+
58
+ addr = tcg_temp_new_i64();
59
+ t0 = tcg_temp_new_i64();
60
+
61
+ /* Note that unpredicated load/store of vector/predicate registers
62
+ * are defined as a stream of bytes, which equates to little-endian
63
+ * operations on larger quantities. There is no nice way to force
64
+ * a little-endian load for aarch64_be-linux-user out of line.
65
+ *
66
+ * Attempt to keep code expansion to a minimum by limiting the
67
+ * amount of unrolling done.
68
+ */
69
+ if (nparts <= 4) {
70
+ int i;
71
+
72
+ for (i = 0; i < len_align; i += 8) {
73
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i);
74
+ tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ);
75
+ tcg_gen_st_i64(t0, cpu_env, vofs + i);
76
+ }
77
+ } else {
78
+ TCGLabel *loop = gen_new_label();
79
+ TCGv_ptr tp, i = tcg_const_local_ptr(0);
80
+
81
+ gen_set_label(loop);
82
+
83
+ /* Minimize the number of local temps that must be re-read from
84
+ * the stack each iteration. Instead, re-compute values other
85
+ * than the loop counter.
86
+ */
87
+ tp = tcg_temp_new_ptr();
88
+ tcg_gen_addi_ptr(tp, i, imm);
89
+ tcg_gen_extu_ptr_i64(addr, tp);
90
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn));
91
+
92
+ tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ);
93
+
94
+ tcg_gen_add_ptr(tp, cpu_env, i);
95
+ tcg_gen_addi_ptr(i, i, 8);
96
+ tcg_gen_st_i64(t0, tp, vofs);
97
+ tcg_temp_free_ptr(tp);
98
+
99
+ tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
100
+ tcg_temp_free_ptr(i);
101
+ }
102
+
103
+ /* Predicate register loads can be any multiple of 2.
104
+ * Note that we still store the entire 64-bit unit into cpu_env.
105
+ */
106
+ if (len_remain) {
107
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align);
108
+
109
+ switch (len_remain) {
110
+ case 2:
111
+ case 4:
112
+ case 8:
113
+ tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LE | ctz32(len_remain));
114
+ break;
115
+
116
+ case 6:
117
+ t1 = tcg_temp_new_i64();
118
+ tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL);
119
+ tcg_gen_addi_i64(addr, addr, 4);
120
+ tcg_gen_qemu_ld_i64(t1, addr, midx, MO_LEUW);
121
+ tcg_gen_deposit_i64(t0, t0, t1, 32, 32);
122
+ tcg_temp_free_i64(t1);
123
+ break;
124
+
125
+ default:
126
+ g_assert_not_reached();
127
+ }
128
+ tcg_gen_st_i64(t0, cpu_env, vofs + len_align);
129
+ }
130
+ tcg_temp_free_i64(addr);
131
+ tcg_temp_free_i64(t0);
132
+}
133
+
134
+static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
135
+{
136
+ if (sve_access_check(s)) {
137
+ int size = vec_full_reg_size(s);
138
+ int off = vec_full_reg_offset(s, a->rd);
139
+ do_ldr(s, off, size, a->rn, a->imm * size);
140
+ }
141
+ return true;
142
+}
143
+
144
+static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
145
+{
146
+ if (sve_access_check(s)) {
147
+ int size = pred_full_reg_size(s);
148
+ int off = pred_full_reg_offset(s, a->rd);
149
+ do_ldr(s, off, size, a->rn, a->imm * size);
150
+ }
151
+ return true;
152
+}
153
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
154
index XXXXXXX..XXXXXXX 100644
155
--- a/target/arm/sve.decode
156
+++ b/target/arm/sve.decode
157
@@ -XXX,XX +XXX,XX @@
158
# This file is processed by scripts/decodetree.py
159
#
160
161
+###########################################################################
162
+# Named fields. These are primarily for disjoint fields.
163
+
164
+%imm9_16_10 16:s6 10:3
165
+
166
###########################################################################
167
# Named attribute sets. These are used to make nice(er) names
168
# when creating helpers common to those for the individual
169
# instruction patterns.
170
171
+&rri rd rn imm
172
&rrr_esz rd rn rm esz
173
174
###########################################################################
175
@@ -XXX,XX +XXX,XX @@
176
# Three operand with unused vector element size
177
@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0
178
179
+# Basic Load/Store with 9-bit immediate offset
180
+@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
181
+ &rri imm=%imm9_16_10
182
+@rd_rn_i9 ........ ........ ...... rn:5 rd:5 \
183
+ &rri imm=%imm9_16_10
184
+
185
###########################################################################
186
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
187
188
@@ -XXX,XX +XXX,XX @@ AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
189
ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
190
EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
191
BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
192
+
193
+### SVE Memory - 32-bit Gather and Unsized Contiguous Group
194
+
195
+# SVE load predicate register
196
+LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9
197
+
198
+# SVE load vector register
199
+LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
200
--
201
2.17.0
202
203
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-6-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/Makefile.objs | 2 +-
9
target/arm/helper-sve.h | 21 ++++++++++
10
target/arm/helper.h | 1 +
11
target/arm/sve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++
12
target/arm/translate-sve.c | 65 +++++++++++++++++++++++++++++++
13
target/arm/sve.decode | 5 +++
14
6 files changed, 171 insertions(+), 1 deletion(-)
15
create mode 100644 target/arm/helper-sve.h
16
create mode 100644 target/arm/sve_helper.c
17
18
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/Makefile.objs
21
+++ b/target/arm/Makefile.objs
22
@@ -XXX,XX +XXX,XX @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
23
     "GEN", $(TARGET_DIR)$@)
24
25
target/arm/translate-sve.o: target/arm/decode-sve.inc.c
26
-obj-$(TARGET_AARCH64) += translate-sve.o
27
+obj-$(TARGET_AARCH64) += translate-sve.o sve_helper.o
28
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
29
new file mode 100644
30
index XXXXXXX..XXXXXXX
31
--- /dev/null
32
+++ b/target/arm/helper-sve.h
33
@@ -XXX,XX +XXX,XX @@
34
+/*
35
+ * AArch64 SVE specific helper definitions
36
+ *
37
+ * Copyright (c) 2018 Linaro, Ltd
38
+ *
39
+ * This library is free software; you can redistribute it and/or
40
+ * modify it under the terms of the GNU Lesser General Public
41
+ * License as published by the Free Software Foundation; either
42
+ * version 2 of the License, or (at your option) any later version.
43
+ *
44
+ * This library is distributed in the hope that it will be useful,
45
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
46
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
47
+ * Lesser General Public License for more details.
48
+ *
49
+ * You should have received a copy of the GNU Lesser General Public
50
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
51
+ */
52
+
53
+DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64)
54
+DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
55
diff --git a/target/arm/helper.h b/target/arm/helper.h
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/helper.h
58
+++ b/target/arm/helper.h
59
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
60
61
#ifdef TARGET_AARCH64
62
#include "helper-a64.h"
63
+#include "helper-sve.h"
64
#endif
65
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
66
new file mode 100644
67
index XXXXXXX..XXXXXXX
68
--- /dev/null
69
+++ b/target/arm/sve_helper.c
70
@@ -XXX,XX +XXX,XX @@
71
+/*
72
+ * ARM SVE Operations
73
+ *
74
+ * Copyright (c) 2018 Linaro, Ltd.
75
+ *
76
+ * This library is free software; you can redistribute it and/or
77
+ * modify it under the terms of the GNU Lesser General Public
78
+ * License as published by the Free Software Foundation; either
79
+ * version 2 of the License, or (at your option) any later version.
80
+ *
81
+ * This library is distributed in the hope that it will be useful,
82
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
83
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
84
+ * Lesser General Public License for more details.
85
+ *
86
+ * You should have received a copy of the GNU Lesser General Public
87
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
88
+ */
89
+
90
+#include "qemu/osdep.h"
91
+#include "cpu.h"
92
+#include "exec/exec-all.h"
93
+#include "exec/cpu_ldst.h"
94
+#include "exec/helper-proto.h"
95
+#include "tcg/tcg-gvec-desc.h"
96
+
97
+
98
+/* Return a value for NZCV as per the ARM PredTest pseudofunction.
99
+ *
100
+ * The return value has bit 31 set if N is set, bit 1 set if Z is clear,
101
+ * and bit 0 set if C is set. Compare the definitions of these variables
102
+ * within CPUARMState.
103
+ */
104
+
105
+/* For no G bits set, NZCV = C. */
106
+#define PREDTEST_INIT 1
107
+
108
+/* This is an iterative function, called for each Pd and Pg word
109
+ * moving forward.
110
+ */
111
+static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags)
112
+{
113
+ if (likely(g)) {
114
+ /* Compute N from first D & G.
115
+ Use bit 2 to signal first G bit seen. */
116
+ if (!(flags & 4)) {
117
+ flags |= ((d & (g & -g)) != 0) << 31;
118
+ flags |= 4;
119
+ }
120
+
121
+ /* Accumulate Z from each D & G. */
122
+ flags |= ((d & g) != 0) << 1;
123
+
124
+ /* Compute C from last !(D & G). Replace previous. */
125
+ flags = deposit32(flags, 0, 1, (d & pow2floor(g)) == 0);
126
+ }
127
+ return flags;
128
+}
129
+
130
+/* The same for a single word predicate. */
131
+uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g)
132
+{
133
+ return iter_predtest_fwd(d, g, PREDTEST_INIT);
134
+}
135
+
136
+/* The same for a multi-word predicate. */
137
+uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words)
138
+{
139
+ uint32_t flags = PREDTEST_INIT;
140
+ uint64_t *d = vd, *g = vg;
141
+ uintptr_t i = 0;
142
+
143
+ do {
144
+ flags = iter_predtest_fwd(d[i], g[i], flags);
145
+ } while (++i < words);
146
+
147
+ return flags;
148
+}
149
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
150
index XXXXXXX..XXXXXXX 100644
151
--- a/target/arm/translate-sve.c
152
+++ b/target/arm/translate-sve.c
153
@@ -XXX,XX +XXX,XX @@ static bool do_mov_z(DisasContext *s, int rd, int rn)
154
return do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn);
155
}
156
157
+/* Set the cpu flags as per a return from an SVE helper. */
158
+static void do_pred_flags(TCGv_i32 t)
159
+{
160
+ tcg_gen_mov_i32(cpu_NF, t);
161
+ tcg_gen_andi_i32(cpu_ZF, t, 2);
162
+ tcg_gen_andi_i32(cpu_CF, t, 1);
163
+ tcg_gen_movi_i32(cpu_VF, 0);
164
+}
165
+
166
+/* Subroutines computing the ARM PredTest psuedofunction. */
167
+static void do_predtest1(TCGv_i64 d, TCGv_i64 g)
168
+{
169
+ TCGv_i32 t = tcg_temp_new_i32();
170
+
171
+ gen_helper_sve_predtest1(t, d, g);
172
+ do_pred_flags(t);
173
+ tcg_temp_free_i32(t);
174
+}
175
+
176
+static void do_predtest(DisasContext *s, int dofs, int gofs, int words)
177
+{
178
+ TCGv_ptr dptr = tcg_temp_new_ptr();
179
+ TCGv_ptr gptr = tcg_temp_new_ptr();
180
+ TCGv_i32 t;
181
+
182
+ tcg_gen_addi_ptr(dptr, cpu_env, dofs);
183
+ tcg_gen_addi_ptr(gptr, cpu_env, gofs);
184
+ t = tcg_const_i32(words);
185
+
186
+ gen_helper_sve_predtest(t, dptr, gptr, t);
187
+ tcg_temp_free_ptr(dptr);
188
+ tcg_temp_free_ptr(gptr);
189
+
190
+ do_pred_flags(t);
191
+ tcg_temp_free_i32(t);
192
+}
193
+
194
/*
195
*** SVE Logical - Unpredicated Group
196
*/
197
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
198
return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
199
}
200
201
+/*
202
+ *** SVE Predicate Misc Group
203
+ */
204
+
205
+static bool trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn)
206
+{
207
+ if (sve_access_check(s)) {
208
+ int nofs = pred_full_reg_offset(s, a->rn);
209
+ int gofs = pred_full_reg_offset(s, a->pg);
210
+ int words = DIV_ROUND_UP(pred_full_reg_size(s), 8);
211
+
212
+ if (words == 1) {
213
+ TCGv_i64 pn = tcg_temp_new_i64();
214
+ TCGv_i64 pg = tcg_temp_new_i64();
215
+
216
+ tcg_gen_ld_i64(pn, cpu_env, nofs);
217
+ tcg_gen_ld_i64(pg, cpu_env, gofs);
218
+ do_predtest1(pn, pg);
219
+
220
+ tcg_temp_free_i64(pn);
221
+ tcg_temp_free_i64(pg);
222
+ } else {
223
+ do_predtest(s, nofs, gofs, words);
224
+ }
225
+ }
226
+ return true;
227
+}
228
+
229
/*
230
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
231
*/
232
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
233
index XXXXXXX..XXXXXXX 100644
234
--- a/target/arm/sve.decode
235
+++ b/target/arm/sve.decode
236
@@ -XXX,XX +XXX,XX @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
237
EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
238
BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
239
240
+### SVE Predicate Misc Group
241
+
242
+# SVE predicate test
243
+PTEST 00100101 01 010000 11 pg:4 0 rn:4 0 0000
244
+
245
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
246
247
# SVE load predicate register
248
--
249
2.17.0
250
251
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-7-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/cpu.h | 4 +-
9
target/arm/helper-sve.h | 10 +
10
target/arm/sve_helper.c | 39 ++++
11
target/arm/translate-sve.c | 361 +++++++++++++++++++++++++++++++++++++
12
target/arm/sve.decode | 16 ++
13
5 files changed, 429 insertions(+), 1 deletion(-)
14
15
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.h
18
+++ b/target/arm/cpu.h
19
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
20
#ifdef TARGET_AARCH64
21
/* Store FFR as pregs[16] to make it easier to treat as any other. */
22
ARMPredicateReg pregs[17];
23
+ /* Scratch space for aa64 sve predicate temporary. */
24
+ ARMPredicateReg preg_tmp;
25
#endif
26
27
uint32_t xregs[16];
28
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
29
int vec_len;
30
int vec_stride;
31
32
- /* scratch space when Tn are not sufficient. */
33
+ /* Scratch space for aa32 neon expansion. */
34
uint32_t scratch[8];
35
36
/* There are a number of distinct float control structures:
37
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/helper-sve.h
40
+++ b/target/arm/helper-sve.h
41
@@ -XXX,XX +XXX,XX @@
42
43
DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64)
44
DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
45
+
46
+DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
49
+DEF_HELPER_FLAGS_5(sve_sel_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_5(sve_orr_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
53
+DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG,
54
+ void, ptr, ptr, ptr, ptr, i32)
55
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/sve_helper.c
58
+++ b/target/arm/sve_helper.c
59
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words)
60
61
return flags;
62
}
63
+
64
+#define LOGICAL_PPPP(NAME, FUNC) \
65
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
66
+{ \
67
+ uintptr_t opr_sz = simd_oprsz(desc); \
68
+ uint64_t *d = vd, *n = vn, *m = vm, *g = vg; \
69
+ uintptr_t i; \
70
+ for (i = 0; i < opr_sz / 8; ++i) { \
71
+ d[i] = FUNC(n[i], m[i], g[i]); \
72
+ } \
73
+}
74
+
75
+#define DO_AND(N, M, G) (((N) & (M)) & (G))
76
+#define DO_BIC(N, M, G) (((N) & ~(M)) & (G))
77
+#define DO_EOR(N, M, G) (((N) ^ (M)) & (G))
78
+#define DO_ORR(N, M, G) (((N) | (M)) & (G))
79
+#define DO_ORN(N, M, G) (((N) | ~(M)) & (G))
80
+#define DO_NOR(N, M, G) (~((N) | (M)) & (G))
81
+#define DO_NAND(N, M, G) (~((N) & (M)) & (G))
82
+#define DO_SEL(N, M, G) (((N) & (G)) | ((M) & ~(G)))
83
+
84
+LOGICAL_PPPP(sve_and_pppp, DO_AND)
85
+LOGICAL_PPPP(sve_bic_pppp, DO_BIC)
86
+LOGICAL_PPPP(sve_eor_pppp, DO_EOR)
87
+LOGICAL_PPPP(sve_sel_pppp, DO_SEL)
88
+LOGICAL_PPPP(sve_orr_pppp, DO_ORR)
89
+LOGICAL_PPPP(sve_orn_pppp, DO_ORN)
90
+LOGICAL_PPPP(sve_nor_pppp, DO_NOR)
91
+LOGICAL_PPPP(sve_nand_pppp, DO_NAND)
92
+
93
+#undef DO_AND
94
+#undef DO_BIC
95
+#undef DO_EOR
96
+#undef DO_ORR
97
+#undef DO_ORN
98
+#undef DO_NOR
99
+#undef DO_NAND
100
+#undef DO_SEL
101
+#undef LOGICAL_PPPP
102
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/target/arm/translate-sve.c
105
+++ b/target/arm/translate-sve.c
106
@@ -XXX,XX +XXX,XX @@ static inline int pred_full_reg_size(DisasContext *s)
107
return s->sve_len >> 3;
108
}
109
110
+/* Round up the size of a register to a size allowed by
111
+ * the tcg vector infrastructure. Any operation which uses this
112
+ * size may assume that the bits above pred_full_reg_size are zero,
113
+ * and must leave them the same way.
114
+ *
115
+ * Note that this is not needed for the vector registers as they
116
+ * are always properly sized for tcg vectors.
117
+ */
118
+static int size_for_gvec(int size)
119
+{
120
+ if (size <= 8) {
121
+ return 8;
122
+ } else {
123
+ return QEMU_ALIGN_UP(size, 16);
124
+ }
125
+}
126
+
127
+static int pred_gvec_reg_size(DisasContext *s)
128
+{
129
+ return size_for_gvec(pred_full_reg_size(s));
130
+}
131
+
132
/* Invoke a vector expander on two Zregs. */
133
static bool do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn,
134
int esz, int rd, int rn)
135
@@ -XXX,XX +XXX,XX @@ static bool do_mov_z(DisasContext *s, int rd, int rn)
136
return do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn);
137
}
138
139
+/* Invoke a vector expander on two Pregs. */
140
+static bool do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn,
141
+ int esz, int rd, int rn)
142
+{
143
+ if (sve_access_check(s)) {
144
+ unsigned psz = pred_gvec_reg_size(s);
145
+ gvec_fn(esz, pred_full_reg_offset(s, rd),
146
+ pred_full_reg_offset(s, rn), psz, psz);
147
+ }
148
+ return true;
149
+}
150
+
151
+/* Invoke a vector expander on three Pregs. */
152
+static bool do_vector3_p(DisasContext *s, GVecGen3Fn *gvec_fn,
153
+ int esz, int rd, int rn, int rm)
154
+{
155
+ if (sve_access_check(s)) {
156
+ unsigned psz = pred_gvec_reg_size(s);
157
+ gvec_fn(esz, pred_full_reg_offset(s, rd),
158
+ pred_full_reg_offset(s, rn),
159
+ pred_full_reg_offset(s, rm), psz, psz);
160
+ }
161
+ return true;
162
+}
163
+
164
+/* Invoke a vector operation on four Pregs. */
165
+static bool do_vecop4_p(DisasContext *s, const GVecGen4 *gvec_op,
166
+ int rd, int rn, int rm, int rg)
167
+{
168
+ if (sve_access_check(s)) {
169
+ unsigned psz = pred_gvec_reg_size(s);
170
+ tcg_gen_gvec_4(pred_full_reg_offset(s, rd),
171
+ pred_full_reg_offset(s, rn),
172
+ pred_full_reg_offset(s, rm),
173
+ pred_full_reg_offset(s, rg),
174
+ psz, psz, gvec_op);
175
+ }
176
+ return true;
177
+}
178
+
179
+/* Invoke a vector move on two Pregs. */
180
+static bool do_mov_p(DisasContext *s, int rd, int rn)
181
+{
182
+ return do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn);
183
+}
184
+
185
/* Set the cpu flags as per a return from an SVE helper. */
186
static void do_pred_flags(TCGv_i32 t)
187
{
188
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
189
return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
190
}
191
192
+/*
193
+ *** SVE Predicate Logical Operations Group
194
+ */
195
+
196
+static bool do_pppp_flags(DisasContext *s, arg_rprr_s *a,
197
+ const GVecGen4 *gvec_op)
198
+{
199
+ if (!sve_access_check(s)) {
200
+ return true;
201
+ }
202
+
203
+ unsigned psz = pred_gvec_reg_size(s);
204
+ int dofs = pred_full_reg_offset(s, a->rd);
205
+ int nofs = pred_full_reg_offset(s, a->rn);
206
+ int mofs = pred_full_reg_offset(s, a->rm);
207
+ int gofs = pred_full_reg_offset(s, a->pg);
208
+
209
+ if (psz == 8) {
210
+ /* Do the operation and the flags generation in temps. */
211
+ TCGv_i64 pd = tcg_temp_new_i64();
212
+ TCGv_i64 pn = tcg_temp_new_i64();
213
+ TCGv_i64 pm = tcg_temp_new_i64();
214
+ TCGv_i64 pg = tcg_temp_new_i64();
215
+
216
+ tcg_gen_ld_i64(pn, cpu_env, nofs);
217
+ tcg_gen_ld_i64(pm, cpu_env, mofs);
218
+ tcg_gen_ld_i64(pg, cpu_env, gofs);
219
+
220
+ gvec_op->fni8(pd, pn, pm, pg);
221
+ tcg_gen_st_i64(pd, cpu_env, dofs);
222
+
223
+ do_predtest1(pd, pg);
224
+
225
+ tcg_temp_free_i64(pd);
226
+ tcg_temp_free_i64(pn);
227
+ tcg_temp_free_i64(pm);
228
+ tcg_temp_free_i64(pg);
229
+ } else {
230
+ /* The operation and flags generation is large. The computation
231
+ * of the flags depends on the original contents of the guarding
232
+ * predicate. If the destination overwrites the guarding predicate,
233
+ * then the easiest way to get this right is to save a copy.
234
+ */
235
+ int tofs = gofs;
236
+ if (a->rd == a->pg) {
237
+ tofs = offsetof(CPUARMState, vfp.preg_tmp);
238
+ tcg_gen_gvec_mov(0, tofs, gofs, psz, psz);
239
+ }
240
+
241
+ tcg_gen_gvec_4(dofs, nofs, mofs, gofs, psz, psz, gvec_op);
242
+ do_predtest(s, dofs, tofs, psz / 8);
243
+ }
244
+ return true;
245
+}
246
+
247
+static void gen_and_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
248
+{
249
+ tcg_gen_and_i64(pd, pn, pm);
250
+ tcg_gen_and_i64(pd, pd, pg);
251
+}
252
+
253
+static void gen_and_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
254
+ TCGv_vec pm, TCGv_vec pg)
255
+{
256
+ tcg_gen_and_vec(vece, pd, pn, pm);
257
+ tcg_gen_and_vec(vece, pd, pd, pg);
258
+}
259
+
260
+static bool trans_AND_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
261
+{
262
+ static const GVecGen4 op = {
263
+ .fni8 = gen_and_pg_i64,
264
+ .fniv = gen_and_pg_vec,
265
+ .fno = gen_helper_sve_and_pppp,
266
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
267
+ };
268
+ if (a->s) {
269
+ return do_pppp_flags(s, a, &op);
270
+ } else if (a->rn == a->rm) {
271
+ if (a->pg == a->rn) {
272
+ return do_mov_p(s, a->rd, a->rn);
273
+ } else {
274
+ return do_vector3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->pg);
275
+ }
276
+ } else if (a->pg == a->rn || a->pg == a->rm) {
277
+ return do_vector3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm);
278
+ } else {
279
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
280
+ }
281
+}
282
+
283
+static void gen_bic_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
284
+{
285
+ tcg_gen_andc_i64(pd, pn, pm);
286
+ tcg_gen_and_i64(pd, pd, pg);
287
+}
288
+
289
+static void gen_bic_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
290
+ TCGv_vec pm, TCGv_vec pg)
291
+{
292
+ tcg_gen_andc_vec(vece, pd, pn, pm);
293
+ tcg_gen_and_vec(vece, pd, pd, pg);
294
+}
295
+
296
+static bool trans_BIC_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
297
+{
298
+ static const GVecGen4 op = {
299
+ .fni8 = gen_bic_pg_i64,
300
+ .fniv = gen_bic_pg_vec,
301
+ .fno = gen_helper_sve_bic_pppp,
302
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
303
+ };
304
+ if (a->s) {
305
+ return do_pppp_flags(s, a, &op);
306
+ } else if (a->pg == a->rn) {
307
+ return do_vector3_p(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
308
+ } else {
309
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
310
+ }
311
+}
312
+
313
+static void gen_eor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
314
+{
315
+ tcg_gen_xor_i64(pd, pn, pm);
316
+ tcg_gen_and_i64(pd, pd, pg);
317
+}
318
+
319
+static void gen_eor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
320
+ TCGv_vec pm, TCGv_vec pg)
321
+{
322
+ tcg_gen_xor_vec(vece, pd, pn, pm);
323
+ tcg_gen_and_vec(vece, pd, pd, pg);
324
+}
325
+
326
+static bool trans_EOR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
327
+{
328
+ static const GVecGen4 op = {
329
+ .fni8 = gen_eor_pg_i64,
330
+ .fniv = gen_eor_pg_vec,
331
+ .fno = gen_helper_sve_eor_pppp,
332
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
333
+ };
334
+ if (a->s) {
335
+ return do_pppp_flags(s, a, &op);
336
+ } else {
337
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
338
+ }
339
+}
340
+
341
+static void gen_sel_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
342
+{
343
+ tcg_gen_and_i64(pn, pn, pg);
344
+ tcg_gen_andc_i64(pm, pm, pg);
345
+ tcg_gen_or_i64(pd, pn, pm);
346
+}
347
+
348
+static void gen_sel_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
349
+ TCGv_vec pm, TCGv_vec pg)
350
+{
351
+ tcg_gen_and_vec(vece, pn, pn, pg);
352
+ tcg_gen_andc_vec(vece, pm, pm, pg);
353
+ tcg_gen_or_vec(vece, pd, pn, pm);
354
+}
355
+
356
+static bool trans_SEL_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
357
+{
358
+ static const GVecGen4 op = {
359
+ .fni8 = gen_sel_pg_i64,
360
+ .fniv = gen_sel_pg_vec,
361
+ .fno = gen_helper_sve_sel_pppp,
362
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
363
+ };
364
+ if (a->s) {
365
+ return false;
366
+ } else {
367
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
368
+ }
369
+}
370
+
371
+static void gen_orr_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
372
+{
373
+ tcg_gen_or_i64(pd, pn, pm);
374
+ tcg_gen_and_i64(pd, pd, pg);
375
+}
376
+
377
+static void gen_orr_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
378
+ TCGv_vec pm, TCGv_vec pg)
379
+{
380
+ tcg_gen_or_vec(vece, pd, pn, pm);
381
+ tcg_gen_and_vec(vece, pd, pd, pg);
382
+}
383
+
384
+static bool trans_ORR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
385
+{
386
+ static const GVecGen4 op = {
387
+ .fni8 = gen_orr_pg_i64,
388
+ .fniv = gen_orr_pg_vec,
389
+ .fno = gen_helper_sve_orr_pppp,
390
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
391
+ };
392
+ if (a->s) {
393
+ return do_pppp_flags(s, a, &op);
394
+ } else if (a->pg == a->rn && a->rn == a->rm) {
395
+ return do_mov_p(s, a->rd, a->rn);
396
+ } else {
397
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
398
+ }
399
+}
400
+
401
+static void gen_orn_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
402
+{
403
+ tcg_gen_orc_i64(pd, pn, pm);
404
+ tcg_gen_and_i64(pd, pd, pg);
405
+}
406
+
407
+static void gen_orn_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
408
+ TCGv_vec pm, TCGv_vec pg)
409
+{
410
+ tcg_gen_orc_vec(vece, pd, pn, pm);
411
+ tcg_gen_and_vec(vece, pd, pd, pg);
412
+}
413
+
414
+static bool trans_ORN_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
415
+{
416
+ static const GVecGen4 op = {
417
+ .fni8 = gen_orn_pg_i64,
418
+ .fniv = gen_orn_pg_vec,
419
+ .fno = gen_helper_sve_orn_pppp,
420
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
421
+ };
422
+ if (a->s) {
423
+ return do_pppp_flags(s, a, &op);
424
+ } else {
425
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
426
+ }
427
+}
428
+
429
+static void gen_nor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
430
+{
431
+ tcg_gen_or_i64(pd, pn, pm);
432
+ tcg_gen_andc_i64(pd, pg, pd);
433
+}
434
+
435
+static void gen_nor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
436
+ TCGv_vec pm, TCGv_vec pg)
437
+{
438
+ tcg_gen_or_vec(vece, pd, pn, pm);
439
+ tcg_gen_andc_vec(vece, pd, pg, pd);
440
+}
441
+
442
+static bool trans_NOR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
443
+{
444
+ static const GVecGen4 op = {
445
+ .fni8 = gen_nor_pg_i64,
446
+ .fniv = gen_nor_pg_vec,
447
+ .fno = gen_helper_sve_nor_pppp,
448
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
449
+ };
450
+ if (a->s) {
451
+ return do_pppp_flags(s, a, &op);
452
+ } else {
453
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
454
+ }
455
+}
456
+
457
+static void gen_nand_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg)
458
+{
459
+ tcg_gen_and_i64(pd, pn, pm);
460
+ tcg_gen_andc_i64(pd, pg, pd);
461
+}
462
+
463
+static void gen_nand_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn,
464
+ TCGv_vec pm, TCGv_vec pg)
465
+{
466
+ tcg_gen_and_vec(vece, pd, pn, pm);
467
+ tcg_gen_andc_vec(vece, pd, pg, pd);
468
+}
469
+
470
+static bool trans_NAND_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn)
471
+{
472
+ static const GVecGen4 op = {
473
+ .fni8 = gen_nand_pg_i64,
474
+ .fniv = gen_nand_pg_vec,
475
+ .fno = gen_helper_sve_nand_pppp,
476
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
477
+ };
478
+ if (a->s) {
479
+ return do_pppp_flags(s, a, &op);
480
+ } else {
481
+ return do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg);
482
+ }
483
+}
484
+
485
/*
486
*** SVE Predicate Misc Group
487
*/
488
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
489
index XXXXXXX..XXXXXXX 100644
490
--- a/target/arm/sve.decode
491
+++ b/target/arm/sve.decode
492
@@ -XXX,XX +XXX,XX @@
493
494
&rri rd rn imm
495
&rrr_esz rd rn rm esz
496
+&rprr_s rd pg rn rm s
497
498
###########################################################################
499
# Named instruction formats. These are generally used to
500
@@ -XXX,XX +XXX,XX @@
501
# Three operand with unused vector element size
502
@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0
503
504
+# Three predicate operand, with governing predicate, flag setting
505
+@pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s
506
+
507
# Basic Load/Store with 9-bit immediate offset
508
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
509
&rri imm=%imm9_16_10
510
@@ -XXX,XX +XXX,XX @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
511
EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
512
BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
513
514
+### SVE Predicate Logical Operations Group
515
+
516
+# SVE predicate logical operations
517
+AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s
518
+BIC_pppp 00100101 0. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s
519
+EOR_pppp 00100101 0. 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm_s
520
+SEL_pppp 00100101 0. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s
521
+ORR_pppp 00100101 1. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s
522
+ORN_pppp 00100101 1. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s
523
+NOR_pppp 00100101 1. 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm_s
524
+NAND_pppp 00100101 1. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s
525
+
526
### SVE Predicate Misc Group
527
528
# SVE predicate test
529
--
530
2.17.0
531
532
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20180516223007.10256-8-richard.henderson@linaro.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
7
target/arm/cpu.h | 4 +
8
target/arm/helper-sve.h | 3 +
9
target/arm/sve_helper.c | 84 +++++++++++++++
10
target/arm/translate-sve.c | 209 +++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 31 ++++++
12
5 files changed, 331 insertions(+)
13
14
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/cpu.h
17
+++ b/target/arm/cpu.h
18
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
19
20
#ifdef TARGET_AARCH64
21
/* Store FFR as pregs[16] to make it easier to treat as any other. */
22
+#define FFR_PRED_NUM 16
23
ARMPredicateReg pregs[17];
24
/* Scratch space for aa64 sve predicate temporary. */
25
ARMPredicateReg preg_tmp;
26
@@ -XXX,XX +XXX,XX @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno)
27
return &env->vfp.zregs[regno].d[0];
28
}
29
30
+/* Shared between translate-sve.c and sve_helper.c. */
31
+extern const uint64_t pred_esz_masks[4];
32
+
33
#endif
34
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
35
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/helper-sve.h
37
+++ b/target/arm/helper-sve.h
38
@@ -XXX,XX +XXX,XX @@
39
DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64)
40
DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
41
42
+DEF_HELPER_FLAGS_3(sve_pfirst, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_3(sve_pnext, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
44
+
45
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
46
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
47
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/sve_helper.c
51
+++ b/target/arm/sve_helper.c
52
@@ -XXX,XX +XXX,XX @@ LOGICAL_PPPP(sve_nand_pppp, DO_NAND)
53
#undef DO_NAND
54
#undef DO_SEL
55
#undef LOGICAL_PPPP
56
+
57
+/* Similar to the ARM LastActiveElement pseudocode function, except the
58
+ result is multiplied by the element size. This includes the not found
59
+ indication; e.g. not found for esz=3 is -8. */
60
+static intptr_t last_active_element(uint64_t *g, intptr_t words, intptr_t esz)
61
+{
62
+ uint64_t mask = pred_esz_masks[esz];
63
+ intptr_t i = words;
64
+
65
+ do {
66
+ uint64_t this_g = g[--i] & mask;
67
+ if (this_g) {
68
+ return i * 64 + (63 - clz64(this_g));
69
+ }
70
+ } while (i > 0);
71
+ return (intptr_t)-1 << esz;
72
+}
73
+
74
+uint32_t HELPER(sve_pfirst)(void *vd, void *vg, uint32_t words)
75
+{
76
+ uint32_t flags = PREDTEST_INIT;
77
+ uint64_t *d = vd, *g = vg;
78
+ intptr_t i = 0;
79
+
80
+ do {
81
+ uint64_t this_d = d[i];
82
+ uint64_t this_g = g[i];
83
+
84
+ if (this_g) {
85
+ if (!(flags & 4)) {
86
+ /* Set in D the first bit of G. */
87
+ this_d |= this_g & -this_g;
88
+ d[i] = this_d;
89
+ }
90
+ flags = iter_predtest_fwd(this_d, this_g, flags);
91
+ }
92
+ } while (++i < words);
93
+
94
+ return flags;
95
+}
96
+
97
+uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32_t pred_desc)
98
+{
99
+ intptr_t words = extract32(pred_desc, 0, SIMD_OPRSZ_BITS);
100
+ intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
101
+ uint32_t flags = PREDTEST_INIT;
102
+ uint64_t *d = vd, *g = vg, esz_mask;
103
+ intptr_t i, next;
104
+
105
+ next = last_active_element(vd, words, esz) + (1 << esz);
106
+ esz_mask = pred_esz_masks[esz];
107
+
108
+ /* Similar to the pseudocode for pnext, but scaled by ESZ
109
+ so that we find the correct bit. */
110
+ if (next < words * 64) {
111
+ uint64_t mask = -1;
112
+
113
+ if (next & 63) {
114
+ mask = ~((1ull << (next & 63)) - 1);
115
+ next &= -64;
116
+ }
117
+ do {
118
+ uint64_t this_g = g[next / 64] & esz_mask & mask;
119
+ if (this_g != 0) {
120
+ next = (next & -64) + ctz64(this_g);
121
+ break;
122
+ }
123
+ next += 64;
124
+ mask = -1;
125
+ } while (next < words * 64);
126
+ }
127
+
128
+ i = 0;
129
+ do {
130
+ uint64_t this_d = 0;
131
+ if (i == next / 64) {
132
+ this_d = 1ull << (next & 63);
133
+ }
134
+ d[i] = this_d;
135
+ flags = iter_predtest_fwd(this_d, g[i] & esz_mask, flags);
136
+ } while (++i < words);
137
+
138
+ return flags;
139
+}
140
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
141
index XXXXXXX..XXXXXXX 100644
142
--- a/target/arm/translate-sve.c
143
+++ b/target/arm/translate-sve.c
144
@@ -XXX,XX +XXX,XX @@
145
#include "exec/exec-all.h"
146
#include "tcg-op.h"
147
#include "tcg-op-gvec.h"
148
+#include "tcg-gvec-desc.h"
149
#include "qemu/log.h"
150
#include "arm_ldst.h"
151
#include "translate.h"
152
@@ -XXX,XX +XXX,XX @@ static void do_predtest(DisasContext *s, int dofs, int gofs, int words)
153
tcg_temp_free_i32(t);
154
}
155
156
+/* For each element size, the bits within a predicate word that are active. */
157
+const uint64_t pred_esz_masks[4] = {
158
+ 0xffffffffffffffffull, 0x5555555555555555ull,
159
+ 0x1111111111111111ull, 0x0101010101010101ull
160
+};
161
+
162
/*
163
*** SVE Logical - Unpredicated Group
164
*/
165
@@ -XXX,XX +XXX,XX @@ static bool trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn)
166
return true;
167
}
168
169
+/* See the ARM pseudocode DecodePredCount. */
170
+static unsigned decode_pred_count(unsigned fullsz, int pattern, int esz)
171
+{
172
+ unsigned elements = fullsz >> esz;
173
+ unsigned bound;
174
+
175
+ switch (pattern) {
176
+ case 0x0: /* POW2 */
177
+ return pow2floor(elements);
178
+ case 0x1: /* VL1 */
179
+ case 0x2: /* VL2 */
180
+ case 0x3: /* VL3 */
181
+ case 0x4: /* VL4 */
182
+ case 0x5: /* VL5 */
183
+ case 0x6: /* VL6 */
184
+ case 0x7: /* VL7 */
185
+ case 0x8: /* VL8 */
186
+ bound = pattern;
187
+ break;
188
+ case 0x9: /* VL16 */
189
+ case 0xa: /* VL32 */
190
+ case 0xb: /* VL64 */
191
+ case 0xc: /* VL128 */
192
+ case 0xd: /* VL256 */
193
+ bound = 16 << (pattern - 9);
194
+ break;
195
+ case 0x1d: /* MUL4 */
196
+ return elements - elements % 4;
197
+ case 0x1e: /* MUL3 */
198
+ return elements - elements % 3;
199
+ case 0x1f: /* ALL */
200
+ return elements;
201
+ default: /* #uimm5 */
202
+ return 0;
203
+ }
204
+ return elements >= bound ? bound : 0;
205
+}
206
+
207
+/* This handles all of the predicate initialization instructions,
208
+ * PTRUE, PFALSE, SETFFR. For PFALSE, we will have set PAT == 32
209
+ * so that decode_pred_count returns 0. For SETFFR, we will have
210
+ * set RD == 16 == FFR.
211
+ */
212
+static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
213
+{
214
+ if (!sve_access_check(s)) {
215
+ return true;
216
+ }
217
+
218
+ unsigned fullsz = vec_full_reg_size(s);
219
+ unsigned ofs = pred_full_reg_offset(s, rd);
220
+ unsigned numelem, setsz, i;
221
+ uint64_t word, lastword;
222
+ TCGv_i64 t;
223
+
224
+ numelem = decode_pred_count(fullsz, pat, esz);
225
+
226
+ /* Determine what we must store into each bit, and how many. */
227
+ if (numelem == 0) {
228
+ lastword = word = 0;
229
+ setsz = fullsz;
230
+ } else {
231
+ setsz = numelem << esz;
232
+ lastword = word = pred_esz_masks[esz];
233
+ if (setsz % 64) {
234
+ lastword &= ~(-1ull << (setsz % 64));
235
+ }
236
+ }
237
+
238
+ t = tcg_temp_new_i64();
239
+ if (fullsz <= 64) {
240
+ tcg_gen_movi_i64(t, lastword);
241
+ tcg_gen_st_i64(t, cpu_env, ofs);
242
+ goto done;
243
+ }
244
+
245
+ if (word == lastword) {
246
+ unsigned maxsz = size_for_gvec(fullsz / 8);
247
+ unsigned oprsz = size_for_gvec(setsz / 8);
248
+
249
+ if (oprsz * 8 == setsz) {
250
+ tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
251
+ goto done;
252
+ }
253
+ if (oprsz * 8 == setsz + 8) {
254
+ tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
255
+ tcg_gen_movi_i64(t, 0);
256
+ tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8);
257
+ goto done;
258
+ }
259
+ }
260
+
261
+ setsz /= 8;
262
+ fullsz /= 8;
263
+
264
+ tcg_gen_movi_i64(t, word);
265
+ for (i = 0; i < setsz; i += 8) {
266
+ tcg_gen_st_i64(t, cpu_env, ofs + i);
267
+ }
268
+ if (lastword != word) {
269
+ tcg_gen_movi_i64(t, lastword);
270
+ tcg_gen_st_i64(t, cpu_env, ofs + i);
271
+ i += 8;
272
+ }
273
+ if (i < fullsz) {
274
+ tcg_gen_movi_i64(t, 0);
275
+ for (; i < fullsz; i += 8) {
276
+ tcg_gen_st_i64(t, cpu_env, ofs + i);
277
+ }
278
+ }
279
+
280
+ done:
281
+ tcg_temp_free_i64(t);
282
+
283
+ /* PTRUES */
284
+ if (setflag) {
285
+ tcg_gen_movi_i32(cpu_NF, -(word != 0));
286
+ tcg_gen_movi_i32(cpu_CF, word == 0);
287
+ tcg_gen_movi_i32(cpu_VF, 0);
288
+ tcg_gen_mov_i32(cpu_ZF, cpu_NF);
289
+ }
290
+ return true;
291
+}
292
+
293
+static bool trans_PTRUE(DisasContext *s, arg_PTRUE *a, uint32_t insn)
294
+{
295
+ return do_predset(s, a->esz, a->rd, a->pat, a->s);
296
+}
297
+
298
+static bool trans_SETFFR(DisasContext *s, arg_SETFFR *a, uint32_t insn)
299
+{
300
+ /* Note pat == 31 is #all, to set all elements. */
301
+ return do_predset(s, 0, FFR_PRED_NUM, 31, false);
302
+}
303
+
304
+static bool trans_PFALSE(DisasContext *s, arg_PFALSE *a, uint32_t insn)
305
+{
306
+ /* Note pat == 32 is #unimp, to set no elements. */
307
+ return do_predset(s, 0, a->rd, 32, false);
308
+}
309
+
310
+static bool trans_RDFFR_p(DisasContext *s, arg_RDFFR_p *a, uint32_t insn)
311
+{
312
+ /* The path through do_pppp_flags is complicated enough to want to avoid
313
+ * duplication. Frob the arguments into the form of a predicated AND.
314
+ */
315
+ arg_rprr_s alt_a = {
316
+ .rd = a->rd, .pg = a->pg, .s = a->s,
317
+ .rn = FFR_PRED_NUM, .rm = FFR_PRED_NUM,
318
+ };
319
+ return trans_AND_pppp(s, &alt_a, insn);
320
+}
321
+
322
+static bool trans_RDFFR(DisasContext *s, arg_RDFFR *a, uint32_t insn)
323
+{
324
+ return do_mov_p(s, a->rd, FFR_PRED_NUM);
325
+}
326
+
327
+static bool trans_WRFFR(DisasContext *s, arg_WRFFR *a, uint32_t insn)
328
+{
329
+ return do_mov_p(s, FFR_PRED_NUM, a->rn);
330
+}
331
+
332
+static bool do_pfirst_pnext(DisasContext *s, arg_rr_esz *a,
333
+ void (*gen_fn)(TCGv_i32, TCGv_ptr,
334
+ TCGv_ptr, TCGv_i32))
335
+{
336
+ if (!sve_access_check(s)) {
337
+ return true;
338
+ }
339
+
340
+ TCGv_ptr t_pd = tcg_temp_new_ptr();
341
+ TCGv_ptr t_pg = tcg_temp_new_ptr();
342
+ TCGv_i32 t;
343
+ unsigned desc;
344
+
345
+ desc = DIV_ROUND_UP(pred_full_reg_size(s), 8);
346
+ desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz);
347
+
348
+ tcg_gen_addi_ptr(t_pd, cpu_env, pred_full_reg_offset(s, a->rd));
349
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->rn));
350
+ t = tcg_const_i32(desc);
351
+
352
+ gen_fn(t, t_pd, t_pg, t);
353
+ tcg_temp_free_ptr(t_pd);
354
+ tcg_temp_free_ptr(t_pg);
355
+
356
+ do_pred_flags(t);
357
+ tcg_temp_free_i32(t);
358
+ return true;
359
+}
360
+
361
+static bool trans_PFIRST(DisasContext *s, arg_rr_esz *a, uint32_t insn)
362
+{
363
+ return do_pfirst_pnext(s, a, gen_helper_sve_pfirst);
364
+}
365
+
366
+static bool trans_PNEXT(DisasContext *s, arg_rr_esz *a, uint32_t insn)
367
+{
368
+ return do_pfirst_pnext(s, a, gen_helper_sve_pnext);
369
+}
370
+
371
/*
372
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
373
*/
374
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
375
index XXXXXXX..XXXXXXX 100644
376
--- a/target/arm/sve.decode
377
+++ b/target/arm/sve.decode
378
@@ -XXX,XX +XXX,XX @@
379
# when creating helpers common to those for the individual
380
# instruction patterns.
381
382
+&rr_esz rd rn esz
383
&rri rd rn imm
384
&rrr_esz rd rn rm esz
385
&rprr_s rd pg rn rm s
386
@@ -XXX,XX +XXX,XX @@
387
# Named instruction formats. These are generally used to
388
# reduce the amount of duplication between instruction patterns.
389
390
+# Two operand with unused vector element size
391
+@pd_pn_e0 ........ ........ ....... rn:4 . rd:4 &rr_esz esz=0
392
+
393
+# Two operand
394
+@pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz
395
+
396
# Three operand with unused vector element size
397
@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0
398
399
@@ -XXX,XX +XXX,XX @@ NAND_pppp 00100101 1. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s
400
# SVE predicate test
401
PTEST 00100101 01 010000 11 pg:4 0 rn:4 0 0000
402
403
+# SVE predicate initialize
404
+PTRUE 00100101 esz:2 01100 s:1 111000 pat:5 0 rd:4
405
+
406
+# SVE initialize FFR
407
+SETFFR 00100101 0010 1100 1001 0000 0000 0000
408
+
409
+# SVE zero predicate register
410
+PFALSE 00100101 0001 1000 1110 0100 0000 rd:4
411
+
412
+# SVE predicate read from FFR (predicated)
413
+RDFFR_p 00100101 0 s:1 0110001111000 pg:4 0 rd:4
414
+
415
+# SVE predicate read from FFR (unpredicated)
416
+RDFFR 00100101 0001 1001 1111 0000 0000 rd:4
417
+
418
+# SVE FFR write from predicate (WRFFR)
419
+WRFFR 00100101 0010 1000 1001 000 rn:4 00000
420
+
421
+# SVE predicate first active
422
+PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0
423
+
424
+# SVE predicate next active
425
+PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn
426
+
427
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
428
429
# SVE load predicate register
430
--
431
2.17.0
432
433
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20180516223007.10256-9-richard.henderson@linaro.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
7
target/arm/helper-sve.h | 145 +++++++++++++++++++++++++++
8
target/arm/sve_helper.c | 194 +++++++++++++++++++++++++++++++++++++
9
target/arm/translate-sve.c | 68 +++++++++++++
10
target/arm/sve.decode | 42 ++++++++
11
4 files changed, 449 insertions(+)
12
13
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-sve.h
16
+++ b/target/arm/helper-sve.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
18
DEF_HELPER_FLAGS_3(sve_pfirst, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_3(sve_pnext, TCG_CALL_NO_WG, i32, ptr, ptr, i32)
20
21
+DEF_HELPER_FLAGS_5(sve_and_zpzz_b, TCG_CALL_NO_RWG,
22
+ void, ptr, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_5(sve_and_zpzz_h, TCG_CALL_NO_RWG,
24
+ void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_5(sve_and_zpzz_s, TCG_CALL_NO_RWG,
26
+ void, ptr, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_5(sve_and_zpzz_d, TCG_CALL_NO_RWG,
28
+ void, ptr, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_5(sve_eor_zpzz_b, TCG_CALL_NO_RWG,
31
+ void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(sve_eor_zpzz_h, TCG_CALL_NO_RWG,
33
+ void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(sve_eor_zpzz_s, TCG_CALL_NO_RWG,
35
+ void, ptr, ptr, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_5(sve_eor_zpzz_d, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, i32)
38
+
39
+DEF_HELPER_FLAGS_5(sve_orr_zpzz_b, TCG_CALL_NO_RWG,
40
+ void, ptr, ptr, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_5(sve_orr_zpzz_h, TCG_CALL_NO_RWG,
42
+ void, ptr, ptr, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_5(sve_orr_zpzz_s, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_5(sve_orr_zpzz_d, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, i32)
47
+
48
+DEF_HELPER_FLAGS_5(sve_bic_zpzz_b, TCG_CALL_NO_RWG,
49
+ void, ptr, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_5(sve_bic_zpzz_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_5(sve_bic_zpzz_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_5(sve_bic_zpzz_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_5(sve_add_zpzz_b, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_5(sve_add_zpzz_h, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_5(sve_add_zpzz_s, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, i32)
63
+DEF_HELPER_FLAGS_5(sve_add_zpzz_d, TCG_CALL_NO_RWG,
64
+ void, ptr, ptr, ptr, ptr, i32)
65
+
66
+DEF_HELPER_FLAGS_5(sve_sub_zpzz_b, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, ptr, i32)
68
+DEF_HELPER_FLAGS_5(sve_sub_zpzz_h, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, ptr, i32)
70
+DEF_HELPER_FLAGS_5(sve_sub_zpzz_s, TCG_CALL_NO_RWG,
71
+ void, ptr, ptr, ptr, ptr, i32)
72
+DEF_HELPER_FLAGS_5(sve_sub_zpzz_d, TCG_CALL_NO_RWG,
73
+ void, ptr, ptr, ptr, ptr, i32)
74
+
75
+DEF_HELPER_FLAGS_5(sve_smax_zpzz_b, TCG_CALL_NO_RWG,
76
+ void, ptr, ptr, ptr, ptr, i32)
77
+DEF_HELPER_FLAGS_5(sve_smax_zpzz_h, TCG_CALL_NO_RWG,
78
+ void, ptr, ptr, ptr, ptr, i32)
79
+DEF_HELPER_FLAGS_5(sve_smax_zpzz_s, TCG_CALL_NO_RWG,
80
+ void, ptr, ptr, ptr, ptr, i32)
81
+DEF_HELPER_FLAGS_5(sve_smax_zpzz_d, TCG_CALL_NO_RWG,
82
+ void, ptr, ptr, ptr, ptr, i32)
83
+
84
+DEF_HELPER_FLAGS_5(sve_umax_zpzz_b, TCG_CALL_NO_RWG,
85
+ void, ptr, ptr, ptr, ptr, i32)
86
+DEF_HELPER_FLAGS_5(sve_umax_zpzz_h, TCG_CALL_NO_RWG,
87
+ void, ptr, ptr, ptr, ptr, i32)
88
+DEF_HELPER_FLAGS_5(sve_umax_zpzz_s, TCG_CALL_NO_RWG,
89
+ void, ptr, ptr, ptr, ptr, i32)
90
+DEF_HELPER_FLAGS_5(sve_umax_zpzz_d, TCG_CALL_NO_RWG,
91
+ void, ptr, ptr, ptr, ptr, i32)
92
+
93
+DEF_HELPER_FLAGS_5(sve_smin_zpzz_b, TCG_CALL_NO_RWG,
94
+ void, ptr, ptr, ptr, ptr, i32)
95
+DEF_HELPER_FLAGS_5(sve_smin_zpzz_h, TCG_CALL_NO_RWG,
96
+ void, ptr, ptr, ptr, ptr, i32)
97
+DEF_HELPER_FLAGS_5(sve_smin_zpzz_s, TCG_CALL_NO_RWG,
98
+ void, ptr, ptr, ptr, ptr, i32)
99
+DEF_HELPER_FLAGS_5(sve_smin_zpzz_d, TCG_CALL_NO_RWG,
100
+ void, ptr, ptr, ptr, ptr, i32)
101
+
102
+DEF_HELPER_FLAGS_5(sve_umin_zpzz_b, TCG_CALL_NO_RWG,
103
+ void, ptr, ptr, ptr, ptr, i32)
104
+DEF_HELPER_FLAGS_5(sve_umin_zpzz_h, TCG_CALL_NO_RWG,
105
+ void, ptr, ptr, ptr, ptr, i32)
106
+DEF_HELPER_FLAGS_5(sve_umin_zpzz_s, TCG_CALL_NO_RWG,
107
+ void, ptr, ptr, ptr, ptr, i32)
108
+DEF_HELPER_FLAGS_5(sve_umin_zpzz_d, TCG_CALL_NO_RWG,
109
+ void, ptr, ptr, ptr, ptr, i32)
110
+
111
+DEF_HELPER_FLAGS_5(sve_sabd_zpzz_b, TCG_CALL_NO_RWG,
112
+ void, ptr, ptr, ptr, ptr, i32)
113
+DEF_HELPER_FLAGS_5(sve_sabd_zpzz_h, TCG_CALL_NO_RWG,
114
+ void, ptr, ptr, ptr, ptr, i32)
115
+DEF_HELPER_FLAGS_5(sve_sabd_zpzz_s, TCG_CALL_NO_RWG,
116
+ void, ptr, ptr, ptr, ptr, i32)
117
+DEF_HELPER_FLAGS_5(sve_sabd_zpzz_d, TCG_CALL_NO_RWG,
118
+ void, ptr, ptr, ptr, ptr, i32)
119
+
120
+DEF_HELPER_FLAGS_5(sve_uabd_zpzz_b, TCG_CALL_NO_RWG,
121
+ void, ptr, ptr, ptr, ptr, i32)
122
+DEF_HELPER_FLAGS_5(sve_uabd_zpzz_h, TCG_CALL_NO_RWG,
123
+ void, ptr, ptr, ptr, ptr, i32)
124
+DEF_HELPER_FLAGS_5(sve_uabd_zpzz_s, TCG_CALL_NO_RWG,
125
+ void, ptr, ptr, ptr, ptr, i32)
126
+DEF_HELPER_FLAGS_5(sve_uabd_zpzz_d, TCG_CALL_NO_RWG,
127
+ void, ptr, ptr, ptr, ptr, i32)
128
+
129
+DEF_HELPER_FLAGS_5(sve_mul_zpzz_b, TCG_CALL_NO_RWG,
130
+ void, ptr, ptr, ptr, ptr, i32)
131
+DEF_HELPER_FLAGS_5(sve_mul_zpzz_h, TCG_CALL_NO_RWG,
132
+ void, ptr, ptr, ptr, ptr, i32)
133
+DEF_HELPER_FLAGS_5(sve_mul_zpzz_s, TCG_CALL_NO_RWG,
134
+ void, ptr, ptr, ptr, ptr, i32)
135
+DEF_HELPER_FLAGS_5(sve_mul_zpzz_d, TCG_CALL_NO_RWG,
136
+ void, ptr, ptr, ptr, ptr, i32)
137
+
138
+DEF_HELPER_FLAGS_5(sve_smulh_zpzz_b, TCG_CALL_NO_RWG,
139
+ void, ptr, ptr, ptr, ptr, i32)
140
+DEF_HELPER_FLAGS_5(sve_smulh_zpzz_h, TCG_CALL_NO_RWG,
141
+ void, ptr, ptr, ptr, ptr, i32)
142
+DEF_HELPER_FLAGS_5(sve_smulh_zpzz_s, TCG_CALL_NO_RWG,
143
+ void, ptr, ptr, ptr, ptr, i32)
144
+DEF_HELPER_FLAGS_5(sve_smulh_zpzz_d, TCG_CALL_NO_RWG,
145
+ void, ptr, ptr, ptr, ptr, i32)
146
+
147
+DEF_HELPER_FLAGS_5(sve_umulh_zpzz_b, TCG_CALL_NO_RWG,
148
+ void, ptr, ptr, ptr, ptr, i32)
149
+DEF_HELPER_FLAGS_5(sve_umulh_zpzz_h, TCG_CALL_NO_RWG,
150
+ void, ptr, ptr, ptr, ptr, i32)
151
+DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG,
152
+ void, ptr, ptr, ptr, ptr, i32)
153
+DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG,
154
+ void, ptr, ptr, ptr, ptr, i32)
155
+
156
+DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG,
157
+ void, ptr, ptr, ptr, ptr, i32)
158
+DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG,
159
+ void, ptr, ptr, ptr, ptr, i32)
160
+
161
+DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG,
162
+ void, ptr, ptr, ptr, ptr, i32)
163
+DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG,
164
+ void, ptr, ptr, ptr, ptr, i32)
165
+
166
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
167
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
168
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
169
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
170
index XXXXXXX..XXXXXXX 100644
171
--- a/target/arm/sve_helper.c
172
+++ b/target/arm/sve_helper.c
173
@@ -XXX,XX +XXX,XX @@
174
#include "tcg/tcg-gvec-desc.h"
175
176
177
+/* Note that vector data is stored in host-endian 64-bit chunks,
178
+ so addressing units smaller than that needs a host-endian fixup. */
179
+#ifdef HOST_WORDS_BIGENDIAN
180
+#define H1(x) ((x) ^ 7)
181
+#define H1_2(x) ((x) ^ 6)
182
+#define H1_4(x) ((x) ^ 4)
183
+#define H2(x) ((x) ^ 3)
184
+#define H4(x) ((x) ^ 1)
185
+#else
186
+#define H1(x) (x)
187
+#define H1_2(x) (x)
188
+#define H1_4(x) (x)
189
+#define H2(x) (x)
190
+#define H4(x) (x)
191
+#endif
192
+
193
/* Return a value for NZCV as per the ARM PredTest pseudofunction.
194
*
195
* The return value has bit 31 set if N is set, bit 1 set if Z is clear,
196
@@ -XXX,XX +XXX,XX @@ LOGICAL_PPPP(sve_nand_pppp, DO_NAND)
197
#undef DO_SEL
198
#undef LOGICAL_PPPP
199
200
+/* Fully general three-operand expander, controlled by a predicate.
201
+ * This is complicated by the host-endian storage of the register file.
202
+ */
203
+/* ??? I don't expect the compiler could ever vectorize this itself.
204
+ * With some tables we can convert bit masks to byte masks, and with
205
+ * extra care wrt byte/word ordering we could use gcc generic vectors
206
+ * and do 16 bytes at a time.
207
+ */
208
+#define DO_ZPZZ(NAME, TYPE, H, OP) \
209
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
210
+{ \
211
+ intptr_t i, opr_sz = simd_oprsz(desc); \
212
+ for (i = 0; i < opr_sz; ) { \
213
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
214
+ do { \
215
+ if (pg & 1) { \
216
+ TYPE nn = *(TYPE *)(vn + H(i)); \
217
+ TYPE mm = *(TYPE *)(vm + H(i)); \
218
+ *(TYPE *)(vd + H(i)) = OP(nn, mm); \
219
+ } \
220
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
221
+ } while (i & 15); \
222
+ } \
223
+}
224
+
225
+/* Similarly, specialized for 64-bit operands. */
226
+#define DO_ZPZZ_D(NAME, TYPE, OP) \
227
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
228
+{ \
229
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8; \
230
+ TYPE *d = vd, *n = vn, *m = vm; \
231
+ uint8_t *pg = vg; \
232
+ for (i = 0; i < opr_sz; i += 1) { \
233
+ if (pg[H1(i)] & 1) { \
234
+ TYPE nn = n[i], mm = m[i]; \
235
+ d[i] = OP(nn, mm); \
236
+ } \
237
+ } \
238
+}
239
+
240
+#define DO_AND(N, M) (N & M)
241
+#define DO_EOR(N, M) (N ^ M)
242
+#define DO_ORR(N, M) (N | M)
243
+#define DO_BIC(N, M) (N & ~M)
244
+#define DO_ADD(N, M) (N + M)
245
+#define DO_SUB(N, M) (N - M)
246
+#define DO_MAX(N, M) ((N) >= (M) ? (N) : (M))
247
+#define DO_MIN(N, M) ((N) >= (M) ? (M) : (N))
248
+#define DO_ABD(N, M) ((N) >= (M) ? (N) - (M) : (M) - (N))
249
+#define DO_MUL(N, M) (N * M)
250
+#define DO_DIV(N, M) (M ? N / M : 0)
251
+
252
+DO_ZPZZ(sve_and_zpzz_b, uint8_t, H1, DO_AND)
253
+DO_ZPZZ(sve_and_zpzz_h, uint16_t, H1_2, DO_AND)
254
+DO_ZPZZ(sve_and_zpzz_s, uint32_t, H1_4, DO_AND)
255
+DO_ZPZZ_D(sve_and_zpzz_d, uint64_t, DO_AND)
256
+
257
+DO_ZPZZ(sve_orr_zpzz_b, uint8_t, H1, DO_ORR)
258
+DO_ZPZZ(sve_orr_zpzz_h, uint16_t, H1_2, DO_ORR)
259
+DO_ZPZZ(sve_orr_zpzz_s, uint32_t, H1_4, DO_ORR)
260
+DO_ZPZZ_D(sve_orr_zpzz_d, uint64_t, DO_ORR)
261
+
262
+DO_ZPZZ(sve_eor_zpzz_b, uint8_t, H1, DO_EOR)
263
+DO_ZPZZ(sve_eor_zpzz_h, uint16_t, H1_2, DO_EOR)
264
+DO_ZPZZ(sve_eor_zpzz_s, uint32_t, H1_4, DO_EOR)
265
+DO_ZPZZ_D(sve_eor_zpzz_d, uint64_t, DO_EOR)
266
+
267
+DO_ZPZZ(sve_bic_zpzz_b, uint8_t, H1, DO_BIC)
268
+DO_ZPZZ(sve_bic_zpzz_h, uint16_t, H1_2, DO_BIC)
269
+DO_ZPZZ(sve_bic_zpzz_s, uint32_t, H1_4, DO_BIC)
270
+DO_ZPZZ_D(sve_bic_zpzz_d, uint64_t, DO_BIC)
271
+
272
+DO_ZPZZ(sve_add_zpzz_b, uint8_t, H1, DO_ADD)
273
+DO_ZPZZ(sve_add_zpzz_h, uint16_t, H1_2, DO_ADD)
274
+DO_ZPZZ(sve_add_zpzz_s, uint32_t, H1_4, DO_ADD)
275
+DO_ZPZZ_D(sve_add_zpzz_d, uint64_t, DO_ADD)
276
+
277
+DO_ZPZZ(sve_sub_zpzz_b, uint8_t, H1, DO_SUB)
278
+DO_ZPZZ(sve_sub_zpzz_h, uint16_t, H1_2, DO_SUB)
279
+DO_ZPZZ(sve_sub_zpzz_s, uint32_t, H1_4, DO_SUB)
280
+DO_ZPZZ_D(sve_sub_zpzz_d, uint64_t, DO_SUB)
281
+
282
+DO_ZPZZ(sve_smax_zpzz_b, int8_t, H1, DO_MAX)
283
+DO_ZPZZ(sve_smax_zpzz_h, int16_t, H1_2, DO_MAX)
284
+DO_ZPZZ(sve_smax_zpzz_s, int32_t, H1_4, DO_MAX)
285
+DO_ZPZZ_D(sve_smax_zpzz_d, int64_t, DO_MAX)
286
+
287
+DO_ZPZZ(sve_umax_zpzz_b, uint8_t, H1, DO_MAX)
288
+DO_ZPZZ(sve_umax_zpzz_h, uint16_t, H1_2, DO_MAX)
289
+DO_ZPZZ(sve_umax_zpzz_s, uint32_t, H1_4, DO_MAX)
290
+DO_ZPZZ_D(sve_umax_zpzz_d, uint64_t, DO_MAX)
291
+
292
+DO_ZPZZ(sve_smin_zpzz_b, int8_t, H1, DO_MIN)
293
+DO_ZPZZ(sve_smin_zpzz_h, int16_t, H1_2, DO_MIN)
294
+DO_ZPZZ(sve_smin_zpzz_s, int32_t, H1_4, DO_MIN)
295
+DO_ZPZZ_D(sve_smin_zpzz_d, int64_t, DO_MIN)
296
+
297
+DO_ZPZZ(sve_umin_zpzz_b, uint8_t, H1, DO_MIN)
298
+DO_ZPZZ(sve_umin_zpzz_h, uint16_t, H1_2, DO_MIN)
299
+DO_ZPZZ(sve_umin_zpzz_s, uint32_t, H1_4, DO_MIN)
300
+DO_ZPZZ_D(sve_umin_zpzz_d, uint64_t, DO_MIN)
301
+
302
+DO_ZPZZ(sve_sabd_zpzz_b, int8_t, H1, DO_ABD)
303
+DO_ZPZZ(sve_sabd_zpzz_h, int16_t, H1_2, DO_ABD)
304
+DO_ZPZZ(sve_sabd_zpzz_s, int32_t, H1_4, DO_ABD)
305
+DO_ZPZZ_D(sve_sabd_zpzz_d, int64_t, DO_ABD)
306
+
307
+DO_ZPZZ(sve_uabd_zpzz_b, uint8_t, H1, DO_ABD)
308
+DO_ZPZZ(sve_uabd_zpzz_h, uint16_t, H1_2, DO_ABD)
309
+DO_ZPZZ(sve_uabd_zpzz_s, uint32_t, H1_4, DO_ABD)
310
+DO_ZPZZ_D(sve_uabd_zpzz_d, uint64_t, DO_ABD)
311
+
312
+/* Because the computation type is at least twice as large as required,
313
+ these work for both signed and unsigned source types. */
314
+static inline uint8_t do_mulh_b(int32_t n, int32_t m)
315
+{
316
+ return (n * m) >> 8;
317
+}
318
+
319
+static inline uint16_t do_mulh_h(int32_t n, int32_t m)
320
+{
321
+ return (n * m) >> 16;
322
+}
323
+
324
+static inline uint32_t do_mulh_s(int64_t n, int64_t m)
325
+{
326
+ return (n * m) >> 32;
327
+}
328
+
329
+static inline uint64_t do_smulh_d(uint64_t n, uint64_t m)
330
+{
331
+ uint64_t lo, hi;
332
+ muls64(&lo, &hi, n, m);
333
+ return hi;
334
+}
335
+
336
+static inline uint64_t do_umulh_d(uint64_t n, uint64_t m)
337
+{
338
+ uint64_t lo, hi;
339
+ mulu64(&lo, &hi, n, m);
340
+ return hi;
341
+}
342
+
343
+DO_ZPZZ(sve_mul_zpzz_b, uint8_t, H1, DO_MUL)
344
+DO_ZPZZ(sve_mul_zpzz_h, uint16_t, H1_2, DO_MUL)
345
+DO_ZPZZ(sve_mul_zpzz_s, uint32_t, H1_4, DO_MUL)
346
+DO_ZPZZ_D(sve_mul_zpzz_d, uint64_t, DO_MUL)
347
+
348
+DO_ZPZZ(sve_smulh_zpzz_b, int8_t, H1, do_mulh_b)
349
+DO_ZPZZ(sve_smulh_zpzz_h, int16_t, H1_2, do_mulh_h)
350
+DO_ZPZZ(sve_smulh_zpzz_s, int32_t, H1_4, do_mulh_s)
351
+DO_ZPZZ_D(sve_smulh_zpzz_d, uint64_t, do_smulh_d)
352
+
353
+DO_ZPZZ(sve_umulh_zpzz_b, uint8_t, H1, do_mulh_b)
354
+DO_ZPZZ(sve_umulh_zpzz_h, uint16_t, H1_2, do_mulh_h)
355
+DO_ZPZZ(sve_umulh_zpzz_s, uint32_t, H1_4, do_mulh_s)
356
+DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d)
357
+
358
+DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV)
359
+DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV)
360
+
361
+DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV)
362
+DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV)
363
+
364
+#undef DO_ZPZZ
365
+#undef DO_ZPZZ_D
366
+#undef DO_AND
367
+#undef DO_ORR
368
+#undef DO_EOR
369
+#undef DO_BIC
370
+#undef DO_ADD
371
+#undef DO_SUB
372
+#undef DO_MAX
373
+#undef DO_MIN
374
+#undef DO_ABD
375
+#undef DO_MUL
376
+#undef DO_DIV
377
+
378
/* Similar to the ARM LastActiveElement pseudocode function, except the
379
result is multiplied by the element size. This includes the not found
380
indication; e.g. not found for esz=3 is -8. */
381
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
382
index XXXXXXX..XXXXXXX 100644
383
--- a/target/arm/translate-sve.c
384
+++ b/target/arm/translate-sve.c
385
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
386
return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
387
}
388
389
+/*
390
+ *** SVE Integer Arithmetic - Binary Predicated Group
391
+ */
392
+
393
+static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn)
394
+{
395
+ unsigned vsz = vec_full_reg_size(s);
396
+ if (fn == NULL) {
397
+ return false;
398
+ }
399
+ if (sve_access_check(s)) {
400
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd),
401
+ vec_full_reg_offset(s, a->rn),
402
+ vec_full_reg_offset(s, a->rm),
403
+ pred_full_reg_offset(s, a->pg),
404
+ vsz, vsz, 0, fn);
405
+ }
406
+ return true;
407
+}
408
+
409
+#define DO_ZPZZ(NAME, name) \
410
+static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \
411
+ uint32_t insn) \
412
+{ \
413
+ static gen_helper_gvec_4 * const fns[4] = { \
414
+ gen_helper_sve_##name##_zpzz_b, gen_helper_sve_##name##_zpzz_h, \
415
+ gen_helper_sve_##name##_zpzz_s, gen_helper_sve_##name##_zpzz_d, \
416
+ }; \
417
+ return do_zpzz_ool(s, a, fns[a->esz]); \
418
+}
419
+
420
+DO_ZPZZ(AND, and)
421
+DO_ZPZZ(EOR, eor)
422
+DO_ZPZZ(ORR, orr)
423
+DO_ZPZZ(BIC, bic)
424
+
425
+DO_ZPZZ(ADD, add)
426
+DO_ZPZZ(SUB, sub)
427
+
428
+DO_ZPZZ(SMAX, smax)
429
+DO_ZPZZ(UMAX, umax)
430
+DO_ZPZZ(SMIN, smin)
431
+DO_ZPZZ(UMIN, umin)
432
+DO_ZPZZ(SABD, sabd)
433
+DO_ZPZZ(UABD, uabd)
434
+
435
+DO_ZPZZ(MUL, mul)
436
+DO_ZPZZ(SMULH, smulh)
437
+DO_ZPZZ(UMULH, umulh)
438
+
439
+static bool trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
440
+{
441
+ static gen_helper_gvec_4 * const fns[4] = {
442
+ NULL, NULL, gen_helper_sve_sdiv_zpzz_s, gen_helper_sve_sdiv_zpzz_d
443
+ };
444
+ return do_zpzz_ool(s, a, fns[a->esz]);
445
+}
446
+
447
+static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
448
+{
449
+ static gen_helper_gvec_4 * const fns[4] = {
450
+ NULL, NULL, gen_helper_sve_udiv_zpzz_s, gen_helper_sve_udiv_zpzz_d
451
+ };
452
+ return do_zpzz_ool(s, a, fns[a->esz]);
453
+}
454
+
455
+#undef DO_ZPZZ
456
+
457
/*
458
*** SVE Predicate Logical Operations Group
459
*/
460
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
461
index XXXXXXX..XXXXXXX 100644
462
--- a/target/arm/sve.decode
463
+++ b/target/arm/sve.decode
464
@@ -XXX,XX +XXX,XX @@
465
466
%imm9_16_10 16:s6 10:3
467
468
+# Either a copy of rd (at bit 0), or a different source
469
+# as propagated via the MOVPRFX instruction.
470
+%reg_movprfx 0:5
471
+
472
###########################################################################
473
# Named attribute sets. These are used to make nice(er) names
474
# when creating helpers common to those for the individual
475
@@ -XXX,XX +XXX,XX @@
476
&rri rd rn imm
477
&rrr_esz rd rn rm esz
478
&rprr_s rd pg rn rm s
479
+&rprr_esz rd pg rn rm esz
480
481
###########################################################################
482
# Named instruction formats. These are generally used to
483
@@ -XXX,XX +XXX,XX @@
484
# Three predicate operand, with governing predicate, flag setting
485
@pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s
486
487
+# Two register operand, with governing predicate, vector element size
488
+@rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \
489
+ &rprr_esz rn=%reg_movprfx
490
+@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \
491
+ &rprr_esz rm=%reg_movprfx
492
+
493
# Basic Load/Store with 9-bit immediate offset
494
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
495
&rri imm=%imm9_16_10
496
@@ -XXX,XX +XXX,XX @@
497
###########################################################################
498
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
499
500
+### SVE Integer Arithmetic - Binary Predicated Group
501
+
502
+# SVE bitwise logical vector operations (predicated)
503
+ORR_zpzz 00000100 .. 011 000 000 ... ..... ..... @rdn_pg_rm
504
+EOR_zpzz 00000100 .. 011 001 000 ... ..... ..... @rdn_pg_rm
505
+AND_zpzz 00000100 .. 011 010 000 ... ..... ..... @rdn_pg_rm
506
+BIC_zpzz 00000100 .. 011 011 000 ... ..... ..... @rdn_pg_rm
507
+
508
+# SVE integer add/subtract vectors (predicated)
509
+ADD_zpzz 00000100 .. 000 000 000 ... ..... ..... @rdn_pg_rm
510
+SUB_zpzz 00000100 .. 000 001 000 ... ..... ..... @rdn_pg_rm
511
+SUB_zpzz 00000100 .. 000 011 000 ... ..... ..... @rdm_pg_rn # SUBR
512
+
513
+# SVE integer min/max/difference (predicated)
514
+SMAX_zpzz 00000100 .. 001 000 000 ... ..... ..... @rdn_pg_rm
515
+UMAX_zpzz 00000100 .. 001 001 000 ... ..... ..... @rdn_pg_rm
516
+SMIN_zpzz 00000100 .. 001 010 000 ... ..... ..... @rdn_pg_rm
517
+UMIN_zpzz 00000100 .. 001 011 000 ... ..... ..... @rdn_pg_rm
518
+SABD_zpzz 00000100 .. 001 100 000 ... ..... ..... @rdn_pg_rm
519
+UABD_zpzz 00000100 .. 001 101 000 ... ..... ..... @rdn_pg_rm
520
+
521
+# SVE integer multiply/divide (predicated)
522
+MUL_zpzz 00000100 .. 010 000 000 ... ..... ..... @rdn_pg_rm
523
+SMULH_zpzz 00000100 .. 010 010 000 ... ..... ..... @rdn_pg_rm
524
+UMULH_zpzz 00000100 .. 010 011 000 ... ..... ..... @rdn_pg_rm
525
+# Note that divide requires size >= 2; below 2 is unallocated.
526
+SDIV_zpzz 00000100 .. 010 100 000 ... ..... ..... @rdn_pg_rm
527
+UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn_pg_rm
528
+SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn # SDIVR
529
+UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn # UDIVR
530
+
531
### SVE Logical - Unpredicated Group
532
533
# SVE bitwise logical operations (unpredicated)
534
--
535
2.17.0
536
537
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Excepting MOVPRFX, which isn't a reduction. Presumably it is
4
placed within the group because of its encoding.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20180516223007.10256-10-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/helper-sve.h | 44 ++++++++++++++++++
12
target/arm/sve_helper.c | 91 ++++++++++++++++++++++++++++++++++++++
13
target/arm/translate-sve.c | 68 ++++++++++++++++++++++++++++
14
target/arm/sve.decode | 22 +++++++++
15
4 files changed, 225 insertions(+)
16
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
20
+++ b/target/arm/helper-sve.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG,
22
DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG,
23
void, ptr, ptr, ptr, ptr, i32)
24
25
+DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_3(sve_orv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_3(sve_eorv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_3(sve_eorv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_3(sve_eorv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_3(sve_eorv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
34
+
35
+DEF_HELPER_FLAGS_3(sve_andv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_3(sve_andv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_3(sve_andv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_3(sve_andv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
39
+
40
+DEF_HELPER_FLAGS_3(sve_saddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_3(sve_saddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_3(sve_saddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
43
+
44
+DEF_HELPER_FLAGS_3(sve_uaddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_3(sve_uaddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_3(sve_uaddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_3(sve_uaddv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
48
+
49
+DEF_HELPER_FLAGS_3(sve_smaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_3(sve_smaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_3(sve_smaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_3(sve_smaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
53
+
54
+DEF_HELPER_FLAGS_3(sve_umaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
55
+DEF_HELPER_FLAGS_3(sve_umaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
56
+DEF_HELPER_FLAGS_3(sve_umaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
57
+DEF_HELPER_FLAGS_3(sve_umaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
58
+
59
+DEF_HELPER_FLAGS_3(sve_sminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
60
+DEF_HELPER_FLAGS_3(sve_sminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_3(sve_sminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
62
+DEF_HELPER_FLAGS_3(sve_sminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_3(sve_uminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
65
+DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
66
+DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
67
+DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
68
+
69
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
70
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
71
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
72
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
73
index XXXXXXX..XXXXXXX 100644
74
--- a/target/arm/sve_helper.c
75
+++ b/target/arm/sve_helper.c
76
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV)
77
78
#undef DO_ZPZZ
79
#undef DO_ZPZZ_D
80
+
81
+/* Two-operand reduction expander, controlled by a predicate.
82
+ * The difference between TYPERED and TYPERET has to do with
83
+ * sign-extension. E.g. for SMAX, TYPERED must be signed,
84
+ * but TYPERET must be unsigned so that e.g. a 32-bit value
85
+ * is not sign-extended to the ABI uint64_t return type.
86
+ */
87
+/* ??? If we were to vectorize this by hand the reduction ordering
88
+ * would change. For integer operands, this is perfectly fine.
89
+ */
90
+#define DO_VPZ(NAME, TYPEELT, TYPERED, TYPERET, H, INIT, OP) \
91
+uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \
92
+{ \
93
+ intptr_t i, opr_sz = simd_oprsz(desc); \
94
+ TYPERED ret = INIT; \
95
+ for (i = 0; i < opr_sz; ) { \
96
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
97
+ do { \
98
+ if (pg & 1) { \
99
+ TYPEELT nn = *(TYPEELT *)(vn + H(i)); \
100
+ ret = OP(ret, nn); \
101
+ } \
102
+ i += sizeof(TYPEELT), pg >>= sizeof(TYPEELT); \
103
+ } while (i & 15); \
104
+ } \
105
+ return (TYPERET)ret; \
106
+}
107
+
108
+#define DO_VPZ_D(NAME, TYPEE, TYPER, INIT, OP) \
109
+uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \
110
+{ \
111
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8; \
112
+ TYPEE *n = vn; \
113
+ uint8_t *pg = vg; \
114
+ TYPER ret = INIT; \
115
+ for (i = 0; i < opr_sz; i += 1) { \
116
+ if (pg[H1(i)] & 1) { \
117
+ TYPEE nn = n[i]; \
118
+ ret = OP(ret, nn); \
119
+ } \
120
+ } \
121
+ return ret; \
122
+}
123
+
124
+DO_VPZ(sve_orv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_ORR)
125
+DO_VPZ(sve_orv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_ORR)
126
+DO_VPZ(sve_orv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_ORR)
127
+DO_VPZ_D(sve_orv_d, uint64_t, uint64_t, 0, DO_ORR)
128
+
129
+DO_VPZ(sve_eorv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_EOR)
130
+DO_VPZ(sve_eorv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_EOR)
131
+DO_VPZ(sve_eorv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_EOR)
132
+DO_VPZ_D(sve_eorv_d, uint64_t, uint64_t, 0, DO_EOR)
133
+
134
+DO_VPZ(sve_andv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_AND)
135
+DO_VPZ(sve_andv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_AND)
136
+DO_VPZ(sve_andv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_AND)
137
+DO_VPZ_D(sve_andv_d, uint64_t, uint64_t, -1, DO_AND)
138
+
139
+DO_VPZ(sve_saddv_b, int8_t, uint64_t, uint64_t, H1, 0, DO_ADD)
140
+DO_VPZ(sve_saddv_h, int16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD)
141
+DO_VPZ(sve_saddv_s, int32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD)
142
+
143
+DO_VPZ(sve_uaddv_b, uint8_t, uint64_t, uint64_t, H1, 0, DO_ADD)
144
+DO_VPZ(sve_uaddv_h, uint16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD)
145
+DO_VPZ(sve_uaddv_s, uint32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD)
146
+DO_VPZ_D(sve_uaddv_d, uint64_t, uint64_t, 0, DO_ADD)
147
+
148
+DO_VPZ(sve_smaxv_b, int8_t, int8_t, uint8_t, H1, INT8_MIN, DO_MAX)
149
+DO_VPZ(sve_smaxv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MIN, DO_MAX)
150
+DO_VPZ(sve_smaxv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MIN, DO_MAX)
151
+DO_VPZ_D(sve_smaxv_d, int64_t, int64_t, INT64_MIN, DO_MAX)
152
+
153
+DO_VPZ(sve_umaxv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_MAX)
154
+DO_VPZ(sve_umaxv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_MAX)
155
+DO_VPZ(sve_umaxv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_MAX)
156
+DO_VPZ_D(sve_umaxv_d, uint64_t, uint64_t, 0, DO_MAX)
157
+
158
+DO_VPZ(sve_sminv_b, int8_t, int8_t, uint8_t, H1, INT8_MAX, DO_MIN)
159
+DO_VPZ(sve_sminv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MAX, DO_MIN)
160
+DO_VPZ(sve_sminv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MAX, DO_MIN)
161
+DO_VPZ_D(sve_sminv_d, int64_t, int64_t, INT64_MAX, DO_MIN)
162
+
163
+DO_VPZ(sve_uminv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_MIN)
164
+DO_VPZ(sve_uminv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_MIN)
165
+DO_VPZ(sve_uminv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_MIN)
166
+DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN)
167
+
168
+#undef DO_VPZ
169
+#undef DO_VPZ_D
170
+
171
#undef DO_AND
172
#undef DO_ORR
173
#undef DO_EOR
174
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
175
index XXXXXXX..XXXXXXX 100644
176
--- a/target/arm/translate-sve.c
177
+++ b/target/arm/translate-sve.c
178
@@ -XXX,XX +XXX,XX @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
179
180
#undef DO_ZPZZ
181
182
+/*
183
+ *** SVE Integer Reduction Group
184
+ */
185
+
186
+typedef void gen_helper_gvec_reduc(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_i32);
187
+static bool do_vpz_ool(DisasContext *s, arg_rpr_esz *a,
188
+ gen_helper_gvec_reduc *fn)
189
+{
190
+ unsigned vsz = vec_full_reg_size(s);
191
+ TCGv_ptr t_zn, t_pg;
192
+ TCGv_i32 desc;
193
+ TCGv_i64 temp;
194
+
195
+ if (fn == NULL) {
196
+ return false;
197
+ }
198
+ if (!sve_access_check(s)) {
199
+ return true;
200
+ }
201
+
202
+ desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
203
+ temp = tcg_temp_new_i64();
204
+ t_zn = tcg_temp_new_ptr();
205
+ t_pg = tcg_temp_new_ptr();
206
+
207
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn));
208
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
209
+ fn(temp, t_zn, t_pg, desc);
210
+ tcg_temp_free_ptr(t_zn);
211
+ tcg_temp_free_ptr(t_pg);
212
+ tcg_temp_free_i32(desc);
213
+
214
+ write_fp_dreg(s, a->rd, temp);
215
+ tcg_temp_free_i64(temp);
216
+ return true;
217
+}
218
+
219
+#define DO_VPZ(NAME, name) \
220
+static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \
221
+{ \
222
+ static gen_helper_gvec_reduc * const fns[4] = { \
223
+ gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \
224
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \
225
+ }; \
226
+ return do_vpz_ool(s, a, fns[a->esz]); \
227
+}
228
+
229
+DO_VPZ(ORV, orv)
230
+DO_VPZ(ANDV, andv)
231
+DO_VPZ(EORV, eorv)
232
+
233
+DO_VPZ(UADDV, uaddv)
234
+DO_VPZ(SMAXV, smaxv)
235
+DO_VPZ(UMAXV, umaxv)
236
+DO_VPZ(SMINV, sminv)
237
+DO_VPZ(UMINV, uminv)
238
+
239
+static bool trans_SADDV(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
240
+{
241
+ static gen_helper_gvec_reduc * const fns[4] = {
242
+ gen_helper_sve_saddv_b, gen_helper_sve_saddv_h,
243
+ gen_helper_sve_saddv_s, NULL
244
+ };
245
+ return do_vpz_ool(s, a, fns[a->esz]);
246
+}
247
+
248
+#undef DO_VPZ
249
+
250
/*
251
*** SVE Predicate Logical Operations Group
252
*/
253
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
254
index XXXXXXX..XXXXXXX 100644
255
--- a/target/arm/sve.decode
256
+++ b/target/arm/sve.decode
257
@@ -XXX,XX +XXX,XX @@
258
&rr_esz rd rn esz
259
&rri rd rn imm
260
&rrr_esz rd rn rm esz
261
+&rpr_esz rd pg rn esz
262
&rprr_s rd pg rn rm s
263
&rprr_esz rd pg rn rm esz
264
265
@@ -XXX,XX +XXX,XX @@
266
@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \
267
&rprr_esz rm=%reg_movprfx
268
269
+# One register operand, with governing predicate, vector element size
270
+@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
271
+
272
# Basic Load/Store with 9-bit immediate offset
273
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
274
&rri imm=%imm9_16_10
275
@@ -XXX,XX +XXX,XX @@ UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn_pg_rm
276
SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn # SDIVR
277
UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn # UDIVR
278
279
+### SVE Integer Reduction Group
280
+
281
+# SVE bitwise logical reduction (predicated)
282
+ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn
283
+EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn
284
+ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn
285
+
286
+# SVE integer add reduction (predicated)
287
+# Note that saddv requires size != 3.
288
+UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn
289
+SADDV 00000100 .. 000 000 001 ... ..... ..... @rd_pg_rn
290
+
291
+# SVE integer min/max reduction (predicated)
292
+SMAXV 00000100 .. 001 000 001 ... ..... ..... @rd_pg_rn
293
+UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg_rn
294
+SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn
295
+UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn
296
+
297
### SVE Logical - Unpredicated Group
298
299
# SVE bitwise logical operations (unpredicated)
300
--
301
2.17.0
302
303
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-11-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 25 ++++
9
target/arm/sve_helper.c | 264 +++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 130 ++++++++++++++++++
11
target/arm/sve.decode | 26 ++++
12
4 files changed, 445 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_3(sve_clr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
+
27
+DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(sve_asr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+
32
+DEF_HELPER_FLAGS_4(sve_lsr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sve_lsr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(sve_lsr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(sve_lsr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_4(sve_lsl_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(sve_lsl_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(sve_lsl_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(sve_lsl_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
+
42
+DEF_HELPER_FLAGS_4(sve_asrd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
+DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
46
+
47
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
49
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
50
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/sve_helper.c
53
+++ b/target/arm/sve_helper.c
54
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words)
55
return flags;
56
}
57
58
+/* Expand active predicate bits to bytes, for byte elements.
59
+ * for (i = 0; i < 256; ++i) {
60
+ * unsigned long m = 0;
61
+ * for (j = 0; j < 8; j++) {
62
+ * if ((i >> j) & 1) {
63
+ * m |= 0xfful << (j << 3);
64
+ * }
65
+ * }
66
+ * printf("0x%016lx,\n", m);
67
+ * }
68
+ */
69
+static inline uint64_t expand_pred_b(uint8_t byte)
70
+{
71
+ static const uint64_t word[256] = {
72
+ 0x0000000000000000, 0x00000000000000ff, 0x000000000000ff00,
73
+ 0x000000000000ffff, 0x0000000000ff0000, 0x0000000000ff00ff,
74
+ 0x0000000000ffff00, 0x0000000000ffffff, 0x00000000ff000000,
75
+ 0x00000000ff0000ff, 0x00000000ff00ff00, 0x00000000ff00ffff,
76
+ 0x00000000ffff0000, 0x00000000ffff00ff, 0x00000000ffffff00,
77
+ 0x00000000ffffffff, 0x000000ff00000000, 0x000000ff000000ff,
78
+ 0x000000ff0000ff00, 0x000000ff0000ffff, 0x000000ff00ff0000,
79
+ 0x000000ff00ff00ff, 0x000000ff00ffff00, 0x000000ff00ffffff,
80
+ 0x000000ffff000000, 0x000000ffff0000ff, 0x000000ffff00ff00,
81
+ 0x000000ffff00ffff, 0x000000ffffff0000, 0x000000ffffff00ff,
82
+ 0x000000ffffffff00, 0x000000ffffffffff, 0x0000ff0000000000,
83
+ 0x0000ff00000000ff, 0x0000ff000000ff00, 0x0000ff000000ffff,
84
+ 0x0000ff0000ff0000, 0x0000ff0000ff00ff, 0x0000ff0000ffff00,
85
+ 0x0000ff0000ffffff, 0x0000ff00ff000000, 0x0000ff00ff0000ff,
86
+ 0x0000ff00ff00ff00, 0x0000ff00ff00ffff, 0x0000ff00ffff0000,
87
+ 0x0000ff00ffff00ff, 0x0000ff00ffffff00, 0x0000ff00ffffffff,
88
+ 0x0000ffff00000000, 0x0000ffff000000ff, 0x0000ffff0000ff00,
89
+ 0x0000ffff0000ffff, 0x0000ffff00ff0000, 0x0000ffff00ff00ff,
90
+ 0x0000ffff00ffff00, 0x0000ffff00ffffff, 0x0000ffffff000000,
91
+ 0x0000ffffff0000ff, 0x0000ffffff00ff00, 0x0000ffffff00ffff,
92
+ 0x0000ffffffff0000, 0x0000ffffffff00ff, 0x0000ffffffffff00,
93
+ 0x0000ffffffffffff, 0x00ff000000000000, 0x00ff0000000000ff,
94
+ 0x00ff00000000ff00, 0x00ff00000000ffff, 0x00ff000000ff0000,
95
+ 0x00ff000000ff00ff, 0x00ff000000ffff00, 0x00ff000000ffffff,
96
+ 0x00ff0000ff000000, 0x00ff0000ff0000ff, 0x00ff0000ff00ff00,
97
+ 0x00ff0000ff00ffff, 0x00ff0000ffff0000, 0x00ff0000ffff00ff,
98
+ 0x00ff0000ffffff00, 0x00ff0000ffffffff, 0x00ff00ff00000000,
99
+ 0x00ff00ff000000ff, 0x00ff00ff0000ff00, 0x00ff00ff0000ffff,
100
+ 0x00ff00ff00ff0000, 0x00ff00ff00ff00ff, 0x00ff00ff00ffff00,
101
+ 0x00ff00ff00ffffff, 0x00ff00ffff000000, 0x00ff00ffff0000ff,
102
+ 0x00ff00ffff00ff00, 0x00ff00ffff00ffff, 0x00ff00ffffff0000,
103
+ 0x00ff00ffffff00ff, 0x00ff00ffffffff00, 0x00ff00ffffffffff,
104
+ 0x00ffff0000000000, 0x00ffff00000000ff, 0x00ffff000000ff00,
105
+ 0x00ffff000000ffff, 0x00ffff0000ff0000, 0x00ffff0000ff00ff,
106
+ 0x00ffff0000ffff00, 0x00ffff0000ffffff, 0x00ffff00ff000000,
107
+ 0x00ffff00ff0000ff, 0x00ffff00ff00ff00, 0x00ffff00ff00ffff,
108
+ 0x00ffff00ffff0000, 0x00ffff00ffff00ff, 0x00ffff00ffffff00,
109
+ 0x00ffff00ffffffff, 0x00ffffff00000000, 0x00ffffff000000ff,
110
+ 0x00ffffff0000ff00, 0x00ffffff0000ffff, 0x00ffffff00ff0000,
111
+ 0x00ffffff00ff00ff, 0x00ffffff00ffff00, 0x00ffffff00ffffff,
112
+ 0x00ffffffff000000, 0x00ffffffff0000ff, 0x00ffffffff00ff00,
113
+ 0x00ffffffff00ffff, 0x00ffffffffff0000, 0x00ffffffffff00ff,
114
+ 0x00ffffffffffff00, 0x00ffffffffffffff, 0xff00000000000000,
115
+ 0xff000000000000ff, 0xff0000000000ff00, 0xff0000000000ffff,
116
+ 0xff00000000ff0000, 0xff00000000ff00ff, 0xff00000000ffff00,
117
+ 0xff00000000ffffff, 0xff000000ff000000, 0xff000000ff0000ff,
118
+ 0xff000000ff00ff00, 0xff000000ff00ffff, 0xff000000ffff0000,
119
+ 0xff000000ffff00ff, 0xff000000ffffff00, 0xff000000ffffffff,
120
+ 0xff0000ff00000000, 0xff0000ff000000ff, 0xff0000ff0000ff00,
121
+ 0xff0000ff0000ffff, 0xff0000ff00ff0000, 0xff0000ff00ff00ff,
122
+ 0xff0000ff00ffff00, 0xff0000ff00ffffff, 0xff0000ffff000000,
123
+ 0xff0000ffff0000ff, 0xff0000ffff00ff00, 0xff0000ffff00ffff,
124
+ 0xff0000ffffff0000, 0xff0000ffffff00ff, 0xff0000ffffffff00,
125
+ 0xff0000ffffffffff, 0xff00ff0000000000, 0xff00ff00000000ff,
126
+ 0xff00ff000000ff00, 0xff00ff000000ffff, 0xff00ff0000ff0000,
127
+ 0xff00ff0000ff00ff, 0xff00ff0000ffff00, 0xff00ff0000ffffff,
128
+ 0xff00ff00ff000000, 0xff00ff00ff0000ff, 0xff00ff00ff00ff00,
129
+ 0xff00ff00ff00ffff, 0xff00ff00ffff0000, 0xff00ff00ffff00ff,
130
+ 0xff00ff00ffffff00, 0xff00ff00ffffffff, 0xff00ffff00000000,
131
+ 0xff00ffff000000ff, 0xff00ffff0000ff00, 0xff00ffff0000ffff,
132
+ 0xff00ffff00ff0000, 0xff00ffff00ff00ff, 0xff00ffff00ffff00,
133
+ 0xff00ffff00ffffff, 0xff00ffffff000000, 0xff00ffffff0000ff,
134
+ 0xff00ffffff00ff00, 0xff00ffffff00ffff, 0xff00ffffffff0000,
135
+ 0xff00ffffffff00ff, 0xff00ffffffffff00, 0xff00ffffffffffff,
136
+ 0xffff000000000000, 0xffff0000000000ff, 0xffff00000000ff00,
137
+ 0xffff00000000ffff, 0xffff000000ff0000, 0xffff000000ff00ff,
138
+ 0xffff000000ffff00, 0xffff000000ffffff, 0xffff0000ff000000,
139
+ 0xffff0000ff0000ff, 0xffff0000ff00ff00, 0xffff0000ff00ffff,
140
+ 0xffff0000ffff0000, 0xffff0000ffff00ff, 0xffff0000ffffff00,
141
+ 0xffff0000ffffffff, 0xffff00ff00000000, 0xffff00ff000000ff,
142
+ 0xffff00ff0000ff00, 0xffff00ff0000ffff, 0xffff00ff00ff0000,
143
+ 0xffff00ff00ff00ff, 0xffff00ff00ffff00, 0xffff00ff00ffffff,
144
+ 0xffff00ffff000000, 0xffff00ffff0000ff, 0xffff00ffff00ff00,
145
+ 0xffff00ffff00ffff, 0xffff00ffffff0000, 0xffff00ffffff00ff,
146
+ 0xffff00ffffffff00, 0xffff00ffffffffff, 0xffffff0000000000,
147
+ 0xffffff00000000ff, 0xffffff000000ff00, 0xffffff000000ffff,
148
+ 0xffffff0000ff0000, 0xffffff0000ff00ff, 0xffffff0000ffff00,
149
+ 0xffffff0000ffffff, 0xffffff00ff000000, 0xffffff00ff0000ff,
150
+ 0xffffff00ff00ff00, 0xffffff00ff00ffff, 0xffffff00ffff0000,
151
+ 0xffffff00ffff00ff, 0xffffff00ffffff00, 0xffffff00ffffffff,
152
+ 0xffffffff00000000, 0xffffffff000000ff, 0xffffffff0000ff00,
153
+ 0xffffffff0000ffff, 0xffffffff00ff0000, 0xffffffff00ff00ff,
154
+ 0xffffffff00ffff00, 0xffffffff00ffffff, 0xffffffffff000000,
155
+ 0xffffffffff0000ff, 0xffffffffff00ff00, 0xffffffffff00ffff,
156
+ 0xffffffffffff0000, 0xffffffffffff00ff, 0xffffffffffffff00,
157
+ 0xffffffffffffffff,
158
+ };
159
+ return word[byte];
160
+}
161
+
162
+/* Similarly for half-word elements.
163
+ * for (i = 0; i < 256; ++i) {
164
+ * unsigned long m = 0;
165
+ * if (i & 0xaa) {
166
+ * continue;
167
+ * }
168
+ * for (j = 0; j < 8; j += 2) {
169
+ * if ((i >> j) & 1) {
170
+ * m |= 0xfffful << (j << 3);
171
+ * }
172
+ * }
173
+ * printf("[0x%x] = 0x%016lx,\n", i, m);
174
+ * }
175
+ */
176
+static inline uint64_t expand_pred_h(uint8_t byte)
177
+{
178
+ static const uint64_t word[] = {
179
+ [0x01] = 0x000000000000ffff, [0x04] = 0x00000000ffff0000,
180
+ [0x05] = 0x00000000ffffffff, [0x10] = 0x0000ffff00000000,
181
+ [0x11] = 0x0000ffff0000ffff, [0x14] = 0x0000ffffffff0000,
182
+ [0x15] = 0x0000ffffffffffff, [0x40] = 0xffff000000000000,
183
+ [0x41] = 0xffff00000000ffff, [0x44] = 0xffff0000ffff0000,
184
+ [0x45] = 0xffff0000ffffffff, [0x50] = 0xffffffff00000000,
185
+ [0x51] = 0xffffffff0000ffff, [0x54] = 0xffffffffffff0000,
186
+ [0x55] = 0xffffffffffffffff,
187
+ };
188
+ return word[byte & 0x55];
189
+}
190
+
191
+/* Similarly for single word elements. */
192
+static inline uint64_t expand_pred_s(uint8_t byte)
193
+{
194
+ static const uint64_t word[] = {
195
+ [0x01] = 0x00000000ffffffffull,
196
+ [0x10] = 0xffffffff00000000ull,
197
+ [0x11] = 0xffffffffffffffffull,
198
+ };
199
+ return word[byte & 0x11];
200
+}
201
+
202
#define LOGICAL_PPPP(NAME, FUNC) \
203
void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
204
{ \
205
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32_t pred_desc)
206
207
return flags;
208
}
209
+
210
+/* Store zero into every active element of Zd. We will use this for two
211
+ * and three-operand predicated instructions for which logic dictates a
212
+ * zero result. In particular, logical shift by element size, which is
213
+ * otherwise undefined on the host.
214
+ *
215
+ * For element sizes smaller than uint64_t, we use tables to expand
216
+ * the N bits of the controlling predicate to a byte mask, and clear
217
+ * those bytes.
218
+ */
219
+void HELPER(sve_clr_b)(void *vd, void *vg, uint32_t desc)
220
+{
221
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
222
+ uint64_t *d = vd;
223
+ uint8_t *pg = vg;
224
+ for (i = 0; i < opr_sz; i += 1) {
225
+ d[i] &= ~expand_pred_b(pg[H1(i)]);
226
+ }
227
+}
228
+
229
+void HELPER(sve_clr_h)(void *vd, void *vg, uint32_t desc)
230
+{
231
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
232
+ uint64_t *d = vd;
233
+ uint8_t *pg = vg;
234
+ for (i = 0; i < opr_sz; i += 1) {
235
+ d[i] &= ~expand_pred_h(pg[H1(i)]);
236
+ }
237
+}
238
+
239
+void HELPER(sve_clr_s)(void *vd, void *vg, uint32_t desc)
240
+{
241
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
242
+ uint64_t *d = vd;
243
+ uint8_t *pg = vg;
244
+ for (i = 0; i < opr_sz; i += 1) {
245
+ d[i] &= ~expand_pred_s(pg[H1(i)]);
246
+ }
247
+}
248
+
249
+void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc)
250
+{
251
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
252
+ uint64_t *d = vd;
253
+ uint8_t *pg = vg;
254
+ for (i = 0; i < opr_sz; i += 1) {
255
+ if (pg[H1(i)] & 1) {
256
+ d[i] = 0;
257
+ }
258
+ }
259
+}
260
+
261
+/* Three-operand expander, immediate operand, controlled by a predicate.
262
+ */
263
+#define DO_ZPZI(NAME, TYPE, H, OP) \
264
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \
265
+{ \
266
+ intptr_t i, opr_sz = simd_oprsz(desc); \
267
+ TYPE imm = simd_data(desc); \
268
+ for (i = 0; i < opr_sz; ) { \
269
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
270
+ do { \
271
+ if (pg & 1) { \
272
+ TYPE nn = *(TYPE *)(vn + H(i)); \
273
+ *(TYPE *)(vd + H(i)) = OP(nn, imm); \
274
+ } \
275
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
276
+ } while (i & 15); \
277
+ } \
278
+}
279
+
280
+/* Similarly, specialized for 64-bit operands. */
281
+#define DO_ZPZI_D(NAME, TYPE, OP) \
282
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \
283
+{ \
284
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8; \
285
+ TYPE *d = vd, *n = vn; \
286
+ TYPE imm = simd_data(desc); \
287
+ uint8_t *pg = vg; \
288
+ for (i = 0; i < opr_sz; i += 1) { \
289
+ if (pg[H1(i)] & 1) { \
290
+ TYPE nn = n[i]; \
291
+ d[i] = OP(nn, imm); \
292
+ } \
293
+ } \
294
+}
295
+
296
+#define DO_SHR(N, M) (N >> M)
297
+#define DO_SHL(N, M) (N << M)
298
+
299
+/* Arithmetic shift right for division. This rounds negative numbers
300
+ toward zero as per signed division. Therefore before shifting,
301
+ when N is negative, add 2**M-1. */
302
+#define DO_ASRD(N, M) ((N + (N < 0 ? ((__typeof(N))1 << M) - 1 : 0)) >> M)
303
+
304
+DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR)
305
+DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR)
306
+DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR)
307
+DO_ZPZI_D(sve_asr_zpzi_d, int64_t, DO_SHR)
308
+
309
+DO_ZPZI(sve_lsr_zpzi_b, uint8_t, H1, DO_SHR)
310
+DO_ZPZI(sve_lsr_zpzi_h, uint16_t, H1_2, DO_SHR)
311
+DO_ZPZI(sve_lsr_zpzi_s, uint32_t, H1_4, DO_SHR)
312
+DO_ZPZI_D(sve_lsr_zpzi_d, uint64_t, DO_SHR)
313
+
314
+DO_ZPZI(sve_lsl_zpzi_b, uint8_t, H1, DO_SHL)
315
+DO_ZPZI(sve_lsl_zpzi_h, uint16_t, H1_2, DO_SHL)
316
+DO_ZPZI(sve_lsl_zpzi_s, uint32_t, H1_4, DO_SHL)
317
+DO_ZPZI_D(sve_lsl_zpzi_d, uint64_t, DO_SHL)
318
+
319
+DO_ZPZI(sve_asrd_b, int8_t, H1, DO_ASRD)
320
+DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD)
321
+DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD)
322
+DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD)
323
+
324
+#undef DO_SHR
325
+#undef DO_SHL
326
+#undef DO_ASRD
327
+#undef DO_ZPZI
328
+#undef DO_ZPZI_D
329
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
330
index XXXXXXX..XXXXXXX 100644
331
--- a/target/arm/translate-sve.c
332
+++ b/target/arm/translate-sve.c
333
@@ -XXX,XX +XXX,XX @@
334
#include "trace-tcg.h"
335
#include "translate-a64.h"
336
337
+/*
338
+ * Helpers for extracting complex instruction fields.
339
+ */
340
+
341
+/* See e.g. ASR (immediate, predicated).
342
+ * Returns -1 for unallocated encoding; diagnose later.
343
+ */
344
+static int tszimm_esz(int x)
345
+{
346
+ x >>= 3; /* discard imm3 */
347
+ return 31 - clz32(x);
348
+}
349
+
350
+static int tszimm_shr(int x)
351
+{
352
+ return (16 << tszimm_esz(x)) - x;
353
+}
354
+
355
+/* See e.g. LSL (immediate, predicated). */
356
+static int tszimm_shl(int x)
357
+{
358
+ return x - (8 << tszimm_esz(x));
359
+}
360
+
361
/*
362
* Include the generated decoder.
363
*/
364
@@ -XXX,XX +XXX,XX @@ static bool trans_SADDV(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
365
366
#undef DO_VPZ
367
368
+/*
369
+ *** SVE Shift by Immediate - Predicated Group
370
+ */
371
+
372
+/* Store zero into every active element of Zd. We will use this for two
373
+ * and three-operand predicated instructions for which logic dictates a
374
+ * zero result.
375
+ */
376
+static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz)
377
+{
378
+ static gen_helper_gvec_2 * const fns[4] = {
379
+ gen_helper_sve_clr_b, gen_helper_sve_clr_h,
380
+ gen_helper_sve_clr_s, gen_helper_sve_clr_d,
381
+ };
382
+ if (sve_access_check(s)) {
383
+ unsigned vsz = vec_full_reg_size(s);
384
+ tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd),
385
+ pred_full_reg_offset(s, pg),
386
+ vsz, vsz, 0, fns[esz]);
387
+ }
388
+ return true;
389
+}
390
+
391
+static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a,
392
+ gen_helper_gvec_3 *fn)
393
+{
394
+ if (sve_access_check(s)) {
395
+ unsigned vsz = vec_full_reg_size(s);
396
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
397
+ vec_full_reg_offset(s, a->rn),
398
+ pred_full_reg_offset(s, a->pg),
399
+ vsz, vsz, a->imm, fn);
400
+ }
401
+ return true;
402
+}
403
+
404
+static bool trans_ASR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn)
405
+{
406
+ static gen_helper_gvec_3 * const fns[4] = {
407
+ gen_helper_sve_asr_zpzi_b, gen_helper_sve_asr_zpzi_h,
408
+ gen_helper_sve_asr_zpzi_s, gen_helper_sve_asr_zpzi_d,
409
+ };
410
+ if (a->esz < 0) {
411
+ /* Invalid tsz encoding -- see tszimm_esz. */
412
+ return false;
413
+ }
414
+ /* Shift by element size is architecturally valid. For
415
+ arithmetic right-shift, it's the same as by one less. */
416
+ a->imm = MIN(a->imm, (8 << a->esz) - 1);
417
+ return do_zpzi_ool(s, a, fns[a->esz]);
418
+}
419
+
420
+static bool trans_LSR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn)
421
+{
422
+ static gen_helper_gvec_3 * const fns[4] = {
423
+ gen_helper_sve_lsr_zpzi_b, gen_helper_sve_lsr_zpzi_h,
424
+ gen_helper_sve_lsr_zpzi_s, gen_helper_sve_lsr_zpzi_d,
425
+ };
426
+ if (a->esz < 0) {
427
+ return false;
428
+ }
429
+ /* Shift by element size is architecturally valid.
430
+ For logical shifts, it is a zeroing operation. */
431
+ if (a->imm >= (8 << a->esz)) {
432
+ return do_clr_zp(s, a->rd, a->pg, a->esz);
433
+ } else {
434
+ return do_zpzi_ool(s, a, fns[a->esz]);
435
+ }
436
+}
437
+
438
+static bool trans_LSL_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn)
439
+{
440
+ static gen_helper_gvec_3 * const fns[4] = {
441
+ gen_helper_sve_lsl_zpzi_b, gen_helper_sve_lsl_zpzi_h,
442
+ gen_helper_sve_lsl_zpzi_s, gen_helper_sve_lsl_zpzi_d,
443
+ };
444
+ if (a->esz < 0) {
445
+ return false;
446
+ }
447
+ /* Shift by element size is architecturally valid.
448
+ For logical shifts, it is a zeroing operation. */
449
+ if (a->imm >= (8 << a->esz)) {
450
+ return do_clr_zp(s, a->rd, a->pg, a->esz);
451
+ } else {
452
+ return do_zpzi_ool(s, a, fns[a->esz]);
453
+ }
454
+}
455
+
456
+static bool trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint32_t insn)
457
+{
458
+ static gen_helper_gvec_3 * const fns[4] = {
459
+ gen_helper_sve_asrd_b, gen_helper_sve_asrd_h,
460
+ gen_helper_sve_asrd_s, gen_helper_sve_asrd_d,
461
+ };
462
+ if (a->esz < 0) {
463
+ return false;
464
+ }
465
+ /* Shift by element size is architecturally valid. For arithmetic
466
+ right shift for division, it is a zeroing operation. */
467
+ if (a->imm >= (8 << a->esz)) {
468
+ return do_clr_zp(s, a->rd, a->pg, a->esz);
469
+ } else {
470
+ return do_zpzi_ool(s, a, fns[a->esz]);
471
+ }
472
+}
473
+
474
/*
475
*** SVE Predicate Logical Operations Group
476
*/
477
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
478
index XXXXXXX..XXXXXXX 100644
479
--- a/target/arm/sve.decode
480
+++ b/target/arm/sve.decode
481
@@ -XXX,XX +XXX,XX @@
482
###########################################################################
483
# Named fields. These are primarily for disjoint fields.
484
485
+%imm6_22_5 22:1 5:5
486
%imm9_16_10 16:s6 10:3
487
488
+# A combination of tsz:imm3 -- extract esize.
489
+%tszimm_esz 22:2 5:5 !function=tszimm_esz
490
+# A combination of tsz:imm3 -- extract (2 * esize) - (tsz:imm3)
491
+%tszimm_shr 22:2 5:5 !function=tszimm_shr
492
+# A combination of tsz:imm3 -- extract (tsz:imm3) - esize
493
+%tszimm_shl 22:2 5:5 !function=tszimm_shl
494
+
495
# Either a copy of rd (at bit 0), or a different source
496
# as propagated via the MOVPRFX instruction.
497
%reg_movprfx 0:5
498
@@ -XXX,XX +XXX,XX @@
499
&rpr_esz rd pg rn esz
500
&rprr_s rd pg rn rm s
501
&rprr_esz rd pg rn rm esz
502
+&rpri_esz rd pg rn imm esz
503
504
###########################################################################
505
# Named instruction formats. These are generally used to
506
@@ -XXX,XX +XXX,XX @@
507
# One register operand, with governing predicate, vector element size
508
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
509
510
+# Two register operand, one immediate operand, with predicate,
511
+# element size encoded as TSZHL. User must fill in imm.
512
+@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \
513
+ &rpri_esz rn=%reg_movprfx esz=%tszimm_esz
514
+
515
# Basic Load/Store with 9-bit immediate offset
516
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
517
&rri imm=%imm9_16_10
518
@@ -XXX,XX +XXX,XX @@ UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg_rn
519
SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn
520
UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn
521
522
+### SVE Shift by Immediate - Predicated Group
523
+
524
+# SVE bitwise shift by immediate (predicated)
525
+ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... \
526
+ @rdn_pg_tszimm imm=%tszimm_shr
527
+LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... \
528
+ @rdn_pg_tszimm imm=%tszimm_shr
529
+LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \
530
+ @rdn_pg_tszimm imm=%tszimm_shl
531
+ASRD 00000100 .. 000 100 100 ... .. ... ..... \
532
+ @rdn_pg_tszimm imm=%tszimm_shr
533
+
534
### SVE Logical - Unpredicated Group
535
536
# SVE bitwise logical operations (unpredicated)
537
--
538
2.17.0
539
540
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-12-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 27 +++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 25 +++++++++++++++++++++++++
10
target/arm/translate-sve.c | 4 ++++
11
target/arm/sve.decode | 8 ++++++++
12
4 files changed, 64 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_5(sve_asr_zpzz_b, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(sve_asr_zpzz_h, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_asr_zpzz_s, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_5(sve_asr_zpzz_d, TCG_CALL_NO_RWG,
29
+ void, ptr, ptr, ptr, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_5(sve_lsr_zpzz_b, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_lsr_zpzz_h, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(sve_lsr_zpzz_s, TCG_CALL_NO_RWG,
36
+ void, ptr, ptr, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_5(sve_lsr_zpzz_d, TCG_CALL_NO_RWG,
38
+ void, ptr, ptr, ptr, ptr, i32)
39
+
40
+DEF_HELPER_FLAGS_5(sve_lsl_zpzz_b, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_5(sve_lsl_zpzz_h, TCG_CALL_NO_RWG,
43
+ void, ptr, ptr, ptr, ptr, i32)
44
+DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG,
45
+ void, ptr, ptr, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG,
47
+ void, ptr, ptr, ptr, ptr, i32)
48
+
49
DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
50
DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
51
DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
52
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/sve_helper.c
55
+++ b/target/arm/sve_helper.c
56
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV)
57
DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV)
58
DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV)
59
60
+/* Note that all bits of the shift are significant
61
+ and not modulo the element size. */
62
+#define DO_ASR(N, M) (N >> MIN(M, sizeof(N) * 8 - 1))
63
+#define DO_LSR(N, M) (M < sizeof(N) * 8 ? N >> M : 0)
64
+#define DO_LSL(N, M) (M < sizeof(N) * 8 ? N << M : 0)
65
+
66
+DO_ZPZZ(sve_asr_zpzz_b, int8_t, H1, DO_ASR)
67
+DO_ZPZZ(sve_lsr_zpzz_b, uint8_t, H1_2, DO_LSR)
68
+DO_ZPZZ(sve_lsl_zpzz_b, uint8_t, H1_4, DO_LSL)
69
+
70
+DO_ZPZZ(sve_asr_zpzz_h, int16_t, H1, DO_ASR)
71
+DO_ZPZZ(sve_lsr_zpzz_h, uint16_t, H1_2, DO_LSR)
72
+DO_ZPZZ(sve_lsl_zpzz_h, uint16_t, H1_4, DO_LSL)
73
+
74
+DO_ZPZZ(sve_asr_zpzz_s, int32_t, H1, DO_ASR)
75
+DO_ZPZZ(sve_lsr_zpzz_s, uint32_t, H1_2, DO_LSR)
76
+DO_ZPZZ(sve_lsl_zpzz_s, uint32_t, H1_4, DO_LSL)
77
+
78
+DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR)
79
+DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR)
80
+DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL)
81
+
82
#undef DO_ZPZZ
83
#undef DO_ZPZZ_D
84
85
@@ -XXX,XX +XXX,XX @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN)
86
#undef DO_ABD
87
#undef DO_MUL
88
#undef DO_DIV
89
+#undef DO_ASR
90
+#undef DO_LSR
91
+#undef DO_LSL
92
93
/* Similar to the ARM LastActiveElement pseudocode function, except the
94
result is multiplied by the element size. This includes the not found
95
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
96
index XXXXXXX..XXXXXXX 100644
97
--- a/target/arm/translate-sve.c
98
+++ b/target/arm/translate-sve.c
99
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ(MUL, mul)
100
DO_ZPZZ(SMULH, smulh)
101
DO_ZPZZ(UMULH, umulh)
102
103
+DO_ZPZZ(ASR, asr)
104
+DO_ZPZZ(LSR, lsr)
105
+DO_ZPZZ(LSL, lsl)
106
+
107
static bool trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
108
{
109
static gen_helper_gvec_4 * const fns[4] = {
110
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/sve.decode
113
+++ b/target/arm/sve.decode
114
@@ -XXX,XX +XXX,XX @@ LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \
115
ASRD 00000100 .. 000 100 100 ... .. ... ..... \
116
@rdn_pg_tszimm imm=%tszimm_shr
117
118
+# SVE bitwise shift by vector (predicated)
119
+ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm
120
+LSR_zpzz 00000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm
121
+LSL_zpzz 00000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm
122
+ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rdm_pg_rn # ASRR
123
+LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn # LSRR
124
+LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # LSLR
125
+
126
### SVE Logical - Unpredicated Group
127
128
# SVE bitwise logical operations (unpredicated)
129
--
130
2.17.0
131
132
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 21 +++++++++++++++++++++
9
target/arm/sve_helper.c | 35 +++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 24 ++++++++++++++++++++++++
11
target/arm/sve.decode | 6 ++++++
12
4 files changed, 86 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_asr_zpzw_s, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_5(sve_lsr_zpzw_b, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_lsr_zpzw_h, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_lsr_zpzw_s, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_5(sve_lsl_zpzw_b, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(sve_lsl_zpzw_h, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_5(sve_lsl_zpzw_s, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, i32)
42
+
43
DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
44
DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
45
DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
46
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/sve_helper.c
49
+++ b/target/arm/sve_helper.c
50
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL)
51
#undef DO_ZPZZ
52
#undef DO_ZPZZ_D
53
54
+/* Three-operand expander, controlled by a predicate, in which the
55
+ * third operand is "wide". That is, for D = N op M, the same 64-bit
56
+ * value of M is used with all of the narrower values of N.
57
+ */
58
+#define DO_ZPZW(NAME, TYPE, TYPEW, H, OP) \
59
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
60
+{ \
61
+ intptr_t i, opr_sz = simd_oprsz(desc); \
62
+ for (i = 0; i < opr_sz; ) { \
63
+ uint8_t pg = *(uint8_t *)(vg + H1(i >> 3)); \
64
+ TYPEW mm = *(TYPEW *)(vm + i); \
65
+ do { \
66
+ if (pg & 1) { \
67
+ TYPE nn = *(TYPE *)(vn + H(i)); \
68
+ *(TYPE *)(vd + H(i)) = OP(nn, mm); \
69
+ } \
70
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
71
+ } while (i & 7); \
72
+ } \
73
+}
74
+
75
+DO_ZPZW(sve_asr_zpzw_b, int8_t, uint64_t, H1, DO_ASR)
76
+DO_ZPZW(sve_lsr_zpzw_b, uint8_t, uint64_t, H1, DO_LSR)
77
+DO_ZPZW(sve_lsl_zpzw_b, uint8_t, uint64_t, H1, DO_LSL)
78
+
79
+DO_ZPZW(sve_asr_zpzw_h, int16_t, uint64_t, H1_2, DO_ASR)
80
+DO_ZPZW(sve_lsr_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSR)
81
+DO_ZPZW(sve_lsl_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSL)
82
+
83
+DO_ZPZW(sve_asr_zpzw_s, int32_t, uint64_t, H1_4, DO_ASR)
84
+DO_ZPZW(sve_lsr_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSR)
85
+DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL)
86
+
87
+#undef DO_ZPZW
88
+
89
/* Two-operand reduction expander, controlled by a predicate.
90
* The difference between TYPERED and TYPERET has to do with
91
* sign-extension. E.g. for SMAX, TYPERED must be signed,
92
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/translate-sve.c
95
+++ b/target/arm/translate-sve.c
96
@@ -XXX,XX +XXX,XX @@ static bool trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint32_t insn)
97
}
98
}
99
100
+/*
101
+ *** SVE Bitwise Shift - Predicated Group
102
+ */
103
+
104
+#define DO_ZPZW(NAME, name) \
105
+static bool trans_##NAME##_zpzw(DisasContext *s, arg_rprr_esz *a, \
106
+ uint32_t insn) \
107
+{ \
108
+ static gen_helper_gvec_4 * const fns[3] = { \
109
+ gen_helper_sve_##name##_zpzw_b, gen_helper_sve_##name##_zpzw_h, \
110
+ gen_helper_sve_##name##_zpzw_s, \
111
+ }; \
112
+ if (a->esz < 0 || a->esz >= 3) { \
113
+ return false; \
114
+ } \
115
+ return do_zpzz_ool(s, a, fns[a->esz]); \
116
+}
117
+
118
+DO_ZPZW(ASR, asr)
119
+DO_ZPZW(LSR, lsr)
120
+DO_ZPZW(LSL, lsl)
121
+
122
+#undef DO_ZPZW
123
+
124
/*
125
*** SVE Predicate Logical Operations Group
126
*/
127
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
128
index XXXXXXX..XXXXXXX 100644
129
--- a/target/arm/sve.decode
130
+++ b/target/arm/sve.decode
131
@@ -XXX,XX +XXX,XX @@ ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rdm_pg_rn # ASRR
132
LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn # LSRR
133
LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # LSLR
134
135
+# SVE bitwise shift by wide elements (predicated)
136
+# Note these require size != 3.
137
+ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm
138
+LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm
139
+LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm
140
+
141
### SVE Logical - Unpredicated Group
142
143
# SVE bitwise logical operations (unpredicated)
144
--
145
2.17.0
146
147
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-14-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 60 ++++++++++++++++++
9
target/arm/sve_helper.c | 127 +++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 113 +++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 23 +++++++
12
4 files changed, 323 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_cls_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(sve_cls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_cls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_4(sve_cls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+
27
+DEF_HELPER_FLAGS_4(sve_clz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(sve_clz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(sve_clz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(sve_clz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+
32
+DEF_HELPER_FLAGS_4(sve_cnt_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sve_cnt_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(sve_cnt_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(sve_cnt_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_4(sve_cnot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(sve_cnot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(sve_cnot_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(sve_cnot_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
+
42
+DEF_HELPER_FLAGS_4(sve_fabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_4(sve_fabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
+DEF_HELPER_FLAGS_4(sve_fabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
45
+
46
+DEF_HELPER_FLAGS_4(sve_fneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(sve_fneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_4(sve_fneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_4(sve_not_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_4(sve_not_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_4(sve_not_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
53
+DEF_HELPER_FLAGS_4(sve_not_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
54
+
55
+DEF_HELPER_FLAGS_4(sve_sxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
56
+DEF_HELPER_FLAGS_4(sve_sxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
57
+DEF_HELPER_FLAGS_4(sve_sxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
58
+
59
+DEF_HELPER_FLAGS_4(sve_uxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
60
+DEF_HELPER_FLAGS_4(sve_uxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_4(sve_uxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
62
+
63
+DEF_HELPER_FLAGS_4(sve_sxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
64
+DEF_HELPER_FLAGS_4(sve_sxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
65
+
66
+DEF_HELPER_FLAGS_4(sve_uxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
67
+DEF_HELPER_FLAGS_4(sve_uxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
68
+
69
+DEF_HELPER_FLAGS_4(sve_sxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
70
+DEF_HELPER_FLAGS_4(sve_uxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
71
+
72
+DEF_HELPER_FLAGS_4(sve_abs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
73
+DEF_HELPER_FLAGS_4(sve_abs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
74
+DEF_HELPER_FLAGS_4(sve_abs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
75
+DEF_HELPER_FLAGS_4(sve_abs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
76
+
77
+DEF_HELPER_FLAGS_4(sve_neg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
78
+DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
79
+DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
80
+DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
81
+
82
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
83
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
84
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
85
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
86
index XXXXXXX..XXXXXXX 100644
87
--- a/target/arm/sve_helper.c
88
+++ b/target/arm/sve_helper.c
89
@@ -XXX,XX +XXX,XX @@ DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL)
90
91
#undef DO_ZPZW
92
93
+/* Fully general two-operand expander, controlled by a predicate.
94
+ */
95
+#define DO_ZPZ(NAME, TYPE, H, OP) \
96
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \
97
+{ \
98
+ intptr_t i, opr_sz = simd_oprsz(desc); \
99
+ for (i = 0; i < opr_sz; ) { \
100
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
101
+ do { \
102
+ if (pg & 1) { \
103
+ TYPE nn = *(TYPE *)(vn + H(i)); \
104
+ *(TYPE *)(vd + H(i)) = OP(nn); \
105
+ } \
106
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
107
+ } while (i & 15); \
108
+ } \
109
+}
110
+
111
+/* Similarly, specialized for 64-bit operands. */
112
+#define DO_ZPZ_D(NAME, TYPE, OP) \
113
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \
114
+{ \
115
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8; \
116
+ TYPE *d = vd, *n = vn; \
117
+ uint8_t *pg = vg; \
118
+ for (i = 0; i < opr_sz; i += 1) { \
119
+ if (pg[H1(i)] & 1) { \
120
+ TYPE nn = n[i]; \
121
+ d[i] = OP(nn); \
122
+ } \
123
+ } \
124
+}
125
+
126
+#define DO_CLS_B(N) (clrsb32(N) - 24)
127
+#define DO_CLS_H(N) (clrsb32(N) - 16)
128
+
129
+DO_ZPZ(sve_cls_b, int8_t, H1, DO_CLS_B)
130
+DO_ZPZ(sve_cls_h, int16_t, H1_2, DO_CLS_H)
131
+DO_ZPZ(sve_cls_s, int32_t, H1_4, clrsb32)
132
+DO_ZPZ_D(sve_cls_d, int64_t, clrsb64)
133
+
134
+#define DO_CLZ_B(N) (clz32(N) - 24)
135
+#define DO_CLZ_H(N) (clz32(N) - 16)
136
+
137
+DO_ZPZ(sve_clz_b, uint8_t, H1, DO_CLZ_B)
138
+DO_ZPZ(sve_clz_h, uint16_t, H1_2, DO_CLZ_H)
139
+DO_ZPZ(sve_clz_s, uint32_t, H1_4, clz32)
140
+DO_ZPZ_D(sve_clz_d, uint64_t, clz64)
141
+
142
+DO_ZPZ(sve_cnt_zpz_b, uint8_t, H1, ctpop8)
143
+DO_ZPZ(sve_cnt_zpz_h, uint16_t, H1_2, ctpop16)
144
+DO_ZPZ(sve_cnt_zpz_s, uint32_t, H1_4, ctpop32)
145
+DO_ZPZ_D(sve_cnt_zpz_d, uint64_t, ctpop64)
146
+
147
+#define DO_CNOT(N) (N == 0)
148
+
149
+DO_ZPZ(sve_cnot_b, uint8_t, H1, DO_CNOT)
150
+DO_ZPZ(sve_cnot_h, uint16_t, H1_2, DO_CNOT)
151
+DO_ZPZ(sve_cnot_s, uint32_t, H1_4, DO_CNOT)
152
+DO_ZPZ_D(sve_cnot_d, uint64_t, DO_CNOT)
153
+
154
+#define DO_FABS(N) (N & ((__typeof(N))-1 >> 1))
155
+
156
+DO_ZPZ(sve_fabs_h, uint16_t, H1_2, DO_FABS)
157
+DO_ZPZ(sve_fabs_s, uint32_t, H1_4, DO_FABS)
158
+DO_ZPZ_D(sve_fabs_d, uint64_t, DO_FABS)
159
+
160
+#define DO_FNEG(N) (N ^ ~((__typeof(N))-1 >> 1))
161
+
162
+DO_ZPZ(sve_fneg_h, uint16_t, H1_2, DO_FNEG)
163
+DO_ZPZ(sve_fneg_s, uint32_t, H1_4, DO_FNEG)
164
+DO_ZPZ_D(sve_fneg_d, uint64_t, DO_FNEG)
165
+
166
+#define DO_NOT(N) (~N)
167
+
168
+DO_ZPZ(sve_not_zpz_b, uint8_t, H1, DO_NOT)
169
+DO_ZPZ(sve_not_zpz_h, uint16_t, H1_2, DO_NOT)
170
+DO_ZPZ(sve_not_zpz_s, uint32_t, H1_4, DO_NOT)
171
+DO_ZPZ_D(sve_not_zpz_d, uint64_t, DO_NOT)
172
+
173
+#define DO_SXTB(N) ((int8_t)N)
174
+#define DO_SXTH(N) ((int16_t)N)
175
+#define DO_SXTS(N) ((int32_t)N)
176
+#define DO_UXTB(N) ((uint8_t)N)
177
+#define DO_UXTH(N) ((uint16_t)N)
178
+#define DO_UXTS(N) ((uint32_t)N)
179
+
180
+DO_ZPZ(sve_sxtb_h, uint16_t, H1_2, DO_SXTB)
181
+DO_ZPZ(sve_sxtb_s, uint32_t, H1_4, DO_SXTB)
182
+DO_ZPZ(sve_sxth_s, uint32_t, H1_4, DO_SXTH)
183
+DO_ZPZ_D(sve_sxtb_d, uint64_t, DO_SXTB)
184
+DO_ZPZ_D(sve_sxth_d, uint64_t, DO_SXTH)
185
+DO_ZPZ_D(sve_sxtw_d, uint64_t, DO_SXTS)
186
+
187
+DO_ZPZ(sve_uxtb_h, uint16_t, H1_2, DO_UXTB)
188
+DO_ZPZ(sve_uxtb_s, uint32_t, H1_4, DO_UXTB)
189
+DO_ZPZ(sve_uxth_s, uint32_t, H1_4, DO_UXTH)
190
+DO_ZPZ_D(sve_uxtb_d, uint64_t, DO_UXTB)
191
+DO_ZPZ_D(sve_uxth_d, uint64_t, DO_UXTH)
192
+DO_ZPZ_D(sve_uxtw_d, uint64_t, DO_UXTS)
193
+
194
+#define DO_ABS(N) (N < 0 ? -N : N)
195
+
196
+DO_ZPZ(sve_abs_b, int8_t, H1, DO_ABS)
197
+DO_ZPZ(sve_abs_h, int16_t, H1_2, DO_ABS)
198
+DO_ZPZ(sve_abs_s, int32_t, H1_4, DO_ABS)
199
+DO_ZPZ_D(sve_abs_d, int64_t, DO_ABS)
200
+
201
+#define DO_NEG(N) (-N)
202
+
203
+DO_ZPZ(sve_neg_b, uint8_t, H1, DO_NEG)
204
+DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG)
205
+DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG)
206
+DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG)
207
+
208
+#undef DO_CLS_B
209
+#undef DO_CLS_H
210
+#undef DO_CLZ_B
211
+#undef DO_CLZ_H
212
+#undef DO_CNOT
213
+#undef DO_FABS
214
+#undef DO_FNEG
215
+#undef DO_ABS
216
+#undef DO_NEG
217
+#undef DO_ZPZ
218
+#undef DO_ZPZ_D
219
+
220
/* Two-operand reduction expander, controlled by a predicate.
221
* The difference between TYPERED and TYPERET has to do with
222
* sign-extension. E.g. for SMAX, TYPERED must be signed,
223
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
224
index XXXXXXX..XXXXXXX 100644
225
--- a/target/arm/translate-sve.c
226
+++ b/target/arm/translate-sve.c
227
@@ -XXX,XX +XXX,XX @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
228
229
#undef DO_ZPZZ
230
231
+/*
232
+ *** SVE Integer Arithmetic - Unary Predicated Group
233
+ */
234
+
235
+static bool do_zpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_3 *fn)
236
+{
237
+ if (fn == NULL) {
238
+ return false;
239
+ }
240
+ if (sve_access_check(s)) {
241
+ unsigned vsz = vec_full_reg_size(s);
242
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
243
+ vec_full_reg_offset(s, a->rn),
244
+ pred_full_reg_offset(s, a->pg),
245
+ vsz, vsz, 0, fn);
246
+ }
247
+ return true;
248
+}
249
+
250
+#define DO_ZPZ(NAME, name) \
251
+static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \
252
+{ \
253
+ static gen_helper_gvec_3 * const fns[4] = { \
254
+ gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \
255
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \
256
+ }; \
257
+ return do_zpz_ool(s, a, fns[a->esz]); \
258
+}
259
+
260
+DO_ZPZ(CLS, cls)
261
+DO_ZPZ(CLZ, clz)
262
+DO_ZPZ(CNT_zpz, cnt_zpz)
263
+DO_ZPZ(CNOT, cnot)
264
+DO_ZPZ(NOT_zpz, not_zpz)
265
+DO_ZPZ(ABS, abs)
266
+DO_ZPZ(NEG, neg)
267
+
268
+static bool trans_FABS(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
269
+{
270
+ static gen_helper_gvec_3 * const fns[4] = {
271
+ NULL,
272
+ gen_helper_sve_fabs_h,
273
+ gen_helper_sve_fabs_s,
274
+ gen_helper_sve_fabs_d
275
+ };
276
+ return do_zpz_ool(s, a, fns[a->esz]);
277
+}
278
+
279
+static bool trans_FNEG(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
280
+{
281
+ static gen_helper_gvec_3 * const fns[4] = {
282
+ NULL,
283
+ gen_helper_sve_fneg_h,
284
+ gen_helper_sve_fneg_s,
285
+ gen_helper_sve_fneg_d
286
+ };
287
+ return do_zpz_ool(s, a, fns[a->esz]);
288
+}
289
+
290
+static bool trans_SXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
291
+{
292
+ static gen_helper_gvec_3 * const fns[4] = {
293
+ NULL,
294
+ gen_helper_sve_sxtb_h,
295
+ gen_helper_sve_sxtb_s,
296
+ gen_helper_sve_sxtb_d
297
+ };
298
+ return do_zpz_ool(s, a, fns[a->esz]);
299
+}
300
+
301
+static bool trans_UXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
302
+{
303
+ static gen_helper_gvec_3 * const fns[4] = {
304
+ NULL,
305
+ gen_helper_sve_uxtb_h,
306
+ gen_helper_sve_uxtb_s,
307
+ gen_helper_sve_uxtb_d
308
+ };
309
+ return do_zpz_ool(s, a, fns[a->esz]);
310
+}
311
+
312
+static bool trans_SXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
313
+{
314
+ static gen_helper_gvec_3 * const fns[4] = {
315
+ NULL, NULL,
316
+ gen_helper_sve_sxth_s,
317
+ gen_helper_sve_sxth_d
318
+ };
319
+ return do_zpz_ool(s, a, fns[a->esz]);
320
+}
321
+
322
+static bool trans_UXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
323
+{
324
+ static gen_helper_gvec_3 * const fns[4] = {
325
+ NULL, NULL,
326
+ gen_helper_sve_uxth_s,
327
+ gen_helper_sve_uxth_d
328
+ };
329
+ return do_zpz_ool(s, a, fns[a->esz]);
330
+}
331
+
332
+static bool trans_SXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
333
+{
334
+ return do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_sxtw_d : NULL);
335
+}
336
+
337
+static bool trans_UXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
338
+{
339
+ return do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_uxtw_d : NULL);
340
+}
341
+
342
+#undef DO_ZPZ
343
+
344
/*
345
*** SVE Integer Reduction Group
346
*/
347
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
348
index XXXXXXX..XXXXXXX 100644
349
--- a/target/arm/sve.decode
350
+++ b/target/arm/sve.decode
351
@@ -XXX,XX +XXX,XX @@ ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm
352
LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm
353
LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm
354
355
+### SVE Integer Arithmetic - Unary Predicated Group
356
+
357
+# SVE unary bit operations (predicated)
358
+# Note esz != 0 for FABS and FNEG.
359
+CLS 00000100 .. 011 000 101 ... ..... ..... @rd_pg_rn
360
+CLZ 00000100 .. 011 001 101 ... ..... ..... @rd_pg_rn
361
+CNT_zpz 00000100 .. 011 010 101 ... ..... ..... @rd_pg_rn
362
+CNOT 00000100 .. 011 011 101 ... ..... ..... @rd_pg_rn
363
+NOT_zpz 00000100 .. 011 110 101 ... ..... ..... @rd_pg_rn
364
+FABS 00000100 .. 011 100 101 ... ..... ..... @rd_pg_rn
365
+FNEG 00000100 .. 011 101 101 ... ..... ..... @rd_pg_rn
366
+
367
+# SVE integer unary operations (predicated)
368
+# Note esz > original size for extensions.
369
+ABS 00000100 .. 010 110 101 ... ..... ..... @rd_pg_rn
370
+NEG 00000100 .. 010 111 101 ... ..... ..... @rd_pg_rn
371
+SXTB 00000100 .. 010 000 101 ... ..... ..... @rd_pg_rn
372
+UXTB 00000100 .. 010 001 101 ... ..... ..... @rd_pg_rn
373
+SXTH 00000100 .. 010 010 101 ... ..... ..... @rd_pg_rn
374
+UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn
375
+SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn
376
+UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn
377
+
378
### SVE Logical - Unpredicated Group
379
380
# SVE bitwise logical operations (unpredicated)
381
--
382
2.17.0
383
384
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-15-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 18 ++++++++++++
9
target/arm/sve_helper.c | 57 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 34 +++++++++++++++++++++++
11
target/arm/sve.decode | 17 ++++++++++++
12
4 files changed, 126 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_6(sve_mla_b, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_6(sve_mla_h, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_mla_s, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_6(sve_mla_d, TCG_CALL_NO_RWG,
29
+ void, ptr, ptr, ptr, ptr, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_6(sve_mls_b, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_mls_h, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG,
36
+ void, ptr, ptr, ptr, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG,
38
+ void, ptr, ptr, ptr, ptr, ptr, i32)
39
+
40
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
41
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
42
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
43
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/sve_helper.c
46
+++ b/target/arm/sve_helper.c
47
@@ -XXX,XX +XXX,XX @@ DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD)
48
#undef DO_ASRD
49
#undef DO_ZPZI
50
#undef DO_ZPZI_D
51
+
52
+/* Fully general four-operand expander, controlled by a predicate.
53
+ */
54
+#define DO_ZPZZZ(NAME, TYPE, H, OP) \
55
+void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \
56
+ void *vg, uint32_t desc) \
57
+{ \
58
+ intptr_t i, opr_sz = simd_oprsz(desc); \
59
+ for (i = 0; i < opr_sz; ) { \
60
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
61
+ do { \
62
+ if (pg & 1) { \
63
+ TYPE nn = *(TYPE *)(vn + H(i)); \
64
+ TYPE mm = *(TYPE *)(vm + H(i)); \
65
+ TYPE aa = *(TYPE *)(va + H(i)); \
66
+ *(TYPE *)(vd + H(i)) = OP(aa, nn, mm); \
67
+ } \
68
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
69
+ } while (i & 15); \
70
+ } \
71
+}
72
+
73
+/* Similarly, specialized for 64-bit operands. */
74
+#define DO_ZPZZZ_D(NAME, TYPE, OP) \
75
+void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \
76
+ void *vg, uint32_t desc) \
77
+{ \
78
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8; \
79
+ TYPE *d = vd, *a = va, *n = vn, *m = vm; \
80
+ uint8_t *pg = vg; \
81
+ for (i = 0; i < opr_sz; i += 1) { \
82
+ if (pg[H1(i)] & 1) { \
83
+ TYPE aa = a[i], nn = n[i], mm = m[i]; \
84
+ d[i] = OP(aa, nn, mm); \
85
+ } \
86
+ } \
87
+}
88
+
89
+#define DO_MLA(A, N, M) (A + N * M)
90
+#define DO_MLS(A, N, M) (A - N * M)
91
+
92
+DO_ZPZZZ(sve_mla_b, uint8_t, H1, DO_MLA)
93
+DO_ZPZZZ(sve_mls_b, uint8_t, H1, DO_MLS)
94
+
95
+DO_ZPZZZ(sve_mla_h, uint16_t, H1_2, DO_MLA)
96
+DO_ZPZZZ(sve_mls_h, uint16_t, H1_2, DO_MLS)
97
+
98
+DO_ZPZZZ(sve_mla_s, uint32_t, H1_4, DO_MLA)
99
+DO_ZPZZZ(sve_mls_s, uint32_t, H1_4, DO_MLS)
100
+
101
+DO_ZPZZZ_D(sve_mla_d, uint64_t, DO_MLA)
102
+DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS)
103
+
104
+#undef DO_MLA
105
+#undef DO_MLS
106
+#undef DO_ZPZZZ
107
+#undef DO_ZPZZZ_D
108
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
109
index XXXXXXX..XXXXXXX 100644
110
--- a/target/arm/translate-sve.c
111
+++ b/target/arm/translate-sve.c
112
@@ -XXX,XX +XXX,XX @@ DO_ZPZW(LSL, lsl)
113
114
#undef DO_ZPZW
115
116
+/*
117
+ *** SVE Integer Multiply-Add Group
118
+ */
119
+
120
+static bool do_zpzzz_ool(DisasContext *s, arg_rprrr_esz *a,
121
+ gen_helper_gvec_5 *fn)
122
+{
123
+ if (sve_access_check(s)) {
124
+ unsigned vsz = vec_full_reg_size(s);
125
+ tcg_gen_gvec_5_ool(vec_full_reg_offset(s, a->rd),
126
+ vec_full_reg_offset(s, a->ra),
127
+ vec_full_reg_offset(s, a->rn),
128
+ vec_full_reg_offset(s, a->rm),
129
+ pred_full_reg_offset(s, a->pg),
130
+ vsz, vsz, 0, fn);
131
+ }
132
+ return true;
133
+}
134
+
135
+#define DO_ZPZZZ(NAME, name) \
136
+static bool trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \
137
+{ \
138
+ static gen_helper_gvec_5 * const fns[4] = { \
139
+ gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \
140
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \
141
+ }; \
142
+ return do_zpzzz_ool(s, a, fns[a->esz]); \
143
+}
144
+
145
+DO_ZPZZZ(MLA, mla)
146
+DO_ZPZZZ(MLS, mls)
147
+
148
+#undef DO_ZPZZZ
149
+
150
/*
151
*** SVE Predicate Logical Operations Group
152
*/
153
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
154
index XXXXXXX..XXXXXXX 100644
155
--- a/target/arm/sve.decode
156
+++ b/target/arm/sve.decode
157
@@ -XXX,XX +XXX,XX @@
158
&rpr_esz rd pg rn esz
159
&rprr_s rd pg rn rm s
160
&rprr_esz rd pg rn rm esz
161
+&rprrr_esz rd pg rn rm ra esz
162
&rpri_esz rd pg rn imm esz
163
164
###########################################################################
165
@@ -XXX,XX +XXX,XX @@
166
@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \
167
&rprr_esz rm=%reg_movprfx
168
169
+# Three register operand, with governing predicate, vector element size
170
+@rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \
171
+ &rprrr_esz ra=%reg_movprfx
172
+@rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \
173
+ &rprrr_esz rn=%reg_movprfx
174
+
175
# One register operand, with governing predicate, vector element size
176
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
177
178
@@ -XXX,XX +XXX,XX @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn
179
SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn
180
UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn
181
182
+### SVE Integer Multiply-Add Group
183
+
184
+# SVE integer multiply-add writing addend (predicated)
185
+MLA 00000100 .. 0 ..... 010 ... ..... ..... @rda_pg_rn_rm
186
+MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg_rn_rm
187
+
188
+# SVE integer multiply-add writing multiplicand (predicated)
189
+MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm # MAD
190
+MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm # MSB
191
+
192
### SVE Logical - Unpredicated Group
193
194
# SVE bitwise logical operations (unpredicated)
195
--
196
2.17.0
197
198
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-16-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/translate-sve.c | 34 ++++++++++++++++++++++++++++++++++
9
target/arm/sve.decode | 13 +++++++++++++
10
2 files changed, 47 insertions(+)
11
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
15
+++ b/target/arm/translate-sve.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
17
return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm);
18
}
19
20
+/*
21
+ *** SVE Integer Arithmetic - Unpredicated Group
22
+ */
23
+
24
+static bool trans_ADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
25
+{
26
+ return do_vector3_z(s, tcg_gen_gvec_add, a->esz, a->rd, a->rn, a->rm);
27
+}
28
+
29
+static bool trans_SUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
30
+{
31
+ return do_vector3_z(s, tcg_gen_gvec_sub, a->esz, a->rd, a->rn, a->rm);
32
+}
33
+
34
+static bool trans_SQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
35
+{
36
+ return do_vector3_z(s, tcg_gen_gvec_ssadd, a->esz, a->rd, a->rn, a->rm);
37
+}
38
+
39
+static bool trans_SQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
40
+{
41
+ return do_vector3_z(s, tcg_gen_gvec_sssub, a->esz, a->rd, a->rn, a->rm);
42
+}
43
+
44
+static bool trans_UQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
45
+{
46
+ return do_vector3_z(s, tcg_gen_gvec_usadd, a->esz, a->rd, a->rn, a->rm);
47
+}
48
+
49
+static bool trans_UQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
50
+{
51
+ return do_vector3_z(s, tcg_gen_gvec_ussub, a->esz, a->rd, a->rn, a->rm);
52
+}
53
+
54
/*
55
*** SVE Integer Arithmetic - Binary Predicated Group
56
*/
57
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/sve.decode
60
+++ b/target/arm/sve.decode
61
@@ -XXX,XX +XXX,XX @@
62
# Three predicate operand, with governing predicate, flag setting
63
@pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s
64
65
+# Three operand, vector element size
66
+@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz
67
+
68
# Two register operand, with governing predicate, vector element size
69
@rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \
70
&rprr_esz rn=%reg_movprfx
71
@@ -XXX,XX +XXX,XX @@ MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg_rn_rm
72
MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm # MAD
73
MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm # MSB
74
75
+### SVE Integer Arithmetic - Unpredicated Group
76
+
77
+# SVE integer add/subtract vectors (unpredicated)
78
+ADD_zzz 00000100 .. 1 ..... 000 000 ..... ..... @rd_rn_rm
79
+SUB_zzz 00000100 .. 1 ..... 000 001 ..... ..... @rd_rn_rm
80
+SQADD_zzz 00000100 .. 1 ..... 000 100 ..... ..... @rd_rn_rm
81
+UQADD_zzz 00000100 .. 1 ..... 000 101 ..... ..... @rd_rn_rm
82
+SQSUB_zzz 00000100 .. 1 ..... 000 110 ..... ..... @rd_rn_rm
83
+UQSUB_zzz 00000100 .. 1 ..... 000 111 ..... ..... @rd_rn_rm
84
+
85
### SVE Logical - Unpredicated Group
86
87
# SVE bitwise logical operations (unpredicated)
88
--
89
2.17.0
90
91
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-17-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 5 +++
9
target/arm/sve_helper.c | 40 +++++++++++++++++++
10
target/arm/translate-sve.c | 79 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 14 +++++++
12
4 files changed, 138 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_index_b, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32)
23
+DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32)
24
+DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32)
25
+DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32)
26
+
27
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/sve_helper.c
33
+++ b/target/arm/sve_helper.c
34
@@ -XXX,XX +XXX,XX @@ DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS)
35
#undef DO_MLS
36
#undef DO_ZPZZZ
37
#undef DO_ZPZZZ_D
38
+
39
+void HELPER(sve_index_b)(void *vd, uint32_t start,
40
+ uint32_t incr, uint32_t desc)
41
+{
42
+ intptr_t i, opr_sz = simd_oprsz(desc);
43
+ uint8_t *d = vd;
44
+ for (i = 0; i < opr_sz; i += 1) {
45
+ d[H1(i)] = start + i * incr;
46
+ }
47
+}
48
+
49
+void HELPER(sve_index_h)(void *vd, uint32_t start,
50
+ uint32_t incr, uint32_t desc)
51
+{
52
+ intptr_t i, opr_sz = simd_oprsz(desc) / 2;
53
+ uint16_t *d = vd;
54
+ for (i = 0; i < opr_sz; i += 1) {
55
+ d[H2(i)] = start + i * incr;
56
+ }
57
+}
58
+
59
+void HELPER(sve_index_s)(void *vd, uint32_t start,
60
+ uint32_t incr, uint32_t desc)
61
+{
62
+ intptr_t i, opr_sz = simd_oprsz(desc) / 4;
63
+ uint32_t *d = vd;
64
+ for (i = 0; i < opr_sz; i += 1) {
65
+ d[H4(i)] = start + i * incr;
66
+ }
67
+}
68
+
69
+void HELPER(sve_index_d)(void *vd, uint64_t start,
70
+ uint64_t incr, uint32_t desc)
71
+{
72
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
73
+ uint64_t *d = vd;
74
+ for (i = 0; i < opr_sz; i += 1) {
75
+ d[i] = start + i * incr;
76
+ }
77
+}
78
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/target/arm/translate-sve.c
81
+++ b/target/arm/translate-sve.c
82
@@ -XXX,XX +XXX,XX @@ DO_ZPZZZ(MLS, mls)
83
84
#undef DO_ZPZZZ
85
86
+/*
87
+ *** SVE Index Generation Group
88
+ */
89
+
90
+static void do_index(DisasContext *s, int esz, int rd,
91
+ TCGv_i64 start, TCGv_i64 incr)
92
+{
93
+ unsigned vsz = vec_full_reg_size(s);
94
+ TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
95
+ TCGv_ptr t_zd = tcg_temp_new_ptr();
96
+
97
+ tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd));
98
+ if (esz == 3) {
99
+ gen_helper_sve_index_d(t_zd, start, incr, desc);
100
+ } else {
101
+ typedef void index_fn(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32);
102
+ static index_fn * const fns[3] = {
103
+ gen_helper_sve_index_b,
104
+ gen_helper_sve_index_h,
105
+ gen_helper_sve_index_s,
106
+ };
107
+ TCGv_i32 s32 = tcg_temp_new_i32();
108
+ TCGv_i32 i32 = tcg_temp_new_i32();
109
+
110
+ tcg_gen_extrl_i64_i32(s32, start);
111
+ tcg_gen_extrl_i64_i32(i32, incr);
112
+ fns[esz](t_zd, s32, i32, desc);
113
+
114
+ tcg_temp_free_i32(s32);
115
+ tcg_temp_free_i32(i32);
116
+ }
117
+ tcg_temp_free_ptr(t_zd);
118
+ tcg_temp_free_i32(desc);
119
+}
120
+
121
+static bool trans_INDEX_ii(DisasContext *s, arg_INDEX_ii *a, uint32_t insn)
122
+{
123
+ if (sve_access_check(s)) {
124
+ TCGv_i64 start = tcg_const_i64(a->imm1);
125
+ TCGv_i64 incr = tcg_const_i64(a->imm2);
126
+ do_index(s, a->esz, a->rd, start, incr);
127
+ tcg_temp_free_i64(start);
128
+ tcg_temp_free_i64(incr);
129
+ }
130
+ return true;
131
+}
132
+
133
+static bool trans_INDEX_ir(DisasContext *s, arg_INDEX_ir *a, uint32_t insn)
134
+{
135
+ if (sve_access_check(s)) {
136
+ TCGv_i64 start = tcg_const_i64(a->imm);
137
+ TCGv_i64 incr = cpu_reg(s, a->rm);
138
+ do_index(s, a->esz, a->rd, start, incr);
139
+ tcg_temp_free_i64(start);
140
+ }
141
+ return true;
142
+}
143
+
144
+static bool trans_INDEX_ri(DisasContext *s, arg_INDEX_ri *a, uint32_t insn)
145
+{
146
+ if (sve_access_check(s)) {
147
+ TCGv_i64 start = cpu_reg(s, a->rn);
148
+ TCGv_i64 incr = tcg_const_i64(a->imm);
149
+ do_index(s, a->esz, a->rd, start, incr);
150
+ tcg_temp_free_i64(incr);
151
+ }
152
+ return true;
153
+}
154
+
155
+static bool trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, uint32_t insn)
156
+{
157
+ if (sve_access_check(s)) {
158
+ TCGv_i64 start = cpu_reg(s, a->rn);
159
+ TCGv_i64 incr = cpu_reg(s, a->rm);
160
+ do_index(s, a->esz, a->rd, start, incr);
161
+ }
162
+ return true;
163
+}
164
+
165
/*
166
*** SVE Predicate Logical Operations Group
167
*/
168
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
169
index XXXXXXX..XXXXXXX 100644
170
--- a/target/arm/sve.decode
171
+++ b/target/arm/sve.decode
172
@@ -XXX,XX +XXX,XX @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
173
EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
174
BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0
175
176
+### SVE Index Generation Group
177
+
178
+# SVE index generation (immediate start, immediate increment)
179
+INDEX_ii 00000100 esz:2 1 imm2:s5 010000 imm1:s5 rd:5
180
+
181
+# SVE index generation (immediate start, register increment)
182
+INDEX_ir 00000100 esz:2 1 rm:5 010010 imm:s5 rd:5
183
+
184
+# SVE index generation (register start, immediate increment)
185
+INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5
186
+
187
+# SVE index generation (register start, register increment)
188
+INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm
189
+
190
### SVE Predicate Logical Operations Group
191
192
# SVE predicate logical operations
193
--
194
2.17.0
195
196
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-18-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/translate-sve.c | 27 +++++++++++++++++++++++++++
9
target/arm/sve.decode | 12 ++++++++++++
10
2 files changed, 39 insertions(+)
11
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
15
+++ b/target/arm/translate-sve.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, uint32_t insn)
17
return true;
18
}
19
20
+/*
21
+ *** SVE Stack Allocation Group
22
+ */
23
+
24
+static bool trans_ADDVL(DisasContext *s, arg_ADDVL *a, uint32_t insn)
25
+{
26
+ TCGv_i64 rd = cpu_reg_sp(s, a->rd);
27
+ TCGv_i64 rn = cpu_reg_sp(s, a->rn);
28
+ tcg_gen_addi_i64(rd, rn, a->imm * vec_full_reg_size(s));
29
+ return true;
30
+}
31
+
32
+static bool trans_ADDPL(DisasContext *s, arg_ADDPL *a, uint32_t insn)
33
+{
34
+ TCGv_i64 rd = cpu_reg_sp(s, a->rd);
35
+ TCGv_i64 rn = cpu_reg_sp(s, a->rn);
36
+ tcg_gen_addi_i64(rd, rn, a->imm * pred_full_reg_size(s));
37
+ return true;
38
+}
39
+
40
+static bool trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t insn)
41
+{
42
+ TCGv_i64 reg = cpu_reg(s, a->rd);
43
+ tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s));
44
+ return true;
45
+}
46
+
47
/*
48
*** SVE Predicate Logical Operations Group
49
*/
50
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/sve.decode
53
+++ b/target/arm/sve.decode
54
@@ -XXX,XX +XXX,XX @@
55
# One register operand, with governing predicate, vector element size
56
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
57
58
+# Two register operands with a 6-bit signed immediate.
59
+@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri
60
+
61
# Two register operand, one immediate operand, with predicate,
62
# element size encoded as TSZHL. User must fill in imm.
63
@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \
64
@@ -XXX,XX +XXX,XX @@ INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5
65
# SVE index generation (register start, register increment)
66
INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm
67
68
+### SVE Stack Allocation Group
69
+
70
+# SVE stack frame adjustment
71
+ADDVL 00000100 001 ..... 01010 ...... ..... @rd_rn_i6
72
+ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6
73
+
74
+# SVE stack frame size
75
+RDVL 00000100 101 11111 01010 imm:s6 rd:5
76
+
77
### SVE Predicate Logical Operations Group
78
79
# SVE predicate logical operations
80
--
81
2.17.0
82
83
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-19-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 12 ++++++
9
target/arm/sve_helper.c | 30 ++++++++++++++
10
target/arm/translate-sve.c | 85 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 26 ++++++++++++
12
4 files changed, 153 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32)
19
DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32)
20
DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_asr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(sve_asr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_asr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+
26
+DEF_HELPER_FLAGS_4(sve_lsr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(sve_lsr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(sve_lsr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+
34
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
37
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/sve_helper.c
40
+++ b/target/arm/sve_helper.c
41
@@ -XXX,XX +XXX,XX @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG)
42
DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG)
43
DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG)
44
45
+/* Three-operand expander, unpredicated, in which the third operand is "wide".
46
+ */
47
+#define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \
48
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
49
+{ \
50
+ intptr_t i, opr_sz = simd_oprsz(desc); \
51
+ for (i = 0; i < opr_sz; ) { \
52
+ TYPEW mm = *(TYPEW *)(vm + i); \
53
+ do { \
54
+ TYPE nn = *(TYPE *)(vn + H(i)); \
55
+ *(TYPE *)(vd + H(i)) = OP(nn, mm); \
56
+ i += sizeof(TYPE); \
57
+ } while (i & 7); \
58
+ } \
59
+}
60
+
61
+DO_ZZW(sve_asr_zzw_b, int8_t, uint64_t, H1, DO_ASR)
62
+DO_ZZW(sve_lsr_zzw_b, uint8_t, uint64_t, H1, DO_LSR)
63
+DO_ZZW(sve_lsl_zzw_b, uint8_t, uint64_t, H1, DO_LSL)
64
+
65
+DO_ZZW(sve_asr_zzw_h, int16_t, uint64_t, H1_2, DO_ASR)
66
+DO_ZZW(sve_lsr_zzw_h, uint16_t, uint64_t, H1_2, DO_LSR)
67
+DO_ZZW(sve_lsl_zzw_h, uint16_t, uint64_t, H1_2, DO_LSL)
68
+
69
+DO_ZZW(sve_asr_zzw_s, int32_t, uint64_t, H1_4, DO_ASR)
70
+DO_ZZW(sve_lsr_zzw_s, uint32_t, uint64_t, H1_4, DO_LSR)
71
+DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL)
72
+
73
+#undef DO_ZZW
74
+
75
#undef DO_CLS_B
76
#undef DO_CLS_H
77
#undef DO_CLZ_B
78
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/target/arm/translate-sve.c
81
+++ b/target/arm/translate-sve.c
82
@@ -XXX,XX +XXX,XX @@ static bool do_mov_z(DisasContext *s, int rd, int rn)
83
return do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn);
84
}
85
86
+/* Initialize a Zreg with replications of a 64-bit immediate. */
87
+static void do_dupi_z(DisasContext *s, int rd, uint64_t word)
88
+{
89
+ unsigned vsz = vec_full_reg_size(s);
90
+ tcg_gen_gvec_dup64i(vec_full_reg_offset(s, rd), vsz, vsz, word);
91
+}
92
+
93
/* Invoke a vector expander on two Pregs. */
94
static bool do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn,
95
int esz, int rd, int rn)
96
@@ -XXX,XX +XXX,XX @@ DO_ZPZW(LSL, lsl)
97
98
#undef DO_ZPZW
99
100
+/*
101
+ *** SVE Bitwise Shift - Unpredicated Group
102
+ */
103
+
104
+static bool do_shift_imm(DisasContext *s, arg_rri_esz *a, bool asr,
105
+ void (*gvec_fn)(unsigned, uint32_t, uint32_t,
106
+ int64_t, uint32_t, uint32_t))
107
+{
108
+ if (a->esz < 0) {
109
+ /* Invalid tsz encoding -- see tszimm_esz. */
110
+ return false;
111
+ }
112
+ if (sve_access_check(s)) {
113
+ unsigned vsz = vec_full_reg_size(s);
114
+ /* Shift by element size is architecturally valid. For
115
+ arithmetic right-shift, it's the same as by one less.
116
+ Otherwise it is a zeroing operation. */
117
+ if (a->imm >= 8 << a->esz) {
118
+ if (asr) {
119
+ a->imm = (8 << a->esz) - 1;
120
+ } else {
121
+ do_dupi_z(s, a->rd, 0);
122
+ return true;
123
+ }
124
+ }
125
+ gvec_fn(a->esz, vec_full_reg_offset(s, a->rd),
126
+ vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz);
127
+ }
128
+ return true;
129
+}
130
+
131
+static bool trans_ASR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
132
+{
133
+ return do_shift_imm(s, a, true, tcg_gen_gvec_sari);
134
+}
135
+
136
+static bool trans_LSR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
137
+{
138
+ return do_shift_imm(s, a, false, tcg_gen_gvec_shri);
139
+}
140
+
141
+static bool trans_LSL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
142
+{
143
+ return do_shift_imm(s, a, false, tcg_gen_gvec_shli);
144
+}
145
+
146
+static bool do_zzw_ool(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 *fn)
147
+{
148
+ if (fn == NULL) {
149
+ return false;
150
+ }
151
+ if (sve_access_check(s)) {
152
+ unsigned vsz = vec_full_reg_size(s);
153
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
154
+ vec_full_reg_offset(s, a->rn),
155
+ vec_full_reg_offset(s, a->rm),
156
+ vsz, vsz, 0, fn);
157
+ }
158
+ return true;
159
+}
160
+
161
+#define DO_ZZW(NAME, name) \
162
+static bool trans_##NAME##_zzw(DisasContext *s, arg_rrr_esz *a, \
163
+ uint32_t insn) \
164
+{ \
165
+ static gen_helper_gvec_3 * const fns[4] = { \
166
+ gen_helper_sve_##name##_zzw_b, gen_helper_sve_##name##_zzw_h, \
167
+ gen_helper_sve_##name##_zzw_s, NULL \
168
+ }; \
169
+ return do_zzw_ool(s, a, fns[a->esz]); \
170
+}
171
+
172
+DO_ZZW(ASR, asr)
173
+DO_ZZW(LSR, lsr)
174
+DO_ZZW(LSL, lsl)
175
+
176
+#undef DO_ZZW
177
+
178
/*
179
*** SVE Integer Multiply-Add Group
180
*/
181
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
182
index XXXXXXX..XXXXXXX 100644
183
--- a/target/arm/sve.decode
184
+++ b/target/arm/sve.decode
185
@@ -XXX,XX +XXX,XX @@
186
# A combination of tsz:imm3 -- extract (tsz:imm3) - esize
187
%tszimm_shl 22:2 5:5 !function=tszimm_shl
188
189
+# Similarly for the tszh/tszl pair at 22/16 for zzi
190
+%tszimm16_esz 22:2 16:5 !function=tszimm_esz
191
+%tszimm16_shr 22:2 16:5 !function=tszimm_shr
192
+%tszimm16_shl 22:2 16:5 !function=tszimm_shl
193
+
194
# Either a copy of rd (at bit 0), or a different source
195
# as propagated via the MOVPRFX instruction.
196
%reg_movprfx 0:5
197
@@ -XXX,XX +XXX,XX @@
198
199
&rr_esz rd rn esz
200
&rri rd rn imm
201
+&rri_esz rd rn imm esz
202
&rrr_esz rd rn rm esz
203
&rpr_esz rd pg rn esz
204
&rprr_s rd pg rn rm s
205
@@ -XXX,XX +XXX,XX @@
206
@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \
207
&rpri_esz rn=%reg_movprfx esz=%tszimm_esz
208
209
+# Similarly without predicate.
210
+@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \
211
+ &rri_esz esz=%tszimm16_esz
212
+
213
# Basic Load/Store with 9-bit immediate offset
214
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
215
&rri imm=%imm9_16_10
216
@@ -XXX,XX +XXX,XX @@ ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6
217
# SVE stack frame size
218
RDVL 00000100 101 11111 01010 imm:s6 rd:5
219
220
+### SVE Bitwise Shift - Unpredicated Group
221
+
222
+# SVE bitwise shift by immediate (unpredicated)
223
+ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... \
224
+ @rd_rn_tszimm imm=%tszimm16_shr
225
+LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... \
226
+ @rd_rn_tszimm imm=%tszimm16_shr
227
+LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... \
228
+ @rd_rn_tszimm imm=%tszimm16_shl
229
+
230
+# SVE bitwise shift by wide elements (unpredicated)
231
+# Note esz != 3
232
+ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_rn_rm
233
+LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm
234
+LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm
235
+
236
### SVE Predicate Logical Operations Group
237
238
# SVE predicate logical operations
239
--
240
2.17.0
241
242
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-20-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 5 +++++
9
target/arm/sve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 36 ++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 12 ++++++++++++
12
4 files changed, 93 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_adr_p32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+
27
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/sve_helper.c
33
+++ b/target/arm/sve_helper.c
34
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_index_d)(void *vd, uint64_t start,
35
d[i] = start + i * incr;
36
}
37
}
38
+
39
+void HELPER(sve_adr_p32)(void *vd, void *vn, void *vm, uint32_t desc)
40
+{
41
+ intptr_t i, opr_sz = simd_oprsz(desc) / 4;
42
+ uint32_t sh = simd_data(desc);
43
+ uint32_t *d = vd, *n = vn, *m = vm;
44
+ for (i = 0; i < opr_sz; i += 1) {
45
+ d[i] = n[i] + (m[i] << sh);
46
+ }
47
+}
48
+
49
+void HELPER(sve_adr_p64)(void *vd, void *vn, void *vm, uint32_t desc)
50
+{
51
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
52
+ uint64_t sh = simd_data(desc);
53
+ uint64_t *d = vd, *n = vn, *m = vm;
54
+ for (i = 0; i < opr_sz; i += 1) {
55
+ d[i] = n[i] + (m[i] << sh);
56
+ }
57
+}
58
+
59
+void HELPER(sve_adr_s32)(void *vd, void *vn, void *vm, uint32_t desc)
60
+{
61
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
62
+ uint64_t sh = simd_data(desc);
63
+ uint64_t *d = vd, *n = vn, *m = vm;
64
+ for (i = 0; i < opr_sz; i += 1) {
65
+ d[i] = n[i] + ((uint64_t)(int32_t)m[i] << sh);
66
+ }
67
+}
68
+
69
+void HELPER(sve_adr_u32)(void *vd, void *vn, void *vm, uint32_t desc)
70
+{
71
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
72
+ uint64_t sh = simd_data(desc);
73
+ uint64_t *d = vd, *n = vn, *m = vm;
74
+ for (i = 0; i < opr_sz; i += 1) {
75
+ d[i] = n[i] + ((uint64_t)(uint32_t)m[i] << sh);
76
+ }
77
+}
78
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/target/arm/translate-sve.c
81
+++ b/target/arm/translate-sve.c
82
@@ -XXX,XX +XXX,XX @@ static bool trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t insn)
83
return true;
84
}
85
86
+/*
87
+ *** SVE Compute Vector Address Group
88
+ */
89
+
90
+static bool do_adr(DisasContext *s, arg_rrri *a, gen_helper_gvec_3 *fn)
91
+{
92
+ if (sve_access_check(s)) {
93
+ unsigned vsz = vec_full_reg_size(s);
94
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
95
+ vec_full_reg_offset(s, a->rn),
96
+ vec_full_reg_offset(s, a->rm),
97
+ vsz, vsz, a->imm, fn);
98
+ }
99
+ return true;
100
+}
101
+
102
+static bool trans_ADR_p32(DisasContext *s, arg_rrri *a, uint32_t insn)
103
+{
104
+ return do_adr(s, a, gen_helper_sve_adr_p32);
105
+}
106
+
107
+static bool trans_ADR_p64(DisasContext *s, arg_rrri *a, uint32_t insn)
108
+{
109
+ return do_adr(s, a, gen_helper_sve_adr_p64);
110
+}
111
+
112
+static bool trans_ADR_s32(DisasContext *s, arg_rrri *a, uint32_t insn)
113
+{
114
+ return do_adr(s, a, gen_helper_sve_adr_s32);
115
+}
116
+
117
+static bool trans_ADR_u32(DisasContext *s, arg_rrri *a, uint32_t insn)
118
+{
119
+ return do_adr(s, a, gen_helper_sve_adr_u32);
120
+}
121
+
122
/*
123
*** SVE Predicate Logical Operations Group
124
*/
125
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
126
index XXXXXXX..XXXXXXX 100644
127
--- a/target/arm/sve.decode
128
+++ b/target/arm/sve.decode
129
@@ -XXX,XX +XXX,XX @@
130
131
&rr_esz rd rn esz
132
&rri rd rn imm
133
+&rrri rd rn rm imm
134
&rri_esz rd rn imm esz
135
&rrr_esz rd rn rm esz
136
&rpr_esz rd pg rn esz
137
@@ -XXX,XX +XXX,XX @@
138
# Three operand, vector element size
139
@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz
140
141
+# Three operand with "memory" size, aka immediate left shift
142
+@rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri
143
+
144
# Two register operand, with governing predicate, vector element size
145
@rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \
146
&rprr_esz rn=%reg_movprfx
147
@@ -XXX,XX +XXX,XX @@ ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_rn_rm
148
LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm
149
LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm
150
151
+### SVE Compute Vector Address Group
152
+
153
+# SVE vector address generation
154
+ADR_s32 00000100 00 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
155
+ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
156
+ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
157
+ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
158
+
159
### SVE Predicate Logical Operations Group
160
161
# SVE predicate logical operations
162
--
163
2.17.0
164
165
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-21-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 4 ++
9
target/arm/sve_helper.c | 90 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 24 ++++++++++
11
target/arm/sve.decode | 7 +++
12
4 files changed, 125 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
+
26
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/sve_helper.c
32
+++ b/target/arm/sve_helper.c
33
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_adr_u32)(void *vd, void *vn, void *vm, uint32_t desc)
34
d[i] = n[i] + ((uint64_t)(uint32_t)m[i] << sh);
35
}
36
}
37
+
38
+void HELPER(sve_fexpa_h)(void *vd, void *vn, uint32_t desc)
39
+{
40
+ /* These constants are cut-and-paste directly from the ARM pseudocode. */
41
+ static const uint16_t coeff[] = {
42
+ 0x0000, 0x0016, 0x002d, 0x0045, 0x005d, 0x0075, 0x008e, 0x00a8,
43
+ 0x00c2, 0x00dc, 0x00f8, 0x0114, 0x0130, 0x014d, 0x016b, 0x0189,
44
+ 0x01a8, 0x01c8, 0x01e8, 0x0209, 0x022b, 0x024e, 0x0271, 0x0295,
45
+ 0x02ba, 0x02e0, 0x0306, 0x032e, 0x0356, 0x037f, 0x03a9, 0x03d4,
46
+ };
47
+ intptr_t i, opr_sz = simd_oprsz(desc) / 2;
48
+ uint16_t *d = vd, *n = vn;
49
+
50
+ for (i = 0; i < opr_sz; i++) {
51
+ uint16_t nn = n[i];
52
+ intptr_t idx = extract32(nn, 0, 5);
53
+ uint16_t exp = extract32(nn, 5, 5);
54
+ d[i] = coeff[idx] | (exp << 10);
55
+ }
56
+}
57
+
58
+void HELPER(sve_fexpa_s)(void *vd, void *vn, uint32_t desc)
59
+{
60
+ /* These constants are cut-and-paste directly from the ARM pseudocode. */
61
+ static const uint32_t coeff[] = {
62
+ 0x000000, 0x0164d2, 0x02cd87, 0x043a29,
63
+ 0x05aac3, 0x071f62, 0x08980f, 0x0a14d5,
64
+ 0x0b95c2, 0x0d1adf, 0x0ea43a, 0x1031dc,
65
+ 0x11c3d3, 0x135a2b, 0x14f4f0, 0x16942d,
66
+ 0x1837f0, 0x19e046, 0x1b8d3a, 0x1d3eda,
67
+ 0x1ef532, 0x20b051, 0x227043, 0x243516,
68
+ 0x25fed7, 0x27cd94, 0x29a15b, 0x2b7a3a,
69
+ 0x2d583f, 0x2f3b79, 0x3123f6, 0x3311c4,
70
+ 0x3504f3, 0x36fd92, 0x38fbaf, 0x3aff5b,
71
+ 0x3d08a4, 0x3f179a, 0x412c4d, 0x4346cd,
72
+ 0x45672a, 0x478d75, 0x49b9be, 0x4bec15,
73
+ 0x4e248c, 0x506334, 0x52a81e, 0x54f35b,
74
+ 0x5744fd, 0x599d16, 0x5bfbb8, 0x5e60f5,
75
+ 0x60ccdf, 0x633f89, 0x65b907, 0x68396a,
76
+ 0x6ac0c7, 0x6d4f30, 0x6fe4ba, 0x728177,
77
+ 0x75257d, 0x77d0df, 0x7a83b3, 0x7d3e0c,
78
+ };
79
+ intptr_t i, opr_sz = simd_oprsz(desc) / 4;
80
+ uint32_t *d = vd, *n = vn;
81
+
82
+ for (i = 0; i < opr_sz; i++) {
83
+ uint32_t nn = n[i];
84
+ intptr_t idx = extract32(nn, 0, 6);
85
+ uint32_t exp = extract32(nn, 6, 8);
86
+ d[i] = coeff[idx] | (exp << 23);
87
+ }
88
+}
89
+
90
+void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_t desc)
91
+{
92
+ /* These constants are cut-and-paste directly from the ARM pseudocode. */
93
+ static const uint64_t coeff[] = {
94
+ 0x0000000000000ull, 0x02C9A3E778061ull, 0x059B0D3158574ull,
95
+ 0x0874518759BC8ull, 0x0B5586CF9890Full, 0x0E3EC32D3D1A2ull,
96
+ 0x11301D0125B51ull, 0x1429AAEA92DE0ull, 0x172B83C7D517Bull,
97
+ 0x1A35BEB6FCB75ull, 0x1D4873168B9AAull, 0x2063B88628CD6ull,
98
+ 0x2387A6E756238ull, 0x26B4565E27CDDull, 0x29E9DF51FDEE1ull,
99
+ 0x2D285A6E4030Bull, 0x306FE0A31B715ull, 0x33C08B26416FFull,
100
+ 0x371A7373AA9CBull, 0x3A7DB34E59FF7ull, 0x3DEA64C123422ull,
101
+ 0x4160A21F72E2Aull, 0x44E086061892Dull, 0x486A2B5C13CD0ull,
102
+ 0x4BFDAD5362A27ull, 0x4F9B2769D2CA7ull, 0x5342B569D4F82ull,
103
+ 0x56F4736B527DAull, 0x5AB07DD485429ull, 0x5E76F15AD2148ull,
104
+ 0x6247EB03A5585ull, 0x6623882552225ull, 0x6A09E667F3BCDull,
105
+ 0x6DFB23C651A2Full, 0x71F75E8EC5F74ull, 0x75FEB564267C9ull,
106
+ 0x7A11473EB0187ull, 0x7E2F336CF4E62ull, 0x82589994CCE13ull,
107
+ 0x868D99B4492EDull, 0x8ACE5422AA0DBull, 0x8F1AE99157736ull,
108
+ 0x93737B0CDC5E5ull, 0x97D829FDE4E50ull, 0x9C49182A3F090ull,
109
+ 0xA0C667B5DE565ull, 0xA5503B23E255Dull, 0xA9E6B5579FDBFull,
110
+ 0xAE89F995AD3ADull, 0xB33A2B84F15FBull, 0xB7F76F2FB5E47ull,
111
+ 0xBCC1E904BC1D2ull, 0xC199BDD85529Cull, 0xC67F12E57D14Bull,
112
+ 0xCB720DCEF9069ull, 0xD072D4A07897Cull, 0xD5818DCFBA487ull,
113
+ 0xDA9E603DB3285ull, 0xDFC97337B9B5Full, 0xE502EE78B3FF6ull,
114
+ 0xEA4AFA2A490DAull, 0xEFA1BEE615A27ull, 0xF50765B6E4540ull,
115
+ 0xFA7C1819E90D8ull,
116
+ };
117
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
118
+ uint64_t *d = vd, *n = vn;
119
+
120
+ for (i = 0; i < opr_sz; i++) {
121
+ uint64_t nn = n[i];
122
+ intptr_t idx = extract32(nn, 0, 6);
123
+ uint64_t exp = extract32(nn, 6, 11);
124
+ d[i] = coeff[idx] | (exp << 52);
125
+ }
126
+}
127
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
128
index XXXXXXX..XXXXXXX 100644
129
--- a/target/arm/translate-sve.c
130
+++ b/target/arm/translate-sve.c
131
@@ -XXX,XX +XXX,XX @@ static bool trans_ADR_u32(DisasContext *s, arg_rrri *a, uint32_t insn)
132
return do_adr(s, a, gen_helper_sve_adr_u32);
133
}
134
135
+/*
136
+ *** SVE Integer Misc - Unpredicated Group
137
+ */
138
+
139
+static bool trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint32_t insn)
140
+{
141
+ static gen_helper_gvec_2 * const fns[4] = {
142
+ NULL,
143
+ gen_helper_sve_fexpa_h,
144
+ gen_helper_sve_fexpa_s,
145
+ gen_helper_sve_fexpa_d,
146
+ };
147
+ if (a->esz == 0) {
148
+ return false;
149
+ }
150
+ if (sve_access_check(s)) {
151
+ unsigned vsz = vec_full_reg_size(s);
152
+ tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd),
153
+ vec_full_reg_offset(s, a->rn),
154
+ vsz, vsz, 0, fns[a->esz]);
155
+ }
156
+ return true;
157
+}
158
+
159
/*
160
*** SVE Predicate Logical Operations Group
161
*/
162
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
163
index XXXXXXX..XXXXXXX 100644
164
--- a/target/arm/sve.decode
165
+++ b/target/arm/sve.decode
166
@@ -XXX,XX +XXX,XX @@
167
168
# Two operand
169
@pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz
170
+@rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz
171
172
# Three operand with unused vector element size
173
@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0
174
@@ -XXX,XX +XXX,XX @@ ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
175
ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
176
ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
177
178
+### SVE Integer Misc - Unpredicated Group
179
+
180
+# SVE floating-point exponential accelerator
181
+# Note esz != 0
182
+FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn
183
+
184
### SVE Predicate Logical Operations Group
185
186
# SVE predicate logical operations
187
--
188
2.17.0
189
190
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-22-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 4 ++++
9
target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 21 +++++++++++++++++++
11
target/arm/sve.decode | 4 ++++
12
4 files changed, 72 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+
26
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/sve_helper.c
32
+++ b/target/arm/sve_helper.c
33
@@ -XXX,XX +XXX,XX @@
34
#include "exec/cpu_ldst.h"
35
#include "exec/helper-proto.h"
36
#include "tcg/tcg-gvec-desc.h"
37
+#include "fpu/softfloat.h"
38
39
40
/* Note that vector data is stored in host-endian 64-bit chunks,
41
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_t desc)
42
d[i] = coeff[idx] | (exp << 52);
43
}
44
}
45
+
46
+void HELPER(sve_ftssel_h)(void *vd, void *vn, void *vm, uint32_t desc)
47
+{
48
+ intptr_t i, opr_sz = simd_oprsz(desc) / 2;
49
+ uint16_t *d = vd, *n = vn, *m = vm;
50
+ for (i = 0; i < opr_sz; i += 1) {
51
+ uint16_t nn = n[i];
52
+ uint16_t mm = m[i];
53
+ if (mm & 1) {
54
+ nn = float16_one;
55
+ }
56
+ d[i] = nn ^ (mm & 2) << 14;
57
+ }
58
+}
59
+
60
+void HELPER(sve_ftssel_s)(void *vd, void *vn, void *vm, uint32_t desc)
61
+{
62
+ intptr_t i, opr_sz = simd_oprsz(desc) / 4;
63
+ uint32_t *d = vd, *n = vn, *m = vm;
64
+ for (i = 0; i < opr_sz; i += 1) {
65
+ uint32_t nn = n[i];
66
+ uint32_t mm = m[i];
67
+ if (mm & 1) {
68
+ nn = float32_one;
69
+ }
70
+ d[i] = nn ^ (mm & 2) << 30;
71
+ }
72
+}
73
+
74
+void HELPER(sve_ftssel_d)(void *vd, void *vn, void *vm, uint32_t desc)
75
+{
76
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
77
+ uint64_t *d = vd, *n = vn, *m = vm;
78
+ for (i = 0; i < opr_sz; i += 1) {
79
+ uint64_t nn = n[i];
80
+ uint64_t mm = m[i];
81
+ if (mm & 1) {
82
+ nn = float64_one;
83
+ }
84
+ d[i] = nn ^ (mm & 2) << 62;
85
+ }
86
+}
87
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/target/arm/translate-sve.c
90
+++ b/target/arm/translate-sve.c
91
@@ -XXX,XX +XXX,XX @@ static bool trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint32_t insn)
92
return true;
93
}
94
95
+static bool trans_FTSSEL(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
96
+{
97
+ static gen_helper_gvec_3 * const fns[4] = {
98
+ NULL,
99
+ gen_helper_sve_ftssel_h,
100
+ gen_helper_sve_ftssel_s,
101
+ gen_helper_sve_ftssel_d,
102
+ };
103
+ if (a->esz == 0) {
104
+ return false;
105
+ }
106
+ if (sve_access_check(s)) {
107
+ unsigned vsz = vec_full_reg_size(s);
108
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
109
+ vec_full_reg_offset(s, a->rn),
110
+ vec_full_reg_offset(s, a->rm),
111
+ vsz, vsz, 0, fns[a->esz]);
112
+ }
113
+ return true;
114
+}
115
+
116
/*
117
*** SVE Predicate Logical Operations Group
118
*/
119
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
120
index XXXXXXX..XXXXXXX 100644
121
--- a/target/arm/sve.decode
122
+++ b/target/arm/sve.decode
123
@@ -XXX,XX +XXX,XX @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
124
# Note esz != 0
125
FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn
126
127
+# SVE floating-point trig select coefficient
128
+# Note esz != 0
129
+FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm
130
+
131
### SVE Predicate Logical Operations Group
132
133
# SVE predicate logical operations
134
--
135
2.17.0
136
137
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-23-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 11 ++
9
target/arm/sve_helper.c | 136 ++++++++++++++++++
10
target/arm/translate-sve.c | 288 +++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 31 +++-
12
4 files changed, 465 insertions(+), 1 deletion(-)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_sqaddi_b, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32)
23
+DEF_HELPER_FLAGS_4(sve_sqaddi_h, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32)
24
+DEF_HELPER_FLAGS_4(sve_sqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32)
25
+DEF_HELPER_FLAGS_4(sve_sqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32)
26
+
27
+DEF_HELPER_FLAGS_4(sve_uqaddi_b, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32)
28
+DEF_HELPER_FLAGS_4(sve_uqaddi_h, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32)
29
+DEF_HELPER_FLAGS_4(sve_uqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32)
30
+DEF_HELPER_FLAGS_4(sve_uqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
31
+DEF_HELPER_FLAGS_4(sve_uqsubi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
32
+
33
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/sve_helper.c
39
+++ b/target/arm/sve_helper.c
40
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftssel_d)(void *vd, void *vn, void *vm, uint32_t desc)
41
d[i] = nn ^ (mm & 2) << 62;
42
}
43
}
44
+
45
+/*
46
+ * Signed saturating addition with scalar operand.
47
+ */
48
+
49
+void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc)
50
+{
51
+ intptr_t i, oprsz = simd_oprsz(desc);
52
+
53
+ for (i = 0; i < oprsz; i += sizeof(int8_t)) {
54
+ int r = *(int8_t *)(a + i) + b;
55
+ if (r > INT8_MAX) {
56
+ r = INT8_MAX;
57
+ } else if (r < INT8_MIN) {
58
+ r = INT8_MIN;
59
+ }
60
+ *(int8_t *)(d + i) = r;
61
+ }
62
+}
63
+
64
+void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc)
65
+{
66
+ intptr_t i, oprsz = simd_oprsz(desc);
67
+
68
+ for (i = 0; i < oprsz; i += sizeof(int16_t)) {
69
+ int r = *(int16_t *)(a + i) + b;
70
+ if (r > INT16_MAX) {
71
+ r = INT16_MAX;
72
+ } else if (r < INT16_MIN) {
73
+ r = INT16_MIN;
74
+ }
75
+ *(int16_t *)(d + i) = r;
76
+ }
77
+}
78
+
79
+void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc)
80
+{
81
+ intptr_t i, oprsz = simd_oprsz(desc);
82
+
83
+ for (i = 0; i < oprsz; i += sizeof(int32_t)) {
84
+ int64_t r = *(int32_t *)(a + i) + b;
85
+ if (r > INT32_MAX) {
86
+ r = INT32_MAX;
87
+ } else if (r < INT32_MIN) {
88
+ r = INT32_MIN;
89
+ }
90
+ *(int32_t *)(d + i) = r;
91
+ }
92
+}
93
+
94
+void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc)
95
+{
96
+ intptr_t i, oprsz = simd_oprsz(desc);
97
+
98
+ for (i = 0; i < oprsz; i += sizeof(int64_t)) {
99
+ int64_t ai = *(int64_t *)(a + i);
100
+ int64_t r = ai + b;
101
+ if (((r ^ ai) & ~(ai ^ b)) < 0) {
102
+ /* Signed overflow. */
103
+ r = (r < 0 ? INT64_MAX : INT64_MIN);
104
+ }
105
+ *(int64_t *)(d + i) = r;
106
+ }
107
+}
108
+
109
+/*
110
+ * Unsigned saturating addition with scalar operand.
111
+ */
112
+
113
+void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc)
114
+{
115
+ intptr_t i, oprsz = simd_oprsz(desc);
116
+
117
+ for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
118
+ int r = *(uint8_t *)(a + i) + b;
119
+ if (r > UINT8_MAX) {
120
+ r = UINT8_MAX;
121
+ } else if (r < 0) {
122
+ r = 0;
123
+ }
124
+ *(uint8_t *)(d + i) = r;
125
+ }
126
+}
127
+
128
+void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc)
129
+{
130
+ intptr_t i, oprsz = simd_oprsz(desc);
131
+
132
+ for (i = 0; i < oprsz; i += sizeof(uint16_t)) {
133
+ int r = *(uint16_t *)(a + i) + b;
134
+ if (r > UINT16_MAX) {
135
+ r = UINT16_MAX;
136
+ } else if (r < 0) {
137
+ r = 0;
138
+ }
139
+ *(uint16_t *)(d + i) = r;
140
+ }
141
+}
142
+
143
+void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc)
144
+{
145
+ intptr_t i, oprsz = simd_oprsz(desc);
146
+
147
+ for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
148
+ int64_t r = *(uint32_t *)(a + i) + b;
149
+ if (r > UINT32_MAX) {
150
+ r = UINT32_MAX;
151
+ } else if (r < 0) {
152
+ r = 0;
153
+ }
154
+ *(uint32_t *)(d + i) = r;
155
+ }
156
+}
157
+
158
+void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc)
159
+{
160
+ intptr_t i, oprsz = simd_oprsz(desc);
161
+
162
+ for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
163
+ uint64_t r = *(uint64_t *)(a + i) + b;
164
+ if (r < b) {
165
+ r = UINT64_MAX;
166
+ }
167
+ *(uint64_t *)(d + i) = r;
168
+ }
169
+}
170
+
171
+void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc)
172
+{
173
+ intptr_t i, oprsz = simd_oprsz(desc);
174
+
175
+ for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
176
+ uint64_t ai = *(uint64_t *)(a + i);
177
+ *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b);
178
+ }
179
+}
180
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
181
index XXXXXXX..XXXXXXX 100644
182
--- a/target/arm/translate-sve.c
183
+++ b/target/arm/translate-sve.c
184
@@ -XXX,XX +XXX,XX @@ static int tszimm_shl(int x)
185
return x - (8 << tszimm_esz(x));
186
}
187
188
+static inline int plus1(int x)
189
+{
190
+ return x + 1;
191
+}
192
+
193
/*
194
* Include the generated decoder.
195
*/
196
@@ -XXX,XX +XXX,XX @@ static bool trans_PNEXT(DisasContext *s, arg_rr_esz *a, uint32_t insn)
197
return do_pfirst_pnext(s, a, gen_helper_sve_pnext);
198
}
199
200
+/*
201
+ *** SVE Element Count Group
202
+ */
203
+
204
+/* Perform an inline saturating addition of a 32-bit value within
205
+ * a 64-bit register. The second operand is known to be positive,
206
+ * which halves the comparisions we must perform to bound the result.
207
+ */
208
+static void do_sat_addsub_32(TCGv_i64 reg, TCGv_i64 val, bool u, bool d)
209
+{
210
+ int64_t ibound;
211
+ TCGv_i64 bound;
212
+ TCGCond cond;
213
+
214
+ /* Use normal 64-bit arithmetic to detect 32-bit overflow. */
215
+ if (u) {
216
+ tcg_gen_ext32u_i64(reg, reg);
217
+ } else {
218
+ tcg_gen_ext32s_i64(reg, reg);
219
+ }
220
+ if (d) {
221
+ tcg_gen_sub_i64(reg, reg, val);
222
+ ibound = (u ? 0 : INT32_MIN);
223
+ cond = TCG_COND_LT;
224
+ } else {
225
+ tcg_gen_add_i64(reg, reg, val);
226
+ ibound = (u ? UINT32_MAX : INT32_MAX);
227
+ cond = TCG_COND_GT;
228
+ }
229
+ bound = tcg_const_i64(ibound);
230
+ tcg_gen_movcond_i64(cond, reg, reg, bound, bound, reg);
231
+ tcg_temp_free_i64(bound);
232
+}
233
+
234
+/* Similarly with 64-bit values. */
235
+static void do_sat_addsub_64(TCGv_i64 reg, TCGv_i64 val, bool u, bool d)
236
+{
237
+ TCGv_i64 t0 = tcg_temp_new_i64();
238
+ TCGv_i64 t1 = tcg_temp_new_i64();
239
+ TCGv_i64 t2;
240
+
241
+ if (u) {
242
+ if (d) {
243
+ tcg_gen_sub_i64(t0, reg, val);
244
+ tcg_gen_movi_i64(t1, 0);
245
+ tcg_gen_movcond_i64(TCG_COND_LTU, reg, reg, val, t1, t0);
246
+ } else {
247
+ tcg_gen_add_i64(t0, reg, val);
248
+ tcg_gen_movi_i64(t1, -1);
249
+ tcg_gen_movcond_i64(TCG_COND_LTU, reg, t0, reg, t1, t0);
250
+ }
251
+ } else {
252
+ if (d) {
253
+ /* Detect signed overflow for subtraction. */
254
+ tcg_gen_xor_i64(t0, reg, val);
255
+ tcg_gen_sub_i64(t1, reg, val);
256
+ tcg_gen_xor_i64(reg, reg, t0);
257
+ tcg_gen_and_i64(t0, t0, reg);
258
+
259
+ /* Bound the result. */
260
+ tcg_gen_movi_i64(reg, INT64_MIN);
261
+ t2 = tcg_const_i64(0);
262
+ tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, t2, reg, t1);
263
+ } else {
264
+ /* Detect signed overflow for addition. */
265
+ tcg_gen_xor_i64(t0, reg, val);
266
+ tcg_gen_add_i64(reg, reg, val);
267
+ tcg_gen_xor_i64(t1, reg, val);
268
+ tcg_gen_andc_i64(t0, t1, t0);
269
+
270
+ /* Bound the result. */
271
+ tcg_gen_movi_i64(t1, INT64_MAX);
272
+ t2 = tcg_const_i64(0);
273
+ tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, t2, t1, reg);
274
+ }
275
+ tcg_temp_free_i64(t2);
276
+ }
277
+ tcg_temp_free_i64(t0);
278
+ tcg_temp_free_i64(t1);
279
+}
280
+
281
+/* Similarly with a vector and a scalar operand. */
282
+static void do_sat_addsub_vec(DisasContext *s, int esz, int rd, int rn,
283
+ TCGv_i64 val, bool u, bool d)
284
+{
285
+ unsigned vsz = vec_full_reg_size(s);
286
+ TCGv_ptr dptr, nptr;
287
+ TCGv_i32 t32, desc;
288
+ TCGv_i64 t64;
289
+
290
+ dptr = tcg_temp_new_ptr();
291
+ nptr = tcg_temp_new_ptr();
292
+ tcg_gen_addi_ptr(dptr, cpu_env, vec_full_reg_offset(s, rd));
293
+ tcg_gen_addi_ptr(nptr, cpu_env, vec_full_reg_offset(s, rn));
294
+ desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
295
+
296
+ switch (esz) {
297
+ case MO_8:
298
+ t32 = tcg_temp_new_i32();
299
+ tcg_gen_extrl_i64_i32(t32, val);
300
+ if (d) {
301
+ tcg_gen_neg_i32(t32, t32);
302
+ }
303
+ if (u) {
304
+ gen_helper_sve_uqaddi_b(dptr, nptr, t32, desc);
305
+ } else {
306
+ gen_helper_sve_sqaddi_b(dptr, nptr, t32, desc);
307
+ }
308
+ tcg_temp_free_i32(t32);
309
+ break;
310
+
311
+ case MO_16:
312
+ t32 = tcg_temp_new_i32();
313
+ tcg_gen_extrl_i64_i32(t32, val);
314
+ if (d) {
315
+ tcg_gen_neg_i32(t32, t32);
316
+ }
317
+ if (u) {
318
+ gen_helper_sve_uqaddi_h(dptr, nptr, t32, desc);
319
+ } else {
320
+ gen_helper_sve_sqaddi_h(dptr, nptr, t32, desc);
321
+ }
322
+ tcg_temp_free_i32(t32);
323
+ break;
324
+
325
+ case MO_32:
326
+ t64 = tcg_temp_new_i64();
327
+ if (d) {
328
+ tcg_gen_neg_i64(t64, val);
329
+ } else {
330
+ tcg_gen_mov_i64(t64, val);
331
+ }
332
+ if (u) {
333
+ gen_helper_sve_uqaddi_s(dptr, nptr, t64, desc);
334
+ } else {
335
+ gen_helper_sve_sqaddi_s(dptr, nptr, t64, desc);
336
+ }
337
+ tcg_temp_free_i64(t64);
338
+ break;
339
+
340
+ case MO_64:
341
+ if (u) {
342
+ if (d) {
343
+ gen_helper_sve_uqsubi_d(dptr, nptr, val, desc);
344
+ } else {
345
+ gen_helper_sve_uqaddi_d(dptr, nptr, val, desc);
346
+ }
347
+ } else if (d) {
348
+ t64 = tcg_temp_new_i64();
349
+ tcg_gen_neg_i64(t64, val);
350
+ gen_helper_sve_sqaddi_d(dptr, nptr, t64, desc);
351
+ tcg_temp_free_i64(t64);
352
+ } else {
353
+ gen_helper_sve_sqaddi_d(dptr, nptr, val, desc);
354
+ }
355
+ break;
356
+
357
+ default:
358
+ g_assert_not_reached();
359
+ }
360
+
361
+ tcg_temp_free_ptr(dptr);
362
+ tcg_temp_free_ptr(nptr);
363
+ tcg_temp_free_i32(desc);
364
+}
365
+
366
+static bool trans_CNT_r(DisasContext *s, arg_CNT_r *a, uint32_t insn)
367
+{
368
+ if (sve_access_check(s)) {
369
+ unsigned fullsz = vec_full_reg_size(s);
370
+ unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz);
371
+ tcg_gen_movi_i64(cpu_reg(s, a->rd), numelem * a->imm);
372
+ }
373
+ return true;
374
+}
375
+
376
+static bool trans_INCDEC_r(DisasContext *s, arg_incdec_cnt *a, uint32_t insn)
377
+{
378
+ if (sve_access_check(s)) {
379
+ unsigned fullsz = vec_full_reg_size(s);
380
+ unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz);
381
+ int inc = numelem * a->imm * (a->d ? -1 : 1);
382
+ TCGv_i64 reg = cpu_reg(s, a->rd);
383
+
384
+ tcg_gen_addi_i64(reg, reg, inc);
385
+ }
386
+ return true;
387
+}
388
+
389
+static bool trans_SINCDEC_r_32(DisasContext *s, arg_incdec_cnt *a,
390
+ uint32_t insn)
391
+{
392
+ if (!sve_access_check(s)) {
393
+ return true;
394
+ }
395
+
396
+ unsigned fullsz = vec_full_reg_size(s);
397
+ unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz);
398
+ int inc = numelem * a->imm;
399
+ TCGv_i64 reg = cpu_reg(s, a->rd);
400
+
401
+ /* Use normal 64-bit arithmetic to detect 32-bit overflow. */
402
+ if (inc == 0) {
403
+ if (a->u) {
404
+ tcg_gen_ext32u_i64(reg, reg);
405
+ } else {
406
+ tcg_gen_ext32s_i64(reg, reg);
407
+ }
408
+ } else {
409
+ TCGv_i64 t = tcg_const_i64(inc);
410
+ do_sat_addsub_32(reg, t, a->u, a->d);
411
+ tcg_temp_free_i64(t);
412
+ }
413
+ return true;
414
+}
415
+
416
+static bool trans_SINCDEC_r_64(DisasContext *s, arg_incdec_cnt *a,
417
+ uint32_t insn)
418
+{
419
+ if (!sve_access_check(s)) {
420
+ return true;
421
+ }
422
+
423
+ unsigned fullsz = vec_full_reg_size(s);
424
+ unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz);
425
+ int inc = numelem * a->imm;
426
+ TCGv_i64 reg = cpu_reg(s, a->rd);
427
+
428
+ if (inc != 0) {
429
+ TCGv_i64 t = tcg_const_i64(inc);
430
+ do_sat_addsub_64(reg, t, a->u, a->d);
431
+ tcg_temp_free_i64(t);
432
+ }
433
+ return true;
434
+}
435
+
436
+static bool trans_INCDEC_v(DisasContext *s, arg_incdec2_cnt *a, uint32_t insn)
437
+{
438
+ if (a->esz == 0) {
439
+ return false;
440
+ }
441
+
442
+ unsigned fullsz = vec_full_reg_size(s);
443
+ unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz);
444
+ int inc = numelem * a->imm;
445
+
446
+ if (inc != 0) {
447
+ if (sve_access_check(s)) {
448
+ TCGv_i64 t = tcg_const_i64(a->d ? -inc : inc);
449
+ tcg_gen_gvec_adds(a->esz, vec_full_reg_offset(s, a->rd),
450
+ vec_full_reg_offset(s, a->rn),
451
+ t, fullsz, fullsz);
452
+ tcg_temp_free_i64(t);
453
+ }
454
+ } else {
455
+ do_mov_z(s, a->rd, a->rn);
456
+ }
457
+ return true;
458
+}
459
+
460
+static bool trans_SINCDEC_v(DisasContext *s, arg_incdec2_cnt *a,
461
+ uint32_t insn)
462
+{
463
+ if (a->esz == 0) {
464
+ return false;
465
+ }
466
+
467
+ unsigned fullsz = vec_full_reg_size(s);
468
+ unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz);
469
+ int inc = numelem * a->imm;
470
+
471
+ if (inc != 0) {
472
+ if (sve_access_check(s)) {
473
+ TCGv_i64 t = tcg_const_i64(inc);
474
+ do_sat_addsub_vec(s, a->esz, a->rd, a->rn, t, a->u, a->d);
475
+ tcg_temp_free_i64(t);
476
+ }
477
+ } else {
478
+ do_mov_z(s, a->rd, a->rn);
479
+ }
480
+ return true;
481
+}
482
+
483
/*
484
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
485
*/
486
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
487
index XXXXXXX..XXXXXXX 100644
488
--- a/target/arm/sve.decode
489
+++ b/target/arm/sve.decode
490
@@ -XXX,XX +XXX,XX @@
491
###########################################################################
492
# Named fields. These are primarily for disjoint fields.
493
494
+%imm4_16_p1 16:4 !function=plus1
495
%imm6_22_5 22:1 5:5
496
%imm9_16_10 16:s6 10:3
497
498
@@ -XXX,XX +XXX,XX @@
499
&rprr_esz rd pg rn rm esz
500
&rprrr_esz rd pg rn rm ra esz
501
&rpri_esz rd pg rn imm esz
502
+&ptrue rd esz pat s
503
+&incdec_cnt rd pat esz imm d u
504
+&incdec2_cnt rd rn pat esz imm d u
505
506
###########################################################################
507
# Named instruction formats. These are generally used to
508
@@ -XXX,XX +XXX,XX @@
509
@rd_rn_i9 ........ ........ ...... rn:5 rd:5 \
510
&rri imm=%imm9_16_10
511
512
+# One register, pattern, and uint4+1.
513
+# User must fill in U and D.
514
+@incdec_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \
515
+ &incdec_cnt imm=%imm4_16_p1
516
+@incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \
517
+ &incdec2_cnt imm=%imm4_16_p1 rn=%reg_movprfx
518
+
519
###########################################################################
520
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
521
522
@@ -XXX,XX +XXX,XX @@ FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn
523
# Note esz != 0
524
FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm
525
526
-### SVE Predicate Logical Operations Group
527
+### SVE Element Count Group
528
+
529
+# SVE element count
530
+CNT_r 00000100 .. 10 .... 1110 0 0 ..... ..... @incdec_cnt d=0 u=1
531
+
532
+# SVE inc/dec register by element count
533
+INCDEC_r 00000100 .. 11 .... 1110 0 d:1 ..... ..... @incdec_cnt u=1
534
+
535
+# SVE saturating inc/dec register by element count
536
+SINCDEC_r_32 00000100 .. 10 .... 1111 d:1 u:1 ..... ..... @incdec_cnt
537
+SINCDEC_r_64 00000100 .. 11 .... 1111 d:1 u:1 ..... ..... @incdec_cnt
538
+
539
+# SVE inc/dec vector by element count
540
+# Note this requires esz != 0.
541
+INCDEC_v 00000100 .. 1 1 .... 1100 0 d:1 ..... ..... @incdec2_cnt u=1
542
+
543
+# SVE saturating inc/dec vector by element count
544
+# Note these require esz != 0.
545
+SINCDEC_v 00000100 .. 1 0 .... 1100 d:1 u:1 ..... ..... @incdec2_cnt
546
547
# SVE predicate logical operations
548
AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s
549
--
550
2.17.0
551
552
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-25-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 10 ++++
9
target/arm/sve_helper.c | 108 +++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 88 ++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 19 ++++++-
12
4 files changed, 224 insertions(+), 1 deletion(-)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_uqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32)
19
DEF_HELPER_FLAGS_4(sve_uqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
20
DEF_HELPER_FLAGS_4(sve_uqsubi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
21
22
+DEF_HELPER_FLAGS_5(sve_cpy_m_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32)
23
+DEF_HELPER_FLAGS_5(sve_cpy_m_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32)
24
+DEF_HELPER_FLAGS_5(sve_cpy_m_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32)
25
+DEF_HELPER_FLAGS_5(sve_cpy_m_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32)
26
+
27
+DEF_HELPER_FLAGS_4(sve_cpy_z_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
28
+DEF_HELPER_FLAGS_4(sve_cpy_z_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
29
+DEF_HELPER_FLAGS_4(sve_cpy_z_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
30
+DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
31
+
32
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/target/arm/sve_helper.c
38
+++ b/target/arm/sve_helper.c
39
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc)
40
*(uint64_t *)(d + i) = (ai < b ? 0 : ai - b);
41
}
42
}
43
+
44
+/* Two operand predicated copy immediate with merge. All valid immediates
45
+ * can fit within 17 signed bits in the simd_data field.
46
+ */
47
+void HELPER(sve_cpy_m_b)(void *vd, void *vn, void *vg,
48
+ uint64_t mm, uint32_t desc)
49
+{
50
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
51
+ uint64_t *d = vd, *n = vn;
52
+ uint8_t *pg = vg;
53
+
54
+ mm = dup_const(MO_8, mm);
55
+ for (i = 0; i < opr_sz; i += 1) {
56
+ uint64_t nn = n[i];
57
+ uint64_t pp = expand_pred_b(pg[H1(i)]);
58
+ d[i] = (mm & pp) | (nn & ~pp);
59
+ }
60
+}
61
+
62
+void HELPER(sve_cpy_m_h)(void *vd, void *vn, void *vg,
63
+ uint64_t mm, uint32_t desc)
64
+{
65
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
66
+ uint64_t *d = vd, *n = vn;
67
+ uint8_t *pg = vg;
68
+
69
+ mm = dup_const(MO_16, mm);
70
+ for (i = 0; i < opr_sz; i += 1) {
71
+ uint64_t nn = n[i];
72
+ uint64_t pp = expand_pred_h(pg[H1(i)]);
73
+ d[i] = (mm & pp) | (nn & ~pp);
74
+ }
75
+}
76
+
77
+void HELPER(sve_cpy_m_s)(void *vd, void *vn, void *vg,
78
+ uint64_t mm, uint32_t desc)
79
+{
80
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
81
+ uint64_t *d = vd, *n = vn;
82
+ uint8_t *pg = vg;
83
+
84
+ mm = dup_const(MO_32, mm);
85
+ for (i = 0; i < opr_sz; i += 1) {
86
+ uint64_t nn = n[i];
87
+ uint64_t pp = expand_pred_s(pg[H1(i)]);
88
+ d[i] = (mm & pp) | (nn & ~pp);
89
+ }
90
+}
91
+
92
+void HELPER(sve_cpy_m_d)(void *vd, void *vn, void *vg,
93
+ uint64_t mm, uint32_t desc)
94
+{
95
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
96
+ uint64_t *d = vd, *n = vn;
97
+ uint8_t *pg = vg;
98
+
99
+ for (i = 0; i < opr_sz; i += 1) {
100
+ uint64_t nn = n[i];
101
+ d[i] = (pg[H1(i)] & 1 ? mm : nn);
102
+ }
103
+}
104
+
105
+void HELPER(sve_cpy_z_b)(void *vd, void *vg, uint64_t val, uint32_t desc)
106
+{
107
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
108
+ uint64_t *d = vd;
109
+ uint8_t *pg = vg;
110
+
111
+ val = dup_const(MO_8, val);
112
+ for (i = 0; i < opr_sz; i += 1) {
113
+ d[i] = val & expand_pred_b(pg[H1(i)]);
114
+ }
115
+}
116
+
117
+void HELPER(sve_cpy_z_h)(void *vd, void *vg, uint64_t val, uint32_t desc)
118
+{
119
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
120
+ uint64_t *d = vd;
121
+ uint8_t *pg = vg;
122
+
123
+ val = dup_const(MO_16, val);
124
+ for (i = 0; i < opr_sz; i += 1) {
125
+ d[i] = val & expand_pred_h(pg[H1(i)]);
126
+ }
127
+}
128
+
129
+void HELPER(sve_cpy_z_s)(void *vd, void *vg, uint64_t val, uint32_t desc)
130
+{
131
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
132
+ uint64_t *d = vd;
133
+ uint8_t *pg = vg;
134
+
135
+ val = dup_const(MO_32, val);
136
+ for (i = 0; i < opr_sz; i += 1) {
137
+ d[i] = val & expand_pred_s(pg[H1(i)]);
138
+ }
139
+}
140
+
141
+void HELPER(sve_cpy_z_d)(void *vd, void *vg, uint64_t val, uint32_t desc)
142
+{
143
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
144
+ uint64_t *d = vd;
145
+ uint8_t *pg = vg;
146
+
147
+ for (i = 0; i < opr_sz; i += 1) {
148
+ d[i] = (pg[H1(i)] & 1 ? val : 0);
149
+ }
150
+}
151
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
152
index XXXXXXX..XXXXXXX 100644
153
--- a/target/arm/translate-sve.c
154
+++ b/target/arm/translate-sve.c
155
@@ -XXX,XX +XXX,XX @@ static inline int plus1(int x)
156
return x + 1;
157
}
158
159
+/* The SH bit is in bit 8. Extract the low 8 and shift. */
160
+static inline int expand_imm_sh8s(int x)
161
+{
162
+ return (int8_t)x << (x & 0x100 ? 8 : 0);
163
+}
164
+
165
/*
166
* Include the generated decoder.
167
*/
168
@@ -XXX,XX +XXX,XX @@ static bool trans_DUPM(DisasContext *s, arg_DUPM *a, uint32_t insn)
169
return true;
170
}
171
172
+/*
173
+ *** SVE Integer Wide Immediate - Predicated Group
174
+ */
175
+
176
+/* Implement all merging copies. This is used for CPY (immediate),
177
+ * FCPY, CPY (scalar), CPY (SIMD&FP scalar).
178
+ */
179
+static void do_cpy_m(DisasContext *s, int esz, int rd, int rn, int pg,
180
+ TCGv_i64 val)
181
+{
182
+ typedef void gen_cpy(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32);
183
+ static gen_cpy * const fns[4] = {
184
+ gen_helper_sve_cpy_m_b, gen_helper_sve_cpy_m_h,
185
+ gen_helper_sve_cpy_m_s, gen_helper_sve_cpy_m_d,
186
+ };
187
+ unsigned vsz = vec_full_reg_size(s);
188
+ TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
189
+ TCGv_ptr t_zd = tcg_temp_new_ptr();
190
+ TCGv_ptr t_zn = tcg_temp_new_ptr();
191
+ TCGv_ptr t_pg = tcg_temp_new_ptr();
192
+
193
+ tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd));
194
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, rn));
195
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
196
+
197
+ fns[esz](t_zd, t_zn, t_pg, val, desc);
198
+
199
+ tcg_temp_free_ptr(t_zd);
200
+ tcg_temp_free_ptr(t_zn);
201
+ tcg_temp_free_ptr(t_pg);
202
+ tcg_temp_free_i32(desc);
203
+}
204
+
205
+static bool trans_FCPY(DisasContext *s, arg_FCPY *a, uint32_t insn)
206
+{
207
+ if (a->esz == 0) {
208
+ return false;
209
+ }
210
+ if (sve_access_check(s)) {
211
+ /* Decode the VFP immediate. */
212
+ uint64_t imm = vfp_expand_imm(a->esz, a->imm);
213
+ TCGv_i64 t_imm = tcg_const_i64(imm);
214
+ do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm);
215
+ tcg_temp_free_i64(t_imm);
216
+ }
217
+ return true;
218
+}
219
+
220
+static bool trans_CPY_m_i(DisasContext *s, arg_rpri_esz *a, uint32_t insn)
221
+{
222
+ if (a->esz == 0 && extract32(insn, 13, 1)) {
223
+ return false;
224
+ }
225
+ if (sve_access_check(s)) {
226
+ TCGv_i64 t_imm = tcg_const_i64(a->imm);
227
+ do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm);
228
+ tcg_temp_free_i64(t_imm);
229
+ }
230
+ return true;
231
+}
232
+
233
+static bool trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a, uint32_t insn)
234
+{
235
+ static gen_helper_gvec_2i * const fns[4] = {
236
+ gen_helper_sve_cpy_z_b, gen_helper_sve_cpy_z_h,
237
+ gen_helper_sve_cpy_z_s, gen_helper_sve_cpy_z_d,
238
+ };
239
+
240
+ if (a->esz == 0 && extract32(insn, 13, 1)) {
241
+ return false;
242
+ }
243
+ if (sve_access_check(s)) {
244
+ unsigned vsz = vec_full_reg_size(s);
245
+ TCGv_i64 t_imm = tcg_const_i64(a->imm);
246
+ tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd),
247
+ pred_full_reg_offset(s, a->pg),
248
+ t_imm, vsz, vsz, 0, fns[a->esz]);
249
+ tcg_temp_free_i64(t_imm);
250
+ }
251
+ return true;
252
+}
253
+
254
/*
255
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
256
*/
257
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
258
index XXXXXXX..XXXXXXX 100644
259
--- a/target/arm/sve.decode
260
+++ b/target/arm/sve.decode
261
@@ -XXX,XX +XXX,XX @@
262
###########################################################################
263
# Named fields. These are primarily for disjoint fields.
264
265
-%imm4_16_p1 16:4 !function=plus1
266
+%imm4_16_p1 16:4 !function=plus1
267
%imm6_22_5 22:1 5:5
268
%imm9_16_10 16:s6 10:3
269
270
@@ -XXX,XX +XXX,XX @@
271
%tszimm16_shr 22:2 16:5 !function=tszimm_shr
272
%tszimm16_shl 22:2 16:5 !function=tszimm_shl
273
274
+# Signed 8-bit immediate, optionally shifted left by 8.
275
+%sh8_i8s 5:9 !function=expand_imm_sh8s
276
+
277
# Either a copy of rd (at bit 0), or a different source
278
# as propagated via the MOVPRFX instruction.
279
%reg_movprfx 0:5
280
@@ -XXX,XX +XXX,XX @@
281
@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \
282
&rri_esz esz=%tszimm16_esz
283
284
+# Two register operand, one immediate operand, with 4-bit predicate.
285
+# User must fill in imm.
286
+@rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \
287
+ &rpri_esz rn=%reg_movprfx
288
+
289
# Two register operand, one encoded bitmask.
290
@rdn_dbm ........ .. .... dbm:13 rd:5 \
291
&rr_dbm rn=%reg_movprfx
292
@@ -XXX,XX +XXX,XX @@ AND_zzi 00000101 10 0000 ............. ..... @rdn_dbm
293
# SVE broadcast bitmask immediate
294
DUPM 00000101 11 0000 dbm:13 rd:5
295
296
+### SVE Integer Wide Immediate - Predicated Group
297
+
298
+# SVE copy floating-point immediate (predicated)
299
+FCPY 00000101 .. 01 .... 110 imm:8 ..... @rdn_pg4
300
+
301
+# SVE copy integer immediate (predicated)
302
+CPY_m_i 00000101 .. 01 .... 01 . ........ ..... @rdn_pg4 imm=%sh8_i8s
303
+CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s
304
+
305
### SVE Predicate Logical Operations Group
306
307
# SVE predicate logical operations
308
--
309
2.17.0
310
311
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180516223007.10256-26-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 2 +
9
target/arm/sve_helper.c | 81 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 34 ++++++++++++++++
11
target/arm/sve.decode | 7 ++++
12
4 files changed, 124 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_cpy_z_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
19
DEF_HELPER_FLAGS_4(sve_cpy_z_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
20
DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+
24
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/sve_helper.c
30
+++ b/target/arm/sve_helper.c
31
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_cpy_z_d)(void *vd, void *vg, uint64_t val, uint32_t desc)
32
d[i] = (pg[H1(i)] & 1 ? val : 0);
33
}
34
}
35
+
36
+/* Big-endian hosts need to frob the byte indicies. If the copy
37
+ * happens to be 8-byte aligned, then no frobbing necessary.
38
+ */
39
+static void swap_memmove(void *vd, void *vs, size_t n)
40
+{
41
+ uintptr_t d = (uintptr_t)vd;
42
+ uintptr_t s = (uintptr_t)vs;
43
+ uintptr_t o = (d | s | n) & 7;
44
+ size_t i;
45
+
46
+#ifndef HOST_WORDS_BIGENDIAN
47
+ o = 0;
48
+#endif
49
+ switch (o) {
50
+ case 0:
51
+ memmove(vd, vs, n);
52
+ break;
53
+
54
+ case 4:
55
+ if (d < s || d >= s + n) {
56
+ for (i = 0; i < n; i += 4) {
57
+ *(uint32_t *)H1_4(d + i) = *(uint32_t *)H1_4(s + i);
58
+ }
59
+ } else {
60
+ for (i = n; i > 0; ) {
61
+ i -= 4;
62
+ *(uint32_t *)H1_4(d + i) = *(uint32_t *)H1_4(s + i);
63
+ }
64
+ }
65
+ break;
66
+
67
+ case 2:
68
+ case 6:
69
+ if (d < s || d >= s + n) {
70
+ for (i = 0; i < n; i += 2) {
71
+ *(uint16_t *)H1_2(d + i) = *(uint16_t *)H1_2(s + i);
72
+ }
73
+ } else {
74
+ for (i = n; i > 0; ) {
75
+ i -= 2;
76
+ *(uint16_t *)H1_2(d + i) = *(uint16_t *)H1_2(s + i);
77
+ }
78
+ }
79
+ break;
80
+
81
+ default:
82
+ if (d < s || d >= s + n) {
83
+ for (i = 0; i < n; i++) {
84
+ *(uint8_t *)H1(d + i) = *(uint8_t *)H1(s + i);
85
+ }
86
+ } else {
87
+ for (i = n; i > 0; ) {
88
+ i -= 1;
89
+ *(uint8_t *)H1(d + i) = *(uint8_t *)H1(s + i);
90
+ }
91
+ }
92
+ break;
93
+ }
94
+}
95
+
96
+void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc)
97
+{
98
+ intptr_t opr_sz = simd_oprsz(desc);
99
+ size_t n_ofs = simd_data(desc);
100
+ size_t n_siz = opr_sz - n_ofs;
101
+
102
+ if (vd != vm) {
103
+ swap_memmove(vd, vn + n_ofs, n_siz);
104
+ swap_memmove(vd + n_siz, vm, n_ofs);
105
+ } else if (vd != vn) {
106
+ swap_memmove(vd + n_siz, vd, n_ofs);
107
+ swap_memmove(vd, vn + n_ofs, n_siz);
108
+ } else {
109
+ /* vd == vn == vm. Need temp space. */
110
+ ARMVectorReg tmp;
111
+ swap_memmove(&tmp, vm, n_ofs);
112
+ swap_memmove(vd, vd + n_ofs, n_siz);
113
+ memcpy(vd + n_siz, &tmp, n_ofs);
114
+ }
115
+}
116
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
117
index XXXXXXX..XXXXXXX 100644
118
--- a/target/arm/translate-sve.c
119
+++ b/target/arm/translate-sve.c
120
@@ -XXX,XX +XXX,XX @@ static bool trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a, uint32_t insn)
121
return true;
122
}
123
124
+/*
125
+ *** SVE Permute Extract Group
126
+ */
127
+
128
+static bool trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn)
129
+{
130
+ if (!sve_access_check(s)) {
131
+ return true;
132
+ }
133
+
134
+ unsigned vsz = vec_full_reg_size(s);
135
+ unsigned n_ofs = a->imm >= vsz ? 0 : a->imm;
136
+ unsigned n_siz = vsz - n_ofs;
137
+ unsigned d = vec_full_reg_offset(s, a->rd);
138
+ unsigned n = vec_full_reg_offset(s, a->rn);
139
+ unsigned m = vec_full_reg_offset(s, a->rm);
140
+
141
+ /* Use host vector move insns if we have appropriate sizes
142
+ * and no unfortunate overlap.
143
+ */
144
+ if (m != d
145
+ && n_ofs == size_for_gvec(n_ofs)
146
+ && n_siz == size_for_gvec(n_siz)
147
+ && (d != n || n_siz <= n_ofs)) {
148
+ tcg_gen_gvec_mov(0, d, n + n_ofs, n_siz, n_siz);
149
+ if (n_ofs != 0) {
150
+ tcg_gen_gvec_mov(0, d + n_siz, m, n_ofs, n_ofs);
151
+ }
152
+ } else {
153
+ tcg_gen_gvec_3_ool(d, n, m, vsz, vsz, n_ofs, gen_helper_sve_ext);
154
+ }
155
+ return true;
156
+}
157
+
158
/*
159
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
160
*/
161
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
162
index XXXXXXX..XXXXXXX 100644
163
--- a/target/arm/sve.decode
164
+++ b/target/arm/sve.decode
165
@@ -XXX,XX +XXX,XX @@
166
167
%imm4_16_p1 16:4 !function=plus1
168
%imm6_22_5 22:1 5:5
169
+%imm8_16_10 16:5 10:3
170
%imm9_16_10 16:s6 10:3
171
172
# A combination of tsz:imm3 -- extract esize.
173
@@ -XXX,XX +XXX,XX @@ FCPY 00000101 .. 01 .... 110 imm:8 ..... @rdn_pg4
174
CPY_m_i 00000101 .. 01 .... 01 . ........ ..... @rdn_pg4 imm=%sh8_i8s
175
CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s
176
177
+### SVE Permute - Extract Group
178
+
179
+# SVE extract vector (immediate offset)
180
+EXT 00000101 001 ..... 000 ... rm:5 rd:5 \
181
+ &rrri rn=%reg_movprfx imm=%imm8_16_10
182
+
183
### SVE Predicate Logical Operations Group
184
185
# SVE predicate logical operations
186
--
187
2.17.0
188
189
diff view generated by jsdifflib