1
Hopefully last target-arm queue before softfreeze;
1
Hi; here's the latest arm pullreq...
2
this one's largest part is the remainder of the SVE patches,
3
but there are a selection of other minor things too.
4
2
5
thanks
6
-- PMM
3
-- PMM
7
4
8
The following changes since commit 109b25045b3651f9c5d02c3766c0b3ff63e6d193:
5
The following changes since commit 03a3a62fbd0aa5227e978eef3c67d3978aec9e5f:
9
6
10
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2018-06-29 12:30:29 +0100)
7
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2023-09-07 10:29:06 -0400)
11
8
12
are available in the Git repository at:
9
are available in the Git repository at:
13
10
14
git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180629
11
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230908
15
12
16
for you to fetch changes up to 802abf4024d23e48d45373ac3f2b580124b54b47:
13
for you to fetch changes up to c8f2eb5d414b788420b938f2ffdde891aa6c3ae8:
17
14
18
target/arm: Add ID_ISAR6 (2018-06-29 15:30:54 +0100)
15
arm/kvm: Enable support for KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE (2023-09-08 16:41:36 +0100)
19
16
20
----------------------------------------------------------------
17
----------------------------------------------------------------
21
target-arm queue:
18
target-arm queue:
22
* last of the SVE patches; SVE is now enabled for aarch64 linux-user
19
* New CPU type: cortex-a710
23
* sd: Don't trace SDRequest crc field (coverity bugfix)
20
* Implement new architectural features:
24
* target/arm: Mark PMINTENSET accesses as possibly doing IO
21
- FEAT_PACQARMA3
25
* clean up v7VE feature bit handling
22
- FEAT_EPAC
26
* i.mx7d: minor cleanups
23
- FEAT_Pauth2
27
* target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
24
- FEAT_FPAC
28
* target/arm: Implement ARMv8.2-DotProd
25
- FEAT_FPACCOMBINE
29
* virt: add addresses to dt node names (which stops dtc from
26
- FEAT_TIDCP1
30
complaining that they're not correctly named)
27
* Xilinx Versal: Model the CFU/CFI
31
* cleanups: replace error_setg(&error_fatal) by error_report() + exit()
28
* Implement RMR_ELx registers
29
* Implement handling of HCR_EL2.TIDCP trap bit
30
* arm/kvm: Enable support for KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE
31
* hw/intc/arm_gicv3_its: Avoid maybe-uninitialized error in get_vte()
32
* target/arm: Do not use gen_mte_checkN in trans_STGP
33
* arm64: Restore trapless ptimer access
32
34
33
----------------------------------------------------------------
35
----------------------------------------------------------------
34
Aaron Lindsay (3):
36
Aaron Lindsay (6):
35
target/arm: Add ARM_FEATURE_V7VE for v7 Virtualization Extensions
37
target/arm: Add ID_AA64ISAR2_EL1
36
target/arm: Remove redundant DIV detection for KVM
38
target/arm: Add feature detection for FEAT_Pauth2 and extensions
37
target/arm: Mark PMINTENSET accesses as possibly doing IO
39
target/arm: Implement FEAT_EPAC
40
target/arm: Implement FEAT_Pauth2
41
target/arm: Inform helpers whether a PAC instruction is 'combined'
42
target/arm: Implement FEAT_FPAC and FEAT_FPACCOMBINE
38
43
39
Alex Bennée (1):
44
Colton Lewis (1):
40
target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
45
arm64: Restore trapless ptimer access
41
46
42
Eric Auger (3):
47
Francisco Iglesias (8):
43
device_tree: Add qemu_fdt_node_unit_path
48
hw/misc: Introduce the Xilinx CFI interface
44
hw/arm/virt: Silence dtc /intc warnings
49
hw/misc: Introduce a model of Xilinx Versal's CFU_APB
45
hw/arm/virt: Silence dtc /memory warning
50
hw/misc/xlnx-versal-cfu: Introduce a model of Xilinx Versal CFU_FDRO
51
hw/misc/xlnx-versal-cfu: Introduce a model of Xilinx Versal's CFU_SFR
52
hw/misc: Introduce a model of Xilinx Versal's CFRAME_REG
53
hw/misc: Introduce a model of Xilinx Versal's CFRAME_BCAST_REG
54
hw/arm/xlnx-versal: Connect the CFU_APB, CFU_FDRO and CFU_SFR
55
hw/arm/versal: Connect the CFRAME_REG and CFRAME_BCAST_REG
46
56
47
Jean-Christophe Dubois (3):
57
Philippe Mathieu-Daudé (1):
48
i.mx7d: Remove unused header files
58
hw/intc/arm_gicv3_its: Avoid maybe-uninitialized error in get_vte()
49
i.mx7d: Change SRC unimplemented device name from sdma to src
50
i.mx7d: Change IRQ number type from hwaddr to int
51
59
52
Peter Maydell (1):
60
Richard Henderson (9):
53
sd: Don't trace SDRequest crc field
61
tests/tcg/aarch64: Adjust pauth tests for FEAT_FPAC
62
target/arm: Don't change pauth features when changing algorithm
63
target/arm: Implement FEAT_PACQARMA3
64
target/arm: Do not use gen_mte_checkN in trans_STGP
65
target/arm: Implement RMR_ELx
66
target/arm: Implement cortex-a710
67
target/arm: Implement HCR_EL2.TIDCP
68
target/arm: Implement FEAT_TIDCP1
69
target/arm: Enable SCTLR_EL1.TIDCP for user-only
54
70
55
Philippe Mathieu-Daudé (4):
71
Shameer Kolothum (1):
56
hw/block/fdc: Replace error_setg(&error_abort) by assert()
72
arm/kvm: Enable support for KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE
57
hw/arm/sysbus-fdt: Replace error_setg(&error_fatal) by error_report() + exit()
58
device_tree: Replace error_setg(&error_fatal) by error_report() + exit()
59
sdcard: Use the ldst API
60
73
61
Richard Henderson (40):
74
MAINTAINERS | 10 +
62
target/arm: Implement SVE Memory Contiguous Load Group
75
docs/system/arm/cpu-features.rst | 21 +-
63
target/arm: Implement SVE Contiguous Load, first-fault and no-fault
76
docs/system/arm/emulation.rst | 8 +
64
target/arm: Implement SVE Memory Contiguous Store Group
77
docs/system/arm/virt.rst | 1 +
65
target/arm: Implement SVE load and broadcast quadword
78
include/hw/arm/xlnx-versal.h | 85 +++
66
target/arm: Implement SVE integer convert to floating-point
79
include/hw/misc/xlnx-cfi-if.h | 59 +++
67
target/arm: Implement SVE floating-point arithmetic (predicated)
80
include/hw/misc/xlnx-versal-cframe-reg.h | 303 +++++++++++
68
target/arm: Implement SVE FP Multiply-Add Group
81
include/hw/misc/xlnx-versal-cfu.h | 258 ++++++++++
69
target/arm: Implement SVE Floating Point Accumulating Reduction Group
82
include/sysemu/kvm_int.h | 1 +
70
target/arm: Implement SVE load and broadcast element
83
target/arm/cpu.h | 54 +-
71
target/arm: Implement SVE store vector/predicate register
84
target/arm/helper.h | 2 +
72
target/arm: Implement SVE scatter stores
85
target/arm/syndrome.h | 7 +
73
target/arm: Implement SVE prefetches
86
target/arm/tcg/helper-a64.h | 4 +
74
target/arm: Implement SVE gather loads
87
tests/tcg/aarch64/pauth.h | 23 +
75
target/arm: Implement SVE first-fault gather loads
88
accel/kvm/kvm-all.c | 1 +
76
target/arm: Implement SVE scatter store vector immediate
89
hw/arm/virt.c | 1 +
77
target/arm: Implement SVE floating-point compare vectors
90
hw/arm/xlnx-versal.c | 155 +++++-
78
target/arm: Implement SVE floating-point arithmetic with immediate
91
hw/intc/arm_gicv3_its.c | 15 +-
79
target/arm: Implement SVE Floating Point Multiply Indexed Group
92
hw/misc/xlnx-cfi-if.c | 34 ++
80
target/arm: Implement SVE FP Fast Reduction Group
93
hw/misc/xlnx-versal-cframe-reg.c | 858 +++++++++++++++++++++++++++++++
81
target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group
94
hw/misc/xlnx-versal-cfu.c | 563 ++++++++++++++++++++
82
target/arm: Implement SVE FP Compare with Zero Group
95
target/arm/arm-qmp-cmds.c | 2 +-
83
target/arm: Implement SVE floating-point trig multiply-add coefficient
96
target/arm/cpu.c | 4 +
84
target/arm: Implement SVE floating-point convert precision
97
target/arm/cpu64.c | 86 +++-
85
target/arm: Implement SVE floating-point convert to integer
98
target/arm/helper.c | 68 ++-
86
target/arm: Implement SVE floating-point round to integral value
99
target/arm/hvf/hvf.c | 1 +
87
target/arm: Implement SVE floating-point unary operations
100
target/arm/kvm.c | 61 +++
88
target/arm: Implement SVE MOVPRFX
101
target/arm/kvm64.c | 3 +
89
target/arm: Implement SVE floating-point complex add
102
target/arm/tcg/cpu64.c | 215 ++++++++
90
target/arm: Implement SVE fp complex multiply add
103
target/arm/tcg/op_helper.c | 33 ++
91
target/arm: Pass index to AdvSIMD FCMLA (indexed)
104
target/arm/tcg/pauth_helper.c | 180 +++++--
92
target/arm: Implement SVE fp complex multiply add (indexed)
105
target/arm/tcg/translate-a64.c | 74 +--
93
target/arm: Implement SVE dot product (vectors)
106
target/arm/tcg/translate.c | 33 ++
94
target/arm: Implement SVE dot product (indexed)
107
tests/qtest/arm-cpu-features.c | 12 +-
95
target/arm: Enable SVE for aarch64-linux-user
108
tests/tcg/aarch64/pauth-2.c | 54 +-
96
target/arm: Implement ARMv8.2-DotProd
109
tests/tcg/aarch64/pauth-4.c | 18 +-
97
target/arm: Fix SVE signed division vs x86 overflow exception
110
tests/tcg/aarch64/pauth-5.c | 10 +
98
target/arm: Fix SVE system register access checks
111
hw/misc/meson.build | 3 +
99
target/arm: Prune a57 features from max
112
qemu-options.hx | 15 +
100
target/arm: Prune a15 features from max
113
tests/tcg/aarch64/Makefile.target | 6 +-
101
target/arm: Add ID_ISAR6
114
40 files changed, 3184 insertions(+), 157 deletions(-)
115
create mode 100644 include/hw/misc/xlnx-cfi-if.h
116
create mode 100644 include/hw/misc/xlnx-versal-cframe-reg.h
117
create mode 100644 include/hw/misc/xlnx-versal-cfu.h
118
create mode 100644 tests/tcg/aarch64/pauth.h
119
create mode 100644 hw/misc/xlnx-cfi-if.c
120
create mode 100644 hw/misc/xlnx-versal-cframe-reg.c
121
create mode 100644 hw/misc/xlnx-versal-cfu.c
102
122
103
include/sysemu/device_tree.h | 16 +
104
target/arm/cpu.h | 3 +
105
target/arm/helper-sve.h | 682 +++++++++++++++
106
target/arm/helper.h | 44 +-
107
device_tree.c | 78 +-
108
hw/arm/boot.c | 41 +-
109
hw/arm/fsl-imx7.c | 8 +-
110
hw/arm/mcimx7d-sabre.c | 2 -
111
hw/arm/sysbus-fdt.c | 53 +-
112
hw/arm/virt.c | 70 +-
113
hw/block/fdc.c | 9 +-
114
hw/sd/bcm2835_sdhost.c | 13 +-
115
hw/sd/core.c | 2 +-
116
hw/sd/milkymist-memcard.c | 3 +-
117
hw/sd/omap_mmc.c | 6 +-
118
hw/sd/pl181.c | 11 +-
119
hw/sd/sdhci.c | 15 +-
120
hw/sd/ssi-sd.c | 6 +-
121
linux-user/elfload.c | 2 +
122
target/arm/cpu.c | 36 +-
123
target/arm/cpu64.c | 13 +-
124
target/arm/helper.c | 44 +-
125
target/arm/kvm32.c | 27 +-
126
target/arm/sve_helper.c | 1875 +++++++++++++++++++++++++++++++++++++++++-
127
target/arm/translate-a64.c | 62 +-
128
target/arm/translate-sve.c | 1688 ++++++++++++++++++++++++++++++++++++-
129
target/arm/translate.c | 102 ++-
130
target/arm/vec_helper.c | 311 ++++++-
131
hw/sd/trace-events | 2 +-
132
target/arm/sve.decode | 427 ++++++++++
133
30 files changed, 5394 insertions(+), 257 deletions(-)
134
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Use assert() instead of error_setg(&error_abort),
4
as suggested by the "qapi/error.h" documentation:
5
6
Please don't error_setg(&error_fatal, ...), use error_report() and
7
exit(), because that's more obvious.
8
Likewise, don't error_setg(&error_abort, ...), use assert().
9
10
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Acked-by: John Snow <jsnow@redhat.com>
12
Message-id: 20180625165749.3910-2-f4bug@amsat.org
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
15
hw/block/fdc.c | 9 +--------
16
1 file changed, 1 insertion(+), 8 deletions(-)
17
18
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/block/fdc.c
21
+++ b/hw/block/fdc.c
22
@@ -XXX,XX +XXX,XX @@ static int pick_geometry(FDrive *drv)
23
nb_sectors,
24
FloppyDriveType_str(parse->drive));
25
}
26
+ assert(type_match != -1 && "misconfigured fd_format");
27
match = type_match;
28
}
29
-
30
- /* No match of any kind found -- fd_format is misconfigured, abort. */
31
- if (match == -1) {
32
- error_setg(&error_abort, "No candidate geometries present in table "
33
- " for floppy drive type '%s'",
34
- FloppyDriveType_str(drv->drive));
35
- }
36
-
37
parse = &(fd_formats[match]);
38
39
out:
40
--
41
2.17.1
42
43
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Use error_report() + exit() instead of error_setg(&error_fatal),
4
as suggested by the "qapi/error.h" documentation:
5
6
Please don't error_setg(&error_fatal, ...), use error_report() and
7
exit(), because that's more obvious.
8
9
This fixes CID 1352173:
10
"Passing null pointer dt_name to qemu_fdt_node_path, which dereferences it."
11
12
And this also fixes:
13
14
hw/arm/sysbus-fdt.c:322:9: warning: Array access (from variable 'node_path') results in a null pointer dereference
15
if (node_path[1]) {
16
^~~~~~~~~~~~
17
18
Fixes: Coverity CID 1352173 (Dereference after null check)
19
Suggested-by: Eric Blake <eblake@redhat.com>
20
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
21
Reviewed-by: Eric Auger <eric.auger@redhat.com>
22
Message-id: 20180625165749.3910-3-f4bug@amsat.org
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
---
25
hw/arm/sysbus-fdt.c | 53 +++++++++++++++++++++++++--------------------
26
1 file changed, 30 insertions(+), 23 deletions(-)
27
28
diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/hw/arm/sysbus-fdt.c
31
+++ b/hw/arm/sysbus-fdt.c
32
@@ -XXX,XX +XXX,XX @@ static void copy_properties_from_host(HostProperty *props, int nb_props,
33
r = qemu_fdt_getprop(host_fdt, node_path,
34
props[i].name,
35
&prop_len,
36
- props[i].optional ? &err : &error_fatal);
37
+ &err);
38
if (r) {
39
qemu_fdt_setprop(guest_fdt, nodename,
40
props[i].name, r, prop_len);
41
} else {
42
- if (prop_len != -FDT_ERR_NOTFOUND) {
43
- /* optional property not returned although property exists */
44
- error_report_err(err);
45
- } else {
46
+ if (props[i].optional && prop_len == -FDT_ERR_NOTFOUND) {
47
+ /* optional property does not exist */
48
error_free(err);
49
+ } else {
50
+ error_report_err(err);
51
+ }
52
+ if (!props[i].optional) {
53
+ /* mandatory property not found: bail out */
54
+ exit(1);
55
}
56
}
57
}
58
@@ -XXX,XX +XXX,XX @@ static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
59
60
node_offset = fdt_node_offset_by_phandle(host_fdt, host_phandle);
61
if (node_offset <= 0) {
62
- error_setg(&error_fatal,
63
- "not able to locate clock handle %d in host device tree",
64
- host_phandle);
65
+ error_report("not able to locate clock handle %d in host device tree",
66
+ host_phandle);
67
+ exit(1);
68
}
69
node_path = g_malloc(path_len);
70
while ((ret = fdt_get_path(host_fdt, node_offset, node_path, path_len))
71
@@ -XXX,XX +XXX,XX @@ static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
72
node_path = g_realloc(node_path, path_len);
73
}
74
if (ret < 0) {
75
- error_setg(&error_fatal,
76
- "not able to retrieve node path for clock handle %d",
77
- host_phandle);
78
+ error_report("not able to retrieve node path for clock handle %d",
79
+ host_phandle);
80
+ exit(1);
81
}
82
83
r = qemu_fdt_getprop(host_fdt, node_path, "compatible", &prop_len,
84
&error_fatal);
85
if (strcmp(r, "fixed-clock")) {
86
- error_setg(&error_fatal,
87
- "clock handle %d is not a fixed clock", host_phandle);
88
+ error_report("clock handle %d is not a fixed clock", host_phandle);
89
+ exit(1);
90
}
91
92
nodename = strrchr(node_path, '/');
93
@@ -XXX,XX +XXX,XX @@ static int add_amd_xgbe_fdt_node(SysBusDevice *sbdev, void *opaque)
94
95
dt_name = sysfs_to_dt_name(vbasedev->name);
96
if (!dt_name) {
97
- error_setg(&error_fatal, "%s incorrect sysfs device name %s",
98
- __func__, vbasedev->name);
99
+ error_report("%s incorrect sysfs device name %s",
100
+ __func__, vbasedev->name);
101
+ exit(1);
102
}
103
node_path = qemu_fdt_node_path(host_fdt, dt_name, vdev->compat,
104
&error_fatal);
105
if (!node_path || !node_path[0]) {
106
- error_setg(&error_fatal, "%s unable to retrieve node path for %s/%s",
107
- __func__, dt_name, vdev->compat);
108
+ error_report("%s unable to retrieve node path for %s/%s",
109
+ __func__, dt_name, vdev->compat);
110
+ exit(1);
111
}
112
113
if (node_path[1]) {
114
- error_setg(&error_fatal, "%s more than one node matching %s/%s!",
115
- __func__, dt_name, vdev->compat);
116
+ error_report("%s more than one node matching %s/%s!",
117
+ __func__, dt_name, vdev->compat);
118
+ exit(1);
119
}
120
121
g_free(dt_name);
122
123
if (vbasedev->num_regions != 5) {
124
- error_setg(&error_fatal, "%s Does the host dt node combine XGBE/PHY?",
125
- __func__);
126
+ error_report("%s Does the host dt node combine XGBE/PHY?", __func__);
127
+ exit(1);
128
}
129
130
/* generate nodes for DMA_CLK and PTP_CLK */
131
r = qemu_fdt_getprop(host_fdt, node_path[0], "clocks",
132
&prop_len, &error_fatal);
133
if (prop_len != 8) {
134
- error_setg(&error_fatal, "%s clocks property should contain 2 handles",
135
- __func__);
136
+ error_report("%s clocks property should contain 2 handles", __func__);
137
+ exit(1);
138
}
139
host_clock_phandles = (uint32_t *)r;
140
guest_clock_phandles[0] = qemu_fdt_alloc_phandle(guest_fdt);
141
--
142
2.17.1
143
144
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Use error_report() + exit() instead of error_setg(&error_fatal),
4
as suggested by the "qapi/error.h" documentation:
5
6
Please don't error_setg(&error_fatal, ...), use error_report() and
7
exit(), because that's more obvious.
8
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Eric Auger <eric.auger@redhat.com>
11
Reviewed-by: Markus Armbruster <armbru@redhat.com>
12
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
13
Message-id: 20180625165749.3910-4-f4bug@amsat.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
16
device_tree.c | 23 +++++++++++++----------
17
1 file changed, 13 insertions(+), 10 deletions(-)
18
19
diff --git a/device_tree.c b/device_tree.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/device_tree.c
22
+++ b/device_tree.c
23
@@ -XXX,XX +XXX,XX @@ static void read_fstree(void *fdt, const char *dirname)
24
const char *parent_node;
25
26
if (strstr(dirname, root_dir) != dirname) {
27
- error_setg(&error_fatal, "%s: %s must be searched within %s",
28
- __func__, dirname, root_dir);
29
+ error_report("%s: %s must be searched within %s",
30
+ __func__, dirname, root_dir);
31
+ exit(1);
32
}
33
parent_node = &dirname[strlen(SYSFS_DT_BASEDIR)];
34
35
d = opendir(dirname);
36
if (!d) {
37
- error_setg(&error_fatal, "%s cannot open %s", __func__, dirname);
38
- return;
39
+ error_report("%s cannot open %s", __func__, dirname);
40
+ exit(1);
41
}
42
43
while ((de = readdir(d)) != NULL) {
44
@@ -XXX,XX +XXX,XX @@ static void read_fstree(void *fdt, const char *dirname)
45
tmpnam = g_strdup_printf("%s/%s", dirname, de->d_name);
46
47
if (lstat(tmpnam, &st) < 0) {
48
- error_setg(&error_fatal, "%s cannot lstat %s", __func__, tmpnam);
49
+ error_report("%s cannot lstat %s", __func__, tmpnam);
50
+ exit(1);
51
}
52
53
if (S_ISREG(st.st_mode)) {
54
@@ -XXX,XX +XXX,XX @@ static void read_fstree(void *fdt, const char *dirname)
55
gsize len;
56
57
if (!g_file_get_contents(tmpnam, &val, &len, NULL)) {
58
- error_setg(&error_fatal, "%s not able to extract info from %s",
59
- __func__, tmpnam);
60
+ error_report("%s not able to extract info from %s",
61
+ __func__, tmpnam);
62
+ exit(1);
63
}
64
65
if (strlen(parent_node) > 0) {
66
@@ -XXX,XX +XXX,XX @@ void *load_device_tree_from_sysfs(void)
67
host_fdt = create_device_tree(&host_fdt_size);
68
read_fstree(host_fdt, SYSFS_DT_BASEDIR);
69
if (fdt_check_header(host_fdt)) {
70
- error_setg(&error_fatal,
71
- "%s host device tree extracted into memory is invalid",
72
- __func__);
73
+ error_report("%s host device tree extracted into memory is invalid",
74
+ __func__);
75
+ exit(1);
76
}
77
return host_fdt;
78
}
79
--
80
2.17.1
81
82
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
With FEAT_FPAC, AUT* instructions that fail authentication
4
do not produce an error value but instead fault.
5
6
For pauth-2, install a signal handler and verify it gets called.
7
8
For pauth-4 and pauth-5, we are explicitly testing the error value,
9
so there's nothing to test with FEAT_FPAC, so exit early.
10
Adjust the makefile to use -cpu neoverse-v1, which has FEAT_EPAC
11
but not FEAT_FPAC.
2
12
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-29-richard.henderson@linaro.org
15
Message-id: 20230829232335.965414-2-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
17
---
8
target/arm/helper-sve.h | 7 +++
18
tests/tcg/aarch64/pauth.h | 23 +++++++++++++
9
target/arm/sve_helper.c | 100 +++++++++++++++++++++++++++++++++++++
19
tests/tcg/aarch64/pauth-2.c | 54 ++++++++++++++++++++++++++-----
10
target/arm/translate-sve.c | 24 +++++++++
20
tests/tcg/aarch64/pauth-4.c | 18 ++++++++---
11
target/arm/sve.decode | 4 ++
21
tests/tcg/aarch64/pauth-5.c | 10 ++++++
12
4 files changed, 135 insertions(+)
22
tests/tcg/aarch64/Makefile.target | 6 +++-
13
23
5 files changed, 98 insertions(+), 13 deletions(-)
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
24
create mode 100644 tests/tcg/aarch64/pauth.h
15
index XXXXXXX..XXXXXXX 100644
25
16
--- a/target/arm/helper-sve.h
26
diff --git a/tests/tcg/aarch64/pauth.h b/tests/tcg/aarch64/pauth.h
17
+++ b/target/arm/helper-sve.h
27
new file mode 100644
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
28
index XXXXXXX..XXXXXXX
19
DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
29
--- /dev/null
20
void, ptr, ptr, ptr, ptr, ptr, i32)
30
+++ b/tests/tcg/aarch64/pauth.h
21
31
@@ -XXX,XX +XXX,XX @@
22
+DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG,
32
+/*
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+ * Helper for pauth test case
24
+DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG,
34
+ *
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+ * Copyright (c) 2023 Linaro Ltd
26
+DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG,
36
+ * SPDX-License-Identifier: GPL-2.0-or-later
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
37
+ */
28
+
38
+
29
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
39
+#include <assert.h>
30
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
40
+#include <sys/auxv.h>
31
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
41
+
32
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
42
+static int get_pac_feature(void)
33
index XXXXXXX..XXXXXXX 100644
43
+{
34
--- a/target/arm/sve_helper.c
44
+ unsigned long isar1, isar2;
35
+++ b/target/arm/sve_helper.c
45
+
36
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
46
+ assert(getauxval(AT_HWCAP) & HWCAP_CPUID);
47
+
48
+ asm("mrs %0, id_aa64isar1_el1" : "=r"(isar1));
49
+ asm("mrs %0, S3_0_C0_C6_2" : "=r"(isar2)); /* id_aa64isar2_el1 */
50
+
51
+ return ((isar1 >> 4) & 0xf) /* APA */
52
+ | ((isar1 >> 8) & 0xf) /* API */
53
+ | ((isar2 >> 12) & 0xf); /* APA3 */
54
+}
55
diff --git a/tests/tcg/aarch64/pauth-2.c b/tests/tcg/aarch64/pauth-2.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/tests/tcg/aarch64/pauth-2.c
58
+++ b/tests/tcg/aarch64/pauth-2.c
59
@@ -XXX,XX +XXX,XX @@
60
#include <stdint.h>
61
+#include <signal.h>
62
+#include <stdlib.h>
63
#include <assert.h>
64
+#include "pauth.h"
65
+
66
+
67
+static void sigill(int sig, siginfo_t *info, void *vuc)
68
+{
69
+ ucontext_t *uc = vuc;
70
+ uint64_t test;
71
+
72
+ /* There is only one insn below that is allowed to fault. */
73
+ asm volatile("adr %0, auth2_insn" : "=r"(test));
74
+ assert(test == uc->uc_mcontext.pc);
75
+ exit(0);
76
+}
77
+
78
+static int pac_feature;
79
80
void do_test(uint64_t value)
81
{
82
@@ -XXX,XX +XXX,XX @@ void do_test(uint64_t value)
83
* An invalid salt usually fails authorization, but again there
84
* is a chance of choosing another salt that works.
85
* Iterate until we find another salt which does fail.
86
+ *
87
+ * With FEAT_FPAC, this will SIGILL instead of producing a result.
88
*/
89
for (salt2 = salt1 + 1; ; salt2++) {
90
- asm volatile("autda %0, %2" : "=r"(decode) : "0"(encode), "r"(salt2));
91
+ asm volatile("auth2_insn: autda %0, %2"
92
+ : "=r"(decode) : "0"(encode), "r"(salt2));
93
if (decode != value) {
94
break;
95
}
96
}
97
98
+ assert(pac_feature < 4); /* No FEAT_FPAC */
99
+
100
/* The VA bits, bit 55, and the TBI bits, should be unchanged. */
101
assert(((decode ^ value) & 0xff80ffffffffffffull) == 0);
102
103
/*
104
- * Bits [54:53] are an error indicator based on the key used;
105
- * the DA key above is keynumber 0, so error == 0b01. Otherwise
106
- * bit 55 of the original is sign-extended into the rest of the auth.
107
+ * Without FEAT_Pauth2, bits [54:53] are an error indicator based on
108
+ * the key used; the DA key above is keynumber 0, so error == 0b01.
109
+ * Otherwise, bit 55 of the original is sign-extended into the rest
110
+ * of the auth.
111
*/
112
- if ((value >> 55) & 1) {
113
- assert(((decode >> 48) & 0xff) == 0b10111111);
114
- } else {
115
- assert(((decode >> 48) & 0xff) == 0b00100000);
116
+ if (pac_feature < 3) {
117
+ if ((value >> 55) & 1) {
118
+ assert(((decode >> 48) & 0xff) == 0b10111111);
119
+ } else {
120
+ assert(((decode >> 48) & 0xff) == 0b00100000);
121
+ }
37
}
122
}
38
}
123
}
39
124
40
+/*
125
int main()
41
+ * FP Complex Add
126
{
42
+ */
127
+ static const struct sigaction sa = {
43
+
128
+ .sa_sigaction = sigill,
44
+void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg,
129
+ .sa_flags = SA_SIGINFO
45
+ void *vs, uint32_t desc)
46
+{
47
+ intptr_t j, i = simd_oprsz(desc);
48
+ uint64_t *g = vg;
49
+ float16 neg_imag = float16_set_sign(0, simd_data(desc));
50
+ float16 neg_real = float16_chs(neg_imag);
51
+
52
+ do {
53
+ uint64_t pg = g[(i - 1) >> 6];
54
+ do {
55
+ float16 e0, e1, e2, e3;
56
+
57
+ /* I holds the real index; J holds the imag index. */
58
+ j = i - sizeof(float16);
59
+ i -= 2 * sizeof(float16);
60
+
61
+ e0 = *(float16 *)(vn + H1_2(i));
62
+ e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real;
63
+ e2 = *(float16 *)(vn + H1_2(j));
64
+ e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag;
65
+
66
+ if (likely((pg >> (i & 63)) & 1)) {
67
+ *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs);
68
+ }
69
+ if (likely((pg >> (j & 63)) & 1)) {
70
+ *(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs);
71
+ }
72
+ } while (i & 63);
73
+ } while (i != 0);
74
+}
75
+
76
+void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg,
77
+ void *vs, uint32_t desc)
78
+{
79
+ intptr_t j, i = simd_oprsz(desc);
80
+ uint64_t *g = vg;
81
+ float32 neg_imag = float32_set_sign(0, simd_data(desc));
82
+ float32 neg_real = float32_chs(neg_imag);
83
+
84
+ do {
85
+ uint64_t pg = g[(i - 1) >> 6];
86
+ do {
87
+ float32 e0, e1, e2, e3;
88
+
89
+ /* I holds the real index; J holds the imag index. */
90
+ j = i - sizeof(float32);
91
+ i -= 2 * sizeof(float32);
92
+
93
+ e0 = *(float32 *)(vn + H1_2(i));
94
+ e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real;
95
+ e2 = *(float32 *)(vn + H1_2(j));
96
+ e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag;
97
+
98
+ if (likely((pg >> (i & 63)) & 1)) {
99
+ *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs);
100
+ }
101
+ if (likely((pg >> (j & 63)) & 1)) {
102
+ *(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs);
103
+ }
104
+ } while (i & 63);
105
+ } while (i != 0);
106
+}
107
+
108
+void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
109
+ void *vs, uint32_t desc)
110
+{
111
+ intptr_t j, i = simd_oprsz(desc);
112
+ uint64_t *g = vg;
113
+ float64 neg_imag = float64_set_sign(0, simd_data(desc));
114
+ float64 neg_real = float64_chs(neg_imag);
115
+
116
+ do {
117
+ uint64_t pg = g[(i - 1) >> 6];
118
+ do {
119
+ float64 e0, e1, e2, e3;
120
+
121
+ /* I holds the real index; J holds the imag index. */
122
+ j = i - sizeof(float64);
123
+ i -= 2 * sizeof(float64);
124
+
125
+ e0 = *(float64 *)(vn + H1_2(i));
126
+ e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real;
127
+ e2 = *(float64 *)(vn + H1_2(j));
128
+ e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag;
129
+
130
+ if (likely((pg >> (i & 63)) & 1)) {
131
+ *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs);
132
+ }
133
+ if (likely((pg >> (j & 63)) & 1)) {
134
+ *(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs);
135
+ }
136
+ } while (i & 63);
137
+ } while (i != 0);
138
+}
139
+
140
/*
141
* Load contiguous data, protected by a governing predicate.
142
*/
143
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/arm/translate-sve.c
146
+++ b/target/arm/translate-sve.c
147
@@ -XXX,XX +XXX,XX @@ DO_FPCMP(FACGT, facgt)
148
149
#undef DO_FPCMP
150
151
+static bool trans_FCADD(DisasContext *s, arg_FCADD *a, uint32_t insn)
152
+{
153
+ static gen_helper_gvec_4_ptr * const fns[3] = {
154
+ gen_helper_sve_fcadd_h,
155
+ gen_helper_sve_fcadd_s,
156
+ gen_helper_sve_fcadd_d
157
+ };
130
+ };
158
+
131
+
159
+ if (a->esz == 0) {
132
+ pac_feature = get_pac_feature();
160
+ return false;
133
+ assert(pac_feature != 0);
134
+
135
+ if (pac_feature >= 4) {
136
+ /* FEAT_FPAC */
137
+ sigaction(SIGILL, &sa, NULL);
161
+ }
138
+ }
162
+ if (sve_access_check(s)) {
139
+
163
+ unsigned vsz = vec_full_reg_size(s);
140
do_test(0);
164
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
141
do_test(0xda004acedeadbeefull);
165
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
142
return 0;
166
+ vec_full_reg_offset(s, a->rn),
143
diff --git a/tests/tcg/aarch64/pauth-4.c b/tests/tcg/aarch64/pauth-4.c
167
+ vec_full_reg_offset(s, a->rm),
144
index XXXXXXX..XXXXXXX 100644
168
+ pred_full_reg_offset(s, a->pg),
145
--- a/tests/tcg/aarch64/pauth-4.c
169
+ status, vsz, vsz, a->rot, fns[a->esz - 1]);
146
+++ b/tests/tcg/aarch64/pauth-4.c
170
+ tcg_temp_free_ptr(status);
147
@@ -XXX,XX +XXX,XX @@
148
#include <assert.h>
149
#include <stdio.h>
150
#include <stdlib.h>
151
+#include "pauth.h"
152
153
#define TESTS 1000
154
155
int main()
156
{
157
+ char base[TESTS];
158
int i, count = 0;
159
float perc;
160
- void *base = malloc(TESTS);
161
+ int pac_feature = get_pac_feature();
162
+
163
+ /*
164
+ * Exit if no PAuth or FEAT_FPAC, which will SIGILL on AUTIA failure
165
+ * rather than return an error for us to check below.
166
+ */
167
+ if (pac_feature == 0 || pac_feature >= 4) {
168
+ return 0;
171
+ }
169
+ }
172
+ return true;
170
173
+}
171
for (i = 0; i < TESTS; i++) {
174
+
172
uintptr_t in, x, y;
175
typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32);
173
@@ -XXX,XX +XXX,XX @@ int main()
176
174
in = i + (uintptr_t) base;
177
static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn)
175
178
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
176
asm("mov %0, %[in]\n\t"
179
index XXXXXXX..XXXXXXX 100644
177
- "pacia %0, sp\n\t" /* sigill if pauth not supported */
180
--- a/target/arm/sve.decode
178
+ "pacia %0, sp\n\t"
181
+++ b/target/arm/sve.decode
179
"eor %0, %0, #4\n\t" /* corrupt single bit */
182
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
180
"mov %1, %0\n\t"
183
# SVE integer multiply immediate (unpredicated)
181
"autia %1, sp\n\t" /* validate corrupted pointer */
184
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
182
@@ -XXX,XX +XXX,XX @@ int main()
185
183
if (x != y) {
186
+# SVE floating-point complex add (predicated)
184
count++;
187
+FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
185
}
188
+ rn=%reg_movprfx
186
-
189
+
187
}
190
### SVE FP Multiply-Add Indexed Group
188
+
191
189
perc = (float) count / (float) TESTS;
192
# SVE floating-point multiply-add (indexed)
190
- printf("Checks Passed: %0.2f%%", perc * 100.0);
191
+ printf("Checks Passed: %0.2f%%\n", perc * 100.0);
192
assert(perc > 0.95);
193
return 0;
194
}
195
diff --git a/tests/tcg/aarch64/pauth-5.c b/tests/tcg/aarch64/pauth-5.c
196
index XXXXXXX..XXXXXXX 100644
197
--- a/tests/tcg/aarch64/pauth-5.c
198
+++ b/tests/tcg/aarch64/pauth-5.c
199
@@ -XXX,XX +XXX,XX @@
200
#include <assert.h>
201
+#include "pauth.h"
202
203
static int x;
204
205
@@ -XXX,XX +XXX,XX @@ int main()
206
{
207
int *p0 = &x, *p1, *p2, *p3;
208
unsigned long salt = 0;
209
+ int pac_feature = get_pac_feature();
210
+
211
+ /*
212
+ * Exit if no PAuth or FEAT_FPAC, which will SIGILL on AUTDA failure
213
+ * rather than return an error for us to check below.
214
+ */
215
+ if (pac_feature == 0 || pac_feature >= 4) {
216
+ return 0;
217
+ }
218
219
/*
220
* With TBI enabled and a 48-bit VA, there are 7 bits of auth, and so
221
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
222
index XXXXXXX..XXXXXXX 100644
223
--- a/tests/tcg/aarch64/Makefile.target
224
+++ b/tests/tcg/aarch64/Makefile.target
225
@@ -XXX,XX +XXX,XX @@ endif
226
ifneq ($(CROSS_CC_HAS_ARMV8_3),)
227
AARCH64_TESTS += pauth-1 pauth-2 pauth-4 pauth-5
228
pauth-%: CFLAGS += -march=armv8.3-a
229
-run-pauth-%: QEMU_OPTS += -cpu max
230
+run-pauth-1: QEMU_OPTS += -cpu max
231
+run-pauth-2: QEMU_OPTS += -cpu max
232
+# Choose a cpu with FEAT_Pauth but without FEAT_FPAC for pauth-[45].
233
+run-pauth-4: QEMU_OPTS += -cpu neoverse-v1
234
+run-pauth-5: QEMU_OPTS += -cpu neoverse-v1
235
endif
236
237
# BTI Tests
193
--
238
--
194
2.17.1
239
2.34.1
195
196
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
2
3
This register was added to aa32 state by ARMv8.2.
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20230829232335.965414-3-richard.henderson@linaro.org
7
Message-id: 20180629001538.11415-6-richard.henderson@linaro.org
9
[PMM: drop the HVF part of the patch and just comment that
10
we need to do something when the register appears in that API]
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
14
---
10
target/arm/cpu.h | 1 +
15
target/arm/cpu.h | 1 +
11
target/arm/cpu.c | 4 ++++
16
target/arm/helper.c | 4 ++--
12
target/arm/cpu64.c | 2 ++
17
target/arm/hvf/hvf.c | 1 +
13
target/arm/helper.c | 5 ++---
18
target/arm/kvm64.c | 2 ++
14
4 files changed, 9 insertions(+), 3 deletions(-)
19
4 files changed, 6 insertions(+), 2 deletions(-)
15
20
16
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
21
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
17
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.h
23
--- a/target/arm/cpu.h
19
+++ b/target/arm/cpu.h
24
+++ b/target/arm/cpu.h
20
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
25
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
21
uint32_t id_isar3;
26
uint32_t dbgdevid1;
22
uint32_t id_isar4;
27
uint64_t id_aa64isar0;
23
uint32_t id_isar5;
28
uint64_t id_aa64isar1;
24
+ uint32_t id_isar6;
29
+ uint64_t id_aa64isar2;
25
uint64_t id_aa64pfr0;
30
uint64_t id_aa64pfr0;
26
uint64_t id_aa64pfr1;
31
uint64_t id_aa64pfr1;
27
uint64_t id_aa64dfr0;
32
uint64_t id_aa64mmfr0;
28
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/cpu.c
31
+++ b/target/arm/cpu.c
32
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
33
cpu->id_isar3 = 0x01111110;
34
cpu->id_isar4 = 0x01310102;
35
cpu->id_isar5 = 0x00000000;
36
+ cpu->id_isar6 = 0x00000000;
37
}
38
39
static void cortex_m4_initfn(Object *obj)
40
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
41
cpu->id_isar3 = 0x01111110;
42
cpu->id_isar4 = 0x01310102;
43
cpu->id_isar5 = 0x00000000;
44
+ cpu->id_isar6 = 0x00000000;
45
}
46
47
static void cortex_m33_initfn(Object *obj)
48
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
49
cpu->id_isar3 = 0x01111131;
50
cpu->id_isar4 = 0x01310132;
51
cpu->id_isar5 = 0x00000000;
52
+ cpu->id_isar6 = 0x00000000;
53
cpu->clidr = 0x00000000;
54
cpu->ctr = 0x8000c000;
55
}
56
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
57
cpu->id_isar3 = 0x01112131;
58
cpu->id_isar4 = 0x0010142;
59
cpu->id_isar5 = 0x0;
60
+ cpu->id_isar6 = 0x0;
61
cpu->mp_is_up = true;
62
cpu->pmsav7_dregion = 16;
63
define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
64
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
65
index XXXXXXX..XXXXXXX 100644
66
--- a/target/arm/cpu64.c
67
+++ b/target/arm/cpu64.c
68
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
69
cpu->id_isar3 = 0x01112131;
70
cpu->id_isar4 = 0x00011142;
71
cpu->id_isar5 = 0x00011121;
72
+ cpu->id_isar6 = 0;
73
cpu->id_aa64pfr0 = 0x00002222;
74
cpu->id_aa64dfr0 = 0x10305106;
75
cpu->pmceid0 = 0x00000000;
76
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
77
cpu->id_isar3 = 0x01112131;
78
cpu->id_isar4 = 0x00011142;
79
cpu->id_isar5 = 0x00011121;
80
+ cpu->id_isar6 = 0;
81
cpu->id_aa64pfr0 = 0x00002222;
82
cpu->id_aa64dfr0 = 0x10305106;
83
cpu->id_aa64isar0 = 0x00011120;
84
diff --git a/target/arm/helper.c b/target/arm/helper.c
33
diff --git a/target/arm/helper.c b/target/arm/helper.c
85
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/helper.c
35
--- a/target/arm/helper.c
87
+++ b/target/arm/helper.c
36
+++ b/target/arm/helper.c
88
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
37
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
89
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
90
.access = PL1_R, .type = ARM_CP_CONST,
38
.access = PL1_R, .type = ARM_CP_CONST,
91
.resetvalue = cpu->id_mmfr4 },
39
.accessfn = access_aa64_tid3,
92
- /* 7 is as yet unallocated and must RAZ */
40
.resetvalue = cpu->isar.id_aa64isar1 },
93
- { .name = "ID_ISAR7_RESERVED", .state = ARM_CP_STATE_BOTH,
41
- { .name = "ID_AA64ISAR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
94
+ { .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
42
+ { .name = "ID_AA64ISAR2_EL1", .state = ARM_CP_STATE_AA64,
95
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
43
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 2,
96
.access = PL1_R, .type = ARM_CP_CONST,
44
.access = PL1_R, .type = ARM_CP_CONST,
45
.accessfn = access_aa64_tid3,
97
- .resetvalue = 0 },
46
- .resetvalue = 0 },
98
+ .resetvalue = cpu->id_isar6 },
47
+ .resetvalue = cpu->isar.id_aa64isar2 },
99
REGINFO_SENTINEL
48
{ .name = "ID_AA64ISAR3_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
100
};
49
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 3,
101
define_arm_cp_regs(cpu, v6_idregs);
50
.access = PL1_R, .type = ARM_CP_CONST,
51
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/hvf/hvf.c
54
+++ b/target/arm/hvf/hvf.c
55
@@ -XXX,XX +XXX,XX @@ static bool hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
56
{ HV_SYS_REG_ID_AA64DFR1_EL1, &host_isar.id_aa64dfr1 },
57
{ HV_SYS_REG_ID_AA64ISAR0_EL1, &host_isar.id_aa64isar0 },
58
{ HV_SYS_REG_ID_AA64ISAR1_EL1, &host_isar.id_aa64isar1 },
59
+ /* Add ID_AA64ISAR2_EL1 here when HVF supports it */
60
{ HV_SYS_REG_ID_AA64MMFR0_EL1, &host_isar.id_aa64mmfr0 },
61
{ HV_SYS_REG_ID_AA64MMFR1_EL1, &host_isar.id_aa64mmfr1 },
62
{ HV_SYS_REG_ID_AA64MMFR2_EL1, &host_isar.id_aa64mmfr2 },
63
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/kvm64.c
66
+++ b/target/arm/kvm64.c
67
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
68
ARM64_SYS_REG(3, 0, 0, 6, 0));
69
err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64isar1,
70
ARM64_SYS_REG(3, 0, 0, 6, 1));
71
+ err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64isar2,
72
+ ARM64_SYS_REG(3, 0, 0, 6, 2));
73
err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64mmfr0,
74
ARM64_SYS_REG(3, 0, 0, 7, 0));
75
err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64mmfr1,
102
--
76
--
103
2.17.1
77
2.34.1
104
78
105
79
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
3
Rename isar_feature_aa64_pauth_arch to isar_feature_aa64_pauth_qarma5
4
to distinguish the other architectural algorithm qarma3.
5
6
Add ARMPauthFeature and isar_feature_pauth_feature to cover the
7
other pauth conditions.
2
8
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-28-richard.henderson@linaro.org
12
Message-id: 20230829232335.965414-4-richard.henderson@linaro.org
13
Message-Id: <20230609172324.982888-3-aaron@os.amperecomputing.com>
14
[rth: Add ARMPauthFeature and eliminate most other predicates]
15
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
17
---
8
target/arm/translate-sve.c | 60 +++++++++++++++++++++++++++++++++++++-
18
target/arm/cpu.h | 47 +++++++++++++++++++++++++++++------
9
target/arm/sve.decode | 7 +++++
19
target/arm/tcg/pauth_helper.c | 2 +-
10
2 files changed, 66 insertions(+), 1 deletion(-)
20
2 files changed, 40 insertions(+), 9 deletions(-)
11
21
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
22
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
24
--- a/target/arm/cpu.h
15
+++ b/target/arm/translate-sve.c
25
+++ b/target/arm/cpu.h
16
@@ -XXX,XX +XXX,XX @@ static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn)
26
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
17
return true;
27
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
18
}
28
}
19
29
20
+/* Select active elememnts from Zn and inactive elements from Zm,
30
+/*
21
+ * storing the result in Zd.
31
+ * These are the values from APA/API/APA3.
32
+ * In general these must be compared '>=', per the normal Arm ARM
33
+ * treatment of fields in ID registers.
22
+ */
34
+ */
23
+static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz)
35
+typedef enum {
36
+ PauthFeat_None = 0,
37
+ PauthFeat_1 = 1,
38
+ PauthFeat_EPAC = 2,
39
+ PauthFeat_2 = 3,
40
+ PauthFeat_FPAC = 4,
41
+ PauthFeat_FPACCOMBINED = 5,
42
+} ARMPauthFeature;
43
+
44
+static inline ARMPauthFeature
45
+isar_feature_pauth_feature(const ARMISARegisters *id)
24
+{
46
+{
25
+ static gen_helper_gvec_4 * const fns[4] = {
47
+ /*
26
+ gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h,
48
+ * Architecturally, only one of {APA,API,APA3} may be active (non-zero)
27
+ gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d
49
+ * and the other two must be zero. Thus we may avoid conditionals.
28
+ };
50
+ */
29
+ unsigned vsz = vec_full_reg_size(s);
51
+ return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, APA) |
30
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
52
+ FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, API) |
31
+ vec_full_reg_offset(s, rn),
53
+ FIELD_EX64(id->id_aa64isar2, ID_AA64ISAR2, APA3));
32
+ vec_full_reg_offset(s, rm),
33
+ pred_full_reg_offset(s, pg),
34
+ vsz, vsz, 0, fns[esz]);
35
+}
54
+}
36
+
55
+
37
#define DO_ZPZZ(NAME, name) \
56
static inline bool isar_feature_aa64_pauth(const ARMISARegisters *id)
38
static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \
57
{
39
uint32_t insn) \
58
/*
40
@@ -XXX,XX +XXX,XX @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
59
* Return true if any form of pauth is enabled, as this
41
return do_zpzz_ool(s, a, fns[a->esz]);
60
* predicate controls migration of the 128-bit keys.
61
*/
62
- return (id->id_aa64isar1 &
63
- (FIELD_DP64(0, ID_AA64ISAR1, APA, 0xf) |
64
- FIELD_DP64(0, ID_AA64ISAR1, API, 0xf) |
65
- FIELD_DP64(0, ID_AA64ISAR1, GPA, 0xf) |
66
- FIELD_DP64(0, ID_AA64ISAR1, GPI, 0xf))) != 0;
67
+ return isar_feature_pauth_feature(id) != PauthFeat_None;
42
}
68
}
43
69
44
-DO_ZPZZ(SEL, sel)
70
-static inline bool isar_feature_aa64_pauth_arch(const ARMISARegisters *id)
45
+static bool trans_SEL_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
71
+static inline bool isar_feature_aa64_pauth_qarma5(const ARMISARegisters *id)
72
{
73
/*
74
- * Return true if pauth is enabled with the architected QARMA algorithm.
75
- * QEMU will always set APA+GPA to the same value.
76
+ * Return true if pauth is enabled with the architected QARMA5 algorithm.
77
+ * QEMU will always enable or disable both APA and GPA.
78
*/
79
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, APA) != 0;
80
}
81
82
+static inline bool isar_feature_aa64_pauth_qarma3(const ARMISARegisters *id)
46
+{
83
+{
47
+ if (sve_access_check(s)) {
84
+ /*
48
+ do_sel_z(s, a->rd, a->rn, a->rm, a->pg, a->esz);
85
+ * Return true if pauth is enabled with the architected QARMA3 algorithm.
49
+ }
86
+ * QEMU will always enable or disable both APA3 and GPA3.
50
+ return true;
87
+ */
51
+}
88
+ return FIELD_EX64(id->id_aa64isar2, ID_AA64ISAR2, APA3) != 0;
52
53
#undef DO_ZPZZ
54
55
@@ -XXX,XX +XXX,XX @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn)
56
sve_access_check(s);
57
return true;
58
}
59
+
60
+/*
61
+ * Move Prefix
62
+ *
63
+ * TODO: The implementation so far could handle predicated merging movprfx.
64
+ * The helper functions as written take an extra source register to
65
+ * use in the operation, but the result is only written when predication
66
+ * succeeds. For unpredicated movprfx, we need to rearrange the helpers
67
+ * to allow the final write back to the destination to be unconditional.
68
+ * For predicated zeroing movprfx, we need to rearrange the helpers to
69
+ * allow the final write back to zero inactives.
70
+ *
71
+ * In the meantime, just emit the moves.
72
+ */
73
+
74
+static bool trans_MOVPRFX(DisasContext *s, arg_MOVPRFX *a, uint32_t insn)
75
+{
76
+ return do_mov_z(s, a->rd, a->rn);
77
+}
89
+}
78
+
90
+
79
+static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
91
static inline bool isar_feature_aa64_tlbirange(const ARMISARegisters *id)
80
+{
92
{
81
+ if (sve_access_check(s)) {
93
return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, TLB) == 2;
82
+ do_sel_z(s, a->rd, a->rn, a->rd, a->pg, a->esz);
94
diff --git a/target/arm/tcg/pauth_helper.c b/target/arm/tcg/pauth_helper.c
83
+ }
84
+ return true;
85
+}
86
+
87
+static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
88
+{
89
+ if (sve_access_check(s)) {
90
+ do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz);
91
+ }
92
+ return true;
93
+}
94
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
95
index XXXXXXX..XXXXXXX 100644
95
index XXXXXXX..XXXXXXX 100644
96
--- a/target/arm/sve.decode
96
--- a/target/arm/tcg/pauth_helper.c
97
+++ b/target/arm/sve.decode
97
+++ b/target/arm/tcg/pauth_helper.c
98
@@ -XXX,XX +XXX,XX @@ ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn
98
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac_impdef(uint64_t data, uint64_t modifier,
99
EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn
99
static uint64_t pauth_computepac(CPUARMState *env, uint64_t data,
100
ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn
100
uint64_t modifier, ARMPACKey key)
101
101
{
102
+# SVE constructive prefix (predicated)
102
- if (cpu_isar_feature(aa64_pauth_arch, env_archcpu(env))) {
103
+MOVPRFX_z 00000100 .. 010 000 001 ... ..... ..... @rd_pg_rn
103
+ if (cpu_isar_feature(aa64_pauth_qarma5, env_archcpu(env))) {
104
+MOVPRFX_m 00000100 .. 010 001 001 ... ..... ..... @rd_pg_rn
104
return pauth_computepac_architected(data, modifier, key);
105
+
105
} else {
106
# SVE integer add reduction (predicated)
106
return pauth_computepac_impdef(data, modifier, key);
107
# Note that saddv requires size != 3.
108
UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn
109
@@ -XXX,XX +XXX,XX @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
110
111
### SVE Integer Misc - Unpredicated Group
112
113
+# SVE constructive prefix (unpredicated)
114
+MOVPRFX 00000100 00 1 00000 101111 rn:5 rd:5
115
+
116
# SVE floating-point exponential accelerator
117
# Note esz != 0
118
FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn
119
--
107
--
120
2.17.1
108
2.34.1
121
122
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
There is no need to re-set these 9 features already
3
We have cpu properties to adjust the pauth algorithm for the
4
implied by the call to aarch64_a57_initfn.
4
purpose of speed of emulation. Retain the set of pauth features
5
supported by the cpu even as the algorithm changes.
5
6
7
This already affects the neoverse-v1 cpu, which has FEAT_EPAC.
8
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Message-id: 20230829232335.965414-5-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20180629001538.11415-4-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/cpu64.c | 9 ---------
14
target/arm/cpu64.c | 70 +++++++++++++++++++++++++++---------------
13
1 file changed, 9 deletions(-)
15
target/arm/tcg/cpu64.c | 2 ++
16
2 files changed, 47 insertions(+), 25 deletions(-)
14
17
15
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
18
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
16
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu64.c
20
--- a/target/arm/cpu64.c
18
+++ b/target/arm/cpu64.c
21
+++ b/target/arm/cpu64.c
19
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
22
@@ -XXX,XX +XXX,XX @@ void aarch64_add_sme_properties(Object *obj)
20
* whereas the architecture requires them to be present in both if
23
21
* present in either.
24
void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
22
*/
25
{
23
- set_feature(&cpu->env, ARM_FEATURE_V8);
26
- int arch_val = 0, impdef_val = 0;
24
- set_feature(&cpu->env, ARM_FEATURE_VFP4);
27
- uint64_t t;
25
- set_feature(&cpu->env, ARM_FEATURE_NEON);
28
+ ARMPauthFeature features = cpu_isar_feature(pauth_feature, cpu);
26
- set_feature(&cpu->env, ARM_FEATURE_AARCH64);
29
+ uint64_t isar1;
27
- set_feature(&cpu->env, ARM_FEATURE_V8_AES);
30
28
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
31
- /* Exit early if PAuth is enabled, and fall through to disable it */
29
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
32
- if ((kvm_enabled() || hvf_enabled()) && cpu->prop_pauth) {
30
set_feature(&cpu->env, ARM_FEATURE_V8_SHA512);
33
- if (!cpu_isar_feature(aa64_pauth, cpu)) {
31
set_feature(&cpu->env, ARM_FEATURE_V8_SHA3);
34
- error_setg(errp, "'pauth' feature not supported by %s on this host",
32
set_feature(&cpu->env, ARM_FEATURE_V8_SM3);
35
- kvm_enabled() ? "KVM" : "hvf");
33
set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
36
+ /*
34
- set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
37
+ * These properties enable or disable Pauth as a whole, or change
35
- set_feature(&cpu->env, ARM_FEATURE_CRC);
38
+ * the pauth algorithm, but do not change the set of features that
36
set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
39
+ * are present. We have saved a copy of those features above and
37
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
40
+ * will now place it into the field that chooses the algorithm.
38
set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
41
+ *
42
+ * Begin by disabling all fields.
43
+ */
44
+ isar1 = cpu->isar.id_aa64isar1;
45
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, 0);
46
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 0);
47
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, API, 0);
48
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPI, 0);
49
+
50
+ if (kvm_enabled() || hvf_enabled()) {
51
+ /*
52
+ * Exit early if PAuth is enabled and fall through to disable it.
53
+ * The algorithm selection properties are not present.
54
+ */
55
+ if (cpu->prop_pauth) {
56
+ if (features == 0) {
57
+ error_setg(errp, "'pauth' feature not supported by "
58
+ "%s on this host", current_accel_name());
59
+ }
60
+ return;
61
+ }
62
+ } else {
63
+ /* Pauth properties are only present when the model supports it. */
64
+ if (features == 0) {
65
+ assert(!cpu->prop_pauth);
66
+ return;
67
}
68
69
- return;
70
- }
71
-
72
- /* TODO: Handle HaveEnhancedPAC, HaveEnhancedPAC2, HaveFPAC. */
73
- if (cpu->prop_pauth) {
74
- if (cpu->prop_pauth_impdef) {
75
- impdef_val = 1;
76
- } else {
77
- arch_val = 1;
78
+ if (cpu->prop_pauth) {
79
+ if (cpu->prop_pauth_impdef) {
80
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, API, features);
81
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPI, 1);
82
+ } else {
83
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, features);
84
+ isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 1);
85
+ }
86
+ } else if (cpu->prop_pauth_impdef) {
87
+ error_setg(errp, "cannot enable pauth-impdef without pauth");
88
+ error_append_hint(errp, "Add pauth=on to the CPU property list.\n");
89
}
90
- } else if (cpu->prop_pauth_impdef) {
91
- error_setg(errp, "cannot enable pauth-impdef without pauth");
92
- error_append_hint(errp, "Add pauth=on to the CPU property list.\n");
93
}
94
95
- t = cpu->isar.id_aa64isar1;
96
- t = FIELD_DP64(t, ID_AA64ISAR1, APA, arch_val);
97
- t = FIELD_DP64(t, ID_AA64ISAR1, GPA, arch_val);
98
- t = FIELD_DP64(t, ID_AA64ISAR1, API, impdef_val);
99
- t = FIELD_DP64(t, ID_AA64ISAR1, GPI, impdef_val);
100
- cpu->isar.id_aa64isar1 = t;
101
+ cpu->isar.id_aa64isar1 = isar1;
102
}
103
104
static Property arm_cpu_pauth_property =
105
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
106
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/tcg/cpu64.c
108
+++ b/target/arm/tcg/cpu64.c
109
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
110
111
t = cpu->isar.id_aa64isar1;
112
t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2); /* FEAT_DPB2 */
113
+ t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_1);
114
+ t = FIELD_DP64(t, ID_AA64ISAR1, API, 1);
115
t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1); /* FEAT_JSCVT */
116
t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1); /* FEAT_FCMA */
117
t = FIELD_DP64(t, ID_AA64ISAR1, LRCPC, 2); /* FEAT_LRCPC2 */
39
--
118
--
40
2.17.1
119
2.34.1
41
42
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We've already added the helpers with an SVE patch, all that remains
3
Implement the QARMA3 cryptographic algorithm for PAC calculation.
4
is to wire up the aa64 and aa32 translators. Enable the feature
4
Implement a cpu feature to select the algorithm and document it.
5
within -cpu max for CONFIG_USER_ONLY.
6
5
6
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20180627043328.11531-36-richard.henderson@linaro.org
10
Message-id: 20230829232335.965414-6-richard.henderson@linaro.org
11
Message-Id: <20230609172324.982888-4-aaron@os.amperecomputing.com>
12
[rth: Merge cpu feature addition from another patch.]
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
15
---
12
target/arm/cpu.h | 1 +
16
docs/system/arm/cpu-features.rst | 21 ++++++++-----
13
linux-user/elfload.c | 1 +
17
docs/system/arm/emulation.rst | 3 ++
14
target/arm/cpu.c | 1 +
18
target/arm/cpu.h | 1 +
15
target/arm/cpu64.c | 1 +
19
target/arm/arm-qmp-cmds.c | 2 +-
16
target/arm/translate-a64.c | 36 +++++++++++++++++++
20
target/arm/cpu64.c | 24 ++++++++++++--
17
target/arm/translate.c | 74 +++++++++++++++++++++++++++-----------
21
target/arm/tcg/pauth_helper.c | 54 ++++++++++++++++++++++++++------
18
6 files changed, 93 insertions(+), 21 deletions(-)
22
tests/qtest/arm-cpu-features.c | 12 ++++++-
23
7 files changed, 94 insertions(+), 23 deletions(-)
19
24
25
diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
26
index XXXXXXX..XXXXXXX 100644
27
--- a/docs/system/arm/cpu-features.rst
28
+++ b/docs/system/arm/cpu-features.rst
29
@@ -XXX,XX +XXX,XX @@ TCG VCPU Features
30
TCG VCPU features are CPU features that are specific to TCG.
31
Below is the list of TCG VCPU features and their descriptions.
32
33
-``pauth-impdef``
34
- When ``FEAT_Pauth`` is enabled, either the *impdef* (Implementation
35
- Defined) algorithm is enabled or the *architected* QARMA algorithm
36
- is enabled. By default the impdef algorithm is disabled, and QARMA
37
- is enabled.
38
+``pauth``
39
+ Enable or disable ``FEAT_Pauth`` entirely.
40
41
- The architected QARMA algorithm has good cryptographic properties,
42
- but can be quite slow to emulate. The impdef algorithm used by QEMU
43
- is non-cryptographic but significantly faster.
44
+``pauth-impdef``
45
+ When ``pauth`` is enabled, select the QEMU implementation defined algorithm.
46
+
47
+``pauth-qarma3``
48
+ When ``pauth`` is enabled, select the architected QARMA3 algorithm.
49
+
50
+Without either ``pauth-impdef`` or ``pauth-qarma3`` enabled,
51
+the architected QARMA5 algorithm is used. The architected QARMA5
52
+and QARMA3 algorithms have good cryptographic properties, but can
53
+be quite slow to emulate. The impdef algorithm used by QEMU is
54
+non-cryptographic but significantly faster.
55
56
SVE CPU Properties
57
==================
58
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
59
index XXXXXXX..XXXXXXX 100644
60
--- a/docs/system/arm/emulation.rst
61
+++ b/docs/system/arm/emulation.rst
62
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
63
- FEAT_MTE (Memory Tagging Extension)
64
- FEAT_MTE2 (Memory Tagging Extension)
65
- FEAT_MTE3 (MTE Asymmetric Fault Handling)
66
+- FEAT_PACIMP (Pointer authentication - IMPLEMENTATION DEFINED algorithm)
67
+- FEAT_PACQARMA3 (Pointer authentication - QARMA3 algorithm)
68
+- FEAT_PACQARMA5 (Pointer authentication - QARMA5 algorithm)
69
- FEAT_PAN (Privileged access never)
70
- FEAT_PAN2 (AT S1E1R and AT S1E1W instruction variants affected by PSTATE.PAN)
71
- FEAT_PAN3 (Support for SCTLR_ELx.EPAN)
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
72
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
21
index XXXXXXX..XXXXXXX 100644
73
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
74
--- a/target/arm/cpu.h
23
+++ b/target/arm/cpu.h
75
+++ b/target/arm/cpu.h
24
@@ -XXX,XX +XXX,XX @@ enum arm_features {
76
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
25
ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
77
*/
26
ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */
78
bool prop_pauth;
27
ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
79
bool prop_pauth_impdef;
28
+ ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */
80
+ bool prop_pauth_qarma3;
29
ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
81
bool prop_lpa2;
30
ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */
82
31
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
83
/* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
32
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
84
diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
33
index XXXXXXX..XXXXXXX 100644
85
index XXXXXXX..XXXXXXX 100644
34
--- a/linux-user/elfload.c
86
--- a/target/arm/arm-qmp-cmds.c
35
+++ b/linux-user/elfload.c
87
+++ b/target/arm/arm-qmp-cmds.c
36
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
88
@@ -XXX,XX +XXX,XX @@ static const char *cpu_model_advertised_features[] = {
37
ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
89
"sve640", "sve768", "sve896", "sve1024", "sve1152", "sve1280",
38
GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
90
"sve1408", "sve1536", "sve1664", "sve1792", "sve1920", "sve2048",
39
GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
91
"kvm-no-adjvtime", "kvm-steal-time",
40
+ GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP);
92
- "pauth", "pauth-impdef",
41
GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
93
+ "pauth", "pauth-impdef", "pauth-qarma3",
42
GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
94
NULL
43
#undef GET_FEATURE
95
};
44
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
96
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/cpu.c
47
+++ b/target/arm/cpu.c
48
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
49
set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
50
set_feature(&cpu->env, ARM_FEATURE_CRC);
51
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
52
+ set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
53
set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
54
#endif
55
}
56
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
97
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
57
index XXXXXXX..XXXXXXX 100644
98
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/cpu64.c
99
--- a/target/arm/cpu64.c
59
+++ b/target/arm/cpu64.c
100
+++ b/target/arm/cpu64.c
60
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
101
@@ -XXX,XX +XXX,XX @@ void aarch64_add_sme_properties(Object *obj)
61
set_feature(&cpu->env, ARM_FEATURE_CRC);
102
void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
62
set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
103
{
63
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
104
ARMPauthFeature features = cpu_isar_feature(pauth_feature, cpu);
64
+ set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
105
- uint64_t isar1;
65
set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
106
+ uint64_t isar1, isar2;
66
set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
107
67
set_feature(&cpu->env, ARM_FEATURE_SVE);
108
/*
68
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
109
* These properties enable or disable Pauth as a whole, or change
69
index XXXXXXX..XXXXXXX 100644
110
@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
70
--- a/target/arm/translate-a64.c
111
isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, API, 0);
71
+++ b/target/arm/translate-a64.c
112
isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPI, 0);
72
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
113
73
vec_full_reg_size(s), gvec_op);
114
+ isar2 = cpu->isar.id_aa64isar2;
74
}
115
+ isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, APA3, 0);
75
116
+ isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, GPA3, 0);
76
+/* Expand a 3-operand operation using an out-of-line helper. */
117
+
77
+static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
118
if (kvm_enabled() || hvf_enabled()) {
78
+ int rn, int rm, int data, gen_helper_gvec_3 *fn)
119
/*
120
* Exit early if PAuth is enabled and fall through to disable it.
121
@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
122
}
123
124
if (cpu->prop_pauth) {
125
+ if (cpu->prop_pauth_impdef && cpu->prop_pauth_qarma3) {
126
+ error_setg(errp,
127
+ "cannot enable both pauth-impdef and pauth-qarma3");
128
+ return;
129
+ }
130
+
131
if (cpu->prop_pauth_impdef) {
132
isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, API, features);
133
isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPI, 1);
134
+ } else if (cpu->prop_pauth_qarma3) {
135
+ isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, APA3, features);
136
+ isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, GPA3, 1);
137
} else {
138
isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, features);
139
isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 1);
140
}
141
- } else if (cpu->prop_pauth_impdef) {
142
- error_setg(errp, "cannot enable pauth-impdef without pauth");
143
+ } else if (cpu->prop_pauth_impdef || cpu->prop_pauth_qarma3) {
144
+ error_setg(errp, "cannot enable pauth-impdef or "
145
+ "pauth-qarma3 without pauth");
146
error_append_hint(errp, "Add pauth=on to the CPU property list.\n");
147
}
148
}
149
150
cpu->isar.id_aa64isar1 = isar1;
151
+ cpu->isar.id_aa64isar2 = isar2;
152
}
153
154
static Property arm_cpu_pauth_property =
155
DEFINE_PROP_BOOL("pauth", ARMCPU, prop_pauth, true);
156
static Property arm_cpu_pauth_impdef_property =
157
DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
158
+static Property arm_cpu_pauth_qarma3_property =
159
+ DEFINE_PROP_BOOL("pauth-qarma3", ARMCPU, prop_pauth_qarma3, false);
160
161
void aarch64_add_pauth_properties(Object *obj)
162
{
163
@@ -XXX,XX +XXX,XX @@ void aarch64_add_pauth_properties(Object *obj)
164
cpu->prop_pauth = cpu_isar_feature(aa64_pauth, cpu);
165
} else {
166
qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_impdef_property);
167
+ qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_qarma3_property);
168
}
169
}
170
171
diff --git a/target/arm/tcg/pauth_helper.c b/target/arm/tcg/pauth_helper.c
172
index XXXXXXX..XXXXXXX 100644
173
--- a/target/arm/tcg/pauth_helper.c
174
+++ b/target/arm/tcg/pauth_helper.c
175
@@ -XXX,XX +XXX,XX @@ static uint64_t pac_sub(uint64_t i)
176
return o;
177
}
178
179
+static uint64_t pac_sub1(uint64_t i)
79
+{
180
+{
80
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
181
+ static const uint8_t sub1[16] = {
81
+ vec_full_reg_offset(s, rn),
182
+ 0xa, 0xd, 0xe, 0x6, 0xf, 0x7, 0x3, 0x5,
82
+ vec_full_reg_offset(s, rm),
183
+ 0x9, 0x8, 0x0, 0xc, 0xb, 0x1, 0x2, 0x4,
83
+ is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
184
+ };
185
+ uint64_t o = 0;
186
+ int b;
187
+
188
+ for (b = 0; b < 64; b += 4) {
189
+ o |= (uint64_t)sub1[(i >> b) & 0xf] << b;
190
+ }
191
+ return o;
84
+}
192
+}
85
+
193
+
86
/* Expand a 3-operand + env pointer operation using
194
static uint64_t pac_inv_sub(uint64_t i)
87
* an out-of-line helper.
195
{
88
*/
196
static const uint8_t inv_sub[16] = {
89
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
197
@@ -XXX,XX +XXX,XX @@ static uint64_t tweak_inv_shuffle(uint64_t i)
198
}
199
200
static uint64_t pauth_computepac_architected(uint64_t data, uint64_t modifier,
201
- ARMPACKey key)
202
+ ARMPACKey key, bool isqarma3)
203
{
204
static const uint64_t RC[5] = {
205
0x0000000000000000ull,
206
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac_architected(uint64_t data, uint64_t modifier,
207
0x452821E638D01377ull,
208
};
209
const uint64_t alpha = 0xC0AC29B7C97C50DDull;
210
+ int iterations = isqarma3 ? 2 : 4;
211
/*
212
* Note that in the ARM pseudocode, key0 contains bits <127:64>
213
* and key1 contains bits <63:0> of the 128-bit key.
214
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac_architected(uint64_t data, uint64_t modifier,
215
runningmod = modifier;
216
workingval = data ^ key0;
217
218
- for (i = 0; i <= 4; ++i) {
219
+ for (i = 0; i <= iterations; ++i) {
220
roundkey = key1 ^ runningmod;
221
workingval ^= roundkey;
222
workingval ^= RC[i];
223
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac_architected(uint64_t data, uint64_t modifier,
224
workingval = pac_cell_shuffle(workingval);
225
workingval = pac_mult(workingval);
90
}
226
}
91
feature = ARM_FEATURE_V8_RDM;
227
- workingval = pac_sub(workingval);
92
break;
228
+ if (isqarma3) {
93
+ case 0x02: /* SDOT (vector) */
229
+ workingval = pac_sub1(workingval);
94
+ case 0x12: /* UDOT (vector) */
230
+ } else {
95
+ if (size != MO_32) {
231
+ workingval = pac_sub(workingval);
96
+ unallocated_encoding(s);
97
+ return;
98
+ }
232
+ }
99
+ feature = ARM_FEATURE_V8_DOTPROD;
233
runningmod = tweak_shuffle(runningmod);
100
+ break;
234
}
101
case 0x8: /* FCMLA, #0 */
235
roundkey = modk0 ^ runningmod;
102
case 0x9: /* FCMLA, #90 */
236
workingval ^= roundkey;
103
case 0xa: /* FCMLA, #180 */
237
workingval = pac_cell_shuffle(workingval);
104
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
238
workingval = pac_mult(workingval);
239
- workingval = pac_sub(workingval);
240
+ if (isqarma3) {
241
+ workingval = pac_sub1(workingval);
242
+ } else {
243
+ workingval = pac_sub(workingval);
244
+ }
245
workingval = pac_cell_shuffle(workingval);
246
workingval = pac_mult(workingval);
247
workingval ^= key1;
248
workingval = pac_cell_inv_shuffle(workingval);
249
- workingval = pac_inv_sub(workingval);
250
+ if (isqarma3) {
251
+ workingval = pac_sub1(workingval);
252
+ } else {
253
+ workingval = pac_inv_sub(workingval);
254
+ }
255
workingval = pac_mult(workingval);
256
workingval = pac_cell_inv_shuffle(workingval);
257
workingval ^= key0;
258
workingval ^= runningmod;
259
- for (i = 0; i <= 4; ++i) {
260
- workingval = pac_inv_sub(workingval);
261
- if (i < 4) {
262
+ for (i = 0; i <= iterations; ++i) {
263
+ if (isqarma3) {
264
+ workingval = pac_sub1(workingval);
265
+ } else {
266
+ workingval = pac_inv_sub(workingval);
267
+ }
268
+ if (i < iterations) {
269
workingval = pac_mult(workingval);
270
workingval = pac_cell_inv_shuffle(workingval);
105
}
271
}
106
return;
272
runningmod = tweak_inv_shuffle(runningmod);
107
273
roundkey = key1 ^ runningmod;
108
+ case 0x2: /* SDOT / UDOT */
274
- workingval ^= RC[4 - i];
109
+ gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0,
275
+ workingval ^= RC[iterations - i];
110
+ u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b);
276
workingval ^= roundkey;
111
+ return;
277
workingval ^= alpha;
112
+
278
}
113
case 0x8: /* FCMLA, #0 */
279
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac(CPUARMState *env, uint64_t data,
114
case 0x9: /* FCMLA, #90 */
280
uint64_t modifier, ARMPACKey key)
115
case 0xa: /* FCMLA, #180 */
281
{
116
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
282
if (cpu_isar_feature(aa64_pauth_qarma5, env_archcpu(env))) {
117
return;
283
- return pauth_computepac_architected(data, modifier, key);
118
}
284
+ return pauth_computepac_architected(data, modifier, key, false);
119
break;
285
+ } else if (cpu_isar_feature(aa64_pauth_qarma3, env_archcpu(env))) {
120
+ case 0x0e: /* SDOT */
286
+ return pauth_computepac_architected(data, modifier, key, true);
121
+ case 0x1e: /* UDOT */
122
+ if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
123
+ unallocated_encoding(s);
124
+ return;
125
+ }
126
+ break;
127
case 0x11: /* FCMLA #0 */
128
case 0x13: /* FCMLA #90 */
129
case 0x15: /* FCMLA #180 */
130
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
131
}
132
133
switch (16 * u + opcode) {
134
+ case 0x0e: /* SDOT */
135
+ case 0x1e: /* UDOT */
136
+ gen_gvec_op3_ool(s, is_q, rd, rn, rm, index,
137
+ u ? gen_helper_gvec_udot_idx_b
138
+ : gen_helper_gvec_sdot_idx_b);
139
+ return;
140
case 0x11: /* FCMLA #0 */
141
case 0x13: /* FCMLA #90 */
142
case 0x15: /* FCMLA #180 */
143
diff --git a/target/arm/translate.c b/target/arm/translate.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/arm/translate.c
146
+++ b/target/arm/translate.c
147
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
148
*/
149
static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
150
{
151
- gen_helper_gvec_3_ptr *fn_gvec_ptr;
152
- int rd, rn, rm, rot, size, opr_sz;
153
- TCGv_ptr fpst;
154
+ gen_helper_gvec_3 *fn_gvec = NULL;
155
+ gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
156
+ int rd, rn, rm, opr_sz;
157
+ int data = 0;
158
bool q;
159
160
q = extract32(insn, 6, 1);
161
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
162
163
if ((insn & 0xfe200f10) == 0xfc200800) {
164
/* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
165
- size = extract32(insn, 20, 1);
166
- rot = extract32(insn, 23, 2);
167
+ int size = extract32(insn, 20, 1);
168
+ data = extract32(insn, 23, 2); /* rot */
169
if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
170
|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
171
return 1;
172
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
173
fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
174
} else if ((insn & 0xfea00f10) == 0xfc800800) {
175
/* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
176
- size = extract32(insn, 20, 1);
177
- rot = extract32(insn, 24, 1);
178
+ int size = extract32(insn, 20, 1);
179
+ data = extract32(insn, 24, 1); /* rot */
180
if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
181
|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
182
return 1;
183
}
184
fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
185
+ } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
186
+ /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
187
+ bool u = extract32(insn, 4, 1);
188
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
189
+ return 1;
190
+ }
191
+ fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
192
} else {
287
} else {
193
return 1;
288
return pauth_computepac_impdef(data, modifier, key);
194
}
289
}
195
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
290
diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
196
}
291
index XXXXXXX..XXXXXXX 100644
197
292
--- a/tests/qtest/arm-cpu-features.c
198
opr_sz = (1 + q) * 8;
293
+++ b/tests/qtest/arm-cpu-features.c
199
- fpst = get_fpstatus_ptr(1);
294
@@ -XXX,XX +XXX,XX @@ static void pauth_tests_default(QTestState *qts, const char *cpu_type)
200
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
295
{
201
- vfp_reg_offset(1, rn),
296
assert_has_feature_enabled(qts, cpu_type, "pauth");
202
- vfp_reg_offset(1, rm), fpst,
297
assert_has_feature_disabled(qts, cpu_type, "pauth-impdef");
203
- opr_sz, opr_sz, rot, fn_gvec_ptr);
298
+ assert_has_feature_disabled(qts, cpu_type, "pauth-qarma3");
204
- tcg_temp_free_ptr(fpst);
299
assert_set_feature(qts, cpu_type, "pauth", false);
205
+ if (fn_gvec_ptr) {
300
assert_set_feature(qts, cpu_type, "pauth", true);
206
+ TCGv_ptr fpst = get_fpstatus_ptr(1);
301
assert_set_feature(qts, cpu_type, "pauth-impdef", true);
207
+ tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
302
assert_set_feature(qts, cpu_type, "pauth-impdef", false);
208
+ vfp_reg_offset(1, rn),
303
- assert_error(qts, cpu_type, "cannot enable pauth-impdef without pauth",
209
+ vfp_reg_offset(1, rm), fpst,
304
+ assert_set_feature(qts, cpu_type, "pauth-qarma3", true);
210
+ opr_sz, opr_sz, data, fn_gvec_ptr);
305
+ assert_set_feature(qts, cpu_type, "pauth-qarma3", false);
211
+ tcg_temp_free_ptr(fpst);
306
+ assert_error(qts, cpu_type,
212
+ } else {
307
+ "cannot enable pauth-impdef or pauth-qarma3 without pauth",
213
+ tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
308
"{ 'pauth': false, 'pauth-impdef': true }");
214
+ vfp_reg_offset(1, rn),
309
+ assert_error(qts, cpu_type,
215
+ vfp_reg_offset(1, rm),
310
+ "cannot enable pauth-impdef or pauth-qarma3 without pauth",
216
+ opr_sz, opr_sz, data, fn_gvec);
311
+ "{ 'pauth': false, 'pauth-qarma3': true }");
217
+ }
312
+ assert_error(qts, cpu_type,
218
return 0;
313
+ "cannot enable both pauth-impdef and pauth-qarma3",
219
}
314
+ "{ 'pauth': true, 'pauth-impdef': true, 'pauth-qarma3': true }");
220
315
}
221
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
316
222
317
static void test_query_cpu_model_expansion(const void *data)
223
static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
224
{
225
- gen_helper_gvec_3_ptr *fn_gvec_ptr;
226
+ gen_helper_gvec_3 *fn_gvec = NULL;
227
+ gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
228
int rd, rn, rm, opr_sz, data;
229
- TCGv_ptr fpst;
230
bool q;
231
232
q = extract32(insn, 6, 1);
233
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
234
data = (index << 2) | rot;
235
fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
236
: gen_helper_gvec_fcmlah_idx);
237
+ } else if ((insn & 0xffb00f00) == 0xfe200d00) {
238
+ /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
239
+ int u = extract32(insn, 4, 1);
240
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
241
+ return 1;
242
+ }
243
+ fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
244
+ /* rm is just Vm, and index is M. */
245
+ data = extract32(insn, 5, 1); /* index */
246
+ rm = extract32(insn, 0, 4);
247
} else {
248
return 1;
249
}
250
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
251
}
252
253
opr_sz = (1 + q) * 8;
254
- fpst = get_fpstatus_ptr(1);
255
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
256
- vfp_reg_offset(1, rn),
257
- vfp_reg_offset(1, rm), fpst,
258
- opr_sz, opr_sz, data, fn_gvec_ptr);
259
- tcg_temp_free_ptr(fpst);
260
+ if (fn_gvec_ptr) {
261
+ TCGv_ptr fpst = get_fpstatus_ptr(1);
262
+ tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
263
+ vfp_reg_offset(1, rn),
264
+ vfp_reg_offset(1, rm), fpst,
265
+ opr_sz, opr_sz, data, fn_gvec_ptr);
266
+ tcg_temp_free_ptr(fpst);
267
+ } else {
268
+ tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
269
+ vfp_reg_offset(1, rn),
270
+ vfp_reg_offset(1, rm),
271
+ opr_sz, opr_sz, data, fn_gvec);
272
+ }
273
return 0;
274
}
275
276
--
318
--
277
2.17.1
319
2.34.1
278
279
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
2
3
For aa64 advsimd, we had been passing the pre-indexed vector.
3
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
4
However, sve applies the index to each 128-bit segment, so we
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
need to pass in the index separately.
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
7
For aa32 advsimd, the fp32 operation always has index 0, but
8
we failed to interpret the fp16 index correctly.
9
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20230829232335.965414-7-richard.henderson@linaro.org
12
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Message-Id: <20230609172324.982888-5-aaron@os.amperecomputing.com>
13
Message-id: 20180627043328.11531-31-richard.henderson@linaro.org
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
11
---
16
target/arm/translate-a64.c | 21 ++++++++++++---------
12
docs/system/arm/emulation.rst | 1 +
17
target/arm/translate.c | 32 +++++++++++++++++++++++---------
13
target/arm/tcg/cpu64.c | 2 +-
18
target/arm/vec_helper.c | 10 ++++++----
14
target/arm/tcg/pauth_helper.c | 16 +++++++++++-----
19
3 files changed, 41 insertions(+), 22 deletions(-)
15
3 files changed, 13 insertions(+), 6 deletions(-)
20
16
21
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
17
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
22
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/translate-a64.c
19
--- a/docs/system/arm/emulation.rst
24
+++ b/target/arm/translate-a64.c
20
+++ b/docs/system/arm/emulation.rst
25
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
21
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
26
case 0x13: /* FCMLA #90 */
22
- FEAT_DotProd (Advanced SIMD dot product instructions)
27
case 0x15: /* FCMLA #180 */
23
- FEAT_DoubleFault (Double Fault Extension)
28
case 0x17: /* FCMLA #270 */
24
- FEAT_E0PD (Preventing EL0 access to halves of address maps)
29
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
25
+- FEAT_EPAC (Enhanced pointer authentication)
30
- vec_full_reg_offset(s, rn),
26
- FEAT_ETS (Enhanced Translation Synchronization)
31
- vec_reg_offset(s, rm, index, size), fpst,
27
- FEAT_EVT (Enhanced Virtualization Traps)
32
- is_q ? 16 : 8, vec_full_reg_size(s),
28
- FEAT_FCMA (Floating-point complex number instructions)
33
- extract32(insn, 13, 2), /* rot */
29
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
34
- size == MO_64
30
index XXXXXXX..XXXXXXX 100644
35
- ? gen_helper_gvec_fcmlas_idx
31
--- a/target/arm/tcg/cpu64.c
36
- : gen_helper_gvec_fcmlah_idx);
32
+++ b/target/arm/tcg/cpu64.c
37
- tcg_temp_free_ptr(fpst);
33
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
38
+ {
34
39
+ int rot = extract32(insn, 13, 2);
35
t = cpu->isar.id_aa64isar1;
40
+ int data = (index << 2) | rot;
36
t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2); /* FEAT_DPB2 */
41
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
37
- t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_1);
42
+ vec_full_reg_offset(s, rn),
38
+ t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_EPAC);
43
+ vec_full_reg_offset(s, rm), fpst,
39
t = FIELD_DP64(t, ID_AA64ISAR1, API, 1);
44
+ is_q ? 16 : 8, vec_full_reg_size(s), data,
40
t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1); /* FEAT_JSCVT */
45
+ size == MO_64
41
t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1); /* FEAT_FCMA */
46
+ ? gen_helper_gvec_fcmlas_idx
42
diff --git a/target/arm/tcg/pauth_helper.c b/target/arm/tcg/pauth_helper.c
47
+ : gen_helper_gvec_fcmlah_idx);
43
index XXXXXXX..XXXXXXX 100644
48
+ tcg_temp_free_ptr(fpst);
44
--- a/target/arm/tcg/pauth_helper.c
45
+++ b/target/arm/tcg/pauth_helper.c
46
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac(CPUARMState *env, uint64_t data,
47
static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
48
ARMPACKey *key, bool data)
49
{
50
+ ARMCPU *cpu = env_archcpu(env);
51
ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
52
ARMVAParameters param = aa64_va_parameters(env, ptr, mmu_idx, data, false);
53
+ ARMPauthFeature pauth_feature = cpu_isar_feature(pauth_feature, cpu);
54
uint64_t pac, ext_ptr, ext, test;
55
int bot_bit, top_bit;
56
57
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
58
*/
59
test = sextract64(ptr, bot_bit, top_bit - bot_bit);
60
if (test != 0 && test != -1) {
61
- /*
62
- * Note that our top_bit is one greater than the pseudocode's
63
- * version, hence "- 2" here.
64
- */
65
- pac ^= MAKE_64BIT_MASK(top_bit - 2, 1);
66
+ if (pauth_feature == PauthFeat_EPAC) {
67
+ pac = 0;
68
+ } else {
69
+ /*
70
+ * Note that our top_bit is one greater than the pseudocode's
71
+ * version, hence "- 2" here.
72
+ */
73
+ pac ^= MAKE_64BIT_MASK(top_bit - 2, 1);
49
+ }
74
+ }
50
return;
51
}
75
}
52
76
53
diff --git a/target/arm/translate.c b/target/arm/translate.c
77
/*
54
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/translate.c
56
+++ b/target/arm/translate.c
57
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
58
59
static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
60
{
61
- int rd, rn, rm, rot, size, opr_sz;
62
+ gen_helper_gvec_3_ptr *fn_gvec_ptr;
63
+ int rd, rn, rm, opr_sz, data;
64
TCGv_ptr fpst;
65
bool q;
66
67
q = extract32(insn, 6, 1);
68
VFP_DREG_D(rd, insn);
69
VFP_DREG_N(rn, insn);
70
- VFP_DREG_M(rm, insn);
71
if ((rd | rn) & q) {
72
return 1;
73
}
74
75
if ((insn & 0xff000f10) == 0xfe000800) {
76
/* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
77
- rot = extract32(insn, 20, 2);
78
- size = extract32(insn, 23, 1);
79
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
80
- || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
81
+ int rot = extract32(insn, 20, 2);
82
+ int size = extract32(insn, 23, 1);
83
+ int index;
84
+
85
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
86
return 1;
87
}
88
+ if (size == 0) {
89
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
90
+ return 1;
91
+ }
92
+ /* For fp16, rm is just Vm, and index is M. */
93
+ rm = extract32(insn, 0, 4);
94
+ index = extract32(insn, 5, 1);
95
+ } else {
96
+ /* For fp32, rm is the usual M:Vm, and index is 0. */
97
+ VFP_DREG_M(rm, insn);
98
+ index = 0;
99
+ }
100
+ data = (index << 2) | rot;
101
+ fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
102
+ : gen_helper_gvec_fcmlah_idx);
103
} else {
104
return 1;
105
}
106
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
107
tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
108
vfp_reg_offset(1, rn),
109
vfp_reg_offset(1, rm), fpst,
110
- opr_sz, opr_sz, rot,
111
- size ? gen_helper_gvec_fcmlas_idx
112
- : gen_helper_gvec_fcmlah_idx);
113
+ opr_sz, opr_sz, data, fn_gvec_ptr);
114
tcg_temp_free_ptr(fpst);
115
return 0;
116
}
117
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
118
index XXXXXXX..XXXXXXX 100644
119
--- a/target/arm/vec_helper.c
120
+++ b/target/arm/vec_helper.c
121
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
122
float_status *fpst = vfpst;
123
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
124
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
125
+ intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
126
uint32_t neg_real = flip ^ neg_imag;
127
uintptr_t i;
128
- float16 e1 = m[H2(flip)];
129
- float16 e3 = m[H2(1 - flip)];
130
+ float16 e1 = m[H2(2 * index + flip)];
131
+ float16 e3 = m[H2(2 * index + 1 - flip)];
132
133
/* Shift boolean to the sign bit so we can xor to negate. */
134
neg_real <<= 15;
135
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm,
136
float_status *fpst = vfpst;
137
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
138
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
139
+ intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
140
uint32_t neg_real = flip ^ neg_imag;
141
uintptr_t i;
142
- float32 e1 = m[H4(flip)];
143
- float32 e3 = m[H4(1 - flip)];
144
+ float32 e1 = m[H4(2 * index + flip)];
145
+ float32 e3 = m[H4(2 * index + 1 - flip)];
146
147
/* Shift boolean to the sign bit so we can xor to negate. */
148
neg_real <<= 31;
149
--
78
--
150
2.17.1
79
2.34.1
151
152
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
2
3
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20230829232335.965414-8-richard.henderson@linaro.org
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Message-Id: <20230609172324.982888-6-aaron@os.amperecomputing.com>
6
Message-id: 20180627043328.11531-8-richard.henderson@linaro.org
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
11
---
9
target/arm/helper-sve.h | 16 ++++
12
docs/system/arm/emulation.rst | 1 +
10
target/arm/sve_helper.c | 158 +++++++++++++++++++++++++++++++++++++
13
target/arm/tcg/cpu64.c | 2 +-
11
target/arm/translate-sve.c | 49 ++++++++++++
14
target/arm/tcg/pauth_helper.c | 21 +++++++++++++++++----
12
target/arm/sve.decode | 18 +++++
15
3 files changed, 19 insertions(+), 5 deletions(-)
13
4 files changed, 241 insertions(+)
14
16
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
17
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
19
--- a/docs/system/arm/emulation.rst
18
+++ b/target/arm/helper-sve.h
20
+++ b/docs/system/arm/emulation.rst
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
21
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
20
DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
22
- FEAT_PAN2 (AT S1E1R and AT S1E1W instruction variants affected by PSTATE.PAN)
21
void, ptr, ptr, ptr, ptr, i32)
23
- FEAT_PAN3 (Support for SCTLR_ELx.EPAN)
22
24
- FEAT_PAuth (Pointer authentication)
23
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
25
+- FEAT_PAuth2 (Enhacements to pointer authentication)
24
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
26
- FEAT_PMULL (PMULL, PMULL2 instructions)
25
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
27
- FEAT_PMUv3p1 (PMU Extensions v3.1)
28
- FEAT_PMUv3p4 (PMU Extensions v3.4)
29
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/tcg/cpu64.c
32
+++ b/target/arm/tcg/cpu64.c
33
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
34
35
t = cpu->isar.id_aa64isar1;
36
t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2); /* FEAT_DPB2 */
37
- t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_EPAC);
38
+ t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_2);
39
t = FIELD_DP64(t, ID_AA64ISAR1, API, 1);
40
t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1); /* FEAT_JSCVT */
41
t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1); /* FEAT_FCMA */
42
diff --git a/target/arm/tcg/pauth_helper.c b/target/arm/tcg/pauth_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/tcg/pauth_helper.c
45
+++ b/target/arm/tcg/pauth_helper.c
46
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
47
*/
48
test = sextract64(ptr, bot_bit, top_bit - bot_bit);
49
if (test != 0 && test != -1) {
50
- if (pauth_feature == PauthFeat_EPAC) {
51
+ if (pauth_feature >= PauthFeat_2) {
52
+ /* No action required */
53
+ } else if (pauth_feature == PauthFeat_EPAC) {
54
pac = 0;
55
} else {
56
/*
57
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
58
* Preserve the determination between upper and lower at bit 55,
59
* and insert pointer authentication code.
60
*/
61
+ if (pauth_feature >= PauthFeat_2) {
62
+ pac ^= ptr;
63
+ }
64
if (param.tbi) {
65
ptr &= ~MAKE_64BIT_MASK(bot_bit, 55 - bot_bit + 1);
66
pac &= MAKE_64BIT_MASK(bot_bit, 54 - bot_bit + 1);
67
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
68
static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
69
ARMPACKey *key, bool data, int keynumber)
70
{
71
+ ARMCPU *cpu = env_archcpu(env);
72
ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
73
ARMVAParameters param = aa64_va_parameters(env, ptr, mmu_idx, data, false);
74
+ ARMPauthFeature pauth_feature = cpu_isar_feature(pauth_feature, cpu);
75
int bot_bit, top_bit;
76
- uint64_t pac, orig_ptr, test;
77
+ uint64_t pac, orig_ptr, cmp_mask;
78
79
orig_ptr = pauth_original_ptr(ptr, param);
80
pac = pauth_computepac(env, orig_ptr, modifier, *key);
81
bot_bit = 64 - param.tsz;
82
top_bit = 64 - 8 * param.tbi;
83
84
- test = (pac ^ ptr) & ~MAKE_64BIT_MASK(55, 1);
85
- if (unlikely(extract64(test, bot_bit, top_bit - bot_bit))) {
86
+ cmp_mask = MAKE_64BIT_MASK(bot_bit, top_bit - bot_bit);
87
+ cmp_mask &= ~MAKE_64BIT_MASK(55, 1);
26
+
88
+
27
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
89
+ if (pauth_feature >= PauthFeat_2) {
28
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
90
+ return ptr ^ (pac & cmp_mask);
29
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
32
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
33
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
34
+
35
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
36
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
37
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
38
+
39
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/sve_helper.c
45
+++ b/target/arm/sve_helper.c
46
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64)
47
48
#undef DO_ZPZ_FP
49
50
+/* 4-operand predicated multiply-add. This requires 7 operands to pass
51
+ * "properly", so we need to encode some of the registers into DESC.
52
+ */
53
+QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32);
54
+
55
+static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc,
56
+ uint16_t neg1, uint16_t neg3)
57
+{
58
+ intptr_t i = simd_oprsz(desc);
59
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
60
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
61
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
62
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
63
+ void *vd = &env->vfp.zregs[rd];
64
+ void *vn = &env->vfp.zregs[rn];
65
+ void *vm = &env->vfp.zregs[rm];
66
+ void *va = &env->vfp.zregs[ra];
67
+ uint64_t *g = vg;
68
+
69
+ do {
70
+ uint64_t pg = g[(i - 1) >> 6];
71
+ do {
72
+ i -= 2;
73
+ if (likely((pg >> (i & 63)) & 1)) {
74
+ float16 e1, e2, e3, r;
75
+
76
+ e1 = *(uint16_t *)(vn + H1_2(i)) ^ neg1;
77
+ e2 = *(uint16_t *)(vm + H1_2(i));
78
+ e3 = *(uint16_t *)(va + H1_2(i)) ^ neg3;
79
+ r = float16_muladd(e1, e2, e3, 0, &env->vfp.fp_status);
80
+ *(uint16_t *)(vd + H1_2(i)) = r;
81
+ }
82
+ } while (i & 63);
83
+ } while (i != 0);
84
+}
85
+
86
+void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
87
+{
88
+ do_fmla_zpzzz_h(env, vg, desc, 0, 0);
89
+}
90
+
91
+void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
92
+{
93
+ do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0);
94
+}
95
+
96
+void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
97
+{
98
+ do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000);
99
+}
100
+
101
+void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
102
+{
103
+ do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000);
104
+}
105
+
106
+static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc,
107
+ uint32_t neg1, uint32_t neg3)
108
+{
109
+ intptr_t i = simd_oprsz(desc);
110
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
111
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
112
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
113
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
114
+ void *vd = &env->vfp.zregs[rd];
115
+ void *vn = &env->vfp.zregs[rn];
116
+ void *vm = &env->vfp.zregs[rm];
117
+ void *va = &env->vfp.zregs[ra];
118
+ uint64_t *g = vg;
119
+
120
+ do {
121
+ uint64_t pg = g[(i - 1) >> 6];
122
+ do {
123
+ i -= 4;
124
+ if (likely((pg >> (i & 63)) & 1)) {
125
+ float32 e1, e2, e3, r;
126
+
127
+ e1 = *(uint32_t *)(vn + H1_4(i)) ^ neg1;
128
+ e2 = *(uint32_t *)(vm + H1_4(i));
129
+ e3 = *(uint32_t *)(va + H1_4(i)) ^ neg3;
130
+ r = float32_muladd(e1, e2, e3, 0, &env->vfp.fp_status);
131
+ *(uint32_t *)(vd + H1_4(i)) = r;
132
+ }
133
+ } while (i & 63);
134
+ } while (i != 0);
135
+}
136
+
137
+void HELPER(sve_fmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
138
+{
139
+ do_fmla_zpzzz_s(env, vg, desc, 0, 0);
140
+}
141
+
142
+void HELPER(sve_fmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
143
+{
144
+ do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0);
145
+}
146
+
147
+void HELPER(sve_fnmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
148
+{
149
+ do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0x80000000);
150
+}
151
+
152
+void HELPER(sve_fnmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
153
+{
154
+ do_fmla_zpzzz_s(env, vg, desc, 0, 0x80000000);
155
+}
156
+
157
+static void do_fmla_zpzzz_d(CPUARMState *env, void *vg, uint32_t desc,
158
+ uint64_t neg1, uint64_t neg3)
159
+{
160
+ intptr_t i = simd_oprsz(desc);
161
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
162
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
163
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
164
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
165
+ void *vd = &env->vfp.zregs[rd];
166
+ void *vn = &env->vfp.zregs[rn];
167
+ void *vm = &env->vfp.zregs[rm];
168
+ void *va = &env->vfp.zregs[ra];
169
+ uint64_t *g = vg;
170
+
171
+ do {
172
+ uint64_t pg = g[(i - 1) >> 6];
173
+ do {
174
+ i -= 8;
175
+ if (likely((pg >> (i & 63)) & 1)) {
176
+ float64 e1, e2, e3, r;
177
+
178
+ e1 = *(uint64_t *)(vn + i) ^ neg1;
179
+ e2 = *(uint64_t *)(vm + i);
180
+ e3 = *(uint64_t *)(va + i) ^ neg3;
181
+ r = float64_muladd(e1, e2, e3, 0, &env->vfp.fp_status);
182
+ *(uint64_t *)(vd + i) = r;
183
+ }
184
+ } while (i & 63);
185
+ } while (i != 0);
186
+}
187
+
188
+void HELPER(sve_fmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
189
+{
190
+ do_fmla_zpzzz_d(env, vg, desc, 0, 0);
191
+}
192
+
193
+void HELPER(sve_fmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
194
+{
195
+ do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, 0);
196
+}
197
+
198
+void HELPER(sve_fnmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
199
+{
200
+ do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, INT64_MIN);
201
+}
202
+
203
+void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
204
+{
205
+ do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN);
206
+}
207
+
208
/*
209
* Load contiguous data, protected by a governing predicate.
210
*/
211
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
212
index XXXXXXX..XXXXXXX 100644
213
--- a/target/arm/translate-sve.c
214
+++ b/target/arm/translate-sve.c
215
@@ -XXX,XX +XXX,XX @@ DO_FP3(FMULX, fmulx)
216
217
#undef DO_FP3
218
219
+typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32);
220
+
221
+static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn)
222
+{
223
+ if (fn == NULL) {
224
+ return false;
225
+ }
226
+ if (!sve_access_check(s)) {
227
+ return true;
228
+ }
91
+ }
229
+
92
+
230
+ unsigned vsz = vec_full_reg_size(s);
93
+ if ((pac ^ ptr) & cmp_mask) {
231
+ unsigned desc;
94
int error_code = (keynumber << 1) | (keynumber ^ 1);
232
+ TCGv_i32 t_desc;
95
if (param.tbi) {
233
+ TCGv_ptr pg = tcg_temp_new_ptr();
96
return deposit64(orig_ptr, 53, 2, error_code);
234
+
235
+ /* We would need 7 operands to pass these arguments "properly".
236
+ * So we encode all the register numbers into the descriptor.
237
+ */
238
+ desc = deposit32(a->rd, 5, 5, a->rn);
239
+ desc = deposit32(desc, 10, 5, a->rm);
240
+ desc = deposit32(desc, 15, 5, a->ra);
241
+ desc = simd_desc(vsz, vsz, desc);
242
+
243
+ t_desc = tcg_const_i32(desc);
244
+ tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
245
+ fn(cpu_env, pg, t_desc);
246
+ tcg_temp_free_i32(t_desc);
247
+ tcg_temp_free_ptr(pg);
248
+ return true;
249
+}
250
+
251
+#define DO_FMLA(NAME, name) \
252
+static bool trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \
253
+{ \
254
+ static gen_helper_sve_fmla * const fns[4] = { \
255
+ NULL, gen_helper_sve_##name##_h, \
256
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \
257
+ }; \
258
+ return do_fmla(s, a, fns[a->esz]); \
259
+}
260
+
261
+DO_FMLA(FMLA_zpzzz, fmla_zpzzz)
262
+DO_FMLA(FMLS_zpzzz, fmls_zpzzz)
263
+DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz)
264
+DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz)
265
+
266
+#undef DO_FMLA
267
+
268
/*
269
*** SVE Floating Point Unary Operations Predicated Group
270
*/
271
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
272
index XXXXXXX..XXXXXXX 100644
273
--- a/target/arm/sve.decode
274
+++ b/target/arm/sve.decode
275
@@ -XXX,XX +XXX,XX @@
276
&rprrr_esz ra=%reg_movprfx
277
@rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \
278
&rprrr_esz rn=%reg_movprfx
279
+@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \
280
+ &rprrr_esz rn=%reg_movprfx
281
282
# One register operand, with governing predicate, vector element size
283
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
284
@@ -XXX,XX +XXX,XX @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm
285
FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR
286
FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm
287
288
+### SVE FP Multiply-Add Group
289
+
290
+# SVE floating-point multiply-accumulate writing addend
291
+FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm
292
+FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm
293
+FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm
294
+FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm
295
+
296
+# SVE floating-point multiply-accumulate writing multiplicand
297
+# Alter the operand extraction order and reuse the helpers from above.
298
+# FMAD, FMSB, FNMAD, FNMS
299
+FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra
300
+FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra
301
+FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra
302
+FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra
303
+
304
### SVE FP Unary Operations Predicated Group
305
306
# SVE integer convert to floating-point
307
--
97
--
308
2.17.1
98
2.34.1
309
310
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
An instruction is a 'combined' Pointer Authentication instruction
4
if it does something in addition to PAC -- for instance, branching
5
to or loading an address from the authenticated pointer.
6
7
Knowing whether a PAC operation is 'combined' is needed to
8
implement FEAT_FPACCOMBINE.
9
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-25-richard.henderson@linaro.org
14
Message-id: 20230829232335.965414-9-richard.henderson@linaro.org
15
Message-Id: <20230609172324.982888-7-aaron@os.amperecomputing.com>
16
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
18
---
8
target/arm/helper-sve.h | 30 +++++++++++++
19
target/arm/tcg/helper-a64.h | 4 ++
9
target/arm/helper.h | 12 +++---
20
target/arm/tcg/pauth_helper.c | 71 +++++++++++++++++++++++++++-------
10
target/arm/helper.c | 2 +-
21
target/arm/tcg/translate-a64.c | 12 +++---
11
target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++
22
3 files changed, 68 insertions(+), 19 deletions(-)
12
target/arm/translate-sve.c | 70 ++++++++++++++++++++++++++++++
13
target/arm/sve.decode | 16 +++++++
14
6 files changed, 211 insertions(+), 7 deletions(-)
15
23
16
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
24
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
17
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-sve.h
26
--- a/target/arm/tcg/helper-a64.h
19
+++ b/target/arm/helper-sve.h
27
+++ b/target/arm/tcg/helper-a64.h
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
28
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(pacda, TCG_CALL_NO_WG, i64, env, i64, i64)
21
DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
29
DEF_HELPER_FLAGS_3(pacdb, TCG_CALL_NO_WG, i64, env, i64, i64)
22
void, ptr, ptr, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_3(pacga, TCG_CALL_NO_WG, i64, env, i64, i64)
23
31
DEF_HELPER_FLAGS_3(autia, TCG_CALL_NO_WG, i64, env, i64, i64)
24
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG,
32
+DEF_HELPER_FLAGS_3(autia_combined, TCG_CALL_NO_WG, i64, env, i64, i64)
25
+ void, ptr, ptr, ptr, ptr, i32)
33
DEF_HELPER_FLAGS_3(autib, TCG_CALL_NO_WG, i64, env, i64, i64)
26
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG,
34
+DEF_HELPER_FLAGS_3(autib_combined, TCG_CALL_NO_WG, i64, env, i64, i64)
27
+ void, ptr, ptr, ptr, ptr, i32)
35
DEF_HELPER_FLAGS_3(autda, TCG_CALL_NO_WG, i64, env, i64, i64)
28
+DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG,
36
+DEF_HELPER_FLAGS_3(autda_combined, TCG_CALL_NO_WG, i64, env, i64, i64)
29
+ void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_3(autdb, TCG_CALL_NO_WG, i64, env, i64, i64)
30
+DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG,
38
+DEF_HELPER_FLAGS_3(autdb_combined, TCG_CALL_NO_WG, i64, env, i64, i64)
31
+ void, ptr, ptr, ptr, ptr, i32)
39
DEF_HELPER_FLAGS_2(xpaci, TCG_CALL_NO_RWG_SE, i64, env, i64)
32
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG,
40
DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
33
+ void, ptr, ptr, ptr, ptr, i32)
41
34
+DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG,
42
diff --git a/target/arm/tcg/pauth_helper.c b/target/arm/tcg/pauth_helper.c
35
+ void, ptr, ptr, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, i32)
38
+
39
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG,
40
+ void, ptr, ptr, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG,
42
+ void, ptr, ptr, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, i32)
49
+DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
50
+ void, ptr, ptr, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
52
+ void, ptr, ptr, ptr, ptr, i32)
53
+
54
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
55
void, ptr, ptr, ptr, ptr, i32)
56
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
57
diff --git a/target/arm/helper.h b/target/arm/helper.h
58
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/helper.h
44
--- a/target/arm/tcg/pauth_helper.c
60
+++ b/target/arm/helper.h
45
+++ b/target/arm/tcg/pauth_helper.c
61
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(vfp_touid, i32, f64, ptr)
46
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
62
DEF_HELPER_2(vfp_touizh, i32, f16, ptr)
63
DEF_HELPER_2(vfp_touizs, i32, f32, ptr)
64
DEF_HELPER_2(vfp_touizd, i32, f64, ptr)
65
-DEF_HELPER_2(vfp_tosih, i32, f16, ptr)
66
-DEF_HELPER_2(vfp_tosis, i32, f32, ptr)
67
-DEF_HELPER_2(vfp_tosid, i32, f64, ptr)
68
-DEF_HELPER_2(vfp_tosizh, i32, f16, ptr)
69
-DEF_HELPER_2(vfp_tosizs, i32, f32, ptr)
70
-DEF_HELPER_2(vfp_tosizd, i32, f64, ptr)
71
+DEF_HELPER_2(vfp_tosih, s32, f16, ptr)
72
+DEF_HELPER_2(vfp_tosis, s32, f32, ptr)
73
+DEF_HELPER_2(vfp_tosid, s32, f64, ptr)
74
+DEF_HELPER_2(vfp_tosizh, s32, f16, ptr)
75
+DEF_HELPER_2(vfp_tosizs, s32, f32, ptr)
76
+DEF_HELPER_2(vfp_tosizd, s32, f64, ptr)
77
78
DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr)
79
DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr)
80
diff --git a/target/arm/helper.c b/target/arm/helper.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/helper.c
83
+++ b/target/arm/helper.c
84
@@ -XXX,XX +XXX,XX @@ ftype HELPER(name)(uint32_t x, void *fpstp) \
85
}
47
}
86
48
87
#define CONV_FTOI(name, ftype, fsz, sign, round) \
49
static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
88
-uint32_t HELPER(name)(ftype x, void *fpstp) \
50
- ARMPACKey *key, bool data, int keynumber)
89
+sign##int32_t HELPER(name)(ftype x, void *fpstp) \
51
+ ARMPACKey *key, bool data, int keynumber,
90
{ \
52
+ uintptr_t ra, bool is_combined)
91
float_status *fpst = fpstp; \
53
{
92
if (float##fsz##_is_any_nan(x)) { \
54
ARMCPU *cpu = env_archcpu(env);
93
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
55
ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
94
index XXXXXXX..XXXXXXX 100644
56
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(pacga)(CPUARMState *env, uint64_t x, uint64_t y)
95
--- a/target/arm/sve_helper.c
57
return pac & 0xffffffff00000000ull;
96
+++ b/target/arm/sve_helper.c
97
@@ -XXX,XX +XXX,XX @@ static inline float16 sve_f64_to_f16(float64 f, float_status *fpst)
98
return ret;
99
}
58
}
100
59
101
+static inline int16_t vfp_float16_to_int16_rtz(float16 f, float_status *s)
60
-uint64_t HELPER(autia)(CPUARMState *env, uint64_t x, uint64_t y)
61
+static uint64_t pauth_autia(CPUARMState *env, uint64_t x, uint64_t y,
62
+ uintptr_t ra, bool is_combined)
63
{
64
int el = arm_current_el(env);
65
if (!pauth_key_enabled(env, el, SCTLR_EnIA)) {
66
return x;
67
}
68
- pauth_check_trap(env, el, GETPC());
69
- return pauth_auth(env, x, y, &env->keys.apia, false, 0);
70
+ pauth_check_trap(env, el, ra);
71
+ return pauth_auth(env, x, y, &env->keys.apia, false, 0, ra, is_combined);
72
}
73
74
-uint64_t HELPER(autib)(CPUARMState *env, uint64_t x, uint64_t y)
75
+uint64_t HELPER(autia)(CPUARMState *env, uint64_t x, uint64_t y)
102
+{
76
+{
103
+ if (float16_is_any_nan(f)) {
77
+ return pauth_autia(env, x, y, GETPC(), false);
104
+ float_raise(float_flag_invalid, s);
105
+ return 0;
106
+ }
107
+ return float16_to_int16_round_to_zero(f, s);
108
+}
78
+}
109
+
79
+
110
+static inline int64_t vfp_float16_to_int64_rtz(float16 f, float_status *s)
80
+uint64_t HELPER(autia_combined)(CPUARMState *env, uint64_t x, uint64_t y)
111
+{
81
+{
112
+ if (float16_is_any_nan(f)) {
82
+ return pauth_autia(env, x, y, GETPC(), true);
113
+ float_raise(float_flag_invalid, s);
114
+ return 0;
115
+ }
116
+ return float16_to_int64_round_to_zero(f, s);
117
+}
83
+}
118
+
84
+
119
+static inline int64_t vfp_float32_to_int64_rtz(float32 f, float_status *s)
85
+static uint64_t pauth_autib(CPUARMState *env, uint64_t x, uint64_t y,
86
+ uintptr_t ra, bool is_combined)
87
{
88
int el = arm_current_el(env);
89
if (!pauth_key_enabled(env, el, SCTLR_EnIB)) {
90
return x;
91
}
92
- pauth_check_trap(env, el, GETPC());
93
- return pauth_auth(env, x, y, &env->keys.apib, false, 1);
94
+ pauth_check_trap(env, el, ra);
95
+ return pauth_auth(env, x, y, &env->keys.apib, false, 1, ra, is_combined);
96
}
97
98
-uint64_t HELPER(autda)(CPUARMState *env, uint64_t x, uint64_t y)
99
+uint64_t HELPER(autib)(CPUARMState *env, uint64_t x, uint64_t y)
120
+{
100
+{
121
+ if (float32_is_any_nan(f)) {
101
+ return pauth_autib(env, x, y, GETPC(), false);
122
+ float_raise(float_flag_invalid, s);
123
+ return 0;
124
+ }
125
+ return float32_to_int64_round_to_zero(f, s);
126
+}
102
+}
127
+
103
+
128
+static inline int64_t vfp_float64_to_int64_rtz(float64 f, float_status *s)
104
+uint64_t HELPER(autib_combined)(CPUARMState *env, uint64_t x, uint64_t y)
129
+{
105
+{
130
+ if (float64_is_any_nan(f)) {
106
+ return pauth_autib(env, x, y, GETPC(), true);
131
+ float_raise(float_flag_invalid, s);
132
+ return 0;
133
+ }
134
+ return float64_to_int64_round_to_zero(f, s);
135
+}
107
+}
136
+
108
+
137
+static inline uint16_t vfp_float16_to_uint16_rtz(float16 f, float_status *s)
109
+static uint64_t pauth_autda(CPUARMState *env, uint64_t x, uint64_t y,
110
+ uintptr_t ra, bool is_combined)
111
{
112
int el = arm_current_el(env);
113
if (!pauth_key_enabled(env, el, SCTLR_EnDA)) {
114
return x;
115
}
116
- pauth_check_trap(env, el, GETPC());
117
- return pauth_auth(env, x, y, &env->keys.apda, true, 0);
118
+ pauth_check_trap(env, el, ra);
119
+ return pauth_auth(env, x, y, &env->keys.apda, true, 0, ra, is_combined);
120
}
121
122
-uint64_t HELPER(autdb)(CPUARMState *env, uint64_t x, uint64_t y)
123
+uint64_t HELPER(autda)(CPUARMState *env, uint64_t x, uint64_t y)
138
+{
124
+{
139
+ if (float16_is_any_nan(f)) {
125
+ return pauth_autda(env, x, y, GETPC(), false);
140
+ float_raise(float_flag_invalid, s);
141
+ return 0;
142
+ }
143
+ return float16_to_uint16_round_to_zero(f, s);
144
+}
126
+}
145
+
127
+
146
+static inline uint64_t vfp_float16_to_uint64_rtz(float16 f, float_status *s)
128
+uint64_t HELPER(autda_combined)(CPUARMState *env, uint64_t x, uint64_t y)
147
+{
129
+{
148
+ if (float16_is_any_nan(f)) {
130
+ return pauth_autda(env, x, y, GETPC(), true);
149
+ float_raise(float_flag_invalid, s);
150
+ return 0;
151
+ }
152
+ return float16_to_uint64_round_to_zero(f, s);
153
+}
131
+}
154
+
132
+
155
+static inline uint64_t vfp_float32_to_uint64_rtz(float32 f, float_status *s)
133
+static uint64_t pauth_autdb(CPUARMState *env, uint64_t x, uint64_t y,
156
+{
134
+ uintptr_t ra, bool is_combined)
157
+ if (float32_is_any_nan(f)) {
135
{
158
+ float_raise(float_flag_invalid, s);
136
int el = arm_current_el(env);
159
+ return 0;
137
if (!pauth_key_enabled(env, el, SCTLR_EnDB)) {
160
+ }
138
return x;
161
+ return float32_to_uint64_round_to_zero(f, s);
139
}
140
- pauth_check_trap(env, el, GETPC());
141
- return pauth_auth(env, x, y, &env->keys.apdb, true, 1);
142
+ pauth_check_trap(env, el, ra);
143
+ return pauth_auth(env, x, y, &env->keys.apdb, true, 1, ra, is_combined);
162
+}
144
+}
163
+
145
+
164
+static inline uint64_t vfp_float64_to_uint64_rtz(float64 f, float_status *s)
146
+uint64_t HELPER(autdb)(CPUARMState *env, uint64_t x, uint64_t y)
165
+{
147
+{
166
+ if (float64_is_any_nan(f)) {
148
+ return pauth_autdb(env, x, y, GETPC(), false);
167
+ float_raise(float_flag_invalid, s);
168
+ return 0;
169
+ }
170
+ return float64_to_uint64_round_to_zero(f, s);
171
+}
149
+}
172
+
150
+
173
DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16)
151
+uint64_t HELPER(autdb_combined)(CPUARMState *env, uint64_t x, uint64_t y)
174
DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32)
152
+{
175
DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16)
153
+ return pauth_autdb(env, x, y, GETPC(), true);
176
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64)
154
}
177
DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32)
155
178
DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64)
156
uint64_t HELPER(xpaci)(CPUARMState *env, uint64_t a)
179
157
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
180
+DO_ZPZ_FP(sve_fcvtzs_hh, uint16_t, H1_2, vfp_float16_to_int16_rtz)
181
+DO_ZPZ_FP(sve_fcvtzs_hs, uint32_t, H1_4, helper_vfp_tosizh)
182
+DO_ZPZ_FP(sve_fcvtzs_ss, uint32_t, H1_4, helper_vfp_tosizs)
183
+DO_ZPZ_FP(sve_fcvtzs_hd, uint64_t, , vfp_float16_to_int64_rtz)
184
+DO_ZPZ_FP(sve_fcvtzs_sd, uint64_t, , vfp_float32_to_int64_rtz)
185
+DO_ZPZ_FP(sve_fcvtzs_ds, uint64_t, , helper_vfp_tosizd)
186
+DO_ZPZ_FP(sve_fcvtzs_dd, uint64_t, , vfp_float64_to_int64_rtz)
187
+
188
+DO_ZPZ_FP(sve_fcvtzu_hh, uint16_t, H1_2, vfp_float16_to_uint16_rtz)
189
+DO_ZPZ_FP(sve_fcvtzu_hs, uint32_t, H1_4, helper_vfp_touizh)
190
+DO_ZPZ_FP(sve_fcvtzu_ss, uint32_t, H1_4, helper_vfp_touizs)
191
+DO_ZPZ_FP(sve_fcvtzu_hd, uint64_t, , vfp_float16_to_uint64_rtz)
192
+DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz)
193
+DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd)
194
+DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz)
195
+
196
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
197
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
198
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
199
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
200
index XXXXXXX..XXXXXXX 100644
158
index XXXXXXX..XXXXXXX 100644
201
--- a/target/arm/translate-sve.c
159
--- a/target/arm/tcg/translate-a64.c
202
+++ b/target/arm/translate-sve.c
160
+++ b/target/arm/tcg/translate-a64.c
203
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
161
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 auth_branch_target(DisasContext *s, TCGv_i64 dst,
204
return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd);
162
163
truedst = tcg_temp_new_i64();
164
if (use_key_a) {
165
- gen_helper_autia(truedst, cpu_env, dst, modifier);
166
+ gen_helper_autia_combined(truedst, cpu_env, dst, modifier);
167
} else {
168
- gen_helper_autib(truedst, cpu_env, dst, modifier);
169
+ gen_helper_autib_combined(truedst, cpu_env, dst, modifier);
170
}
171
return truedst;
205
}
172
}
206
173
@@ -XXX,XX +XXX,XX @@ static bool trans_LDRA(DisasContext *s, arg_LDRA *a)
207
+static bool trans_FCVTZS_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
174
208
+{
175
if (s->pauth_active) {
209
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hh);
176
if (!a->m) {
210
+}
177
- gen_helper_autda(dirty_addr, cpu_env, dirty_addr,
211
+
178
- tcg_constant_i64(0));
212
+static bool trans_FCVTZU_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
179
+ gen_helper_autda_combined(dirty_addr, cpu_env, dirty_addr,
213
+{
180
+ tcg_constant_i64(0));
214
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hh);
181
} else {
215
+}
182
- gen_helper_autdb(dirty_addr, cpu_env, dirty_addr,
216
+
183
- tcg_constant_i64(0));
217
+static bool trans_FCVTZS_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
184
+ gen_helper_autdb_combined(dirty_addr, cpu_env, dirty_addr,
218
+{
185
+ tcg_constant_i64(0));
219
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hs);
186
}
220
+}
187
}
221
+
188
222
+static bool trans_FCVTZU_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
223
+{
224
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hs);
225
+}
226
+
227
+static bool trans_FCVTZS_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
228
+{
229
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hd);
230
+}
231
+
232
+static bool trans_FCVTZU_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
233
+{
234
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hd);
235
+}
236
+
237
+static bool trans_FCVTZS_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
238
+{
239
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ss);
240
+}
241
+
242
+static bool trans_FCVTZU_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
243
+{
244
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ss);
245
+}
246
+
247
+static bool trans_FCVTZS_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
248
+{
249
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_sd);
250
+}
251
+
252
+static bool trans_FCVTZU_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
253
+{
254
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_sd);
255
+}
256
+
257
+static bool trans_FCVTZS_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
258
+{
259
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ds);
260
+}
261
+
262
+static bool trans_FCVTZU_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
263
+{
264
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ds);
265
+}
266
+
267
+static bool trans_FCVTZS_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
268
+{
269
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_dd);
270
+}
271
+
272
+static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
273
+{
274
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd);
275
+}
276
+
277
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
278
{
279
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
280
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
281
index XXXXXXX..XXXXXXX 100644
282
--- a/target/arm/sve.decode
283
+++ b/target/arm/sve.decode
284
@@ -XXX,XX +XXX,XX @@ FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0
285
FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0
286
FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0
287
288
+# SVE floating-point convert to integer
289
+FCVTZS_hh 01100101 01 011 01 0 101 ... ..... ..... @rd_pg_rn_e0
290
+FCVTZU_hh 01100101 01 011 01 1 101 ... ..... ..... @rd_pg_rn_e0
291
+FCVTZS_hs 01100101 01 011 10 0 101 ... ..... ..... @rd_pg_rn_e0
292
+FCVTZU_hs 01100101 01 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
293
+FCVTZS_hd 01100101 01 011 11 0 101 ... ..... ..... @rd_pg_rn_e0
294
+FCVTZU_hd 01100101 01 011 11 1 101 ... ..... ..... @rd_pg_rn_e0
295
+FCVTZS_ss 01100101 10 011 10 0 101 ... ..... ..... @rd_pg_rn_e0
296
+FCVTZU_ss 01100101 10 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
297
+FCVTZS_ds 01100101 11 011 00 0 101 ... ..... ..... @rd_pg_rn_e0
298
+FCVTZU_ds 01100101 11 011 00 1 101 ... ..... ..... @rd_pg_rn_e0
299
+FCVTZS_sd 01100101 11 011 10 0 101 ... ..... ..... @rd_pg_rn_e0
300
+FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
301
+FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0
302
+FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0
303
+
304
# SVE integer convert to floating-point
305
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
306
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
307
--
189
--
308
2.17.1
190
2.34.1
309
191
310
192
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-20-richard.henderson@linaro.org
6
Message-id: 20230829232335.965414-10-richard.henderson@linaro.org
7
Message-Id: <20230609172324.982888-8-aaron@os.amperecomputing.com>
8
[rth: Simplify fpac comparison, reusing cmp_mask]
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 35 ++++++++++++++++++++++
12
docs/system/arm/emulation.rst | 2 ++
9
target/arm/sve_helper.c | 61 ++++++++++++++++++++++++++++++++++++++
13
target/arm/syndrome.h | 7 +++++++
10
target/arm/translate-sve.c | 57 +++++++++++++++++++++++++++++++++++
14
target/arm/tcg/cpu64.c | 2 +-
11
target/arm/sve.decode | 8 +++++
15
target/arm/tcg/pauth_helper.c | 18 +++++++++++++++++-
12
4 files changed, 161 insertions(+)
16
4 files changed, 27 insertions(+), 2 deletions(-)
13
17
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
15
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
20
--- a/docs/system/arm/emulation.rst
17
+++ b/target/arm/helper-sve.h
21
+++ b/docs/system/arm/emulation.rst
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
22
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
19
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
23
- FEAT_FGT (Fine-Grained Traps)
20
void, ptr, ptr, ptr, ptr, i32)
24
- FEAT_FHM (Floating-point half-precision multiplication instructions)
21
25
- FEAT_FP16 (Half-precision floating-point data processing)
22
+DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG,
26
+- FEAT_FPAC (Faulting on AUT* instructions)
23
+ i64, ptr, ptr, ptr, i32)
27
+- FEAT_FPACCOMBINE (Faulting on combined pointer authentication instructions)
24
+DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG,
28
- FEAT_FRINTTS (Floating-point to integer instructions)
25
+ i64, ptr, ptr, ptr, i32)
29
- FEAT_FlagM (Flag manipulation instructions v2)
26
+DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG,
30
- FEAT_FlagM2 (Enhancements to flag manipulation instructions)
27
+ i64, ptr, ptr, ptr, i32)
31
diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
28
+
29
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG,
30
+ i64, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG,
32
+ i64, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG,
34
+ i64, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG,
37
+ i64, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG,
39
+ i64, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG,
41
+ i64, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG,
44
+ i64, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG,
46
+ i64, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG,
48
+ i64, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG,
51
+ i64, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG,
53
+ i64, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG,
55
+ i64, ptr, ptr, ptr, i32)
56
+
57
DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
58
i64, i64, ptr, ptr, ptr, i32)
59
DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
60
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
61
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/sve_helper.c
33
--- a/target/arm/syndrome.h
63
+++ b/target/arm/sve_helper.c
34
+++ b/target/arm/syndrome.h
64
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
35
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
65
return predtest_ones(d, oprsz, esz_mask);
36
EC_SYSTEMREGISTERTRAP = 0x18,
37
EC_SVEACCESSTRAP = 0x19,
38
EC_ERETTRAP = 0x1a,
39
+ EC_PACFAIL = 0x1c,
40
EC_SMETRAP = 0x1d,
41
EC_GPC = 0x1e,
42
EC_INSNABORT = 0x20,
43
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_smetrap(SMEExceptionType etype, bool is_16bit)
44
| (is_16bit ? 0 : ARM_EL_IL) | etype;
66
}
45
}
67
46
68
+/* Recursive reduction on a function;
47
+static inline uint32_t syn_pacfail(bool data, int keynumber)
69
+ * C.f. the ARM ARM function ReducePredicated.
48
+{
70
+ *
49
+ int error_code = (data << 1) | keynumber;
71
+ * While it would be possible to write this without the DATA temporary,
50
+ return (EC_PACFAIL << ARM_EL_EC_SHIFT) | ARM_EL_IL | error_code;
72
+ * it is much simpler to process the predicate register this way.
73
+ * The recursion is bounded to depth 7 (128 fp16 elements), so there's
74
+ * little to gain with a more complex non-recursive form.
75
+ */
76
+#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \
77
+static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \
78
+{ \
79
+ if (n == 1) { \
80
+ return *data; \
81
+ } else { \
82
+ uintptr_t half = n / 2; \
83
+ TYPE lo = NAME##_reduce(data, status, half); \
84
+ TYPE hi = NAME##_reduce(data + half, status, half); \
85
+ return TYPE##_##FUNC(lo, hi, status); \
86
+ } \
87
+} \
88
+uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \
89
+{ \
90
+ uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc); \
91
+ TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \
92
+ for (i = 0; i < oprsz; ) { \
93
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
94
+ do { \
95
+ TYPE nn = *(TYPE *)(vn + H(i)); \
96
+ *(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT); \
97
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
98
+ } while (i & 15); \
99
+ } \
100
+ for (; i < maxsz; i += sizeof(TYPE)) { \
101
+ *(TYPE *)((void *)data + i) = IDENT; \
102
+ } \
103
+ return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \
104
+}
51
+}
105
+
52
+
106
+DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero)
53
static inline uint32_t syn_pactrap(void)
107
+DO_REDUCE(sve_faddv_s, float32, H1_4, add, float32_zero)
108
+DO_REDUCE(sve_faddv_d, float64, , add, float64_zero)
109
+
110
+/* Identity is floatN_default_nan, without the function call. */
111
+DO_REDUCE(sve_fminnmv_h, float16, H1_2, minnum, 0x7E00)
112
+DO_REDUCE(sve_fminnmv_s, float32, H1_4, minnum, 0x7FC00000)
113
+DO_REDUCE(sve_fminnmv_d, float64, , minnum, 0x7FF8000000000000ULL)
114
+
115
+DO_REDUCE(sve_fmaxnmv_h, float16, H1_2, maxnum, 0x7E00)
116
+DO_REDUCE(sve_fmaxnmv_s, float32, H1_4, maxnum, 0x7FC00000)
117
+DO_REDUCE(sve_fmaxnmv_d, float64, , maxnum, 0x7FF8000000000000ULL)
118
+
119
+DO_REDUCE(sve_fminv_h, float16, H1_2, min, float16_infinity)
120
+DO_REDUCE(sve_fminv_s, float32, H1_4, min, float32_infinity)
121
+DO_REDUCE(sve_fminv_d, float64, , min, float64_infinity)
122
+
123
+DO_REDUCE(sve_fmaxv_h, float16, H1_2, max, float16_chs(float16_infinity))
124
+DO_REDUCE(sve_fmaxv_s, float32, H1_4, max, float32_chs(float32_infinity))
125
+DO_REDUCE(sve_fmaxv_d, float64, , max, float64_chs(float64_infinity))
126
+
127
+#undef DO_REDUCE
128
+
129
uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
130
void *status, uint32_t desc)
131
{
54
{
132
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
55
return EC_PACTRAP << ARM_EL_EC_SHIFT;
56
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
133
index XXXXXXX..XXXXXXX 100644
57
index XXXXXXX..XXXXXXX 100644
134
--- a/target/arm/translate-sve.c
58
--- a/target/arm/tcg/cpu64.c
135
+++ b/target/arm/translate-sve.c
59
+++ b/target/arm/tcg/cpu64.c
136
@@ -XXX,XX +XXX,XX @@ static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn)
60
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
137
return true;
61
62
t = cpu->isar.id_aa64isar1;
63
t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2); /* FEAT_DPB2 */
64
- t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_2);
65
+ t = FIELD_DP64(t, ID_AA64ISAR1, APA, PauthFeat_FPACCOMBINED);
66
t = FIELD_DP64(t, ID_AA64ISAR1, API, 1);
67
t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1); /* FEAT_JSCVT */
68
t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1); /* FEAT_FCMA */
69
diff --git a/target/arm/tcg/pauth_helper.c b/target/arm/tcg/pauth_helper.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/tcg/pauth_helper.c
72
+++ b/target/arm/tcg/pauth_helper.c
73
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
74
}
138
}
75
}
139
76
140
+/*
77
+static G_NORETURN
141
+ *** SVE Floating Point Fast Reduction Group
78
+void pauth_fail_exception(CPUARMState *env, bool data,
142
+ */
79
+ int keynumber, uintptr_t ra)
143
+
144
+typedef void gen_helper_fp_reduce(TCGv_i64, TCGv_ptr, TCGv_ptr,
145
+ TCGv_ptr, TCGv_i32);
146
+
147
+static void do_reduce(DisasContext *s, arg_rpr_esz *a,
148
+ gen_helper_fp_reduce *fn)
149
+{
80
+{
150
+ unsigned vsz = vec_full_reg_size(s);
81
+ raise_exception_ra(env, EXCP_UDEF, syn_pacfail(data, keynumber),
151
+ unsigned p2vsz = pow2ceil(vsz);
82
+ exception_target_el(env), ra);
152
+ TCGv_i32 t_desc = tcg_const_i32(simd_desc(vsz, p2vsz, 0));
153
+ TCGv_ptr t_zn, t_pg, status;
154
+ TCGv_i64 temp;
155
+
156
+ temp = tcg_temp_new_i64();
157
+ t_zn = tcg_temp_new_ptr();
158
+ t_pg = tcg_temp_new_ptr();
159
+
160
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn));
161
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
162
+ status = get_fpstatus_ptr(a->esz == MO_16);
163
+
164
+ fn(temp, t_zn, t_pg, status, t_desc);
165
+ tcg_temp_free_ptr(t_zn);
166
+ tcg_temp_free_ptr(t_pg);
167
+ tcg_temp_free_ptr(status);
168
+ tcg_temp_free_i32(t_desc);
169
+
170
+ write_fp_dreg(s, a->rd, temp);
171
+ tcg_temp_free_i64(temp);
172
+}
83
+}
173
+
84
+
174
+#define DO_VPZ(NAME, name) \
85
static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
175
+static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \
86
ARMPACKey *key, bool data, int keynumber,
176
+{ \
87
uintptr_t ra, bool is_combined)
177
+ static gen_helper_fp_reduce * const fns[3] = { \
88
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
178
+ gen_helper_sve_##name##_h, \
89
cmp_mask &= ~MAKE_64BIT_MASK(55, 1);
179
+ gen_helper_sve_##name##_s, \
90
180
+ gen_helper_sve_##name##_d, \
91
if (pauth_feature >= PauthFeat_2) {
181
+ }; \
92
- return ptr ^ (pac & cmp_mask);
182
+ if (a->esz == 0) { \
93
+ ARMPauthFeature fault_feature =
183
+ return false; \
94
+ is_combined ? PauthFeat_FPACCOMBINED : PauthFeat_FPAC;
184
+ } \
95
+ uint64_t result = ptr ^ (pac & cmp_mask);
185
+ if (sve_access_check(s)) { \
186
+ do_reduce(s, a, fns[a->esz - 1]); \
187
+ } \
188
+ return true; \
189
+}
190
+
96
+
191
+DO_VPZ(FADDV, faddv)
97
+ if (pauth_feature >= fault_feature
192
+DO_VPZ(FMINNMV, fminnmv)
98
+ && ((result ^ sextract64(result, 55, 1)) & cmp_mask)) {
193
+DO_VPZ(FMAXNMV, fmaxnmv)
99
+ pauth_fail_exception(env, data, keynumber, ra);
194
+DO_VPZ(FMINV, fminv)
100
+ }
195
+DO_VPZ(FMAXV, fmaxv)
101
+ return result;
196
+
102
}
197
/*
103
198
*** SVE Floating Point Accumulating Reduction Group
104
if ((pac ^ ptr) & cmp_mask) {
199
*/
200
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
201
index XXXXXXX..XXXXXXX 100644
202
--- a/target/arm/sve.decode
203
+++ b/target/arm/sve.decode
204
@@ -XXX,XX +XXX,XX @@ FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \
205
FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2
206
FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3
207
208
+### SVE FP Fast Reduction Group
209
+
210
+FADDV 01100101 .. 000 000 001 ... ..... ..... @rd_pg_rn
211
+FMAXNMV 01100101 .. 000 100 001 ... ..... ..... @rd_pg_rn
212
+FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn
213
+FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn
214
+FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn
215
+
216
### SVE FP Accumulating Reduction Group
217
218
# SVE floating-point serial reduction (predicated)
219
--
105
--
220
2.17.1
106
2.34.1
221
222
diff view generated by jsdifflib
1
From: Alex Bennée <alex.bennee@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
Since kernel commit a86bd139f2 (arm64: arch_timer: Enable CNTVCT_EL0
3
Fix when using GCC v11.4 (Ubuntu 11.4.0-1ubuntu1~22.04) with CFLAGS=-Og:
4
trap..), released in kernel version v4.12, user-space has been able
5
to read these system registers. As we can't use QEMUTimer's in
6
linux-user mode we just directly call cpu_get_clock().
7
4
8
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
5
[4/6] Compiling C object libcommon.fa.p/hw_intc_arm_gicv3_its.c.o
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
FAILED: libcommon.fa.p/hw_intc_arm_gicv3_its.c.o
10
Message-id: 20180625160009.17437-2-alex.bennee@linaro.org
7
inlined from ‘lookup_vte’ at hw/intc/arm_gicv3_its.c:453:9,
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
inlined from ‘vmovp_callback’ at hw/intc/arm_gicv3_its.c:1039:14:
9
hw/intc/arm_gicv3_its.c:347:9: error: ‘vte.rdbase’ may be used uninitialized [-Werror=maybe-uninitialized]
10
347 | trace_gicv3_its_vte_read(vpeid, vte->valid, vte->vptsize,
11
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12
348 | vte->vptaddr, vte->rdbase);
13
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
14
hw/intc/arm_gicv3_its.c: In function ‘vmovp_callback’:
15
hw/intc/arm_gicv3_its.c:1036:13: note: ‘vte’ declared here
16
1036 | VTEntry vte;
17
| ^~~
18
19
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
20
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
21
Message-id: 20230831131348.69032-1-philmd@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
23
---
14
target/arm/helper.c | 27 ++++++++++++++++++++++++---
24
hw/intc/arm_gicv3_its.c | 15 ++++++---------
15
1 file changed, 24 insertions(+), 3 deletions(-)
25
1 file changed, 6 insertions(+), 9 deletions(-)
16
26
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
27
diff --git a/hw/intc/arm_gicv3_its.c b/hw/intc/arm_gicv3_its.c
18
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.c
29
--- a/hw/intc/arm_gicv3_its.c
20
+++ b/target/arm/helper.c
30
+++ b/hw/intc/arm_gicv3_its.c
21
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
31
@@ -XXX,XX +XXX,XX @@ static MemTxResult get_vte(GICv3ITSState *s, uint32_t vpeid, VTEntry *vte)
22
};
32
if (entry_addr == -1) {
23
33
/* No L2 table entry, i.e. no valid VTE, or a memory error */
24
#else
34
vte->valid = false;
25
-/* In user-mode none of the generic timer registers are accessible,
35
- goto out;
26
- * and their implementation depends on QEMU_CLOCK_VIRTUAL and qdev gpio outputs,
36
+ trace_gicv3_its_vte_read_fault(vpeid);
27
- * so instead just don't register any of them.
37
+ return MEMTX_OK;
28
+
38
}
29
+/* In user-mode most of the generic timer registers are inaccessible
39
vteval = address_space_ldq_le(as, entry_addr, MEMTXATTRS_UNSPECIFIED, &res);
30
+ * however modern kernels (4.12+) allow access to cntvct_el0
40
if (res != MEMTX_OK) {
31
*/
41
- goto out;
32
+
42
+ trace_gicv3_its_vte_read_fault(vpeid);
33
+static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
43
+ return res;
34
+{
44
}
35
+ /* Currently we have no support for QEMUTimer in linux-user so we
45
vte->valid = FIELD_EX64(vteval, VTE, VALID);
36
+ * can't call gt_get_countervalue(env), instead we directly
46
vte->vptsize = FIELD_EX64(vteval, VTE, VPTSIZE);
37
+ * call the lower level functions.
47
vte->vptaddr = FIELD_EX64(vteval, VTE, VPTADDR);
38
+ */
48
vte->rdbase = FIELD_EX64(vteval, VTE, RDBASE);
39
+ return cpu_get_clock() / GTIMER_SCALE;
49
-out:
40
+}
50
- if (res != MEMTX_OK) {
41
+
51
- trace_gicv3_its_vte_read_fault(vpeid);
42
static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
52
- } else {
43
+ { .name = "CNTFRQ_EL0", .state = ARM_CP_STATE_AA64,
53
- trace_gicv3_its_vte_read(vpeid, vte->valid, vte->vptsize,
44
+ .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 0, .opc2 = 0,
54
- vte->vptaddr, vte->rdbase);
45
+ .type = ARM_CP_CONST, .access = PL0_R /* no PL1_RW in linux-user */,
55
- }
46
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_cntfrq),
56
+ trace_gicv3_its_vte_read(vpeid, vte->valid, vte->vptsize,
47
+ .resetvalue = NANOSECONDS_PER_SECOND / GTIMER_SCALE,
57
+ vte->vptaddr, vte->rdbase);
48
+ },
58
return res;
49
+ { .name = "CNTVCT_EL0", .state = ARM_CP_STATE_AA64,
59
}
50
+ .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 0, .opc2 = 2,
51
+ .access = PL0_R, .type = ARM_CP_NO_RAW | ARM_CP_IO,
52
+ .readfn = gt_virt_cnt_read,
53
+ },
54
REGINFO_SENTINEL
55
};
56
60
57
--
61
--
58
2.17.1
62
2.34.1
59
63
60
64
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Introduce the Xilinx Configuration Frame Interface (CFI) for transmitting
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
CFI data packets between the Xilinx Configuration Frame Unit models
5
Message-id: 20180627043328.11531-5-richard.henderson@linaro.org
5
(CFU_APB, CFU_FDRO and CFU_SFR), the Xilinx CFRAME controller (CFRAME_REG)
6
and the Xilinx CFRAME broadcast controller (CFRAME_BCAST_REG) models (when
7
emulating bitstream programming and readback).
8
9
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
10
Reviewed-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com>
11
Acked-by: Edgar E. Iglesias <edgar@zeroasic.com>
12
Message-id: 20230831165701.2016397-2-francisco.iglesias@amd.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
14
---
8
target/arm/translate-sve.c | 52 ++++++++++++++++++++++++++++++++++++++
15
MAINTAINERS | 6 ++++
9
target/arm/sve.decode | 9 +++++++
16
include/hw/misc/xlnx-cfi-if.h | 59 +++++++++++++++++++++++++++++++++++
10
2 files changed, 61 insertions(+)
17
hw/misc/xlnx-cfi-if.c | 34 ++++++++++++++++++++
18
hw/misc/meson.build | 1 +
19
4 files changed, 100 insertions(+)
20
create mode 100644 include/hw/misc/xlnx-cfi-if.h
21
create mode 100644 hw/misc/xlnx-cfi-if.c
11
22
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
23
diff --git a/MAINTAINERS b/MAINTAINERS
13
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
25
--- a/MAINTAINERS
15
+++ b/target/arm/translate-sve.c
26
+++ b/MAINTAINERS
16
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
27
@@ -XXX,XX +XXX,XX @@ S: Maintained
17
return true;
28
F: hw/ssi/xlnx-versal-ospi.c
18
}
29
F: include/hw/ssi/xlnx-versal-ospi.h
19
30
20
+static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
31
+Xilinx Versal CFI
32
+M: Francisco Iglesias <francisco.iglesias@amd.com>
33
+S: Maintained
34
+F: hw/misc/xlnx-cfi-if.c
35
+F: include/hw/misc/xlnx-cfi-if.h
36
+
37
STM32F100
38
M: Alexandre Iooss <erdnaxe@crans.org>
39
L: qemu-arm@nongnu.org
40
diff --git a/include/hw/misc/xlnx-cfi-if.h b/include/hw/misc/xlnx-cfi-if.h
41
new file mode 100644
42
index XXXXXXX..XXXXXXX
43
--- /dev/null
44
+++ b/include/hw/misc/xlnx-cfi-if.h
45
@@ -XXX,XX +XXX,XX @@
46
+/*
47
+ * Xilinx CFI interface
48
+ *
49
+ * Copyright (C) 2023, Advanced Micro Devices, Inc.
50
+ *
51
+ * Written by Francisco Iglesias <francisco.iglesias@amd.com>
52
+ *
53
+ * SPDX-License-Identifier: GPL-2.0-or-later
54
+ */
55
+#ifndef XLNX_CFI_IF_H
56
+#define XLNX_CFI_IF_H 1
57
+
58
+#include "qemu/help-texts.h"
59
+#include "hw/hw.h"
60
+#include "qom/object.h"
61
+
62
+#define TYPE_XLNX_CFI_IF "xlnx-cfi-if"
63
+typedef struct XlnxCfiIfClass XlnxCfiIfClass;
64
+DECLARE_CLASS_CHECKERS(XlnxCfiIfClass, XLNX_CFI_IF, TYPE_XLNX_CFI_IF)
65
+
66
+#define XLNX_CFI_IF(obj) \
67
+ INTERFACE_CHECK(XlnxCfiIf, (obj), TYPE_XLNX_CFI_IF)
68
+
69
+typedef enum {
70
+ PACKET_TYPE_CFU = 0x52,
71
+ PACKET_TYPE_CFRAME = 0xA1,
72
+} xlnx_cfi_packet_type;
73
+
74
+typedef enum {
75
+ CFRAME_FAR = 1,
76
+ CFRAME_SFR = 2,
77
+ CFRAME_FDRI = 4,
78
+ CFRAME_CMD = 6,
79
+} xlnx_cfi_reg_addr;
80
+
81
+typedef struct XlnxCfiPacket {
82
+ uint8_t reg_addr;
83
+ uint32_t data[4];
84
+} XlnxCfiPacket;
85
+
86
+typedef struct XlnxCfiIf {
87
+ Object Parent;
88
+} XlnxCfiIf;
89
+
90
+typedef struct XlnxCfiIfClass {
91
+ InterfaceClass parent;
92
+
93
+ void (*cfi_transfer_packet)(XlnxCfiIf *cfi_if, XlnxCfiPacket *pkt);
94
+} XlnxCfiIfClass;
95
+
96
+/**
97
+ * Transfer a XlnxCfiPacket.
98
+ *
99
+ * @cfi_if: the object implementing this interface
100
+ * @XlnxCfiPacket: a pointer to the XlnxCfiPacket to transfer
101
+ */
102
+void xlnx_cfi_transfer_packet(XlnxCfiIf *cfi_if, XlnxCfiPacket *pkt);
103
+
104
+#endif /* XLNX_CFI_IF_H */
105
diff --git a/hw/misc/xlnx-cfi-if.c b/hw/misc/xlnx-cfi-if.c
106
new file mode 100644
107
index XXXXXXX..XXXXXXX
108
--- /dev/null
109
+++ b/hw/misc/xlnx-cfi-if.c
110
@@ -XXX,XX +XXX,XX @@
111
+/*
112
+ * Xilinx CFI interface
113
+ *
114
+ * Copyright (C) 2023, Advanced Micro Devices, Inc.
115
+ *
116
+ * Written by Francisco Iglesias <francisco.iglesias@amd.com>
117
+ *
118
+ * SPDX-License-Identifier: GPL-2.0-or-later
119
+ */
120
+#include "qemu/osdep.h"
121
+#include "hw/misc/xlnx-cfi-if.h"
122
+
123
+void xlnx_cfi_transfer_packet(XlnxCfiIf *cfi_if, XlnxCfiPacket *pkt)
21
+{
124
+{
22
+ static gen_helper_gvec_mem * const fns[4] = {
125
+ XlnxCfiIfClass *xcic = XLNX_CFI_IF_GET_CLASS(cfi_if);
23
+ gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r,
24
+ gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r,
25
+ };
26
+ unsigned vsz = vec_full_reg_size(s);
27
+ TCGv_ptr t_pg;
28
+ TCGv_i32 desc;
29
+
126
+
30
+ /* Load the first quadword using the normal predicated load helpers. */
127
+ if (xcic->cfi_transfer_packet) {
31
+ desc = tcg_const_i32(simd_desc(16, 16, zt));
128
+ xcic->cfi_transfer_packet(cfi_if, pkt);
32
+ t_pg = tcg_temp_new_ptr();
33
+
34
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
35
+ fns[msz](cpu_env, t_pg, addr, desc);
36
+
37
+ tcg_temp_free_ptr(t_pg);
38
+ tcg_temp_free_i32(desc);
39
+
40
+ /* Replicate that first quadword. */
41
+ if (vsz > 16) {
42
+ unsigned dofs = vec_full_reg_offset(s, zt);
43
+ tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16);
44
+ }
129
+ }
45
+}
130
+}
46
+
131
+
47
+static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
132
+static const TypeInfo xlnx_cfi_if_info = {
133
+ .name = TYPE_XLNX_CFI_IF,
134
+ .parent = TYPE_INTERFACE,
135
+ .class_size = sizeof(XlnxCfiIfClass),
136
+};
137
+
138
+static void xlnx_cfi_if_register_types(void)
48
+{
139
+{
49
+ if (a->rm == 31) {
140
+ type_register_static(&xlnx_cfi_if_info);
50
+ return false;
51
+ }
52
+ if (sve_access_check(s)) {
53
+ int msz = dtype_msz(a->dtype);
54
+ TCGv_i64 addr = new_tmp_a64(s);
55
+ tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz);
56
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
57
+ do_ldrq(s, a->rd, a->pg, addr, msz);
58
+ }
59
+ return true;
60
+}
141
+}
61
+
142
+
62
+static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
143
+type_init(xlnx_cfi_if_register_types)
63
+{
64
+ if (sve_access_check(s)) {
65
+ TCGv_i64 addr = new_tmp_a64(s);
66
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16);
67
+ do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype));
68
+ }
69
+ return true;
70
+}
71
+
144
+
72
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
145
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
73
int msz, int esz, int nreg)
74
{
75
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
76
index XXXXXXX..XXXXXXX 100644
146
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/sve.decode
147
--- a/hw/misc/meson.build
78
+++ b/target/arm/sve.decode
148
+++ b/hw/misc/meson.build
79
@@ -XXX,XX +XXX,XX @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz
149
@@ -XXX,XX +XXX,XX @@ specific_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files('xlnx-versal-crl.c'))
80
# LD2B, LD2H, LD2W, LD2D; etc.
150
system_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files(
81
LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz
151
'xlnx-versal-xramc.c',
82
152
'xlnx-versal-pmc-iou-slcr.c',
83
+# SVE load and broadcast quadword (scalar plus scalar)
153
+ 'xlnx-cfi-if.c',
84
+LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \
154
))
85
+ @rprr_load_msz nreg=0
155
system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: files('stm32f2xx_syscfg.c'))
86
+
156
system_ss.add(when: 'CONFIG_STM32F4XX_SYSCFG', if_true: files('stm32f4xx_syscfg.c'))
87
+# SVE load and broadcast quadword (scalar plus immediate)
88
+# LD1RQB, LD1RQH, LD1RQS, LD1RQD
89
+LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \
90
+ @rpri_load_msz nreg=0
91
+
92
### SVE Memory Store Group
93
94
# SVE contiguous store (scalar plus immediate)
95
--
157
--
96
2.17.1
158
2.34.1
97
98
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
Introduce a model of the software programming interface (CFU_APB) of
4
Xilinx Versal's Configuration Frame Unit.
5
6
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20230831165701.2016397-3-francisco.iglesias@amd.com
5
Message-id: 20180627043328.11531-11-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/translate-sve.c | 103 +++++++++++++++++++++++++++++++++++++
11
MAINTAINERS | 2 +
9
target/arm/sve.decode | 6 +++
12
include/hw/misc/xlnx-versal-cfu.h | 231 ++++++++++++++++++
10
2 files changed, 109 insertions(+)
13
hw/misc/xlnx-versal-cfu.c | 380 ++++++++++++++++++++++++++++++
14
hw/misc/meson.build | 1 +
15
4 files changed, 614 insertions(+)
16
create mode 100644 include/hw/misc/xlnx-versal-cfu.h
17
create mode 100644 hw/misc/xlnx-versal-cfu.c
11
18
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
19
diff --git a/MAINTAINERS b/MAINTAINERS
13
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
21
--- a/MAINTAINERS
15
+++ b/target/arm/translate-sve.c
22
+++ b/MAINTAINERS
16
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len,
23
@@ -XXX,XX +XXX,XX @@ M: Francisco Iglesias <francisco.iglesias@amd.com>
17
tcg_temp_free_i64(t0);
24
S: Maintained
18
}
25
F: hw/misc/xlnx-cfi-if.c
19
26
F: include/hw/misc/xlnx-cfi-if.h
20
+/* Similarly for stores. */
27
+F: hw/misc/xlnx-versal-cfu.c
21
+static void do_str(DisasContext *s, uint32_t vofs, uint32_t len,
28
+F: include/hw/misc/xlnx-versal-cfu.h
22
+ int rn, int imm)
29
23
+{
30
STM32F100
24
+ uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
31
M: Alexandre Iooss <erdnaxe@crans.org>
25
+ uint32_t len_remain = len % 8;
32
diff --git a/include/hw/misc/xlnx-versal-cfu.h b/include/hw/misc/xlnx-versal-cfu.h
26
+ uint32_t nparts = len / 8 + ctpop8(len_remain);
33
new file mode 100644
27
+ int midx = get_mem_index(s);
34
index XXXXXXX..XXXXXXX
28
+ TCGv_i64 addr, t0;
35
--- /dev/null
29
+
36
+++ b/include/hw/misc/xlnx-versal-cfu.h
30
+ addr = tcg_temp_new_i64();
37
@@ -XXX,XX +XXX,XX @@
31
+ t0 = tcg_temp_new_i64();
38
+/*
32
+
39
+ * QEMU model of the CFU Configuration Unit.
33
+ /* Note that unpredicated load/store of vector/predicate registers
40
+ *
34
+ * are defined as a stream of bytes, which equates to little-endian
41
+ * Copyright (C) 2023, Advanced Micro Devices, Inc.
35
+ * operations on larger quantities. There is no nice way to force
42
+ *
36
+ * a little-endian store for aarch64_be-linux-user out of line.
43
+ * Written by Francisco Iglesias <francisco.iglesias@amd.com>
37
+ *
44
+ *
38
+ * Attempt to keep code expansion to a minimum by limiting the
45
+ * SPDX-License-Identifier: GPL-2.0-or-later
39
+ * amount of unrolling done.
46
+ *
40
+ */
47
+ * References:
41
+ if (nparts <= 4) {
48
+ * [1] Versal ACAP Technical Reference Manual,
42
+ int i;
49
+ * https://www.xilinx.com/support/documentation/architecture-manuals/am011-versal-acap-trm.pdf
43
+
50
+ *
44
+ for (i = 0; i < len_align; i += 8) {
51
+ * [2] Versal ACAP Register Reference,
45
+ tcg_gen_ld_i64(t0, cpu_env, vofs + i);
52
+ * https://www.xilinx.com/htmldocs/registers/am012/am012-versal-register-reference.html
46
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i);
53
+ */
47
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
54
+#ifndef HW_MISC_XLNX_VERSAL_CFU_APB_H
55
+#define HW_MISC_XLNX_VERSAL_CFU_APB_H
56
+
57
+#include "hw/sysbus.h"
58
+#include "hw/register.h"
59
+#include "hw/misc/xlnx-cfi-if.h"
60
+
61
+#define TYPE_XLNX_VERSAL_CFU_APB "xlnx,versal-cfu-apb"
62
+OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFUAPB, XLNX_VERSAL_CFU_APB)
63
+
64
+REG32(CFU_ISR, 0x0)
65
+ FIELD(CFU_ISR, USR_GTS_EVENT, 9, 1)
66
+ FIELD(CFU_ISR, USR_GSR_EVENT, 8, 1)
67
+ FIELD(CFU_ISR, SLVERR, 7, 1)
68
+ FIELD(CFU_ISR, DECOMP_ERROR, 6, 1)
69
+ FIELD(CFU_ISR, BAD_CFI_PACKET, 5, 1)
70
+ FIELD(CFU_ISR, AXI_ALIGN_ERROR, 4, 1)
71
+ FIELD(CFU_ISR, CFI_ROW_ERROR, 3, 1)
72
+ FIELD(CFU_ISR, CRC32_ERROR, 2, 1)
73
+ FIELD(CFU_ISR, CRC8_ERROR, 1, 1)
74
+ FIELD(CFU_ISR, SEU_ENDOFCALIB, 0, 1)
75
+REG32(CFU_IMR, 0x4)
76
+ FIELD(CFU_IMR, USR_GTS_EVENT, 9, 1)
77
+ FIELD(CFU_IMR, USR_GSR_EVENT, 8, 1)
78
+ FIELD(CFU_IMR, SLVERR, 7, 1)
79
+ FIELD(CFU_IMR, DECOMP_ERROR, 6, 1)
80
+ FIELD(CFU_IMR, BAD_CFI_PACKET, 5, 1)
81
+ FIELD(CFU_IMR, AXI_ALIGN_ERROR, 4, 1)
82
+ FIELD(CFU_IMR, CFI_ROW_ERROR, 3, 1)
83
+ FIELD(CFU_IMR, CRC32_ERROR, 2, 1)
84
+ FIELD(CFU_IMR, CRC8_ERROR, 1, 1)
85
+ FIELD(CFU_IMR, SEU_ENDOFCALIB, 0, 1)
86
+REG32(CFU_IER, 0x8)
87
+ FIELD(CFU_IER, USR_GTS_EVENT, 9, 1)
88
+ FIELD(CFU_IER, USR_GSR_EVENT, 8, 1)
89
+ FIELD(CFU_IER, SLVERR, 7, 1)
90
+ FIELD(CFU_IER, DECOMP_ERROR, 6, 1)
91
+ FIELD(CFU_IER, BAD_CFI_PACKET, 5, 1)
92
+ FIELD(CFU_IER, AXI_ALIGN_ERROR, 4, 1)
93
+ FIELD(CFU_IER, CFI_ROW_ERROR, 3, 1)
94
+ FIELD(CFU_IER, CRC32_ERROR, 2, 1)
95
+ FIELD(CFU_IER, CRC8_ERROR, 1, 1)
96
+ FIELD(CFU_IER, SEU_ENDOFCALIB, 0, 1)
97
+REG32(CFU_IDR, 0xc)
98
+ FIELD(CFU_IDR, USR_GTS_EVENT, 9, 1)
99
+ FIELD(CFU_IDR, USR_GSR_EVENT, 8, 1)
100
+ FIELD(CFU_IDR, SLVERR, 7, 1)
101
+ FIELD(CFU_IDR, DECOMP_ERROR, 6, 1)
102
+ FIELD(CFU_IDR, BAD_CFI_PACKET, 5, 1)
103
+ FIELD(CFU_IDR, AXI_ALIGN_ERROR, 4, 1)
104
+ FIELD(CFU_IDR, CFI_ROW_ERROR, 3, 1)
105
+ FIELD(CFU_IDR, CRC32_ERROR, 2, 1)
106
+ FIELD(CFU_IDR, CRC8_ERROR, 1, 1)
107
+ FIELD(CFU_IDR, SEU_ENDOFCALIB, 0, 1)
108
+REG32(CFU_ITR, 0x10)
109
+ FIELD(CFU_ITR, USR_GTS_EVENT, 9, 1)
110
+ FIELD(CFU_ITR, USR_GSR_EVENT, 8, 1)
111
+ FIELD(CFU_ITR, SLVERR, 7, 1)
112
+ FIELD(CFU_ITR, DECOMP_ERROR, 6, 1)
113
+ FIELD(CFU_ITR, BAD_CFI_PACKET, 5, 1)
114
+ FIELD(CFU_ITR, AXI_ALIGN_ERROR, 4, 1)
115
+ FIELD(CFU_ITR, CFI_ROW_ERROR, 3, 1)
116
+ FIELD(CFU_ITR, CRC32_ERROR, 2, 1)
117
+ FIELD(CFU_ITR, CRC8_ERROR, 1, 1)
118
+ FIELD(CFU_ITR, SEU_ENDOFCALIB, 0, 1)
119
+REG32(CFU_PROTECT, 0x14)
120
+ FIELD(CFU_PROTECT, ACTIVE, 0, 1)
121
+REG32(CFU_FGCR, 0x18)
122
+ FIELD(CFU_FGCR, GCLK_CAL, 14, 1)
123
+ FIELD(CFU_FGCR, SC_HBC_TRIGGER, 13, 1)
124
+ FIELD(CFU_FGCR, GLOW, 12, 1)
125
+ FIELD(CFU_FGCR, GPWRDWN, 11, 1)
126
+ FIELD(CFU_FGCR, GCAP, 10, 1)
127
+ FIELD(CFU_FGCR, GSCWE, 9, 1)
128
+ FIELD(CFU_FGCR, GHIGH_B, 8, 1)
129
+ FIELD(CFU_FGCR, GMC_B, 7, 1)
130
+ FIELD(CFU_FGCR, GWE, 6, 1)
131
+ FIELD(CFU_FGCR, GRESTORE, 5, 1)
132
+ FIELD(CFU_FGCR, GTS_CFG_B, 4, 1)
133
+ FIELD(CFU_FGCR, GLUTMASK, 3, 1)
134
+ FIELD(CFU_FGCR, EN_GLOBS_B, 2, 1)
135
+ FIELD(CFU_FGCR, EOS, 1, 1)
136
+ FIELD(CFU_FGCR, INIT_COMPLETE, 0, 1)
137
+REG32(CFU_CTL, 0x1c)
138
+ FIELD(CFU_CTL, GSR_GSC, 15, 1)
139
+ FIELD(CFU_CTL, SLVERR_EN, 14, 1)
140
+ FIELD(CFU_CTL, CRC32_RESET, 13, 1)
141
+ FIELD(CFU_CTL, AXI_ERROR_EN, 12, 1)
142
+ FIELD(CFU_CTL, FLUSH_AXI, 11, 1)
143
+ FIELD(CFU_CTL, SSI_PER_SLR_PR, 10, 1)
144
+ FIELD(CFU_CTL, GCAP_CLK_EN, 9, 1)
145
+ FIELD(CFU_CTL, STATUS_SYNC_DISABLE, 8, 1)
146
+ FIELD(CFU_CTL, IGNORE_CFI_ERROR, 7, 1)
147
+ FIELD(CFU_CTL, CFRAME_DISABLE, 6, 1)
148
+ FIELD(CFU_CTL, QWORD_CNT_RESET, 5, 1)
149
+ FIELD(CFU_CTL, CRC8_DISABLE, 4, 1)
150
+ FIELD(CFU_CTL, CRC32_CHECK, 3, 1)
151
+ FIELD(CFU_CTL, DECOMPRESS, 2, 1)
152
+ FIELD(CFU_CTL, SEU_GO, 1, 1)
153
+ FIELD(CFU_CTL, CFI_LOCAL_RESET, 0, 1)
154
+REG32(CFU_CRAM_RW, 0x20)
155
+ FIELD(CFU_CRAM_RW, RFIFO_AFULL_DEPTH, 18, 9)
156
+ FIELD(CFU_CRAM_RW, RD_WAVE_CNT_LEFT, 12, 6)
157
+ FIELD(CFU_CRAM_RW, RD_WAVE_CNT, 6, 6)
158
+ FIELD(CFU_CRAM_RW, WR_WAVE_CNT, 0, 6)
159
+REG32(CFU_MASK, 0x28)
160
+REG32(CFU_CRC_EXPECT, 0x2c)
161
+REG32(CFU_CFRAME_LEFT_T0, 0x60)
162
+ FIELD(CFU_CFRAME_LEFT_T0, NUM, 0, 20)
163
+REG32(CFU_CFRAME_LEFT_T1, 0x64)
164
+ FIELD(CFU_CFRAME_LEFT_T1, NUM, 0, 20)
165
+REG32(CFU_CFRAME_LEFT_T2, 0x68)
166
+ FIELD(CFU_CFRAME_LEFT_T2, NUM, 0, 20)
167
+REG32(CFU_ROW_RANGE, 0x6c)
168
+ FIELD(CFU_ROW_RANGE, HALF_FSR, 5, 1)
169
+ FIELD(CFU_ROW_RANGE, NUM, 0, 5)
170
+REG32(CFU_STATUS, 0x100)
171
+ FIELD(CFU_STATUS, SEU_WRITE_ERROR, 30, 1)
172
+ FIELD(CFU_STATUS, FRCNT_ERROR, 29, 1)
173
+ FIELD(CFU_STATUS, RSVD_ERROR, 28, 1)
174
+ FIELD(CFU_STATUS, FDRO_ERROR, 27, 1)
175
+ FIELD(CFU_STATUS, FDRI_ERROR, 26, 1)
176
+ FIELD(CFU_STATUS, FDRI_READ_ERROR, 25, 1)
177
+ FIELD(CFU_STATUS, READ_FDRI_ERROR, 24, 1)
178
+ FIELD(CFU_STATUS, READ_SFR_ERROR, 23, 1)
179
+ FIELD(CFU_STATUS, READ_STREAM_ERROR, 22, 1)
180
+ FIELD(CFU_STATUS, UNKNOWN_STREAM_PKT, 21, 1)
181
+ FIELD(CFU_STATUS, USR_GTS, 20, 1)
182
+ FIELD(CFU_STATUS, USR_GSR, 19, 1)
183
+ FIELD(CFU_STATUS, AXI_BAD_WSTRB, 18, 1)
184
+ FIELD(CFU_STATUS, AXI_BAD_AR_SIZE, 17, 1)
185
+ FIELD(CFU_STATUS, AXI_BAD_AW_SIZE, 16, 1)
186
+ FIELD(CFU_STATUS, AXI_BAD_ARADDR, 15, 1)
187
+ FIELD(CFU_STATUS, AXI_BAD_AWADDR, 14, 1)
188
+ FIELD(CFU_STATUS, SCAN_CLEAR_PASS, 13, 1)
189
+ FIELD(CFU_STATUS, HC_SEC_ERROR, 12, 1)
190
+ FIELD(CFU_STATUS, GHIGH_B_ISHIGH, 11, 1)
191
+ FIELD(CFU_STATUS, GHIGH_B_ISLOW, 10, 1)
192
+ FIELD(CFU_STATUS, GMC_B_ISHIGH, 9, 1)
193
+ FIELD(CFU_STATUS, GMC_B_ISLOW, 8, 1)
194
+ FIELD(CFU_STATUS, GPWRDWN_B_ISHIGH, 7, 1)
195
+ FIELD(CFU_STATUS, CFI_SEU_CRC_ERROR, 6, 1)
196
+ FIELD(CFU_STATUS, CFI_SEU_ECC_ERROR, 5, 1)
197
+ FIELD(CFU_STATUS, CFI_SEU_HEARTBEAT, 4, 1)
198
+ FIELD(CFU_STATUS, SCAN_CLEAR_DONE, 3, 1)
199
+ FIELD(CFU_STATUS, HC_COMPLETE, 2, 1)
200
+ FIELD(CFU_STATUS, CFI_CFRAME_BUSY, 1, 1)
201
+ FIELD(CFU_STATUS, CFU_STREAM_BUSY, 0, 1)
202
+REG32(CFU_INTERNAL_STATUS, 0x104)
203
+ FIELD(CFU_INTERNAL_STATUS, SSI_EOS, 22, 1)
204
+ FIELD(CFU_INTERNAL_STATUS, SSI_GWE, 21, 1)
205
+ FIELD(CFU_INTERNAL_STATUS, RFIFO_EMPTY, 20, 1)
206
+ FIELD(CFU_INTERNAL_STATUS, RFIFO_FULL, 19, 1)
207
+ FIELD(CFU_INTERNAL_STATUS, SEL_SFR, 18, 1)
208
+ FIELD(CFU_INTERNAL_STATUS, STREAM_CFRAME, 17, 1)
209
+ FIELD(CFU_INTERNAL_STATUS, FDRI_PHASE, 16, 1)
210
+ FIELD(CFU_INTERNAL_STATUS, CFI_PIPE_EN, 15, 1)
211
+ FIELD(CFU_INTERNAL_STATUS, AWFIFO_DCNT, 10, 5)
212
+ FIELD(CFU_INTERNAL_STATUS, WFIFO_DCNT, 5, 5)
213
+ FIELD(CFU_INTERNAL_STATUS, REPAIR_BUSY, 4, 1)
214
+ FIELD(CFU_INTERNAL_STATUS, TRIMU_BUSY, 3, 1)
215
+ FIELD(CFU_INTERNAL_STATUS, TRIMB_BUSY, 2, 1)
216
+ FIELD(CFU_INTERNAL_STATUS, HCLEANR_BUSY, 1, 1)
217
+ FIELD(CFU_INTERNAL_STATUS, HCLEAN_BUSY, 0, 1)
218
+REG32(CFU_QWORD_CNT, 0x108)
219
+REG32(CFU_CRC_LIVE, 0x10c)
220
+REG32(CFU_PENDING_READ_CNT, 0x110)
221
+ FIELD(CFU_PENDING_READ_CNT, NUM, 0, 25)
222
+REG32(CFU_FDRI_CNT, 0x114)
223
+REG32(CFU_ECO1, 0x118)
224
+REG32(CFU_ECO2, 0x11c)
225
+
226
+#define R_MAX (R_CFU_ECO2 + 1)
227
+
228
+#define NUM_STREAM 2
229
+#define WFIFO_SZ 4
230
+
231
+struct XlnxVersalCFUAPB {
232
+ SysBusDevice parent_obj;
233
+ MemoryRegion iomem;
234
+ MemoryRegion iomem_stream[NUM_STREAM];
235
+ qemu_irq irq_cfu_imr;
236
+
237
+ /* 128-bit wfifo. */
238
+ uint32_t wfifo[WFIFO_SZ];
239
+
240
+ uint32_t regs[R_MAX];
241
+ RegisterInfo regs_info[R_MAX];
242
+
243
+ uint8_t fdri_row_addr;
244
+
245
+ struct {
246
+ XlnxCfiIf *cframe[15];
247
+ } cfg;
248
+};
249
+
250
+/**
251
+ * This is a helper function for updating a CFI data write fifo, an array of 4
252
+ * uint32_t and 128 bits of data that are allowed to be written through 4
253
+ * sequential 32 bit accesses. After the last index has been written into the
254
+ * write fifo (wfifo), the data is copied to and returned in a secondary fifo
255
+ * provided to the function (wfifo_ret), and the write fifo is cleared
256
+ * (zeroized).
257
+ *
258
+ * @addr: the address used when calculating the wfifo array index to update
259
+ * @value: the value to write into the wfifo array
260
+ * @wfifo: the wfifo to update
261
+ * @wfifo_out: will return the wfifo data when all 128 bits have been written
262
+ *
263
+ * @return: true if all 128 bits have been updated.
264
+ */
265
+bool update_wfifo(hwaddr addr, uint64_t value,
266
+ uint32_t *wfifo, uint32_t *wfifo_ret);
267
+
268
+#endif
269
diff --git a/hw/misc/xlnx-versal-cfu.c b/hw/misc/xlnx-versal-cfu.c
270
new file mode 100644
271
index XXXXXXX..XXXXXXX
272
--- /dev/null
273
+++ b/hw/misc/xlnx-versal-cfu.c
274
@@ -XXX,XX +XXX,XX @@
275
+/*
276
+ * QEMU model of the CFU Configuration Unit.
277
+ *
278
+ * Copyright (C) 2023, Advanced Micro Devices, Inc.
279
+ *
280
+ * Written by Edgar E. Iglesias <edgar.iglesias@gmail.com>,
281
+ * Sai Pavan Boddu <sai.pavan.boddu@amd.com>,
282
+ * Francisco Iglesias <francisco.iglesias@amd.com>
283
+ *
284
+ * SPDX-License-Identifier: GPL-2.0-or-later
285
+ */
286
+
287
+#include "qemu/osdep.h"
288
+#include "hw/sysbus.h"
289
+#include "hw/register.h"
290
+#include "hw/irq.h"
291
+#include "qemu/bitops.h"
292
+#include "qemu/log.h"
293
+#include "qemu/units.h"
294
+#include "migration/vmstate.h"
295
+#include "hw/qdev-properties.h"
296
+#include "hw/qdev-properties-system.h"
297
+#include "hw/misc/xlnx-versal-cfu.h"
298
+
299
+#ifndef XLNX_VERSAL_CFU_APB_ERR_DEBUG
300
+#define XLNX_VERSAL_CFU_APB_ERR_DEBUG 0
301
+#endif
302
+
303
+#define KEYHOLE_STREAM_4K (4 * KiB)
304
+#define KEYHOLE_STREAM_256K (256 * KiB)
305
+#define CFRAME_BROADCAST_ROW 0x1F
306
+
307
+bool update_wfifo(hwaddr addr, uint64_t value,
308
+ uint32_t *wfifo, uint32_t *wfifo_ret)
309
+{
310
+ unsigned int idx = extract32(addr, 2, 2);
311
+
312
+ wfifo[idx] = value;
313
+
314
+ if (idx == 3) {
315
+ memcpy(wfifo_ret, wfifo, WFIFO_SZ * sizeof(uint32_t));
316
+ memset(wfifo, 0, WFIFO_SZ * sizeof(uint32_t));
317
+ return true;
318
+ }
319
+
320
+ return false;
321
+}
322
+
323
+static void cfu_imr_update_irq(XlnxVersalCFUAPB *s)
324
+{
325
+ bool pending = s->regs[R_CFU_ISR] & ~s->regs[R_CFU_IMR];
326
+ qemu_set_irq(s->irq_cfu_imr, pending);
327
+}
328
+
329
+static void cfu_isr_postw(RegisterInfo *reg, uint64_t val64)
330
+{
331
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(reg->opaque);
332
+ cfu_imr_update_irq(s);
333
+}
334
+
335
+static uint64_t cfu_ier_prew(RegisterInfo *reg, uint64_t val64)
336
+{
337
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(reg->opaque);
338
+ uint32_t val = val64;
339
+
340
+ s->regs[R_CFU_IMR] &= ~val;
341
+ cfu_imr_update_irq(s);
342
+ return 0;
343
+}
344
+
345
+static uint64_t cfu_idr_prew(RegisterInfo *reg, uint64_t val64)
346
+{
347
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(reg->opaque);
348
+ uint32_t val = val64;
349
+
350
+ s->regs[R_CFU_IMR] |= val;
351
+ cfu_imr_update_irq(s);
352
+ return 0;
353
+}
354
+
355
+static uint64_t cfu_itr_prew(RegisterInfo *reg, uint64_t val64)
356
+{
357
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(reg->opaque);
358
+ uint32_t val = val64;
359
+
360
+ s->regs[R_CFU_ISR] |= val;
361
+ cfu_imr_update_irq(s);
362
+ return 0;
363
+}
364
+
365
+static void cfu_fgcr_postw(RegisterInfo *reg, uint64_t val64)
366
+{
367
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(reg->opaque);
368
+ uint32_t val = (uint32_t)val64;
369
+
370
+ /* Do a scan. It always looks good. */
371
+ if (FIELD_EX32(val, CFU_FGCR, SC_HBC_TRIGGER)) {
372
+ ARRAY_FIELD_DP32(s->regs, CFU_STATUS, SCAN_CLEAR_PASS, 1);
373
+ ARRAY_FIELD_DP32(s->regs, CFU_STATUS, SCAN_CLEAR_DONE, 1);
374
+ }
375
+}
376
+
377
+static const RegisterAccessInfo cfu_apb_regs_info[] = {
378
+ { .name = "CFU_ISR", .addr = A_CFU_ISR,
379
+ .rsvd = 0xfffffc00,
380
+ .w1c = 0x3ff,
381
+ .post_write = cfu_isr_postw,
382
+ },{ .name = "CFU_IMR", .addr = A_CFU_IMR,
383
+ .reset = 0x3ff,
384
+ .rsvd = 0xfffffc00,
385
+ .ro = 0x3ff,
386
+ },{ .name = "CFU_IER", .addr = A_CFU_IER,
387
+ .rsvd = 0xfffffc00,
388
+ .pre_write = cfu_ier_prew,
389
+ },{ .name = "CFU_IDR", .addr = A_CFU_IDR,
390
+ .rsvd = 0xfffffc00,
391
+ .pre_write = cfu_idr_prew,
392
+ },{ .name = "CFU_ITR", .addr = A_CFU_ITR,
393
+ .rsvd = 0xfffffc00,
394
+ .pre_write = cfu_itr_prew,
395
+ },{ .name = "CFU_PROTECT", .addr = A_CFU_PROTECT,
396
+ .reset = 0x1,
397
+ },{ .name = "CFU_FGCR", .addr = A_CFU_FGCR,
398
+ .rsvd = 0xffff8000,
399
+ .post_write = cfu_fgcr_postw,
400
+ },{ .name = "CFU_CTL", .addr = A_CFU_CTL,
401
+ .rsvd = 0xffff0000,
402
+ },{ .name = "CFU_CRAM_RW", .addr = A_CFU_CRAM_RW,
403
+ .reset = 0x401f7d9,
404
+ .rsvd = 0xf8000000,
405
+ },{ .name = "CFU_MASK", .addr = A_CFU_MASK,
406
+ },{ .name = "CFU_CRC_EXPECT", .addr = A_CFU_CRC_EXPECT,
407
+ },{ .name = "CFU_CFRAME_LEFT_T0", .addr = A_CFU_CFRAME_LEFT_T0,
408
+ .rsvd = 0xfff00000,
409
+ },{ .name = "CFU_CFRAME_LEFT_T1", .addr = A_CFU_CFRAME_LEFT_T1,
410
+ .rsvd = 0xfff00000,
411
+ },{ .name = "CFU_CFRAME_LEFT_T2", .addr = A_CFU_CFRAME_LEFT_T2,
412
+ .rsvd = 0xfff00000,
413
+ },{ .name = "CFU_ROW_RANGE", .addr = A_CFU_ROW_RANGE,
414
+ .rsvd = 0xffffffc0,
415
+ .ro = 0x3f,
416
+ },{ .name = "CFU_STATUS", .addr = A_CFU_STATUS,
417
+ .rsvd = 0x80000000,
418
+ .ro = 0x7fffffff,
419
+ },{ .name = "CFU_INTERNAL_STATUS", .addr = A_CFU_INTERNAL_STATUS,
420
+ .rsvd = 0xff800000,
421
+ .ro = 0x7fffff,
422
+ },{ .name = "CFU_QWORD_CNT", .addr = A_CFU_QWORD_CNT,
423
+ .ro = 0xffffffff,
424
+ },{ .name = "CFU_CRC_LIVE", .addr = A_CFU_CRC_LIVE,
425
+ .ro = 0xffffffff,
426
+ },{ .name = "CFU_PENDING_READ_CNT", .addr = A_CFU_PENDING_READ_CNT,
427
+ .rsvd = 0xfe000000,
428
+ .ro = 0x1ffffff,
429
+ },{ .name = "CFU_FDRI_CNT", .addr = A_CFU_FDRI_CNT,
430
+ .ro = 0xffffffff,
431
+ },{ .name = "CFU_ECO1", .addr = A_CFU_ECO1,
432
+ },{ .name = "CFU_ECO2", .addr = A_CFU_ECO2,
433
+ }
434
+};
435
+
436
+static void cfu_apb_reset(DeviceState *dev)
437
+{
438
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(dev);
439
+ unsigned int i;
440
+
441
+ for (i = 0; i < ARRAY_SIZE(s->regs_info); ++i) {
442
+ register_reset(&s->regs_info[i]);
443
+ }
444
+ memset(s->wfifo, 0, WFIFO_SZ * sizeof(uint32_t));
445
+
446
+ s->regs[R_CFU_STATUS] |= R_CFU_STATUS_HC_COMPLETE_MASK;
447
+ cfu_imr_update_irq(s);
448
+}
449
+
450
+static const MemoryRegionOps cfu_apb_ops = {
451
+ .read = register_read_memory,
452
+ .write = register_write_memory,
453
+ .endianness = DEVICE_LITTLE_ENDIAN,
454
+ .valid = {
455
+ .min_access_size = 4,
456
+ .max_access_size = 4,
457
+ },
458
+};
459
+
460
+static void cfu_transfer_cfi_packet(XlnxVersalCFUAPB *s, uint8_t row_addr,
461
+ XlnxCfiPacket *pkt)
462
+{
463
+ if (row_addr == CFRAME_BROADCAST_ROW) {
464
+ for (int i = 0; i < ARRAY_SIZE(s->cfg.cframe); i++) {
465
+ if (s->cfg.cframe[i]) {
466
+ xlnx_cfi_transfer_packet(s->cfg.cframe[i], pkt);
467
+ }
48
+ }
468
+ }
49
+ } else {
469
+ } else {
50
+ TCGLabel *loop = gen_new_label();
470
+ assert(row_addr < ARRAY_SIZE(s->cfg.cframe));
51
+ TCGv_ptr t2, i = tcg_const_local_ptr(0);
471
+
52
+
472
+ if (s->cfg.cframe[row_addr]) {
53
+ gen_set_label(loop);
473
+ xlnx_cfi_transfer_packet(s->cfg.cframe[row_addr], pkt);
54
+
474
+ }
55
+ t2 = tcg_temp_new_ptr();
475
+ }
56
+ tcg_gen_add_ptr(t2, cpu_env, i);
476
+}
57
+ tcg_gen_ld_i64(t0, t2, vofs);
477
+
58
+
478
+static uint64_t cfu_stream_read(void *opaque, hwaddr addr, unsigned size)
59
+ /* Minimize the number of local temps that must be re-read from
479
+{
60
+ * the stack each iteration. Instead, re-compute values other
480
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Unsupported read from addr=%"
61
+ * than the loop counter.
481
+ HWADDR_PRIx "\n", __func__, addr);
62
+ */
482
+ return 0;
63
+ tcg_gen_addi_ptr(t2, i, imm);
483
+}
64
+ tcg_gen_extu_ptr_i64(addr, t2);
484
+
65
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn));
485
+static void cfu_stream_write(void *opaque, hwaddr addr, uint64_t value,
66
+ tcg_temp_free_ptr(t2);
486
+ unsigned size)
67
+
487
+{
68
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
488
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(opaque);
69
+
489
+ uint32_t wfifo[WFIFO_SZ];
70
+ tcg_gen_addi_ptr(i, i, 8);
490
+
71
+
491
+ if (update_wfifo(addr, value, s->wfifo, wfifo)) {
72
+ tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
492
+ uint8_t packet_type, row_addr, reg_addr;
73
+ tcg_temp_free_ptr(i);
493
+
74
+ }
494
+ packet_type = extract32(wfifo[0], 24, 8);
75
+
495
+ row_addr = extract32(wfifo[0], 16, 5);
76
+ /* Predicate register stores can be any multiple of 2. */
496
+ reg_addr = extract32(wfifo[0], 8, 6);
77
+ if (len_remain) {
497
+
78
+ tcg_gen_ld_i64(t0, cpu_env, vofs + len_align);
498
+ /* Compressed bitstreams are not supported yet. */
79
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align);
499
+ if (ARRAY_FIELD_EX32(s->regs, CFU_CTL, DECOMPRESS) == 0) {
80
+
500
+ if (s->regs[R_CFU_FDRI_CNT]) {
81
+ switch (len_remain) {
501
+ XlnxCfiPacket pkt = {
82
+ case 2:
502
+ .reg_addr = CFRAME_FDRI,
83
+ case 4:
503
+ .data[0] = wfifo[0],
84
+ case 8:
504
+ .data[1] = wfifo[1],
85
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain));
505
+ .data[2] = wfifo[2],
86
+ break;
506
+ .data[3] = wfifo[3]
87
+
507
+ };
88
+ case 6:
508
+
89
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL);
509
+ cfu_transfer_cfi_packet(s, s->fdri_row_addr, &pkt);
90
+ tcg_gen_addi_i64(addr, addr, 4);
510
+
91
+ tcg_gen_shri_i64(t0, t0, 32);
511
+ s->regs[R_CFU_FDRI_CNT]--;
92
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW);
512
+
93
+ break;
513
+ } else if (packet_type == PACKET_TYPE_CFU &&
94
+
514
+ reg_addr == CFRAME_FDRI) {
95
+ default:
515
+
96
+ g_assert_not_reached();
516
+ /* Load R_CFU_FDRI_CNT, must be multiple of 25 */
517
+ s->regs[R_CFU_FDRI_CNT] = wfifo[1];
518
+
519
+ /* Store target row_addr */
520
+ s->fdri_row_addr = row_addr;
521
+
522
+ if (wfifo[1] % 25 != 0) {
523
+ qemu_log_mask(LOG_GUEST_ERROR,
524
+ "CFU FDRI_CNT is not loaded with "
525
+ "a multiple of 25 value\n");
526
+ }
527
+
528
+ } else if (packet_type == PACKET_TYPE_CFRAME) {
529
+ XlnxCfiPacket pkt = {
530
+ .reg_addr = reg_addr,
531
+ .data[0] = wfifo[1],
532
+ .data[1] = wfifo[2],
533
+ .data[2] = wfifo[3],
534
+ };
535
+ cfu_transfer_cfi_packet(s, row_addr, &pkt);
536
+ }
97
+ }
537
+ }
98
+ }
538
+ }
99
+ tcg_temp_free_i64(addr);
539
+}
100
+ tcg_temp_free_i64(t0);
540
+
101
+}
541
+static const MemoryRegionOps cfu_stream_ops = {
102
+
542
+ .read = cfu_stream_read,
103
static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
543
+ .write = cfu_stream_write,
104
{
544
+ .endianness = DEVICE_LITTLE_ENDIAN,
105
if (sve_access_check(s)) {
545
+ .valid = {
106
@@ -XXX,XX +XXX,XX @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
546
+ .min_access_size = 4,
107
return true;
547
+ .max_access_size = 8,
108
}
548
+ },
109
549
+};
110
+static bool trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
550
+
111
+{
551
+static void cfu_apb_init(Object *obj)
112
+ if (sve_access_check(s)) {
552
+{
113
+ int size = vec_full_reg_size(s);
553
+ XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(obj);
114
+ int off = vec_full_reg_offset(s, a->rd);
554
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
115
+ do_str(s, off, size, a->rn, a->imm * size);
555
+ RegisterInfoArray *reg_array;
116
+ }
556
+ unsigned int i;
117
+ return true;
557
+ char *name;
118
+}
558
+
119
+
559
+ memory_region_init(&s->iomem, obj, TYPE_XLNX_VERSAL_CFU_APB, R_MAX * 4);
120
+static bool trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
560
+ reg_array =
121
+{
561
+ register_init_block32(DEVICE(obj), cfu_apb_regs_info,
122
+ if (sve_access_check(s)) {
562
+ ARRAY_SIZE(cfu_apb_regs_info),
123
+ int size = pred_full_reg_size(s);
563
+ s->regs_info, s->regs,
124
+ int off = pred_full_reg_offset(s, a->rd);
564
+ &cfu_apb_ops,
125
+ do_str(s, off, size, a->rn, a->imm * size);
565
+ XLNX_VERSAL_CFU_APB_ERR_DEBUG,
126
+ }
566
+ R_MAX * 4);
127
+ return true;
567
+ memory_region_add_subregion(&s->iomem,
128
+}
568
+ 0x0,
129
+
569
+ &reg_array->mem);
130
/*
570
+ sysbus_init_mmio(sbd, &s->iomem);
131
*** SVE Memory - Contiguous Load Group
571
+ for (i = 0; i < NUM_STREAM; i++) {
132
*/
572
+ name = g_strdup_printf(TYPE_XLNX_VERSAL_CFU_APB "-stream%d", i);
133
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
573
+ memory_region_init_io(&s->iomem_stream[i], obj, &cfu_stream_ops, s,
574
+ name, i == 0 ? KEYHOLE_STREAM_4K :
575
+ KEYHOLE_STREAM_256K);
576
+ sysbus_init_mmio(sbd, &s->iomem_stream[i]);
577
+ g_free(name);
578
+ }
579
+ sysbus_init_irq(sbd, &s->irq_cfu_imr);
580
+}
581
+
582
+static Property cfu_props[] = {
583
+ DEFINE_PROP_LINK("cframe0", XlnxVersalCFUAPB, cfg.cframe[0],
584
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
585
+ DEFINE_PROP_LINK("cframe1", XlnxVersalCFUAPB, cfg.cframe[1],
586
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
587
+ DEFINE_PROP_LINK("cframe2", XlnxVersalCFUAPB, cfg.cframe[2],
588
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
589
+ DEFINE_PROP_LINK("cframe3", XlnxVersalCFUAPB, cfg.cframe[3],
590
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
591
+ DEFINE_PROP_LINK("cframe4", XlnxVersalCFUAPB, cfg.cframe[4],
592
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
593
+ DEFINE_PROP_LINK("cframe5", XlnxVersalCFUAPB, cfg.cframe[5],
594
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
595
+ DEFINE_PROP_LINK("cframe6", XlnxVersalCFUAPB, cfg.cframe[6],
596
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
597
+ DEFINE_PROP_LINK("cframe7", XlnxVersalCFUAPB, cfg.cframe[7],
598
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
599
+ DEFINE_PROP_LINK("cframe8", XlnxVersalCFUAPB, cfg.cframe[8],
600
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
601
+ DEFINE_PROP_LINK("cframe9", XlnxVersalCFUAPB, cfg.cframe[9],
602
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
603
+ DEFINE_PROP_LINK("cframe10", XlnxVersalCFUAPB, cfg.cframe[10],
604
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
605
+ DEFINE_PROP_LINK("cframe11", XlnxVersalCFUAPB, cfg.cframe[11],
606
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
607
+ DEFINE_PROP_LINK("cframe12", XlnxVersalCFUAPB, cfg.cframe[12],
608
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
609
+ DEFINE_PROP_LINK("cframe13", XlnxVersalCFUAPB, cfg.cframe[13],
610
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
611
+ DEFINE_PROP_LINK("cframe14", XlnxVersalCFUAPB, cfg.cframe[14],
612
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
613
+ DEFINE_PROP_END_OF_LIST(),
614
+};
615
+
616
+static const VMStateDescription vmstate_cfu_apb = {
617
+ .name = TYPE_XLNX_VERSAL_CFU_APB,
618
+ .version_id = 1,
619
+ .minimum_version_id = 1,
620
+ .fields = (VMStateField[]) {
621
+ VMSTATE_UINT32_ARRAY(wfifo, XlnxVersalCFUAPB, 4),
622
+ VMSTATE_UINT32_ARRAY(regs, XlnxVersalCFUAPB, R_MAX),
623
+ VMSTATE_UINT8(fdri_row_addr, XlnxVersalCFUAPB),
624
+ VMSTATE_END_OF_LIST(),
625
+ }
626
+};
627
+
628
+static void cfu_apb_class_init(ObjectClass *klass, void *data)
629
+{
630
+ DeviceClass *dc = DEVICE_CLASS(klass);
631
+
632
+ dc->reset = cfu_apb_reset;
633
+ dc->vmsd = &vmstate_cfu_apb;
634
+ device_class_set_props(dc, cfu_props);
635
+}
636
+
637
+static const TypeInfo cfu_apb_info = {
638
+ .name = TYPE_XLNX_VERSAL_CFU_APB,
639
+ .parent = TYPE_SYS_BUS_DEVICE,
640
+ .instance_size = sizeof(XlnxVersalCFUAPB),
641
+ .class_init = cfu_apb_class_init,
642
+ .instance_init = cfu_apb_init,
643
+ .interfaces = (InterfaceInfo[]) {
644
+ { TYPE_XLNX_CFI_IF },
645
+ { }
646
+ }
647
+};
648
+
649
+static void cfu_apb_register_types(void)
650
+{
651
+ type_register_static(&cfu_apb_info);
652
+}
653
+
654
+type_init(cfu_apb_register_types)
655
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
134
index XXXXXXX..XXXXXXX 100644
656
index XXXXXXX..XXXXXXX 100644
135
--- a/target/arm/sve.decode
657
--- a/hw/misc/meson.build
136
+++ b/target/arm/sve.decode
658
+++ b/hw/misc/meson.build
137
@@ -XXX,XX +XXX,XX @@ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \
659
@@ -XXX,XX +XXX,XX @@ specific_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files('xlnx-versal-crl.c'))
138
660
system_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files(
139
### SVE Memory Store Group
661
'xlnx-versal-xramc.c',
140
662
'xlnx-versal-pmc-iou-slcr.c',
141
+# SVE store predicate register
663
+ 'xlnx-versal-cfu.c',
142
+STR_pri 1110010 11 0. ..... 000 ... ..... 0 .... @pd_rn_i9
664
'xlnx-cfi-if.c',
143
+
665
))
144
+# SVE store vector register
666
system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: files('stm32f2xx_syscfg.c'))
145
+STR_zri 1110010 11 0. ..... 010 ... ..... ..... @rd_rn_i9
146
+
147
# SVE contiguous store (scalar plus immediate)
148
# ST1B, ST1H, ST1W, ST1D; require msz <= esz
149
ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \
150
--
667
--
151
2.17.1
668
2.34.1
152
153
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
Introduce a model of Xilinx Versal's Configuration Frame Unit's data out
4
port (CFU_FDRO).
5
6
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20230831165701.2016397-4-francisco.iglesias@amd.com
5
Message-id: 20180627043328.11531-24-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/helper-sve.h | 13 +++++++++
11
include/hw/misc/xlnx-versal-cfu.h | 12 ++++
9
target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++
12
hw/misc/xlnx-versal-cfu.c | 96 +++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 30 +++++++++++++++++++++
13
2 files changed, 108 insertions(+)
11
target/arm/sve.decode | 8 ++++++
12
4 files changed, 106 insertions(+)
13
14
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
diff --git a/include/hw/misc/xlnx-versal-cfu.h b/include/hw/misc/xlnx-versal-cfu.h
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
--- a/include/hw/misc/xlnx-versal-cfu.h
17
+++ b/target/arm/helper-sve.h
18
+++ b/include/hw/misc/xlnx-versal-cfu.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
19
@@ -XXX,XX +XXX,XX @@
19
DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
20
#include "hw/sysbus.h"
20
void, ptr, ptr, ptr, i64, ptr, i32)
21
#include "hw/register.h"
21
22
#include "hw/misc/xlnx-cfi-if.h"
22
+DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG,
23
+#include "qemu/fifo32.h"
23
+ void, ptr, ptr, ptr, ptr, i32)
24
24
+DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG,
25
#define TYPE_XLNX_VERSAL_CFU_APB "xlnx,versal-cfu-apb"
25
+ void, ptr, ptr, ptr, ptr, i32)
26
OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFUAPB, XLNX_VERSAL_CFU_APB)
26
+DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG,
27
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+#define TYPE_XLNX_VERSAL_CFU_FDRO "xlnx,versal-cfu-fdro"
28
+DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG,
29
+OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFUFDRO, XLNX_VERSAL_CFU_FDRO)
29
+ void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
31
+ void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
33
+ void, ptr, ptr, ptr, ptr, i32)
34
+
30
+
35
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
31
REG32(CFU_ISR, 0x0)
36
void, ptr, ptr, ptr, ptr, i32)
32
FIELD(CFU_ISR, USR_GTS_EVENT, 9, 1)
37
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
33
FIELD(CFU_ISR, USR_GSR_EVENT, 8, 1)
38
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
34
@@ -XXX,XX +XXX,XX @@ struct XlnxVersalCFUAPB {
35
} cfg;
36
};
37
38
+
39
+struct XlnxVersalCFUFDRO {
40
+ SysBusDevice parent_obj;
41
+ MemoryRegion iomem_fdro;
42
+
43
+ Fifo32 fdro_data;
44
+};
45
+
46
/**
47
* This is a helper function for updating a CFI data write fifo, an array of 4
48
* uint32_t and 128 bits of data that are allowed to be written through 4
49
diff --git a/hw/misc/xlnx-versal-cfu.c b/hw/misc/xlnx-versal-cfu.c
39
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/sve_helper.c
51
--- a/hw/misc/xlnx-versal-cfu.c
41
+++ b/target/arm/sve_helper.c
52
+++ b/hw/misc/xlnx-versal-cfu.c
42
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
53
@@ -XXX,XX +XXX,XX @@ static void cfu_stream_write(void *opaque, hwaddr addr, uint64_t value,
43
} while (i != 0); \
54
}
44
}
55
}
45
56
46
+/* SVE fp16 conversions always use IEEE mode. Like AdvSIMD, they ignore
57
+static uint64_t cfu_fdro_read(void *opaque, hwaddr addr, unsigned size)
47
+ * FZ16. When converting from fp16, this affects flushing input denormals;
48
+ * when converting to fp16, this affects flushing output denormals.
49
+ */
50
+static inline float32 sve_f16_to_f32(float16 f, float_status *fpst)
51
+{
58
+{
52
+ flag save = get_flush_inputs_to_zero(fpst);
59
+ XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(opaque);
53
+ float32 ret;
60
+ uint64_t ret = 0;
54
+
61
+
55
+ set_flush_inputs_to_zero(false, fpst);
62
+ if (!fifo32_is_empty(&s->fdro_data)) {
56
+ ret = float16_to_float32(f, true, fpst);
63
+ ret = fifo32_pop(&s->fdro_data);
57
+ set_flush_inputs_to_zero(save, fpst);
64
+ }
65
+
58
+ return ret;
66
+ return ret;
59
+}
67
+}
60
+
68
+
61
+static inline float64 sve_f16_to_f64(float16 f, float_status *fpst)
69
+static void cfu_fdro_write(void *opaque, hwaddr addr, uint64_t value,
70
+ unsigned size)
62
+{
71
+{
63
+ flag save = get_flush_inputs_to_zero(fpst);
72
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Unsupported write from addr=%"
64
+ float64 ret;
73
+ HWADDR_PRIx "\n", __func__, addr);
65
+
66
+ set_flush_inputs_to_zero(false, fpst);
67
+ ret = float16_to_float64(f, true, fpst);
68
+ set_flush_inputs_to_zero(save, fpst);
69
+ return ret;
70
+}
74
+}
71
+
75
+
72
+static inline float16 sve_f32_to_f16(float32 f, float_status *fpst)
76
static const MemoryRegionOps cfu_stream_ops = {
77
.read = cfu_stream_read,
78
.write = cfu_stream_write,
79
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps cfu_stream_ops = {
80
},
81
};
82
83
+static const MemoryRegionOps cfu_fdro_ops = {
84
+ .read = cfu_fdro_read,
85
+ .write = cfu_fdro_write,
86
+ .endianness = DEVICE_LITTLE_ENDIAN,
87
+ .valid = {
88
+ .min_access_size = 4,
89
+ .max_access_size = 4,
90
+ },
91
+};
92
+
93
static void cfu_apb_init(Object *obj)
94
{
95
XlnxVersalCFUAPB *s = XLNX_VERSAL_CFU_APB(obj);
96
@@ -XXX,XX +XXX,XX @@ static void cfu_apb_init(Object *obj)
97
sysbus_init_irq(sbd, &s->irq_cfu_imr);
98
}
99
100
+static void cfu_fdro_init(Object *obj)
73
+{
101
+{
74
+ flag save = get_flush_to_zero(fpst);
102
+ XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(obj);
75
+ float16 ret;
103
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
76
+
104
+
77
+ set_flush_to_zero(false, fpst);
105
+ memory_region_init_io(&s->iomem_fdro, obj, &cfu_fdro_ops, s,
78
+ ret = float32_to_float16(f, true, fpst);
106
+ TYPE_XLNX_VERSAL_CFU_FDRO, KEYHOLE_STREAM_4K);
79
+ set_flush_to_zero(save, fpst);
107
+ sysbus_init_mmio(sbd, &s->iomem_fdro);
80
+ return ret;
108
+ fifo32_create(&s->fdro_data, 8 * KiB / sizeof(uint32_t));
81
+}
109
+}
82
+
110
+
83
+static inline float16 sve_f64_to_f16(float64 f, float_status *fpst)
111
+static void cfu_fdro_reset_enter(Object *obj, ResetType type)
84
+{
112
+{
85
+ flag save = get_flush_to_zero(fpst);
113
+ XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(obj);
86
+ float16 ret;
87
+
114
+
88
+ set_flush_to_zero(false, fpst);
115
+ fifo32_reset(&s->fdro_data);
89
+ ret = float64_to_float16(f, true, fpst);
90
+ set_flush_to_zero(save, fpst);
91
+ return ret;
92
+}
116
+}
93
+
117
+
94
+DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16)
118
+static void cfu_fdro_cfi_transfer_packet(XlnxCfiIf *cfi_if, XlnxCfiPacket *pkt)
95
+DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32)
119
+{
96
+DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16)
120
+ XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(cfi_if);
97
+DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64)
98
+DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32)
99
+DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64)
100
+
121
+
101
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
122
+ if (fifo32_num_free(&s->fdro_data) >= ARRAY_SIZE(pkt->data)) {
102
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
123
+ for (int i = 0; i < ARRAY_SIZE(pkt->data); i++) {
103
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
124
+ fifo32_push(&s->fdro_data, pkt->data[i]);
104
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
125
+ }
105
index XXXXXXX..XXXXXXX 100644
126
+ } else {
106
--- a/target/arm/translate-sve.c
127
+ /* It is a programming error to fill the fifo. */
107
+++ b/target/arm/translate-sve.c
128
+ qemu_log_mask(LOG_GUEST_ERROR,
108
@@ -XXX,XX +XXX,XX @@ static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg,
129
+ "CFU_FDRO: CFI data dropped due to full read fifo\n");
109
return true;
130
+ }
110
}
111
112
+static bool trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
113
+{
114
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh);
115
+}
131
+}
116
+
132
+
117
+static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
133
static Property cfu_props[] = {
134
DEFINE_PROP_LINK("cframe0", XlnxVersalCFUAPB, cfg.cframe[0],
135
TYPE_XLNX_CFI_IF, XlnxCfiIf *),
136
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_cfu_apb = {
137
}
138
};
139
140
+static const VMStateDescription vmstate_cfu_fdro = {
141
+ .name = TYPE_XLNX_VERSAL_CFU_FDRO,
142
+ .version_id = 1,
143
+ .minimum_version_id = 1,
144
+ .fields = (VMStateField[]) {
145
+ VMSTATE_FIFO32(fdro_data, XlnxVersalCFUFDRO),
146
+ VMSTATE_END_OF_LIST(),
147
+ }
148
+};
149
+
150
static void cfu_apb_class_init(ObjectClass *klass, void *data)
151
{
152
DeviceClass *dc = DEVICE_CLASS(klass);
153
@@ -XXX,XX +XXX,XX @@ static void cfu_apb_class_init(ObjectClass *klass, void *data)
154
device_class_set_props(dc, cfu_props);
155
}
156
157
+static void cfu_fdro_class_init(ObjectClass *klass, void *data)
118
+{
158
+{
119
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs);
159
+ DeviceClass *dc = DEVICE_CLASS(klass);
160
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
161
+ XlnxCfiIfClass *xcic = XLNX_CFI_IF_CLASS(klass);
162
+
163
+ dc->vmsd = &vmstate_cfu_fdro;
164
+ xcic->cfi_transfer_packet = cfu_fdro_cfi_transfer_packet;
165
+ rc->phases.enter = cfu_fdro_reset_enter;
120
+}
166
+}
121
+
167
+
122
+static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
168
static const TypeInfo cfu_apb_info = {
123
+{
169
.name = TYPE_XLNX_VERSAL_CFU_APB,
124
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh);
170
.parent = TYPE_SYS_BUS_DEVICE,
125
+}
171
@@ -XXX,XX +XXX,XX @@ static const TypeInfo cfu_apb_info = {
172
}
173
};
174
175
+static const TypeInfo cfu_fdro_info = {
176
+ .name = TYPE_XLNX_VERSAL_CFU_FDRO,
177
+ .parent = TYPE_SYS_BUS_DEVICE,
178
+ .instance_size = sizeof(XlnxVersalCFUFDRO),
179
+ .class_init = cfu_fdro_class_init,
180
+ .instance_init = cfu_fdro_init,
181
+ .interfaces = (InterfaceInfo[]) {
182
+ { TYPE_XLNX_CFI_IF },
183
+ { }
184
+ }
185
+};
126
+
186
+
127
+static bool trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
187
static void cfu_apb_register_types(void)
128
+{
129
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd);
130
+}
131
+
132
+static bool trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
133
+{
134
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds);
135
+}
136
+
137
+static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
138
+{
139
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd);
140
+}
141
+
142
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
143
{
188
{
144
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
189
type_register_static(&cfu_apb_info);
145
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
190
+ type_register_static(&cfu_fdro_info);
146
index XXXXXXX..XXXXXXX 100644
191
}
147
--- a/target/arm/sve.decode
192
148
+++ b/target/arm/sve.decode
193
type_init(cfu_apb_register_types)
149
@@ -XXX,XX +XXX,XX @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra
150
151
### SVE FP Unary Operations Predicated Group
152
153
+# SVE floating-point convert precision
154
+FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0
155
+FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0
156
+FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0
157
+FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0
158
+FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0
159
+FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0
160
+
161
# SVE integer convert to floating-point
162
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
163
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
164
--
194
--
165
2.17.1
195
2.34.1
166
167
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
Introduce a model of Xilinx Versal's Configuration Frame Unit's Single
4
Frame Read port (CFU_SFR).
5
6
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20230831165701.2016397-5-francisco.iglesias@amd.com
5
Message-id: 20180627043328.11531-10-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/helper-sve.h | 5 +++
11
include/hw/misc/xlnx-versal-cfu.h | 15 ++++++
9
target/arm/sve_helper.c | 41 +++++++++++++++++++++++++
12
hw/misc/xlnx-versal-cfu.c | 87 +++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 62 ++++++++++++++++++++++++++++++++++++++
13
2 files changed, 102 insertions(+)
11
target/arm/sve.decode | 5 +++
12
4 files changed, 113 insertions(+)
13
14
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
diff --git a/include/hw/misc/xlnx-versal-cfu.h b/include/hw/misc/xlnx-versal-cfu.h
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
--- a/include/hw/misc/xlnx-versal-cfu.h
17
+++ b/target/arm/helper-sve.h
18
+++ b/include/hw/misc/xlnx-versal-cfu.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
19
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFUAPB, XLNX_VERSAL_CFU_APB)
19
DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
20
#define TYPE_XLNX_VERSAL_CFU_FDRO "xlnx,versal-cfu-fdro"
20
DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
21
OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFUFDRO, XLNX_VERSAL_CFU_FDRO)
21
22
22
+DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+#define TYPE_XLNX_VERSAL_CFU_SFR "xlnx,versal-cfu-sfr"
23
+DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFUSFR, XLNX_VERSAL_CFU_SFR)
24
+DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_4(sve_movz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+
25
+
27
DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
REG32(CFU_ISR, 0x0)
28
DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
FIELD(CFU_ISR, USR_GTS_EVENT, 9, 1)
29
DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
FIELD(CFU_ISR, USR_GSR_EVENT, 8, 1)
30
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
29
@@ -XXX,XX +XXX,XX @@ struct XlnxVersalCFUFDRO {
30
Fifo32 fdro_data;
31
};
32
33
+struct XlnxVersalCFUSFR {
34
+ SysBusDevice parent_obj;
35
+ MemoryRegion iomem_sfr;
36
+
37
+ /* 128-bit wfifo. */
38
+ uint32_t wfifo[WFIFO_SZ];
39
+
40
+ struct {
41
+ XlnxVersalCFUAPB *cfu;
42
+ } cfg;
43
+};
44
+
45
/**
46
* This is a helper function for updating a CFI data write fifo, an array of 4
47
* uint32_t and 128 bits of data that are allowed to be written through 4
48
diff --git a/hw/misc/xlnx-versal-cfu.c b/hw/misc/xlnx-versal-cfu.c
31
index XXXXXXX..XXXXXXX 100644
49
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/sve_helper.c
50
--- a/hw/misc/xlnx-versal-cfu.c
33
+++ b/target/arm/sve_helper.c
51
+++ b/hw/misc/xlnx-versal-cfu.c
34
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc)
52
@@ -XXX,XX +XXX,XX @@ static void cfu_stream_write(void *opaque, hwaddr addr, uint64_t value,
35
}
53
}
36
}
54
}
37
55
38
+/* Copy Zn into Zd, and store zero into inactive elements. */
56
+static uint64_t cfu_sfr_read(void *opaque, hwaddr addr, unsigned size)
39
+void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc)
40
+{
57
+{
41
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
58
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Unsupported read from addr=%"
42
+ uint64_t *d = vd, *n = vn;
59
+ HWADDR_PRIx "\n", __func__, addr);
43
+ uint8_t *pg = vg;
60
+ return 0;
44
+ for (i = 0; i < opr_sz; i += 1) {
61
+}
45
+ d[i] = n[i] & expand_pred_b(pg[H1(i)]);
62
+
63
+static void cfu_sfr_write(void *opaque, hwaddr addr, uint64_t value,
64
+ unsigned size)
65
+{
66
+ XlnxVersalCFUSFR *s = XLNX_VERSAL_CFU_SFR(opaque);
67
+ uint32_t wfifo[WFIFO_SZ];
68
+
69
+ if (update_wfifo(addr, value, s->wfifo, wfifo)) {
70
+ uint8_t row_addr = extract32(wfifo[0], 23, 5);
71
+ uint32_t frame_addr = extract32(wfifo[0], 0, 23);
72
+ XlnxCfiPacket pkt = { .reg_addr = CFRAME_SFR,
73
+ .data[0] = frame_addr };
74
+
75
+ if (s->cfg.cfu) {
76
+ cfu_transfer_cfi_packet(s->cfg.cfu, row_addr, &pkt);
77
+ }
46
+ }
78
+ }
47
+}
79
+}
48
+
80
+
49
+void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc)
81
static uint64_t cfu_fdro_read(void *opaque, hwaddr addr, unsigned size)
82
{
83
XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(opaque);
84
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps cfu_stream_ops = {
85
},
86
};
87
88
+static const MemoryRegionOps cfu_sfr_ops = {
89
+ .read = cfu_sfr_read,
90
+ .write = cfu_sfr_write,
91
+ .endianness = DEVICE_LITTLE_ENDIAN,
92
+ .valid = {
93
+ .min_access_size = 4,
94
+ .max_access_size = 4,
95
+ },
96
+};
97
+
98
static const MemoryRegionOps cfu_fdro_ops = {
99
.read = cfu_fdro_read,
100
.write = cfu_fdro_write,
101
@@ -XXX,XX +XXX,XX @@ static void cfu_apb_init(Object *obj)
102
sysbus_init_irq(sbd, &s->irq_cfu_imr);
103
}
104
105
+static void cfu_sfr_init(Object *obj)
50
+{
106
+{
51
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
107
+ XlnxVersalCFUSFR *s = XLNX_VERSAL_CFU_SFR(obj);
52
+ uint64_t *d = vd, *n = vn;
108
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
53
+ uint8_t *pg = vg;
109
+
54
+ for (i = 0; i < opr_sz; i += 1) {
110
+ memory_region_init_io(&s->iomem_sfr, obj, &cfu_sfr_ops, s,
55
+ d[i] = n[i] & expand_pred_h(pg[H1(i)]);
111
+ TYPE_XLNX_VERSAL_CFU_SFR, KEYHOLE_STREAM_4K);
56
+ }
112
+ sysbus_init_mmio(sbd, &s->iomem_sfr);
57
+}
113
+}
58
+
114
+
59
+void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc)
115
+static void cfu_sfr_reset_enter(Object *obj, ResetType type)
60
+{
116
+{
61
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
117
+ XlnxVersalCFUSFR *s = XLNX_VERSAL_CFU_SFR(obj);
62
+ uint64_t *d = vd, *n = vn;
118
+
63
+ uint8_t *pg = vg;
119
+ memset(s->wfifo, 0, WFIFO_SZ * sizeof(uint32_t));
64
+ for (i = 0; i < opr_sz; i += 1) {
65
+ d[i] = n[i] & expand_pred_s(pg[H1(i)]);
66
+ }
67
+}
120
+}
68
+
121
+
69
+void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc)
122
static void cfu_fdro_init(Object *obj)
123
{
124
XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(obj);
125
@@ -XXX,XX +XXX,XX @@ static Property cfu_props[] = {
126
DEFINE_PROP_END_OF_LIST(),
127
};
128
129
+static Property cfu_sfr_props[] = {
130
+ DEFINE_PROP_LINK("cfu", XlnxVersalCFUSFR, cfg.cfu,
131
+ TYPE_XLNX_VERSAL_CFU_APB, XlnxVersalCFUAPB *),
132
+ DEFINE_PROP_END_OF_LIST(),
133
+};
134
+
135
static const VMStateDescription vmstate_cfu_apb = {
136
.name = TYPE_XLNX_VERSAL_CFU_APB,
137
.version_id = 1,
138
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_cfu_fdro = {
139
}
140
};
141
142
+static const VMStateDescription vmstate_cfu_sfr = {
143
+ .name = TYPE_XLNX_VERSAL_CFU_SFR,
144
+ .version_id = 1,
145
+ .minimum_version_id = 1,
146
+ .fields = (VMStateField[]) {
147
+ VMSTATE_UINT32_ARRAY(wfifo, XlnxVersalCFUSFR, 4),
148
+ VMSTATE_END_OF_LIST(),
149
+ }
150
+};
151
+
152
static void cfu_apb_class_init(ObjectClass *klass, void *data)
153
{
154
DeviceClass *dc = DEVICE_CLASS(klass);
155
@@ -XXX,XX +XXX,XX @@ static void cfu_fdro_class_init(ObjectClass *klass, void *data)
156
rc->phases.enter = cfu_fdro_reset_enter;
157
}
158
159
+static void cfu_sfr_class_init(ObjectClass *klass, void *data)
70
+{
160
+{
71
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
161
+ DeviceClass *dc = DEVICE_CLASS(klass);
72
+ uint64_t *d = vd, *n = vn;
162
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
73
+ uint8_t *pg = vg;
163
+
74
+ for (i = 0; i < opr_sz; i += 1) {
164
+ device_class_set_props(dc, cfu_sfr_props);
75
+ d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1);
165
+ dc->vmsd = &vmstate_cfu_sfr;
76
+ }
166
+ rc->phases.enter = cfu_sfr_reset_enter;
77
+}
167
+}
78
+
168
+
79
/* Three-operand expander, immediate operand, controlled by a predicate.
169
static const TypeInfo cfu_apb_info = {
80
*/
170
.name = TYPE_XLNX_VERSAL_CFU_APB,
81
#define DO_ZPZI(NAME, TYPE, H, OP) \
171
.parent = TYPE_SYS_BUS_DEVICE,
82
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
172
@@ -XXX,XX +XXX,XX @@ static const TypeInfo cfu_fdro_info = {
83
index XXXXXXX..XXXXXXX 100644
173
}
84
--- a/target/arm/translate-sve.c
174
};
85
+++ b/target/arm/translate-sve.c
175
86
@@ -XXX,XX +XXX,XX @@ static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz)
176
+static const TypeInfo cfu_sfr_info = {
87
return true;
177
+ .name = TYPE_XLNX_VERSAL_CFU_SFR,
178
+ .parent = TYPE_SYS_BUS_DEVICE,
179
+ .instance_size = sizeof(XlnxVersalCFUSFR),
180
+ .class_init = cfu_sfr_class_init,
181
+ .instance_init = cfu_sfr_init,
182
+};
183
+
184
static void cfu_apb_register_types(void)
185
{
186
type_register_static(&cfu_apb_info);
187
type_register_static(&cfu_fdro_info);
188
+ type_register_static(&cfu_sfr_info);
88
}
189
}
89
190
90
+/* Copy Zn into Zd, storing zeros into inactive elements. */
191
type_init(cfu_apb_register_types)
91
+static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz)
92
+{
93
+ static gen_helper_gvec_3 * const fns[4] = {
94
+ gen_helper_sve_movz_b, gen_helper_sve_movz_h,
95
+ gen_helper_sve_movz_s, gen_helper_sve_movz_d,
96
+ };
97
+ unsigned vsz = vec_full_reg_size(s);
98
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
99
+ vec_full_reg_offset(s, rn),
100
+ pred_full_reg_offset(s, pg),
101
+ vsz, vsz, 0, fns[esz]);
102
+}
103
+
104
static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a,
105
gen_helper_gvec_3 *fn)
106
{
107
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
108
return true;
109
}
110
111
+/* Load and broadcast element. */
112
+static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
113
+{
114
+ if (!sve_access_check(s)) {
115
+ return true;
116
+ }
117
+
118
+ unsigned vsz = vec_full_reg_size(s);
119
+ unsigned psz = pred_full_reg_size(s);
120
+ unsigned esz = dtype_esz[a->dtype];
121
+ TCGLabel *over = gen_new_label();
122
+ TCGv_i64 temp;
123
+
124
+ /* If the guarding predicate has no bits set, no load occurs. */
125
+ if (psz <= 8) {
126
+ /* Reduce the pred_esz_masks value simply to reduce the
127
+ * size of the code generated here.
128
+ */
129
+ uint64_t psz_mask = MAKE_64BIT_MASK(0, psz * 8);
130
+ temp = tcg_temp_new_i64();
131
+ tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg));
132
+ tcg_gen_andi_i64(temp, temp, pred_esz_masks[esz] & psz_mask);
133
+ tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over);
134
+ tcg_temp_free_i64(temp);
135
+ } else {
136
+ TCGv_i32 t32 = tcg_temp_new_i32();
137
+ find_last_active(s, t32, esz, a->pg);
138
+ tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over);
139
+ tcg_temp_free_i32(t32);
140
+ }
141
+
142
+ /* Load the data. */
143
+ temp = tcg_temp_new_i64();
144
+ tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << esz);
145
+ tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s),
146
+ s->be_data | dtype_mop[a->dtype]);
147
+
148
+ /* Broadcast to *all* elements. */
149
+ tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd),
150
+ vsz, vsz, temp);
151
+ tcg_temp_free_i64(temp);
152
+
153
+ /* Zero the inactive elements. */
154
+ gen_set_label(over);
155
+ do_movz_zpz(s, a->rd, a->rd, a->pg, esz);
156
+ return true;
157
+}
158
+
159
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
160
int msz, int esz, int nreg)
161
{
162
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
163
index XXXXXXX..XXXXXXX 100644
164
--- a/target/arm/sve.decode
165
+++ b/target/arm/sve.decode
166
@@ -XXX,XX +XXX,XX @@
167
%imm8_16_10 16:5 10:3
168
%imm9_16_10 16:s6 10:3
169
%size_23 23:2
170
+%dtype_23_13 23:2 13:2
171
172
# A combination of tsz:imm3 -- extract esize.
173
%tszimm_esz 22:2 5:5 !function=tszimm_esz
174
@@ -XXX,XX +XXX,XX @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9
175
# SVE load vector register
176
LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
177
178
+# SVE load and broadcast element
179
+LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \
180
+ &rpri_load dtype=%dtype_23_13 nreg=0
181
+
182
### SVE Memory Contiguous Load Group
183
184
# SVE contiguous load (scalar plus scalar)
185
--
192
--
186
2.17.1
193
2.34.1
187
188
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
Introduce a model of Xilinx Versal's Configuration Frame controller
4
(CFRAME_REG).
5
6
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
7
Message-id: 20230831165701.2016397-6-francisco.iglesias@amd.com
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-16-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/translate-sve.c | 85 ++++++++++++++++++++++++++------------
11
MAINTAINERS | 2 +
9
target/arm/sve.decode | 11 +++++
12
include/hw/misc/xlnx-versal-cframe-reg.h | 286 ++++++++++
10
2 files changed, 70 insertions(+), 26 deletions(-)
13
hw/misc/xlnx-versal-cframe-reg.c | 697 +++++++++++++++++++++++
14
hw/misc/meson.build | 1 +
15
4 files changed, 986 insertions(+)
16
create mode 100644 include/hw/misc/xlnx-versal-cframe-reg.h
17
create mode 100644 hw/misc/xlnx-versal-cframe-reg.c
11
18
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
19
diff --git a/MAINTAINERS b/MAINTAINERS
13
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
21
--- a/MAINTAINERS
15
+++ b/target/arm/translate-sve.c
22
+++ b/MAINTAINERS
16
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
23
@@ -XXX,XX +XXX,XX @@ F: hw/misc/xlnx-cfi-if.c
17
return true;
24
F: include/hw/misc/xlnx-cfi-if.h
18
}
25
F: hw/misc/xlnx-versal-cfu.c
19
26
F: include/hw/misc/xlnx-versal-cfu.h
20
+/* Indexed by [xs][msz]. */
27
+F: hw/misc/xlnx-versal-cframe-reg.c
21
+static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = {
28
+F: include/hw/misc/xlnx-versal-cframe-reg.h
22
+ { gen_helper_sve_stbs_zsu,
29
23
+ gen_helper_sve_sths_zsu,
30
STM32F100
24
+ gen_helper_sve_stss_zsu, },
31
M: Alexandre Iooss <erdnaxe@crans.org>
25
+ { gen_helper_sve_stbs_zss,
32
diff --git a/include/hw/misc/xlnx-versal-cframe-reg.h b/include/hw/misc/xlnx-versal-cframe-reg.h
26
+ gen_helper_sve_sths_zss,
33
new file mode 100644
27
+ gen_helper_sve_stss_zss, },
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/include/hw/misc/xlnx-versal-cframe-reg.h
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * QEMU model of the Configuration Frame Control module
40
+ *
41
+ * Copyright (C) 2023, Advanced Micro Devices, Inc.
42
+ *
43
+ * Written by Francisco Iglesias <francisco.iglesias@amd.com>
44
+ *
45
+ * SPDX-License-Identifier: GPL-2.0-or-later
46
+ *
47
+ * References:
48
+ * [1] Versal ACAP Technical Reference Manual,
49
+ * https://www.xilinx.com/support/documentation/architecture-manuals/am011-versal-acap-trm.pdf
50
+ *
51
+ * [2] Versal ACAP Register Reference,
52
+ * https://www.xilinx.com/htmldocs/registers/am012/am012-versal-register-reference.html
53
+ */
54
+#ifndef HW_MISC_XLNX_VERSAL_CFRAME_REG_H
55
+#define HW_MISC_XLNX_VERSAL_CFRAME_REG_H
56
+
57
+#include "hw/sysbus.h"
58
+#include "hw/register.h"
59
+#include "hw/misc/xlnx-cfi-if.h"
60
+#include "hw/misc/xlnx-versal-cfu.h"
61
+#include "qemu/fifo32.h"
62
+
63
+#define TYPE_XLNX_VERSAL_CFRAME_REG "xlnx,cframe-reg"
64
+OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFrameReg, XLNX_VERSAL_CFRAME_REG)
65
+
66
+/*
67
+ * The registers in this module are 128 bits wide but it is ok to write
68
+ * and read them through 4 sequential 32 bit accesses (address[3:2] = 0,
69
+ * 1, 2, 3).
70
+ */
71
+REG32(CRC0, 0x0)
72
+ FIELD(CRC, CRC, 0, 32)
73
+REG32(CRC1, 0x4)
74
+REG32(CRC2, 0x8)
75
+REG32(CRC3, 0xc)
76
+REG32(FAR0, 0x10)
77
+ FIELD(FAR0, SEGMENT, 23, 2)
78
+ FIELD(FAR0, BLOCKTYPE, 20, 3)
79
+ FIELD(FAR0, FRAME_ADDR, 0, 20)
80
+REG32(FAR1, 0x14)
81
+REG32(FAR2, 0x18)
82
+REG32(FAR3, 0x1c)
83
+REG32(FAR_SFR0, 0x20)
84
+ FIELD(FAR_SFR0, BLOCKTYPE, 20, 3)
85
+ FIELD(FAR_SFR0, FRAME_ADDR, 0, 20)
86
+REG32(FAR_SFR1, 0x24)
87
+REG32(FAR_SFR2, 0x28)
88
+REG32(FAR_SFR3, 0x2c)
89
+REG32(FDRI0, 0x40)
90
+REG32(FDRI1, 0x44)
91
+REG32(FDRI2, 0x48)
92
+REG32(FDRI3, 0x4c)
93
+REG32(FRCNT0, 0x50)
94
+ FIELD(FRCNT0, FRCNT, 0, 32)
95
+REG32(FRCNT1, 0x54)
96
+REG32(FRCNT2, 0x58)
97
+REG32(FRCNT3, 0x5c)
98
+REG32(CMD0, 0x60)
99
+ FIELD(CMD0, CMD, 0, 5)
100
+REG32(CMD1, 0x64)
101
+REG32(CMD2, 0x68)
102
+REG32(CMD3, 0x6c)
103
+REG32(CR_MASK0, 0x70)
104
+REG32(CR_MASK1, 0x74)
105
+REG32(CR_MASK2, 0x78)
106
+REG32(CR_MASK3, 0x7c)
107
+REG32(CTL0, 0x80)
108
+ FIELD(CTL, PER_FRAME_CRC, 0, 1)
109
+REG32(CTL1, 0x84)
110
+REG32(CTL2, 0x88)
111
+REG32(CTL3, 0x8c)
112
+REG32(CFRM_ISR0, 0x150)
113
+ FIELD(CFRM_ISR0, READ_BROADCAST_ERROR, 21, 1)
114
+ FIELD(CFRM_ISR0, CMD_MISSING_ERROR, 20, 1)
115
+ FIELD(CFRM_ISR0, RW_ROWOFF_ERROR, 19, 1)
116
+ FIELD(CFRM_ISR0, READ_REG_ADDR_ERROR, 18, 1)
117
+ FIELD(CFRM_ISR0, READ_BLK_TYPE_ERROR, 17, 1)
118
+ FIELD(CFRM_ISR0, READ_FRAME_ADDR_ERROR, 16, 1)
119
+ FIELD(CFRM_ISR0, WRITE_REG_ADDR_ERROR, 15, 1)
120
+ FIELD(CFRM_ISR0, WRITE_BLK_TYPE_ERROR, 13, 1)
121
+ FIELD(CFRM_ISR0, WRITE_FRAME_ADDR_ERROR, 12, 1)
122
+ FIELD(CFRM_ISR0, MFW_OVERRUN_ERROR, 11, 1)
123
+ FIELD(CFRM_ISR0, FAR_FIFO_UNDERFLOW, 10, 1)
124
+ FIELD(CFRM_ISR0, FAR_FIFO_OVERFLOW, 9, 1)
125
+ FIELD(CFRM_ISR0, PER_FRAME_SEQ_ERROR, 8, 1)
126
+ FIELD(CFRM_ISR0, CRC_ERROR, 7, 1)
127
+ FIELD(CFRM_ISR0, WRITE_OVERRUN_ERROR, 6, 1)
128
+ FIELD(CFRM_ISR0, READ_OVERRUN_ERROR, 5, 1)
129
+ FIELD(CFRM_ISR0, CMD_INTERRUPT_ERROR, 4, 1)
130
+ FIELD(CFRM_ISR0, WRITE_INTERRUPT_ERROR, 3, 1)
131
+ FIELD(CFRM_ISR0, READ_INTERRUPT_ERROR, 2, 1)
132
+ FIELD(CFRM_ISR0, SEU_CRC_ERROR, 1, 1)
133
+ FIELD(CFRM_ISR0, SEU_ECC_ERROR, 0, 1)
134
+REG32(CFRM_ISR1, 0x154)
135
+REG32(CFRM_ISR2, 0x158)
136
+REG32(CFRM_ISR3, 0x15c)
137
+REG32(CFRM_IMR0, 0x160)
138
+ FIELD(CFRM_IMR0, READ_BROADCAST_ERROR, 21, 1)
139
+ FIELD(CFRM_IMR0, CMD_MISSING_ERROR, 20, 1)
140
+ FIELD(CFRM_IMR0, RW_ROWOFF_ERROR, 19, 1)
141
+ FIELD(CFRM_IMR0, READ_REG_ADDR_ERROR, 18, 1)
142
+ FIELD(CFRM_IMR0, READ_BLK_TYPE_ERROR, 17, 1)
143
+ FIELD(CFRM_IMR0, READ_FRAME_ADDR_ERROR, 16, 1)
144
+ FIELD(CFRM_IMR0, WRITE_REG_ADDR_ERROR, 15, 1)
145
+ FIELD(CFRM_IMR0, WRITE_BLK_TYPE_ERROR, 13, 1)
146
+ FIELD(CFRM_IMR0, WRITE_FRAME_ADDR_ERROR, 12, 1)
147
+ FIELD(CFRM_IMR0, MFW_OVERRUN_ERROR, 11, 1)
148
+ FIELD(CFRM_IMR0, FAR_FIFO_UNDERFLOW, 10, 1)
149
+ FIELD(CFRM_IMR0, FAR_FIFO_OVERFLOW, 9, 1)
150
+ FIELD(CFRM_IMR0, PER_FRAME_SEQ_ERROR, 8, 1)
151
+ FIELD(CFRM_IMR0, CRC_ERROR, 7, 1)
152
+ FIELD(CFRM_IMR0, WRITE_OVERRUN_ERROR, 6, 1)
153
+ FIELD(CFRM_IMR0, READ_OVERRUN_ERROR, 5, 1)
154
+ FIELD(CFRM_IMR0, CMD_INTERRUPT_ERROR, 4, 1)
155
+ FIELD(CFRM_IMR0, WRITE_INTERRUPT_ERROR, 3, 1)
156
+ FIELD(CFRM_IMR0, READ_INTERRUPT_ERROR, 2, 1)
157
+ FIELD(CFRM_IMR0, SEU_CRC_ERROR, 1, 1)
158
+ FIELD(CFRM_IMR0, SEU_ECC_ERROR, 0, 1)
159
+REG32(CFRM_IMR1, 0x164)
160
+REG32(CFRM_IMR2, 0x168)
161
+REG32(CFRM_IMR3, 0x16c)
162
+REG32(CFRM_IER0, 0x170)
163
+ FIELD(CFRM_IER0, READ_BROADCAST_ERROR, 21, 1)
164
+ FIELD(CFRM_IER0, CMD_MISSING_ERROR, 20, 1)
165
+ FIELD(CFRM_IER0, RW_ROWOFF_ERROR, 19, 1)
166
+ FIELD(CFRM_IER0, READ_REG_ADDR_ERROR, 18, 1)
167
+ FIELD(CFRM_IER0, READ_BLK_TYPE_ERROR, 17, 1)
168
+ FIELD(CFRM_IER0, READ_FRAME_ADDR_ERROR, 16, 1)
169
+ FIELD(CFRM_IER0, WRITE_REG_ADDR_ERROR, 15, 1)
170
+ FIELD(CFRM_IER0, WRITE_BLK_TYPE_ERROR, 13, 1)
171
+ FIELD(CFRM_IER0, WRITE_FRAME_ADDR_ERROR, 12, 1)
172
+ FIELD(CFRM_IER0, MFW_OVERRUN_ERROR, 11, 1)
173
+ FIELD(CFRM_IER0, FAR_FIFO_UNDERFLOW, 10, 1)
174
+ FIELD(CFRM_IER0, FAR_FIFO_OVERFLOW, 9, 1)
175
+ FIELD(CFRM_IER0, PER_FRAME_SEQ_ERROR, 8, 1)
176
+ FIELD(CFRM_IER0, CRC_ERROR, 7, 1)
177
+ FIELD(CFRM_IER0, WRITE_OVERRUN_ERROR, 6, 1)
178
+ FIELD(CFRM_IER0, READ_OVERRUN_ERROR, 5, 1)
179
+ FIELD(CFRM_IER0, CMD_INTERRUPT_ERROR, 4, 1)
180
+ FIELD(CFRM_IER0, WRITE_INTERRUPT_ERROR, 3, 1)
181
+ FIELD(CFRM_IER0, READ_INTERRUPT_ERROR, 2, 1)
182
+ FIELD(CFRM_IER0, SEU_CRC_ERROR, 1, 1)
183
+ FIELD(CFRM_IER0, SEU_ECC_ERROR, 0, 1)
184
+REG32(CFRM_IER1, 0x174)
185
+REG32(CFRM_IER2, 0x178)
186
+REG32(CFRM_IER3, 0x17c)
187
+REG32(CFRM_IDR0, 0x180)
188
+ FIELD(CFRM_IDR0, READ_BROADCAST_ERROR, 21, 1)
189
+ FIELD(CFRM_IDR0, CMD_MISSING_ERROR, 20, 1)
190
+ FIELD(CFRM_IDR0, RW_ROWOFF_ERROR, 19, 1)
191
+ FIELD(CFRM_IDR0, READ_REG_ADDR_ERROR, 18, 1)
192
+ FIELD(CFRM_IDR0, READ_BLK_TYPE_ERROR, 17, 1)
193
+ FIELD(CFRM_IDR0, READ_FRAME_ADDR_ERROR, 16, 1)
194
+ FIELD(CFRM_IDR0, WRITE_REG_ADDR_ERROR, 15, 1)
195
+ FIELD(CFRM_IDR0, WRITE_BLK_TYPE_ERROR, 13, 1)
196
+ FIELD(CFRM_IDR0, WRITE_FRAME_ADDR_ERROR, 12, 1)
197
+ FIELD(CFRM_IDR0, MFW_OVERRUN_ERROR, 11, 1)
198
+ FIELD(CFRM_IDR0, FAR_FIFO_UNDERFLOW, 10, 1)
199
+ FIELD(CFRM_IDR0, FAR_FIFO_OVERFLOW, 9, 1)
200
+ FIELD(CFRM_IDR0, PER_FRAME_SEQ_ERROR, 8, 1)
201
+ FIELD(CFRM_IDR0, CRC_ERROR, 7, 1)
202
+ FIELD(CFRM_IDR0, WRITE_OVERRUN_ERROR, 6, 1)
203
+ FIELD(CFRM_IDR0, READ_OVERRUN_ERROR, 5, 1)
204
+ FIELD(CFRM_IDR0, CMD_INTERRUPT_ERROR, 4, 1)
205
+ FIELD(CFRM_IDR0, WRITE_INTERRUPT_ERROR, 3, 1)
206
+ FIELD(CFRM_IDR0, READ_INTERRUPT_ERROR, 2, 1)
207
+ FIELD(CFRM_IDR0, SEU_CRC_ERROR, 1, 1)
208
+ FIELD(CFRM_IDR0, SEU_ECC_ERROR, 0, 1)
209
+REG32(CFRM_IDR1, 0x184)
210
+REG32(CFRM_IDR2, 0x188)
211
+REG32(CFRM_IDR3, 0x18c)
212
+REG32(CFRM_ITR0, 0x190)
213
+ FIELD(CFRM_ITR0, READ_BROADCAST_ERROR, 21, 1)
214
+ FIELD(CFRM_ITR0, CMD_MISSING_ERROR, 20, 1)
215
+ FIELD(CFRM_ITR0, RW_ROWOFF_ERROR, 19, 1)
216
+ FIELD(CFRM_ITR0, READ_REG_ADDR_ERROR, 18, 1)
217
+ FIELD(CFRM_ITR0, READ_BLK_TYPE_ERROR, 17, 1)
218
+ FIELD(CFRM_ITR0, READ_FRAME_ADDR_ERROR, 16, 1)
219
+ FIELD(CFRM_ITR0, WRITE_REG_ADDR_ERROR, 15, 1)
220
+ FIELD(CFRM_ITR0, WRITE_BLK_TYPE_ERROR, 13, 1)
221
+ FIELD(CFRM_ITR0, WRITE_FRAME_ADDR_ERROR, 12, 1)
222
+ FIELD(CFRM_ITR0, MFW_OVERRUN_ERROR, 11, 1)
223
+ FIELD(CFRM_ITR0, FAR_FIFO_UNDERFLOW, 10, 1)
224
+ FIELD(CFRM_ITR0, FAR_FIFO_OVERFLOW, 9, 1)
225
+ FIELD(CFRM_ITR0, PER_FRAME_SEQ_ERROR, 8, 1)
226
+ FIELD(CFRM_ITR0, CRC_ERROR, 7, 1)
227
+ FIELD(CFRM_ITR0, WRITE_OVERRUN_ERROR, 6, 1)
228
+ FIELD(CFRM_ITR0, READ_OVERRUN_ERROR, 5, 1)
229
+ FIELD(CFRM_ITR0, CMD_INTERRUPT_ERROR, 4, 1)
230
+ FIELD(CFRM_ITR0, WRITE_INTERRUPT_ERROR, 3, 1)
231
+ FIELD(CFRM_ITR0, READ_INTERRUPT_ERROR, 2, 1)
232
+ FIELD(CFRM_ITR0, SEU_CRC_ERROR, 1, 1)
233
+ FIELD(CFRM_ITR0, SEU_ECC_ERROR, 0, 1)
234
+REG32(CFRM_ITR1, 0x194)
235
+REG32(CFRM_ITR2, 0x198)
236
+REG32(CFRM_ITR3, 0x19c)
237
+REG32(SEU_SYNDRM00, 0x1a0)
238
+REG32(SEU_SYNDRM01, 0x1a4)
239
+REG32(SEU_SYNDRM02, 0x1a8)
240
+REG32(SEU_SYNDRM03, 0x1ac)
241
+REG32(SEU_SYNDRM10, 0x1b0)
242
+REG32(SEU_SYNDRM11, 0x1b4)
243
+REG32(SEU_SYNDRM12, 0x1b8)
244
+REG32(SEU_SYNDRM13, 0x1bc)
245
+REG32(SEU_SYNDRM20, 0x1c0)
246
+REG32(SEU_SYNDRM21, 0x1c4)
247
+REG32(SEU_SYNDRM22, 0x1c8)
248
+REG32(SEU_SYNDRM23, 0x1cc)
249
+REG32(SEU_SYNDRM30, 0x1d0)
250
+REG32(SEU_SYNDRM31, 0x1d4)
251
+REG32(SEU_SYNDRM32, 0x1d8)
252
+REG32(SEU_SYNDRM33, 0x1dc)
253
+REG32(SEU_VIRTUAL_SYNDRM0, 0x1e0)
254
+REG32(SEU_VIRTUAL_SYNDRM1, 0x1e4)
255
+REG32(SEU_VIRTUAL_SYNDRM2, 0x1e8)
256
+REG32(SEU_VIRTUAL_SYNDRM3, 0x1ec)
257
+REG32(SEU_CRC0, 0x1f0)
258
+REG32(SEU_CRC1, 0x1f4)
259
+REG32(SEU_CRC2, 0x1f8)
260
+REG32(SEU_CRC3, 0x1fc)
261
+REG32(CFRAME_FAR_BOT0, 0x200)
262
+REG32(CFRAME_FAR_BOT1, 0x204)
263
+REG32(CFRAME_FAR_BOT2, 0x208)
264
+REG32(CFRAME_FAR_BOT3, 0x20c)
265
+REG32(CFRAME_FAR_TOP0, 0x210)
266
+REG32(CFRAME_FAR_TOP1, 0x214)
267
+REG32(CFRAME_FAR_TOP2, 0x218)
268
+REG32(CFRAME_FAR_TOP3, 0x21c)
269
+REG32(LAST_FRAME_BOT0, 0x220)
270
+ FIELD(LAST_FRAME_BOT0, BLOCKTYPE1_LAST_FRAME_LSB, 20, 12)
271
+ FIELD(LAST_FRAME_BOT0, BLOCKTYPE0_LAST_FRAME, 0, 20)
272
+REG32(LAST_FRAME_BOT1, 0x224)
273
+ FIELD(LAST_FRAME_BOT1, BLOCKTYPE3_LAST_FRAME_LSB, 28, 4)
274
+ FIELD(LAST_FRAME_BOT1, BLOCKTYPE2_LAST_FRAME, 8, 20)
275
+ FIELD(LAST_FRAME_BOT1, BLOCKTYPE1_LAST_FRAME_MSB, 0, 8)
276
+REG32(LAST_FRAME_BOT2, 0x228)
277
+ FIELD(LAST_FRAME_BOT2, BLOCKTYPE3_LAST_FRAME_MSB, 0, 16)
278
+REG32(LAST_FRAME_BOT3, 0x22c)
279
+REG32(LAST_FRAME_TOP0, 0x230)
280
+ FIELD(LAST_FRAME_TOP0, BLOCKTYPE5_LAST_FRAME_LSB, 20, 12)
281
+ FIELD(LAST_FRAME_TOP0, BLOCKTYPE4_LAST_FRAME, 0, 20)
282
+REG32(LAST_FRAME_TOP1, 0x234)
283
+ FIELD(LAST_FRAME_TOP1, BLOCKTYPE6_LAST_FRAME, 8, 20)
284
+ FIELD(LAST_FRAME_TOP1, BLOCKTYPE5_LAST_FRAME_MSB, 0, 8)
285
+REG32(LAST_FRAME_TOP2, 0x238)
286
+REG32(LAST_FRAME_TOP3, 0x23c)
287
+
288
+#define CFRAME_REG_R_MAX (R_LAST_FRAME_TOP3 + 1)
289
+
290
+#define FRAME_NUM_QWORDS 25
291
+#define FRAME_NUM_WORDS (FRAME_NUM_QWORDS * 4) /* 25 * 128 bits */
292
+
293
+typedef struct XlnxCFrame {
294
+ uint32_t data[FRAME_NUM_WORDS];
295
+} XlnxCFrame;
296
+
297
+struct XlnxVersalCFrameReg {
298
+ SysBusDevice parent_obj;
299
+ MemoryRegion iomem;
300
+ MemoryRegion iomem_fdri;
301
+ qemu_irq irq_cfrm_imr;
302
+
303
+ /* 128-bit wfifo. */
304
+ uint32_t wfifo[WFIFO_SZ];
305
+
306
+ uint32_t regs[CFRAME_REG_R_MAX];
307
+ RegisterInfo regs_info[CFRAME_REG_R_MAX];
308
+
309
+ bool rowon;
310
+ bool wcfg;
311
+ bool rcfg;
312
+
313
+ GTree *cframes;
314
+ Fifo32 new_f_data;
315
+
316
+ struct {
317
+ XlnxCfiIf *cfu_fdro;
318
+ uint32_t blktype_num_frames[7];
319
+ } cfg;
320
+ bool row_configured;
28
+};
321
+};
29
+
322
+
30
+/* Note that we overload xs=2 to indicate 64-bit offset. */
323
+#endif
31
+static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = {
324
diff --git a/hw/misc/xlnx-versal-cframe-reg.c b/hw/misc/xlnx-versal-cframe-reg.c
32
+ { gen_helper_sve_stbd_zsu,
325
new file mode 100644
33
+ gen_helper_sve_sthd_zsu,
326
index XXXXXXX..XXXXXXX
34
+ gen_helper_sve_stsd_zsu,
327
--- /dev/null
35
+ gen_helper_sve_stdd_zsu, },
328
+++ b/hw/misc/xlnx-versal-cframe-reg.c
36
+ { gen_helper_sve_stbd_zss,
329
@@ -XXX,XX +XXX,XX @@
37
+ gen_helper_sve_sthd_zss,
330
+/*
38
+ gen_helper_sve_stsd_zss,
331
+ * QEMU model of the Configuration Frame Control module
39
+ gen_helper_sve_stdd_zss, },
332
+ *
40
+ { gen_helper_sve_stbd_zd,
333
+ * Copyright (C) 2023, Advanced Micro Devices, Inc.
41
+ gen_helper_sve_sthd_zd,
334
+ *
42
+ gen_helper_sve_stsd_zd,
335
+ * Written by Francisco Iglesias <francisco.iglesias@amd.com>
43
+ gen_helper_sve_stdd_zd, },
336
+ *
337
+ * SPDX-License-Identifier: GPL-2.0-or-later
338
+ */
339
+
340
+#include "qemu/osdep.h"
341
+#include "hw/sysbus.h"
342
+#include "hw/register.h"
343
+#include "hw/registerfields.h"
344
+#include "qemu/bitops.h"
345
+#include "qemu/log.h"
346
+#include "qemu/units.h"
347
+#include "qapi/error.h"
348
+#include "hw/qdev-properties.h"
349
+#include "migration/vmstate.h"
350
+#include "hw/irq.h"
351
+#include "hw/misc/xlnx-versal-cframe-reg.h"
352
+
353
+#ifndef XLNX_VERSAL_CFRAME_REG_ERR_DEBUG
354
+#define XLNX_VERSAL_CFRAME_REG_ERR_DEBUG 0
355
+#endif
356
+
357
+#define KEYHOLE_STREAM_4K (4 * KiB)
358
+#define N_WORDS_128BIT 4
359
+
360
+#define MAX_BLOCKTYPE 6
361
+#define MAX_BLOCKTYPE_FRAMES 0xFFFFF
362
+
363
+enum {
364
+ CFRAME_CMD_WCFG = 1,
365
+ CFRAME_CMD_ROWON = 2,
366
+ CFRAME_CMD_ROWOFF = 3,
367
+ CFRAME_CMD_RCFG = 4,
368
+ CFRAME_CMD_DLPARK = 5,
44
+};
369
+};
45
+
370
+
46
static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
371
+static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
47
{
372
+{
48
- /* Indexed by [xs][msz]. */
373
+ guint ua = GPOINTER_TO_UINT(a);
49
- static gen_helper_gvec_mem_scatter * const fn32[2][3] = {
374
+ guint ub = GPOINTER_TO_UINT(b);
50
- { gen_helper_sve_stbs_zsu,
375
+ return (ua > ub) - (ua < ub);
51
- gen_helper_sve_sths_zsu,
376
+}
52
- gen_helper_sve_stss_zsu, },
377
+
53
- { gen_helper_sve_stbs_zss,
378
+static void cfrm_imr_update_irq(XlnxVersalCFrameReg *s)
54
- gen_helper_sve_sths_zss,
379
+{
55
- gen_helper_sve_stss_zss, },
380
+ bool pending = s->regs[R_CFRM_ISR0] & ~s->regs[R_CFRM_IMR0];
56
- };
381
+ qemu_set_irq(s->irq_cfrm_imr, pending);
57
- /* Note that we overload xs=2 to indicate 64-bit offset. */
382
+}
58
- static gen_helper_gvec_mem_scatter * const fn64[3][4] = {
383
+
59
- { gen_helper_sve_stbd_zsu,
384
+static void cfrm_isr_postw(RegisterInfo *reg, uint64_t val64)
60
- gen_helper_sve_sthd_zsu,
385
+{
61
- gen_helper_sve_stsd_zsu,
386
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
62
- gen_helper_sve_stdd_zsu, },
387
+ cfrm_imr_update_irq(s);
63
- { gen_helper_sve_stbd_zss,
388
+}
64
- gen_helper_sve_sthd_zss,
389
+
65
- gen_helper_sve_stsd_zss,
390
+static uint64_t cfrm_ier_prew(RegisterInfo *reg, uint64_t val64)
66
- gen_helper_sve_stdd_zss, },
391
+{
67
- { gen_helper_sve_stbd_zd,
392
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
68
- gen_helper_sve_sthd_zd,
393
+
69
- gen_helper_sve_stsd_zd,
394
+ s->regs[R_CFRM_IMR0] &= ~s->regs[R_CFRM_IER0];
70
- gen_helper_sve_stdd_zd, },
395
+ s->regs[R_CFRM_IER0] = 0;
71
- };
396
+ cfrm_imr_update_irq(s);
72
gen_helper_gvec_mem_scatter *fn;
397
+ return 0;
73
398
+}
74
if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
399
+
75
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
400
+static uint64_t cfrm_idr_prew(RegisterInfo *reg, uint64_t val64)
76
}
401
+{
77
switch (a->esz) {
402
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
78
case MO_32:
403
+
79
- fn = fn32[a->xs][a->msz];
404
+ s->regs[R_CFRM_IMR0] |= s->regs[R_CFRM_IDR0];
80
+ fn = scatter_store_fn32[a->xs][a->msz];
405
+ s->regs[R_CFRM_IDR0] = 0;
81
break;
406
+ cfrm_imr_update_irq(s);
82
case MO_64:
407
+ return 0;
83
- fn = fn64[a->xs][a->msz];
408
+}
84
+ fn = scatter_store_fn64[a->xs][a->msz];
409
+
85
break;
410
+static uint64_t cfrm_itr_prew(RegisterInfo *reg, uint64_t val64)
86
default:
411
+{
87
g_assert_not_reached();
412
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
88
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
413
+
89
return true;
414
+ s->regs[R_CFRM_ISR0] |= s->regs[R_CFRM_ITR0];
90
}
415
+ s->regs[R_CFRM_ITR0] = 0;
91
416
+ cfrm_imr_update_irq(s);
92
+static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn)
417
+ return 0;
93
+{
418
+}
94
+ gen_helper_gvec_mem_scatter *fn = NULL;
419
+
95
+ TCGv_i64 imm;
420
+static void cframe_incr_far(XlnxVersalCFrameReg *s)
96
+
421
+{
97
+ if (a->esz < a->msz) {
422
+ uint32_t faddr = ARRAY_FIELD_EX32(s->regs, FAR0, FRAME_ADDR);
98
+ return false;
423
+ uint32_t blktype = ARRAY_FIELD_EX32(s->regs, FAR0, BLOCKTYPE);
99
+ }
424
+
100
+ if (!sve_access_check(s)) {
425
+ assert(blktype <= MAX_BLOCKTYPE);
101
+ return true;
426
+
102
+ }
427
+ faddr++;
103
+
428
+ if (faddr > s->cfg.blktype_num_frames[blktype]) {
104
+ switch (a->esz) {
429
+ /* Restart from 0 and increment block type */
105
+ case MO_32:
430
+ faddr = 0;
106
+ fn = scatter_store_fn32[0][a->msz];
431
+ blktype++;
107
+ break;
432
+
108
+ case MO_64:
433
+ assert(blktype <= MAX_BLOCKTYPE);
109
+ fn = scatter_store_fn64[2][a->msz];
434
+
110
+ break;
435
+ ARRAY_FIELD_DP32(s->regs, FAR0, BLOCKTYPE, blktype);
111
+ }
436
+ }
112
+ assert(fn != NULL);
437
+
113
+
438
+ ARRAY_FIELD_DP32(s->regs, FAR0, FRAME_ADDR, faddr);
114
+ /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x])
439
+}
115
+ * by loading the immediate into the scalar parameter.
440
+
441
+static void cfrm_fdri_post_write(RegisterInfo *reg, uint64_t val)
442
+{
443
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
444
+
445
+ if (s->row_configured && s->rowon && s->wcfg) {
446
+
447
+ if (fifo32_num_free(&s->new_f_data) >= N_WORDS_128BIT) {
448
+ fifo32_push(&s->new_f_data, s->regs[R_FDRI0]);
449
+ fifo32_push(&s->new_f_data, s->regs[R_FDRI1]);
450
+ fifo32_push(&s->new_f_data, s->regs[R_FDRI2]);
451
+ fifo32_push(&s->new_f_data, s->regs[R_FDRI3]);
452
+ }
453
+
454
+ if (fifo32_is_full(&s->new_f_data)) {
455
+ uint32_t addr = extract32(s->regs[R_FAR0], 0, 23);
456
+ XlnxCFrame *f = g_new(XlnxCFrame, 1);
457
+
458
+ for (int i = 0; i < FRAME_NUM_WORDS; i++) {
459
+ f->data[i] = fifo32_pop(&s->new_f_data);
460
+ }
461
+
462
+ g_tree_replace(s->cframes, GUINT_TO_POINTER(addr), f);
463
+
464
+ cframe_incr_far(s);
465
+
466
+ fifo32_reset(&s->new_f_data);
467
+ }
468
+ }
469
+}
470
+
471
+static void cfrm_readout_frames(XlnxVersalCFrameReg *s, uint32_t start_addr,
472
+ uint32_t end_addr)
473
+{
474
+ /*
475
+ * NB: when our minimum glib version is at least 2.68 we can improve the
476
+ * performance of the cframe traversal by using g_tree_lookup_node and
477
+ * g_tree_node_next (instead of calling g_tree_lookup for finding each
478
+ * cframe).
116
+ */
479
+ */
117
+ imm = tcg_const_i64(a->imm << a->msz);
480
+ for (uint32_t addr = start_addr; addr < end_addr; addr++) {
118
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
481
+ XlnxCFrame *f = g_tree_lookup(s->cframes, GUINT_TO_POINTER(addr));
119
+ tcg_temp_free_i64(imm);
482
+
120
+ return true;
483
+ /* Transmit the data if a frame was found */
121
+}
484
+ if (f) {
122
+
485
+ for (int i = 0; i < FRAME_NUM_WORDS; i += 4) {
123
/*
486
+ XlnxCfiPacket pkt = {};
124
* Prefetches
487
+
125
*/
488
+ pkt.data[0] = f->data[i];
126
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
489
+ pkt.data[1] = f->data[i + 1];
490
+ pkt.data[2] = f->data[i + 2];
491
+ pkt.data[3] = f->data[i + 3];
492
+
493
+ if (s->cfg.cfu_fdro) {
494
+ xlnx_cfi_transfer_packet(s->cfg.cfu_fdro, &pkt);
495
+ }
496
+ }
497
+ }
498
+ }
499
+}
500
+
501
+static void cfrm_frcnt_post_write(RegisterInfo *reg, uint64_t val)
502
+{
503
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
504
+
505
+ if (s->row_configured && s->rowon && s->rcfg) {
506
+ uint32_t start_addr = extract32(s->regs[R_FAR0], 0, 23);
507
+ uint32_t end_addr = start_addr + s->regs[R_FRCNT0] / FRAME_NUM_QWORDS;
508
+
509
+ cfrm_readout_frames(s, start_addr, end_addr);
510
+ }
511
+}
512
+
513
+static void cfrm_cmd_post_write(RegisterInfo *reg, uint64_t val)
514
+{
515
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
516
+
517
+ if (s->row_configured) {
518
+ uint8_t cmd = ARRAY_FIELD_EX32(s->regs, CMD0, CMD);
519
+
520
+ switch (cmd) {
521
+ case CFRAME_CMD_WCFG:
522
+ s->wcfg = true;
523
+ break;
524
+ case CFRAME_CMD_ROWON:
525
+ s->rowon = true;
526
+ break;
527
+ case CFRAME_CMD_ROWOFF:
528
+ s->rowon = false;
529
+ break;
530
+ case CFRAME_CMD_RCFG:
531
+ s->rcfg = true;
532
+ break;
533
+ case CFRAME_CMD_DLPARK:
534
+ s->wcfg = false;
535
+ s->rcfg = false;
536
+ break;
537
+ default:
538
+ break;
539
+ };
540
+ }
541
+}
542
+
543
+static uint64_t cfrm_last_frame_bot_post_read(RegisterInfo *reg,
544
+ uint64_t val64)
545
+{
546
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
547
+ uint64_t val = 0;
548
+
549
+ switch (reg->access->addr) {
550
+ case A_LAST_FRAME_BOT0:
551
+ val = FIELD_DP32(val, LAST_FRAME_BOT0, BLOCKTYPE1_LAST_FRAME_LSB,
552
+ s->cfg.blktype_num_frames[1]);
553
+ val = FIELD_DP32(val, LAST_FRAME_BOT0, BLOCKTYPE0_LAST_FRAME,
554
+ s->cfg.blktype_num_frames[0]);
555
+ break;
556
+ case A_LAST_FRAME_BOT1:
557
+ val = FIELD_DP32(val, LAST_FRAME_BOT1, BLOCKTYPE3_LAST_FRAME_LSB,
558
+ s->cfg.blktype_num_frames[3]);
559
+ val = FIELD_DP32(val, LAST_FRAME_BOT1, BLOCKTYPE2_LAST_FRAME,
560
+ s->cfg.blktype_num_frames[2]);
561
+ val = FIELD_DP32(val, LAST_FRAME_BOT1, BLOCKTYPE1_LAST_FRAME_MSB,
562
+ (s->cfg.blktype_num_frames[1] >> 12));
563
+ break;
564
+ case A_LAST_FRAME_BOT2:
565
+ val = FIELD_DP32(val, LAST_FRAME_BOT2, BLOCKTYPE3_LAST_FRAME_MSB,
566
+ (s->cfg.blktype_num_frames[3] >> 4));
567
+ break;
568
+ case A_LAST_FRAME_BOT3:
569
+ default:
570
+ break;
571
+ }
572
+
573
+ return val;
574
+}
575
+
576
+static uint64_t cfrm_last_frame_top_post_read(RegisterInfo *reg,
577
+ uint64_t val64)
578
+{
579
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
580
+ uint64_t val = 0;
581
+
582
+ switch (reg->access->addr) {
583
+ case A_LAST_FRAME_TOP0:
584
+ val = FIELD_DP32(val, LAST_FRAME_TOP0, BLOCKTYPE5_LAST_FRAME_LSB,
585
+ s->cfg.blktype_num_frames[5]);
586
+ val = FIELD_DP32(val, LAST_FRAME_TOP0, BLOCKTYPE4_LAST_FRAME,
587
+ s->cfg.blktype_num_frames[4]);
588
+ break;
589
+ case A_LAST_FRAME_TOP1:
590
+ val = FIELD_DP32(val, LAST_FRAME_TOP1, BLOCKTYPE6_LAST_FRAME,
591
+ s->cfg.blktype_num_frames[6]);
592
+ val = FIELD_DP32(val, LAST_FRAME_TOP1, BLOCKTYPE5_LAST_FRAME_MSB,
593
+ (s->cfg.blktype_num_frames[5] >> 12));
594
+ break;
595
+ case A_LAST_FRAME_TOP2:
596
+ case A_LAST_FRAME_BOT3:
597
+ default:
598
+ break;
599
+ }
600
+
601
+ return val;
602
+}
603
+
604
+static void cfrm_far_sfr_post_write(RegisterInfo *reg, uint64_t val)
605
+{
606
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(reg->opaque);
607
+
608
+ if (s->row_configured && s->rowon && s->rcfg) {
609
+ uint32_t start_addr = extract32(s->regs[R_FAR_SFR0], 0, 23);
610
+
611
+ /* Readback 1 frame */
612
+ cfrm_readout_frames(s, start_addr, start_addr + 1);
613
+ }
614
+}
615
+
616
+static const RegisterAccessInfo cframe_reg_regs_info[] = {
617
+ { .name = "CRC0", .addr = A_CRC0,
618
+ .rsvd = 0x00000000,
619
+ },{ .name = "CRC1", .addr = A_CRC0,
620
+ .rsvd = 0xffffffff,
621
+ },{ .name = "CRC2", .addr = A_CRC0,
622
+ .rsvd = 0xffffffff,
623
+ },{ .name = "CRC3", .addr = A_CRC0,
624
+ .rsvd = 0xffffffff,
625
+ },{ .name = "FAR0", .addr = A_FAR0,
626
+ .rsvd = 0xfe000000,
627
+ },{ .name = "FAR1", .addr = A_FAR1,
628
+ .rsvd = 0xffffffff,
629
+ },{ .name = "FAR2", .addr = A_FAR2,
630
+ .rsvd = 0xffffffff,
631
+ },{ .name = "FAR3", .addr = A_FAR3,
632
+ .rsvd = 0xffffffff,
633
+ },{ .name = "FAR_SFR0", .addr = A_FAR_SFR0,
634
+ .rsvd = 0xff800000,
635
+ },{ .name = "FAR_SFR1", .addr = A_FAR_SFR1,
636
+ .rsvd = 0xffffffff,
637
+ },{ .name = "FAR_SFR2", .addr = A_FAR_SFR2,
638
+ .rsvd = 0xffffffff,
639
+ },{ .name = "FAR_SFR3", .addr = A_FAR_SFR3,
640
+ .rsvd = 0xffffffff,
641
+ .post_write = cfrm_far_sfr_post_write,
642
+ },{ .name = "FDRI0", .addr = A_FDRI0,
643
+ },{ .name = "FDRI1", .addr = A_FDRI1,
644
+ },{ .name = "FDRI2", .addr = A_FDRI2,
645
+ },{ .name = "FDRI3", .addr = A_FDRI3,
646
+ .post_write = cfrm_fdri_post_write,
647
+ },{ .name = "FRCNT0", .addr = A_FRCNT0,
648
+ .rsvd = 0x00000000,
649
+ },{ .name = "FRCNT1", .addr = A_FRCNT1,
650
+ .rsvd = 0xffffffff,
651
+ },{ .name = "FRCNT2", .addr = A_FRCNT2,
652
+ .rsvd = 0xffffffff,
653
+ },{ .name = "FRCNT3", .addr = A_FRCNT3,
654
+ .rsvd = 0xffffffff,
655
+ .post_write = cfrm_frcnt_post_write
656
+ },{ .name = "CMD0", .addr = A_CMD0,
657
+ .rsvd = 0xffffffe0,
658
+ },{ .name = "CMD1", .addr = A_CMD1,
659
+ .rsvd = 0xffffffff,
660
+ },{ .name = "CMD2", .addr = A_CMD2,
661
+ .rsvd = 0xffffffff,
662
+ },{ .name = "CMD3", .addr = A_CMD3,
663
+ .rsvd = 0xffffffff,
664
+ .post_write = cfrm_cmd_post_write
665
+ },{ .name = "CR_MASK0", .addr = A_CR_MASK0,
666
+ .rsvd = 0x00000000,
667
+ },{ .name = "CR_MASK1", .addr = A_CR_MASK1,
668
+ .rsvd = 0x00000000,
669
+ },{ .name = "CR_MASK2", .addr = A_CR_MASK2,
670
+ .rsvd = 0x00000000,
671
+ },{ .name = "CR_MASK3", .addr = A_CR_MASK3,
672
+ .rsvd = 0xffffffff,
673
+ },{ .name = "CTL0", .addr = A_CTL0,
674
+ .rsvd = 0xfffffff8,
675
+ },{ .name = "CTL1", .addr = A_CTL1,
676
+ .rsvd = 0xffffffff,
677
+ },{ .name = "CTL2", .addr = A_CTL2,
678
+ .rsvd = 0xffffffff,
679
+ },{ .name = "CTL3", .addr = A_CTL3,
680
+ .rsvd = 0xffffffff,
681
+ },{ .name = "CFRM_ISR0", .addr = A_CFRM_ISR0,
682
+ .rsvd = 0xffc04000,
683
+ .w1c = 0x3bfff,
684
+ },{ .name = "CFRM_ISR1", .addr = A_CFRM_ISR1,
685
+ .rsvd = 0xffffffff,
686
+ },{ .name = "CFRM_ISR2", .addr = A_CFRM_ISR2,
687
+ .rsvd = 0xffffffff,
688
+ },{ .name = "CFRM_ISR3", .addr = A_CFRM_ISR3,
689
+ .rsvd = 0xffffffff,
690
+ .post_write = cfrm_isr_postw,
691
+ },{ .name = "CFRM_IMR0", .addr = A_CFRM_IMR0,
692
+ .rsvd = 0xffc04000,
693
+ .ro = 0xfffff,
694
+ .reset = 0x3bfff,
695
+ },{ .name = "CFRM_IMR1", .addr = A_CFRM_IMR1,
696
+ .rsvd = 0xffffffff,
697
+ },{ .name = "CFRM_IMR2", .addr = A_CFRM_IMR2,
698
+ .rsvd = 0xffffffff,
699
+ },{ .name = "CFRM_IMR3", .addr = A_CFRM_IMR3,
700
+ .rsvd = 0xffffffff,
701
+ },{ .name = "CFRM_IER0", .addr = A_CFRM_IER0,
702
+ .rsvd = 0xffc04000,
703
+ },{ .name = "CFRM_IER1", .addr = A_CFRM_IER1,
704
+ .rsvd = 0xffffffff,
705
+ },{ .name = "CFRM_IER2", .addr = A_CFRM_IER2,
706
+ .rsvd = 0xffffffff,
707
+ },{ .name = "CFRM_IER3", .addr = A_CFRM_IER3,
708
+ .rsvd = 0xffffffff,
709
+ .pre_write = cfrm_ier_prew,
710
+ },{ .name = "CFRM_IDR0", .addr = A_CFRM_IDR0,
711
+ .rsvd = 0xffc04000,
712
+ },{ .name = "CFRM_IDR1", .addr = A_CFRM_IDR1,
713
+ .rsvd = 0xffffffff,
714
+ },{ .name = "CFRM_IDR2", .addr = A_CFRM_IDR2,
715
+ .rsvd = 0xffffffff,
716
+ },{ .name = "CFRM_IDR3", .addr = A_CFRM_IDR3,
717
+ .rsvd = 0xffffffff,
718
+ .pre_write = cfrm_idr_prew,
719
+ },{ .name = "CFRM_ITR0", .addr = A_CFRM_ITR0,
720
+ .rsvd = 0xffc04000,
721
+ },{ .name = "CFRM_ITR1", .addr = A_CFRM_ITR1,
722
+ .rsvd = 0xffffffff,
723
+ },{ .name = "CFRM_ITR2", .addr = A_CFRM_ITR2,
724
+ .rsvd = 0xffffffff,
725
+ },{ .name = "CFRM_ITR3", .addr = A_CFRM_ITR3,
726
+ .rsvd = 0xffffffff,
727
+ .pre_write = cfrm_itr_prew,
728
+ },{ .name = "SEU_SYNDRM00", .addr = A_SEU_SYNDRM00,
729
+ },{ .name = "SEU_SYNDRM01", .addr = A_SEU_SYNDRM01,
730
+ },{ .name = "SEU_SYNDRM02", .addr = A_SEU_SYNDRM02,
731
+ },{ .name = "SEU_SYNDRM03", .addr = A_SEU_SYNDRM03,
732
+ },{ .name = "SEU_SYNDRM10", .addr = A_SEU_SYNDRM10,
733
+ },{ .name = "SEU_SYNDRM11", .addr = A_SEU_SYNDRM11,
734
+ },{ .name = "SEU_SYNDRM12", .addr = A_SEU_SYNDRM12,
735
+ },{ .name = "SEU_SYNDRM13", .addr = A_SEU_SYNDRM13,
736
+ },{ .name = "SEU_SYNDRM20", .addr = A_SEU_SYNDRM20,
737
+ },{ .name = "SEU_SYNDRM21", .addr = A_SEU_SYNDRM21,
738
+ },{ .name = "SEU_SYNDRM22", .addr = A_SEU_SYNDRM22,
739
+ },{ .name = "SEU_SYNDRM23", .addr = A_SEU_SYNDRM23,
740
+ },{ .name = "SEU_SYNDRM30", .addr = A_SEU_SYNDRM30,
741
+ },{ .name = "SEU_SYNDRM31", .addr = A_SEU_SYNDRM31,
742
+ },{ .name = "SEU_SYNDRM32", .addr = A_SEU_SYNDRM32,
743
+ },{ .name = "SEU_SYNDRM33", .addr = A_SEU_SYNDRM33,
744
+ },{ .name = "SEU_VIRTUAL_SYNDRM0", .addr = A_SEU_VIRTUAL_SYNDRM0,
745
+ },{ .name = "SEU_VIRTUAL_SYNDRM1", .addr = A_SEU_VIRTUAL_SYNDRM1,
746
+ },{ .name = "SEU_VIRTUAL_SYNDRM2", .addr = A_SEU_VIRTUAL_SYNDRM2,
747
+ },{ .name = "SEU_VIRTUAL_SYNDRM3", .addr = A_SEU_VIRTUAL_SYNDRM3,
748
+ },{ .name = "SEU_CRC0", .addr = A_SEU_CRC0,
749
+ },{ .name = "SEU_CRC1", .addr = A_SEU_CRC1,
750
+ },{ .name = "SEU_CRC2", .addr = A_SEU_CRC2,
751
+ },{ .name = "SEU_CRC3", .addr = A_SEU_CRC3,
752
+ },{ .name = "CFRAME_FAR_BOT0", .addr = A_CFRAME_FAR_BOT0,
753
+ },{ .name = "CFRAME_FAR_BOT1", .addr = A_CFRAME_FAR_BOT1,
754
+ },{ .name = "CFRAME_FAR_BOT2", .addr = A_CFRAME_FAR_BOT2,
755
+ },{ .name = "CFRAME_FAR_BOT3", .addr = A_CFRAME_FAR_BOT3,
756
+ },{ .name = "CFRAME_FAR_TOP0", .addr = A_CFRAME_FAR_TOP0,
757
+ },{ .name = "CFRAME_FAR_TOP1", .addr = A_CFRAME_FAR_TOP1,
758
+ },{ .name = "CFRAME_FAR_TOP2", .addr = A_CFRAME_FAR_TOP2,
759
+ },{ .name = "CFRAME_FAR_TOP3", .addr = A_CFRAME_FAR_TOP3,
760
+ },{ .name = "LAST_FRAME_BOT0", .addr = A_LAST_FRAME_BOT0,
761
+ .ro = 0xffffffff,
762
+ .post_read = cfrm_last_frame_bot_post_read,
763
+ },{ .name = "LAST_FRAME_BOT1", .addr = A_LAST_FRAME_BOT1,
764
+ .ro = 0xffffffff,
765
+ .post_read = cfrm_last_frame_bot_post_read,
766
+ },{ .name = "LAST_FRAME_BOT2", .addr = A_LAST_FRAME_BOT2,
767
+ .ro = 0xffffffff,
768
+ .post_read = cfrm_last_frame_bot_post_read,
769
+ },{ .name = "LAST_FRAME_BOT3", .addr = A_LAST_FRAME_BOT3,
770
+ .ro = 0xffffffff,
771
+ .post_read = cfrm_last_frame_bot_post_read,
772
+ },{ .name = "LAST_FRAME_TOP0", .addr = A_LAST_FRAME_TOP0,
773
+ .ro = 0xffffffff,
774
+ .post_read = cfrm_last_frame_top_post_read,
775
+ },{ .name = "LAST_FRAME_TOP1", .addr = A_LAST_FRAME_TOP1,
776
+ .ro = 0xffffffff,
777
+ .post_read = cfrm_last_frame_top_post_read,
778
+ },{ .name = "LAST_FRAME_TOP2", .addr = A_LAST_FRAME_TOP2,
779
+ .ro = 0xffffffff,
780
+ .post_read = cfrm_last_frame_top_post_read,
781
+ },{ .name = "LAST_FRAME_TOP3", .addr = A_LAST_FRAME_TOP3,
782
+ .ro = 0xffffffff,
783
+ .post_read = cfrm_last_frame_top_post_read,
784
+ }
785
+};
786
+
787
+static void cframe_reg_cfi_transfer_packet(XlnxCfiIf *cfi_if,
788
+ XlnxCfiPacket *pkt)
789
+{
790
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(cfi_if);
791
+ uint64_t we = MAKE_64BIT_MASK(0, 4 * 8);
792
+
793
+ if (!s->row_configured) {
794
+ return;
795
+ }
796
+
797
+ switch (pkt->reg_addr) {
798
+ case CFRAME_FAR:
799
+ s->regs[R_FAR0] = pkt->data[0];
800
+ break;
801
+ case CFRAME_SFR:
802
+ s->regs[R_FAR_SFR0] = pkt->data[0];
803
+ register_write(&s->regs_info[R_FAR_SFR3], 0,
804
+ we, object_get_typename(OBJECT(s)),
805
+ XLNX_VERSAL_CFRAME_REG_ERR_DEBUG);
806
+ break;
807
+ case CFRAME_FDRI:
808
+ s->regs[R_FDRI0] = pkt->data[0];
809
+ s->regs[R_FDRI1] = pkt->data[1];
810
+ s->regs[R_FDRI2] = pkt->data[2];
811
+ register_write(&s->regs_info[R_FDRI3], pkt->data[3],
812
+ we, object_get_typename(OBJECT(s)),
813
+ XLNX_VERSAL_CFRAME_REG_ERR_DEBUG);
814
+ break;
815
+ case CFRAME_CMD:
816
+ ARRAY_FIELD_DP32(s->regs, CMD0, CMD, pkt->data[0]);
817
+
818
+ register_write(&s->regs_info[R_CMD3], 0,
819
+ we, object_get_typename(OBJECT(s)),
820
+ XLNX_VERSAL_CFRAME_REG_ERR_DEBUG);
821
+ break;
822
+ default:
823
+ break;
824
+ }
825
+}
826
+
827
+static uint64_t cframe_reg_fdri_read(void *opaque, hwaddr addr, unsigned size)
828
+{
829
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Unsupported read from addr=%"
830
+ HWADDR_PRIx "\n", __func__, addr);
831
+ return 0;
832
+}
833
+
834
+static void cframe_reg_fdri_write(void *opaque, hwaddr addr, uint64_t value,
835
+ unsigned size)
836
+{
837
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(opaque);
838
+ uint32_t wfifo[WFIFO_SZ];
839
+
840
+ if (update_wfifo(addr, value, s->wfifo, wfifo)) {
841
+ uint64_t we = MAKE_64BIT_MASK(0, 4 * 8);
842
+
843
+ s->regs[R_FDRI0] = wfifo[0];
844
+ s->regs[R_FDRI1] = wfifo[1];
845
+ s->regs[R_FDRI2] = wfifo[2];
846
+ register_write(&s->regs_info[R_FDRI3], wfifo[3],
847
+ we, object_get_typename(OBJECT(s)),
848
+ XLNX_VERSAL_CFRAME_REG_ERR_DEBUG);
849
+ }
850
+}
851
+
852
+static void cframe_reg_reset_enter(Object *obj, ResetType type)
853
+{
854
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(obj);
855
+ unsigned int i;
856
+
857
+ for (i = 0; i < ARRAY_SIZE(s->regs_info); ++i) {
858
+ register_reset(&s->regs_info[i]);
859
+ }
860
+ memset(s->wfifo, 0, WFIFO_SZ * sizeof(uint32_t));
861
+ fifo32_reset(&s->new_f_data);
862
+
863
+ if (g_tree_nnodes(s->cframes)) {
864
+ /*
865
+ * Take a reference so when g_tree_destroy() unrefs it we keep the
866
+ * GTree and only destroy its contents. NB: when our minimum
867
+ * glib version is at least 2.70 we could use g_tree_remove_all().
868
+ */
869
+ g_tree_ref(s->cframes);
870
+ g_tree_destroy(s->cframes);
871
+ }
872
+}
873
+
874
+static void cframe_reg_reset_hold(Object *obj)
875
+{
876
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(obj);
877
+
878
+ cfrm_imr_update_irq(s);
879
+}
880
+
881
+static const MemoryRegionOps cframe_reg_ops = {
882
+ .read = register_read_memory,
883
+ .write = register_write_memory,
884
+ .endianness = DEVICE_LITTLE_ENDIAN,
885
+ .valid = {
886
+ .min_access_size = 4,
887
+ .max_access_size = 4,
888
+ },
889
+};
890
+
891
+static const MemoryRegionOps cframe_reg_fdri_ops = {
892
+ .read = cframe_reg_fdri_read,
893
+ .write = cframe_reg_fdri_write,
894
+ .endianness = DEVICE_LITTLE_ENDIAN,
895
+ .valid = {
896
+ .min_access_size = 4,
897
+ .max_access_size = 4,
898
+ },
899
+};
900
+
901
+static void cframe_reg_realize(DeviceState *dev, Error **errp)
902
+{
903
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(dev);
904
+
905
+ for (int i = 0; i < ARRAY_SIZE(s->cfg.blktype_num_frames); i++) {
906
+ if (s->cfg.blktype_num_frames[i] > MAX_BLOCKTYPE_FRAMES) {
907
+ error_setg(errp,
908
+ "blktype-frames%d > 0xFFFFF (max frame per block)",
909
+ i);
910
+ return;
911
+ }
912
+ if (s->cfg.blktype_num_frames[i]) {
913
+ s->row_configured = true;
914
+ }
915
+ }
916
+}
917
+
918
+static void cframe_reg_init(Object *obj)
919
+{
920
+ XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(obj);
921
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
922
+ RegisterInfoArray *reg_array;
923
+
924
+ memory_region_init(&s->iomem, obj, TYPE_XLNX_VERSAL_CFRAME_REG,
925
+ CFRAME_REG_R_MAX * 4);
926
+ reg_array =
927
+ register_init_block32(DEVICE(obj), cframe_reg_regs_info,
928
+ ARRAY_SIZE(cframe_reg_regs_info),
929
+ s->regs_info, s->regs,
930
+ &cframe_reg_ops,
931
+ XLNX_VERSAL_CFRAME_REG_ERR_DEBUG,
932
+ CFRAME_REG_R_MAX * 4);
933
+ memory_region_add_subregion(&s->iomem,
934
+ 0x0,
935
+ &reg_array->mem);
936
+ sysbus_init_mmio(sbd, &s->iomem);
937
+ memory_region_init_io(&s->iomem_fdri, obj, &cframe_reg_fdri_ops, s,
938
+ TYPE_XLNX_VERSAL_CFRAME_REG "-fdri",
939
+ KEYHOLE_STREAM_4K);
940
+ sysbus_init_mmio(sbd, &s->iomem_fdri);
941
+ sysbus_init_irq(sbd, &s->irq_cfrm_imr);
942
+
943
+ s->cframes = g_tree_new_full((GCompareDataFunc)int_cmp, NULL,
944
+ NULL, (GDestroyNotify)g_free);
945
+ fifo32_create(&s->new_f_data, FRAME_NUM_WORDS);
946
+}
947
+
948
+static const VMStateDescription vmstate_cframe = {
949
+ .name = "cframe",
950
+ .version_id = 1,
951
+ .minimum_version_id = 1,
952
+ .fields = (VMStateField[]) {
953
+ VMSTATE_UINT32_ARRAY(data, XlnxCFrame, FRAME_NUM_WORDS),
954
+ VMSTATE_END_OF_LIST()
955
+ }
956
+};
957
+
958
+static const VMStateDescription vmstate_cframe_reg = {
959
+ .name = TYPE_XLNX_VERSAL_CFRAME_REG,
960
+ .version_id = 1,
961
+ .minimum_version_id = 1,
962
+ .fields = (VMStateField[]) {
963
+ VMSTATE_UINT32_ARRAY(wfifo, XlnxVersalCFrameReg, 4),
964
+ VMSTATE_UINT32_ARRAY(regs, XlnxVersalCFrameReg, CFRAME_REG_R_MAX),
965
+ VMSTATE_BOOL(rowon, XlnxVersalCFrameReg),
966
+ VMSTATE_BOOL(wcfg, XlnxVersalCFrameReg),
967
+ VMSTATE_BOOL(rcfg, XlnxVersalCFrameReg),
968
+ VMSTATE_GTREE_DIRECT_KEY_V(cframes, XlnxVersalCFrameReg, 1,
969
+ &vmstate_cframe, XlnxCFrame),
970
+ VMSTATE_FIFO32(new_f_data, XlnxVersalCFrameReg),
971
+ VMSTATE_END_OF_LIST(),
972
+ }
973
+};
974
+
975
+static Property cframe_regs_props[] = {
976
+ DEFINE_PROP_LINK("cfu-fdro", XlnxVersalCFrameReg, cfg.cfu_fdro,
977
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
978
+ DEFINE_PROP_UINT32("blktype0-frames", XlnxVersalCFrameReg,
979
+ cfg.blktype_num_frames[0], 0),
980
+ DEFINE_PROP_UINT32("blktype1-frames", XlnxVersalCFrameReg,
981
+ cfg.blktype_num_frames[1], 0),
982
+ DEFINE_PROP_UINT32("blktype2-frames", XlnxVersalCFrameReg,
983
+ cfg.blktype_num_frames[2], 0),
984
+ DEFINE_PROP_UINT32("blktype3-frames", XlnxVersalCFrameReg,
985
+ cfg.blktype_num_frames[3], 0),
986
+ DEFINE_PROP_UINT32("blktype4-frames", XlnxVersalCFrameReg,
987
+ cfg.blktype_num_frames[4], 0),
988
+ DEFINE_PROP_UINT32("blktype5-frames", XlnxVersalCFrameReg,
989
+ cfg.blktype_num_frames[5], 0),
990
+ DEFINE_PROP_UINT32("blktype6-frames", XlnxVersalCFrameReg,
991
+ cfg.blktype_num_frames[6], 0),
992
+ DEFINE_PROP_END_OF_LIST(),
993
+};
994
+
995
+static void cframe_reg_class_init(ObjectClass *klass, void *data)
996
+{
997
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
998
+ DeviceClass *dc = DEVICE_CLASS(klass);
999
+ XlnxCfiIfClass *xcic = XLNX_CFI_IF_CLASS(klass);
1000
+
1001
+ dc->vmsd = &vmstate_cframe_reg;
1002
+ dc->realize = cframe_reg_realize;
1003
+ rc->phases.enter = cframe_reg_reset_enter;
1004
+ rc->phases.hold = cframe_reg_reset_hold;
1005
+ device_class_set_props(dc, cframe_regs_props);
1006
+ xcic->cfi_transfer_packet = cframe_reg_cfi_transfer_packet;
1007
+}
1008
+
1009
+static const TypeInfo cframe_reg_info = {
1010
+ .name = TYPE_XLNX_VERSAL_CFRAME_REG,
1011
+ .parent = TYPE_SYS_BUS_DEVICE,
1012
+ .instance_size = sizeof(XlnxVersalCFrameReg),
1013
+ .class_init = cframe_reg_class_init,
1014
+ .instance_init = cframe_reg_init,
1015
+ .interfaces = (InterfaceInfo[]) {
1016
+ { TYPE_XLNX_CFI_IF },
1017
+ { }
1018
+ }
1019
+};
1020
+
1021
+static void cframe_reg_register_types(void)
1022
+{
1023
+ type_register_static(&cframe_reg_info);
1024
+}
1025
+
1026
+type_init(cframe_reg_register_types)
1027
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
127
index XXXXXXX..XXXXXXX 100644
1028
index XXXXXXX..XXXXXXX 100644
128
--- a/target/arm/sve.decode
1029
--- a/hw/misc/meson.build
129
+++ b/target/arm/sve.decode
1030
+++ b/hw/misc/meson.build
130
@@ -XXX,XX +XXX,XX @@
1031
@@ -XXX,XX +XXX,XX @@ system_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files(
131
&rprr_gather_load rd pg rn rm esz msz u ff xs scale
1032
'xlnx-versal-pmc-iou-slcr.c',
132
&rpri_gather_load rd pg rn imm esz msz u ff
1033
'xlnx-versal-cfu.c',
133
&rprr_scatter_store rd pg rn rm esz msz xs scale
1034
'xlnx-cfi-if.c',
134
+&rpri_scatter_store rd pg rn imm esz msz
1035
+ 'xlnx-versal-cframe-reg.c',
135
1036
))
136
###########################################################################
1037
system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: files('stm32f2xx_syscfg.c'))
137
# Named instruction formats. These are generally used to
1038
system_ss.add(when: 'CONFIG_STM32F4XX_SYSCFG', if_true: files('stm32f4xx_syscfg.c'))
138
@@ -XXX,XX +XXX,XX @@
139
&rprr_store nreg=0
140
@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \
141
&rprr_scatter_store
142
+@rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \
143
+ &rpri_scatter_store
144
145
###########################################################################
146
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
147
@@ -XXX,XX +XXX,XX @@ ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \
148
ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \
149
@rprr_scatter_store xs=2 esz=3 scale=0
150
151
+# SVE 64-bit scatter store (vector plus immediate)
152
+ST1_zpiz 1110010 .. 10 ..... 101 ... ..... ..... \
153
+ @rpri_scatter_store esz=3
154
+
155
+# SVE 32-bit scatter store (vector plus immediate)
156
+ST1_zpiz 1110010 .. 11 ..... 101 ... ..... ..... \
157
+ @rpri_scatter_store esz=2
158
+
159
# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset)
160
# Require msz > 0
161
ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \
162
--
1039
--
163
2.17.1
1040
2.34.1
164
165
diff view generated by jsdifflib
1
From: Eric Auger <eric.auger@redhat.com>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
When running dtc on the guest /proc/device-tree we get the
3
Introduce a model of Xilinx Versal's Configuration Frame broadcast
4
following warnings: "Warning (unit_address_vs_reg): Node <name>
4
controller (CFRAME_BCAST_REG).
5
has a reg or ranges property, but no unit name", with name:
6
/intc, /intc/its, /intc/v2m.
7
5
8
Nodes should have a name in the form <name>[@<unit-address>] where
6
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
9
unit-address is the primary address used to access the device, listed
10
in the node's reg property. This fix seems to make dtc happy.
11
12
Signed-off-by: Eric Auger <eric.auger@redhat.com>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 1530044492-24921-3-git-send-email-eric.auger@redhat.com
8
Message-id: 20230831165701.2016397-7-francisco.iglesias@amd.com
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
10
---
17
hw/arm/virt.c | 63 +++++++++++++++++++++++++++++++--------------------
11
include/hw/misc/xlnx-versal-cframe-reg.h | 17 +++
18
1 file changed, 39 insertions(+), 24 deletions(-)
12
hw/misc/xlnx-versal-cframe-reg.c | 161 +++++++++++++++++++++++
13
2 files changed, 178 insertions(+)
19
14
20
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
15
diff --git a/include/hw/misc/xlnx-versal-cframe-reg.h b/include/hw/misc/xlnx-versal-cframe-reg.h
21
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/arm/virt.c
17
--- a/include/hw/misc/xlnx-versal-cframe-reg.h
23
+++ b/hw/arm/virt.c
18
+++ b/include/hw/misc/xlnx-versal-cframe-reg.h
24
@@ -XXX,XX +XXX,XX @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
19
@@ -XXX,XX +XXX,XX @@
25
20
#define TYPE_XLNX_VERSAL_CFRAME_REG "xlnx,cframe-reg"
26
static void fdt_add_its_gic_node(VirtMachineState *vms)
21
OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFrameReg, XLNX_VERSAL_CFRAME_REG)
22
23
+#define TYPE_XLNX_VERSAL_CFRAME_BCAST_REG "xlnx.cframe-bcast-reg"
24
+OBJECT_DECLARE_SIMPLE_TYPE(XlnxVersalCFrameBcastReg,
25
+ XLNX_VERSAL_CFRAME_BCAST_REG)
26
+
27
/*
28
* The registers in this module are 128 bits wide but it is ok to write
29
* and read them through 4 sequential 32 bit accesses (address[3:2] = 0,
30
@@ -XXX,XX +XXX,XX @@ struct XlnxVersalCFrameReg {
31
bool row_configured;
32
};
33
34
+struct XlnxVersalCFrameBcastReg {
35
+ SysBusDevice parent_obj;
36
+ MemoryRegion iomem_reg;
37
+ MemoryRegion iomem_fdri;
38
+
39
+ /* 128-bit wfifo. */
40
+ uint32_t wfifo[WFIFO_SZ];
41
+
42
+ struct {
43
+ XlnxCfiIf *cframe[15];
44
+ } cfg;
45
+};
46
+
47
#endif
48
diff --git a/hw/misc/xlnx-versal-cframe-reg.c b/hw/misc/xlnx-versal-cframe-reg.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/hw/misc/xlnx-versal-cframe-reg.c
51
+++ b/hw/misc/xlnx-versal-cframe-reg.c
52
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps cframe_reg_fdri_ops = {
53
},
54
};
55
56
+static uint64_t cframes_bcast_reg_read(void *opaque, hwaddr addr, unsigned size)
57
+{
58
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Unsupported read from addr=%"
59
+ HWADDR_PRIx "\n", __func__, addr);
60
+ return 0;
61
+}
62
+
63
+static void cframes_bcast_write(XlnxVersalCFrameBcastReg *s, uint8_t reg_addr,
64
+ uint32_t *wfifo)
65
+{
66
+ XlnxCfiPacket pkt = {
67
+ .reg_addr = reg_addr,
68
+ .data[0] = wfifo[0],
69
+ .data[1] = wfifo[1],
70
+ .data[2] = wfifo[2],
71
+ .data[3] = wfifo[3]
72
+ };
73
+
74
+ for (int i = 0; i < ARRAY_SIZE(s->cfg.cframe); i++) {
75
+ if (s->cfg.cframe[i]) {
76
+ xlnx_cfi_transfer_packet(s->cfg.cframe[i], &pkt);
77
+ }
78
+ }
79
+}
80
+
81
+static void cframes_bcast_reg_write(void *opaque, hwaddr addr, uint64_t value,
82
+ unsigned size)
83
+{
84
+ XlnxVersalCFrameBcastReg *s = XLNX_VERSAL_CFRAME_BCAST_REG(opaque);
85
+ uint32_t wfifo[WFIFO_SZ];
86
+
87
+ if (update_wfifo(addr, value, s->wfifo, wfifo)) {
88
+ uint8_t reg_addr = extract32(addr, 4, 6);
89
+
90
+ cframes_bcast_write(s, reg_addr, wfifo);
91
+ }
92
+}
93
+
94
+static uint64_t cframes_bcast_fdri_read(void *opaque, hwaddr addr,
95
+ unsigned size)
96
+{
97
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Unsupported read from addr=%"
98
+ HWADDR_PRIx "\n", __func__, addr);
99
+ return 0;
100
+}
101
+
102
+static void cframes_bcast_fdri_write(void *opaque, hwaddr addr, uint64_t value,
103
+ unsigned size)
104
+{
105
+ XlnxVersalCFrameBcastReg *s = XLNX_VERSAL_CFRAME_BCAST_REG(opaque);
106
+ uint32_t wfifo[WFIFO_SZ];
107
+
108
+ if (update_wfifo(addr, value, s->wfifo, wfifo)) {
109
+ cframes_bcast_write(s, CFRAME_FDRI, wfifo);
110
+ }
111
+}
112
+
113
+static const MemoryRegionOps cframes_bcast_reg_reg_ops = {
114
+ .read = cframes_bcast_reg_read,
115
+ .write = cframes_bcast_reg_write,
116
+ .endianness = DEVICE_LITTLE_ENDIAN,
117
+ .valid = {
118
+ .min_access_size = 4,
119
+ .max_access_size = 4,
120
+ },
121
+};
122
+
123
+static const MemoryRegionOps cframes_bcast_reg_fdri_ops = {
124
+ .read = cframes_bcast_fdri_read,
125
+ .write = cframes_bcast_fdri_write,
126
+ .endianness = DEVICE_LITTLE_ENDIAN,
127
+ .valid = {
128
+ .min_access_size = 4,
129
+ .max_access_size = 4,
130
+ },
131
+};
132
+
133
static void cframe_reg_realize(DeviceState *dev, Error **errp)
27
{
134
{
28
+ char *nodename;
135
XlnxVersalCFrameReg *s = XLNX_VERSAL_CFRAME_REG(dev);
29
+
136
@@ -XXX,XX +XXX,XX @@ static Property cframe_regs_props[] = {
30
vms->msi_phandle = qemu_fdt_alloc_phandle(vms->fdt);
137
DEFINE_PROP_END_OF_LIST(),
31
- qemu_fdt_add_subnode(vms->fdt, "/intc/its");
138
};
32
- qemu_fdt_setprop_string(vms->fdt, "/intc/its", "compatible",
139
33
+ nodename = g_strdup_printf("/intc/its@%" PRIx64,
140
+static void cframe_bcast_reg_init(Object *obj)
34
+ vms->memmap[VIRT_GIC_ITS].base);
141
+{
35
+ qemu_fdt_add_subnode(vms->fdt, nodename);
142
+ XlnxVersalCFrameBcastReg *s = XLNX_VERSAL_CFRAME_BCAST_REG(obj);
36
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
143
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
37
"arm,gic-v3-its");
144
+
38
- qemu_fdt_setprop(vms->fdt, "/intc/its", "msi-controller", NULL, 0);
145
+ memory_region_init_io(&s->iomem_reg, obj, &cframes_bcast_reg_reg_ops, s,
39
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc/its", "reg",
146
+ TYPE_XLNX_VERSAL_CFRAME_BCAST_REG, KEYHOLE_STREAM_4K);
40
+ qemu_fdt_setprop(vms->fdt, nodename, "msi-controller", NULL, 0);
147
+ memory_region_init_io(&s->iomem_fdri, obj, &cframes_bcast_reg_fdri_ops, s,
41
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
148
+ TYPE_XLNX_VERSAL_CFRAME_BCAST_REG "-fdri",
42
2, vms->memmap[VIRT_GIC_ITS].base,
149
+ KEYHOLE_STREAM_4K);
43
2, vms->memmap[VIRT_GIC_ITS].size);
150
+ sysbus_init_mmio(sbd, &s->iomem_reg);
44
- qemu_fdt_setprop_cell(vms->fdt, "/intc/its", "phandle", vms->msi_phandle);
151
+ sysbus_init_mmio(sbd, &s->iomem_fdri);
45
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->msi_phandle);
152
+}
46
+ g_free(nodename);
153
+
154
+static void cframe_bcast_reg_reset_enter(Object *obj, ResetType type)
155
+{
156
+ XlnxVersalCFrameBcastReg *s = XLNX_VERSAL_CFRAME_BCAST_REG(obj);
157
+
158
+ memset(s->wfifo, 0, WFIFO_SZ * sizeof(uint32_t));
159
+}
160
+
161
+static const VMStateDescription vmstate_cframe_bcast_reg = {
162
+ .name = TYPE_XLNX_VERSAL_CFRAME_BCAST_REG,
163
+ .version_id = 1,
164
+ .minimum_version_id = 1,
165
+ .fields = (VMStateField[]) {
166
+ VMSTATE_UINT32_ARRAY(wfifo, XlnxVersalCFrameBcastReg, 4),
167
+ VMSTATE_END_OF_LIST(),
168
+ }
169
+};
170
+
171
+static Property cframe_bcast_regs_props[] = {
172
+ DEFINE_PROP_LINK("cframe0", XlnxVersalCFrameBcastReg, cfg.cframe[0],
173
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
174
+ DEFINE_PROP_LINK("cframe1", XlnxVersalCFrameBcastReg, cfg.cframe[1],
175
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
176
+ DEFINE_PROP_LINK("cframe2", XlnxVersalCFrameBcastReg, cfg.cframe[2],
177
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
178
+ DEFINE_PROP_LINK("cframe3", XlnxVersalCFrameBcastReg, cfg.cframe[3],
179
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
180
+ DEFINE_PROP_LINK("cframe4", XlnxVersalCFrameBcastReg, cfg.cframe[4],
181
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
182
+ DEFINE_PROP_LINK("cframe5", XlnxVersalCFrameBcastReg, cfg.cframe[5],
183
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
184
+ DEFINE_PROP_LINK("cframe6", XlnxVersalCFrameBcastReg, cfg.cframe[6],
185
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
186
+ DEFINE_PROP_LINK("cframe7", XlnxVersalCFrameBcastReg, cfg.cframe[7],
187
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
188
+ DEFINE_PROP_LINK("cframe8", XlnxVersalCFrameBcastReg, cfg.cframe[8],
189
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
190
+ DEFINE_PROP_LINK("cframe9", XlnxVersalCFrameBcastReg, cfg.cframe[9],
191
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
192
+ DEFINE_PROP_LINK("cframe10", XlnxVersalCFrameBcastReg, cfg.cframe[10],
193
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
194
+ DEFINE_PROP_LINK("cframe11", XlnxVersalCFrameBcastReg, cfg.cframe[11],
195
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
196
+ DEFINE_PROP_LINK("cframe12", XlnxVersalCFrameBcastReg, cfg.cframe[12],
197
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
198
+ DEFINE_PROP_LINK("cframe13", XlnxVersalCFrameBcastReg, cfg.cframe[13],
199
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
200
+ DEFINE_PROP_LINK("cframe14", XlnxVersalCFrameBcastReg, cfg.cframe[14],
201
+ TYPE_XLNX_CFI_IF, XlnxCfiIf *),
202
+ DEFINE_PROP_END_OF_LIST(),
203
+};
204
+
205
static void cframe_reg_class_init(ObjectClass *klass, void *data)
206
{
207
ResettableClass *rc = RESETTABLE_CLASS(klass);
208
@@ -XXX,XX +XXX,XX @@ static void cframe_reg_class_init(ObjectClass *klass, void *data)
209
xcic->cfi_transfer_packet = cframe_reg_cfi_transfer_packet;
47
}
210
}
48
211
49
static void fdt_add_v2m_gic_node(VirtMachineState *vms)
212
+static void cframe_bcast_reg_class_init(ObjectClass *klass, void *data)
213
+{
214
+ DeviceClass *dc = DEVICE_CLASS(klass);
215
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
216
+
217
+ dc->vmsd = &vmstate_cframe_bcast_reg;
218
+ device_class_set_props(dc, cframe_bcast_regs_props);
219
+ rc->phases.enter = cframe_bcast_reg_reset_enter;
220
+}
221
+
222
static const TypeInfo cframe_reg_info = {
223
.name = TYPE_XLNX_VERSAL_CFRAME_REG,
224
.parent = TYPE_SYS_BUS_DEVICE,
225
@@ -XXX,XX +XXX,XX @@ static const TypeInfo cframe_reg_info = {
226
}
227
};
228
229
+static const TypeInfo cframe_bcast_reg_info = {
230
+ .name = TYPE_XLNX_VERSAL_CFRAME_BCAST_REG,
231
+ .parent = TYPE_SYS_BUS_DEVICE,
232
+ .instance_size = sizeof(XlnxVersalCFrameBcastReg),
233
+ .class_init = cframe_bcast_reg_class_init,
234
+ .instance_init = cframe_bcast_reg_init,
235
+};
236
+
237
static void cframe_reg_register_types(void)
50
{
238
{
51
+ char *nodename;
239
type_register_static(&cframe_reg_info);
52
+
240
+ type_register_static(&cframe_bcast_reg_info);
53
+ nodename = g_strdup_printf("/intc/v2m@%" PRIx64,
54
+ vms->memmap[VIRT_GIC_V2M].base);
55
vms->msi_phandle = qemu_fdt_alloc_phandle(vms->fdt);
56
- qemu_fdt_add_subnode(vms->fdt, "/intc/v2m");
57
- qemu_fdt_setprop_string(vms->fdt, "/intc/v2m", "compatible",
58
+ qemu_fdt_add_subnode(vms->fdt, nodename);
59
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
60
"arm,gic-v2m-frame");
61
- qemu_fdt_setprop(vms->fdt, "/intc/v2m", "msi-controller", NULL, 0);
62
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc/v2m", "reg",
63
+ qemu_fdt_setprop(vms->fdt, nodename, "msi-controller", NULL, 0);
64
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
65
2, vms->memmap[VIRT_GIC_V2M].base,
66
2, vms->memmap[VIRT_GIC_V2M].size);
67
- qemu_fdt_setprop_cell(vms->fdt, "/intc/v2m", "phandle", vms->msi_phandle);
68
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->msi_phandle);
69
+ g_free(nodename);
70
}
241
}
71
242
72
static void fdt_add_gic_node(VirtMachineState *vms)
243
type_init(cframe_reg_register_types)
73
{
74
+ char *nodename;
75
+
76
vms->gic_phandle = qemu_fdt_alloc_phandle(vms->fdt);
77
qemu_fdt_setprop_cell(vms->fdt, "/", "interrupt-parent", vms->gic_phandle);
78
79
- qemu_fdt_add_subnode(vms->fdt, "/intc");
80
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "#interrupt-cells", 3);
81
- qemu_fdt_setprop(vms->fdt, "/intc", "interrupt-controller", NULL, 0);
82
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "#address-cells", 0x2);
83
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "#size-cells", 0x2);
84
- qemu_fdt_setprop(vms->fdt, "/intc", "ranges", NULL, 0);
85
+ nodename = g_strdup_printf("/intc@%" PRIx64,
86
+ vms->memmap[VIRT_GIC_DIST].base);
87
+ qemu_fdt_add_subnode(vms->fdt, nodename);
88
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 3);
89
+ qemu_fdt_setprop(vms->fdt, nodename, "interrupt-controller", NULL, 0);
90
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "#address-cells", 0x2);
91
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "#size-cells", 0x2);
92
+ qemu_fdt_setprop(vms->fdt, nodename, "ranges", NULL, 0);
93
if (vms->gic_version == 3) {
94
int nb_redist_regions = virt_gicv3_redist_region_count(vms);
95
96
- qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
97
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
98
"arm,gic-v3");
99
100
- qemu_fdt_setprop_cell(vms->fdt, "/intc",
101
+ qemu_fdt_setprop_cell(vms->fdt, nodename,
102
"#redistributor-regions", nb_redist_regions);
103
104
if (nb_redist_regions == 1) {
105
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
106
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
107
2, vms->memmap[VIRT_GIC_DIST].base,
108
2, vms->memmap[VIRT_GIC_DIST].size,
109
2, vms->memmap[VIRT_GIC_REDIST].base,
110
2, vms->memmap[VIRT_GIC_REDIST].size);
111
} else {
112
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
113
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
114
2, vms->memmap[VIRT_GIC_DIST].base,
115
2, vms->memmap[VIRT_GIC_DIST].size,
116
2, vms->memmap[VIRT_GIC_REDIST].base,
117
@@ -XXX,XX +XXX,XX @@ static void fdt_add_gic_node(VirtMachineState *vms)
118
}
119
120
if (vms->virt) {
121
- qemu_fdt_setprop_cells(vms->fdt, "/intc", "interrupts",
122
+ qemu_fdt_setprop_cells(vms->fdt, nodename, "interrupts",
123
GIC_FDT_IRQ_TYPE_PPI, ARCH_GICV3_MAINT_IRQ,
124
GIC_FDT_IRQ_FLAGS_LEVEL_HI);
125
}
126
} else {
127
/* 'cortex-a15-gic' means 'GIC v2' */
128
- qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
129
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
130
"arm,cortex-a15-gic");
131
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
132
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
133
2, vms->memmap[VIRT_GIC_DIST].base,
134
2, vms->memmap[VIRT_GIC_DIST].size,
135
2, vms->memmap[VIRT_GIC_CPU].base,
136
2, vms->memmap[VIRT_GIC_CPU].size);
137
}
138
139
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "phandle", vms->gic_phandle);
140
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->gic_phandle);
141
+ g_free(nodename);
142
}
143
144
static void fdt_add_pmu_nodes(const VirtMachineState *vms)
145
--
244
--
146
2.17.1
245
2.34.1
147
148
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
Connect the Configuration Frame Unit (CFU_APB, CFU_FDRO and CFU_SFR) to
4
the Versal machine.
5
6
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
7
Acked-by: Edgar E. Iglesias <edgar@zeroasic.com>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20230831165701.2016397-8-francisco.iglesias@amd.com
5
Message-id: 20180627043328.11531-30-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 4 +
12
include/hw/arm/xlnx-versal.h | 16 ++++++++++++++
9
target/arm/sve_helper.c | 162 +++++++++++++++++++++++++++++++++++++
13
hw/arm/xlnx-versal.c | 42 ++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 37 +++++++++
14
2 files changed, 58 insertions(+)
11
target/arm/sve.decode | 4 +
12
4 files changed, 207 insertions(+)
13
15
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
18
--- a/include/hw/arm/xlnx-versal.h
17
+++ b/target/arm/helper-sve.h
19
+++ b/include/hw/arm/xlnx-versal.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@
19
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
21
#include "hw/misc/xlnx-versal-crl.h"
20
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
22
#include "hw/misc/xlnx-versal-pmc-iou-slcr.h"
21
23
#include "hw/net/xlnx-versal-canfd.h"
22
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
24
+#include "hw/misc/xlnx-versal-cfu.h"
23
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
25
24
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
26
#define TYPE_XLNX_VERSAL "xlnx-versal"
27
OBJECT_DECLARE_SIMPLE_TYPE(Versal, XLNX_VERSAL)
28
@@ -XXX,XX +XXX,XX @@ struct Versal {
29
XlnxEFuse efuse;
30
XlnxVersalEFuseCtrl efuse_ctrl;
31
XlnxVersalEFuseCache efuse_cache;
32
+ XlnxVersalCFUAPB cfu_apb;
33
+ XlnxVersalCFUFDRO cfu_fdro;
34
+ XlnxVersalCFUSFR cfu_sfr;
35
36
OrIRQState apb_irq_orgate;
37
} pmc;
38
@@ -XXX,XX +XXX,XX @@ struct Versal {
39
#define VERSAL_GEM1_WAKE_IRQ_0 59
40
#define VERSAL_ADMA_IRQ_0 60
41
#define VERSAL_XRAM_IRQ_0 79
42
+#define VERSAL_CFU_IRQ_0 120
43
#define VERSAL_PMC_APB_IRQ 121
44
#define VERSAL_OSPI_IRQ 124
45
#define VERSAL_SD0_IRQ_0 126
46
@@ -XXX,XX +XXX,XX @@ struct Versal {
47
#define MM_PMC_EFUSE_CACHE 0xf1250000
48
#define MM_PMC_EFUSE_CACHE_SIZE 0x00C00
49
50
+#define MM_PMC_CFU_APB 0xf12b0000
51
+#define MM_PMC_CFU_APB_SIZE 0x10000
52
+#define MM_PMC_CFU_STREAM 0xf12c0000
53
+#define MM_PMC_CFU_STREAM_SIZE 0x1000
54
+#define MM_PMC_CFU_SFR 0xf12c1000
55
+#define MM_PMC_CFU_SFR_SIZE 0x1000
56
+#define MM_PMC_CFU_FDRO 0xf12c2000
57
+#define MM_PMC_CFU_FDRO_SIZE 0x1000
58
+#define MM_PMC_CFU_STREAM_2 0xf1f80000
59
+#define MM_PMC_CFU_STREAM_2_SIZE 0x40000
25
+
60
+
26
DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
61
#define MM_PMC_CRP 0xf1260000U
27
DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
62
#define MM_PMC_CRP_SIZE 0x10000
28
DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
63
#define MM_PMC_RTC 0xf12a0000
29
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
64
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
30
index XXXXXXX..XXXXXXX 100644
65
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/sve_helper.c
66
--- a/hw/arm/xlnx-versal.c
32
+++ b/target/arm/sve_helper.c
67
+++ b/hw/arm/xlnx-versal.c
33
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
68
@@ -XXX,XX +XXX,XX @@ static void versal_create_ospi(Versal *s, qemu_irq *pic)
34
} while (i != 0);
69
qdev_connect_gpio_out(orgate, 0, pic[VERSAL_OSPI_IRQ]);
35
}
70
}
36
71
37
+/*
72
+static void versal_create_cfu(Versal *s, qemu_irq *pic)
38
+ * FP Complex Multiply
73
+{
39
+ */
74
+ SysBusDevice *sbd;
40
+
75
+
41
+QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32);
76
+ /* CFU FDRO */
77
+ object_initialize_child(OBJECT(s), "cfu-fdro", &s->pmc.cfu_fdro,
78
+ TYPE_XLNX_VERSAL_CFU_FDRO);
79
+ sbd = SYS_BUS_DEVICE(&s->pmc.cfu_fdro);
42
+
80
+
43
+void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
81
+ sysbus_realize(sbd, &error_fatal);
44
+{
82
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_FDRO,
45
+ intptr_t j, i = simd_oprsz(desc);
83
+ sysbus_mmio_get_region(sbd, 0));
46
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
47
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
48
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
49
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
50
+ unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
51
+ bool flip = rot & 1;
52
+ float16 neg_imag, neg_real;
53
+ void *vd = &env->vfp.zregs[rd];
54
+ void *vn = &env->vfp.zregs[rn];
55
+ void *vm = &env->vfp.zregs[rm];
56
+ void *va = &env->vfp.zregs[ra];
57
+ uint64_t *g = vg;
58
+
84
+
59
+ neg_imag = float16_set_sign(0, (rot & 2) != 0);
85
+ /* CFU APB */
60
+ neg_real = float16_set_sign(0, rot == 1 || rot == 2);
86
+ object_initialize_child(OBJECT(s), "cfu-apb", &s->pmc.cfu_apb,
87
+ TYPE_XLNX_VERSAL_CFU_APB);
88
+ sbd = SYS_BUS_DEVICE(&s->pmc.cfu_apb);
61
+
89
+
62
+ do {
90
+ sysbus_realize(sbd, &error_fatal);
63
+ uint64_t pg = g[(i - 1) >> 6];
91
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_APB,
64
+ do {
92
+ sysbus_mmio_get_region(sbd, 0));
65
+ float16 e1, e2, e3, e4, nr, ni, mr, mi, d;
93
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_STREAM,
94
+ sysbus_mmio_get_region(sbd, 1));
95
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_STREAM_2,
96
+ sysbus_mmio_get_region(sbd, 2));
97
+ sysbus_connect_irq(sbd, 0, pic[VERSAL_CFU_IRQ_0]);
66
+
98
+
67
+ /* I holds the real index; J holds the imag index. */
99
+ /* CFU SFR */
68
+ j = i - sizeof(float16);
100
+ object_initialize_child(OBJECT(s), "cfu-sfr", &s->pmc.cfu_sfr,
69
+ i -= 2 * sizeof(float16);
101
+ TYPE_XLNX_VERSAL_CFU_SFR);
70
+
102
+
71
+ nr = *(float16 *)(vn + H1_2(i));
103
+ sbd = SYS_BUS_DEVICE(&s->pmc.cfu_sfr);
72
+ ni = *(float16 *)(vn + H1_2(j));
73
+ mr = *(float16 *)(vm + H1_2(i));
74
+ mi = *(float16 *)(vm + H1_2(j));
75
+
104
+
76
+ e2 = (flip ? ni : nr);
105
+ object_property_set_link(OBJECT(&s->pmc.cfu_sfr),
77
+ e1 = (flip ? mi : mr) ^ neg_real;
106
+ "cfu", OBJECT(&s->pmc.cfu_apb), &error_abort);
78
+ e4 = e2;
79
+ e3 = (flip ? mr : mi) ^ neg_imag;
80
+
107
+
81
+ if (likely((pg >> (i & 63)) & 1)) {
108
+ sysbus_realize(sbd, &error_fatal);
82
+ d = *(float16 *)(va + H1_2(i));
109
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_SFR,
83
+ d = float16_muladd(e2, e1, d, 0, &env->vfp.fp_status_f16);
110
+ sysbus_mmio_get_region(sbd, 0));
84
+ *(float16 *)(vd + H1_2(i)) = d;
85
+ }
86
+ if (likely((pg >> (j & 63)) & 1)) {
87
+ d = *(float16 *)(va + H1_2(j));
88
+ d = float16_muladd(e4, e3, d, 0, &env->vfp.fp_status_f16);
89
+ *(float16 *)(vd + H1_2(j)) = d;
90
+ }
91
+ } while (i & 63);
92
+ } while (i != 0);
93
+}
111
+}
94
+
112
+
95
+void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
113
static void versal_create_crl(Versal *s, qemu_irq *pic)
96
+{
114
{
97
+ intptr_t j, i = simd_oprsz(desc);
115
SysBusDevice *sbd;
98
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
116
@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
99
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
117
versal_create_pmc_iou_slcr(s, pic);
100
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
118
versal_create_ospi(s, pic);
101
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
119
versal_create_crl(s, pic);
102
+ unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
120
+ versal_create_cfu(s, pic);
103
+ bool flip = rot & 1;
121
versal_map_ddr(s);
104
+ float32 neg_imag, neg_real;
122
versal_unimp(s);
105
+ void *vd = &env->vfp.zregs[rd];
123
106
+ void *vn = &env->vfp.zregs[rn];
107
+ void *vm = &env->vfp.zregs[rm];
108
+ void *va = &env->vfp.zregs[ra];
109
+ uint64_t *g = vg;
110
+
111
+ neg_imag = float32_set_sign(0, (rot & 2) != 0);
112
+ neg_real = float32_set_sign(0, rot == 1 || rot == 2);
113
+
114
+ do {
115
+ uint64_t pg = g[(i - 1) >> 6];
116
+ do {
117
+ float32 e1, e2, e3, e4, nr, ni, mr, mi, d;
118
+
119
+ /* I holds the real index; J holds the imag index. */
120
+ j = i - sizeof(float32);
121
+ i -= 2 * sizeof(float32);
122
+
123
+ nr = *(float32 *)(vn + H1_2(i));
124
+ ni = *(float32 *)(vn + H1_2(j));
125
+ mr = *(float32 *)(vm + H1_2(i));
126
+ mi = *(float32 *)(vm + H1_2(j));
127
+
128
+ e2 = (flip ? ni : nr);
129
+ e1 = (flip ? mi : mr) ^ neg_real;
130
+ e4 = e2;
131
+ e3 = (flip ? mr : mi) ^ neg_imag;
132
+
133
+ if (likely((pg >> (i & 63)) & 1)) {
134
+ d = *(float32 *)(va + H1_2(i));
135
+ d = float32_muladd(e2, e1, d, 0, &env->vfp.fp_status);
136
+ *(float32 *)(vd + H1_2(i)) = d;
137
+ }
138
+ if (likely((pg >> (j & 63)) & 1)) {
139
+ d = *(float32 *)(va + H1_2(j));
140
+ d = float32_muladd(e4, e3, d, 0, &env->vfp.fp_status);
141
+ *(float32 *)(vd + H1_2(j)) = d;
142
+ }
143
+ } while (i & 63);
144
+ } while (i != 0);
145
+}
146
+
147
+void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
148
+{
149
+ intptr_t j, i = simd_oprsz(desc);
150
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
151
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
152
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
153
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
154
+ unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
155
+ bool flip = rot & 1;
156
+ float64 neg_imag, neg_real;
157
+ void *vd = &env->vfp.zregs[rd];
158
+ void *vn = &env->vfp.zregs[rn];
159
+ void *vm = &env->vfp.zregs[rm];
160
+ void *va = &env->vfp.zregs[ra];
161
+ uint64_t *g = vg;
162
+
163
+ neg_imag = float64_set_sign(0, (rot & 2) != 0);
164
+ neg_real = float64_set_sign(0, rot == 1 || rot == 2);
165
+
166
+ do {
167
+ uint64_t pg = g[(i - 1) >> 6];
168
+ do {
169
+ float64 e1, e2, e3, e4, nr, ni, mr, mi, d;
170
+
171
+ /* I holds the real index; J holds the imag index. */
172
+ j = i - sizeof(float64);
173
+ i -= 2 * sizeof(float64);
174
+
175
+ nr = *(float64 *)(vn + H1_2(i));
176
+ ni = *(float64 *)(vn + H1_2(j));
177
+ mr = *(float64 *)(vm + H1_2(i));
178
+ mi = *(float64 *)(vm + H1_2(j));
179
+
180
+ e2 = (flip ? ni : nr);
181
+ e1 = (flip ? mi : mr) ^ neg_real;
182
+ e4 = e2;
183
+ e3 = (flip ? mr : mi) ^ neg_imag;
184
+
185
+ if (likely((pg >> (i & 63)) & 1)) {
186
+ d = *(float64 *)(va + H1_2(i));
187
+ d = float64_muladd(e2, e1, d, 0, &env->vfp.fp_status);
188
+ *(float64 *)(vd + H1_2(i)) = d;
189
+ }
190
+ if (likely((pg >> (j & 63)) & 1)) {
191
+ d = *(float64 *)(va + H1_2(j));
192
+ d = float64_muladd(e4, e3, d, 0, &env->vfp.fp_status);
193
+ *(float64 *)(vd + H1_2(j)) = d;
194
+ }
195
+ } while (i & 63);
196
+ } while (i != 0);
197
+}
198
+
199
/*
200
* Load contiguous data, protected by a governing predicate.
201
*/
202
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
203
index XXXXXXX..XXXXXXX 100644
204
--- a/target/arm/translate-sve.c
205
+++ b/target/arm/translate-sve.c
206
@@ -XXX,XX +XXX,XX @@ DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz)
207
208
#undef DO_FMLA
209
210
+static bool trans_FCMLA_zpzzz(DisasContext *s,
211
+ arg_FCMLA_zpzzz *a, uint32_t insn)
212
+{
213
+ static gen_helper_sve_fmla * const fns[3] = {
214
+ gen_helper_sve_fcmla_zpzzz_h,
215
+ gen_helper_sve_fcmla_zpzzz_s,
216
+ gen_helper_sve_fcmla_zpzzz_d,
217
+ };
218
+
219
+ if (a->esz == 0) {
220
+ return false;
221
+ }
222
+ if (sve_access_check(s)) {
223
+ unsigned vsz = vec_full_reg_size(s);
224
+ unsigned desc;
225
+ TCGv_i32 t_desc;
226
+ TCGv_ptr pg = tcg_temp_new_ptr();
227
+
228
+ /* We would need 7 operands to pass these arguments "properly".
229
+ * So we encode all the register numbers into the descriptor.
230
+ */
231
+ desc = deposit32(a->rd, 5, 5, a->rn);
232
+ desc = deposit32(desc, 10, 5, a->rm);
233
+ desc = deposit32(desc, 15, 5, a->ra);
234
+ desc = deposit32(desc, 20, 2, a->rot);
235
+ desc = sextract32(desc, 0, 22);
236
+ desc = simd_desc(vsz, vsz, desc);
237
+
238
+ t_desc = tcg_const_i32(desc);
239
+ tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
240
+ fns[a->esz - 1](cpu_env, pg, t_desc);
241
+ tcg_temp_free_i32(t_desc);
242
+ tcg_temp_free_ptr(pg);
243
+ }
244
+ return true;
245
+}
246
+
247
/*
248
*** SVE Floating Point Unary Operations Predicated Group
249
*/
250
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
251
index XXXXXXX..XXXXXXX 100644
252
--- a/target/arm/sve.decode
253
+++ b/target/arm/sve.decode
254
@@ -XXX,XX +XXX,XX @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
255
FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
256
rn=%reg_movprfx
257
258
+# SVE floating-point complex multiply-add (predicated)
259
+FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \
260
+ ra=%reg_movprfx
261
+
262
### SVE FP Multiply-Add Indexed Group
263
264
# SVE floating-point multiply-add (indexed)
265
--
124
--
266
2.17.1
125
2.34.1
267
268
diff view generated by jsdifflib
1
From: Eric Auger <eric.auger@redhat.com>
1
From: Francisco Iglesias <francisco.iglesias@amd.com>
2
2
3
This helper allows to retrieve the paths of nodes whose name
3
Connect the Configuration Frame controller (CFRAME_REG) and the
4
match node-name or node-name@unit-address patterns.
4
Configuration Frame broadcast controller (CFRAME_BCAST_REG) to the
5
Versal machine.
5
6
6
Signed-off-by: Eric Auger <eric.auger@redhat.com>
7
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
7
Message-id: 1530044492-24921-2-git-send-email-eric.auger@redhat.com
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20230831165701.2016397-9-francisco.iglesias@amd.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
include/sysemu/device_tree.h | 16 +++++++++++
12
include/hw/arm/xlnx-versal.h | 69 +++++++++++++++++++++
12
device_tree.c | 55 ++++++++++++++++++++++++++++++++++++
13
hw/arm/xlnx-versal.c | 113 ++++++++++++++++++++++++++++++++++-
13
2 files changed, 71 insertions(+)
14
2 files changed, 181 insertions(+), 1 deletion(-)
14
15
15
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
16
diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/include/sysemu/device_tree.h
18
--- a/include/hw/arm/xlnx-versal.h
18
+++ b/include/sysemu/device_tree.h
19
+++ b/include/hw/arm/xlnx-versal.h
19
@@ -XXX,XX +XXX,XX @@ void *load_device_tree_from_sysfs(void);
20
@@ -XXX,XX +XXX,XX @@
20
char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
21
#include "hw/misc/xlnx-versal-pmc-iou-slcr.h"
21
Error **errp);
22
#include "hw/net/xlnx-versal-canfd.h"
22
23
#include "hw/misc/xlnx-versal-cfu.h"
23
+/**
24
+#include "hw/misc/xlnx-versal-cframe-reg.h"
24
+ * qemu_fdt_node_unit_path: return the paths of nodes matching a given
25
25
+ * node-name, ie. node-name and node-name@unit-address
26
#define TYPE_XLNX_VERSAL "xlnx-versal"
26
+ * @fdt: pointer to the dt blob
27
OBJECT_DECLARE_SIMPLE_TYPE(Versal, XLNX_VERSAL)
27
+ * @name: node name
28
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(Versal, XLNX_VERSAL)
28
+ * @errp: handle to an error object
29
#define XLNX_VERSAL_NR_IRQS 192
29
+ *
30
#define XLNX_VERSAL_NR_CANFD 2
30
+ * returns a newly allocated NULL-terminated array of node paths.
31
#define XLNX_VERSAL_CANFD_REF_CLK (24 * 1000 * 1000)
31
+ * Use g_strfreev() to free it. If one or more nodes were found, the
32
+#define XLNX_VERSAL_NR_CFRAME 15
32
+ * array contains the path of each node and the last element equals to
33
33
+ * NULL. If there is no error but no matching node was found, the
34
struct Versal {
34
+ * returned array contains a single element equal to NULL. If an error
35
/*< private >*/
35
+ * was encountered when parsing the blob, the function returns NULL
36
@@ -XXX,XX +XXX,XX @@ struct Versal {
36
+ */
37
XlnxVersalCFUAPB cfu_apb;
37
+char **qemu_fdt_node_unit_path(void *fdt, const char *name, Error **errp);
38
XlnxVersalCFUFDRO cfu_fdro;
38
+
39
XlnxVersalCFUSFR cfu_sfr;
39
int qemu_fdt_setprop(void *fdt, const char *node_path,
40
+ XlnxVersalCFrameReg cframe[XLNX_VERSAL_NR_CFRAME];
40
const char *property, const void *val, int size);
41
+ XlnxVersalCFrameBcastReg cframe_bcast;
41
int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
42
42
diff --git a/device_tree.c b/device_tree.c
43
OrIRQState apb_irq_orgate;
44
} pmc;
45
@@ -XXX,XX +XXX,XX @@ struct Versal {
46
#define MM_PMC_CFU_STREAM_2 0xf1f80000
47
#define MM_PMC_CFU_STREAM_2_SIZE 0x40000
48
49
+#define MM_PMC_CFRAME0_REG 0xf12d0000
50
+#define MM_PMC_CFRAME0_REG_SIZE 0x1000
51
+#define MM_PMC_CFRAME0_FDRI 0xf12d1000
52
+#define MM_PMC_CFRAME0_FDRI_SIZE 0x1000
53
+#define MM_PMC_CFRAME1_REG 0xf12d2000
54
+#define MM_PMC_CFRAME1_REG_SIZE 0x1000
55
+#define MM_PMC_CFRAME1_FDRI 0xf12d3000
56
+#define MM_PMC_CFRAME1_FDRI_SIZE 0x1000
57
+#define MM_PMC_CFRAME2_REG 0xf12d4000
58
+#define MM_PMC_CFRAME2_REG_SIZE 0x1000
59
+#define MM_PMC_CFRAME2_FDRI 0xf12d5000
60
+#define MM_PMC_CFRAME2_FDRI_SIZE 0x1000
61
+#define MM_PMC_CFRAME3_REG 0xf12d6000
62
+#define MM_PMC_CFRAME3_REG_SIZE 0x1000
63
+#define MM_PMC_CFRAME3_FDRI 0xf12d7000
64
+#define MM_PMC_CFRAME3_FDRI_SIZE 0x1000
65
+#define MM_PMC_CFRAME4_REG 0xf12d8000
66
+#define MM_PMC_CFRAME4_REG_SIZE 0x1000
67
+#define MM_PMC_CFRAME4_FDRI 0xf12d9000
68
+#define MM_PMC_CFRAME4_FDRI_SIZE 0x1000
69
+#define MM_PMC_CFRAME5_REG 0xf12da000
70
+#define MM_PMC_CFRAME5_REG_SIZE 0x1000
71
+#define MM_PMC_CFRAME5_FDRI 0xf12db000
72
+#define MM_PMC_CFRAME5_FDRI_SIZE 0x1000
73
+#define MM_PMC_CFRAME6_REG 0xf12dc000
74
+#define MM_PMC_CFRAME6_REG_SIZE 0x1000
75
+#define MM_PMC_CFRAME6_FDRI 0xf12dd000
76
+#define MM_PMC_CFRAME6_FDRI_SIZE 0x1000
77
+#define MM_PMC_CFRAME7_REG 0xf12de000
78
+#define MM_PMC_CFRAME7_REG_SIZE 0x1000
79
+#define MM_PMC_CFRAME7_FDRI 0xf12df000
80
+#define MM_PMC_CFRAME7_FDRI_SIZE 0x1000
81
+#define MM_PMC_CFRAME8_REG 0xf12e0000
82
+#define MM_PMC_CFRAME8_REG_SIZE 0x1000
83
+#define MM_PMC_CFRAME8_FDRI 0xf12e1000
84
+#define MM_PMC_CFRAME8_FDRI_SIZE 0x1000
85
+#define MM_PMC_CFRAME9_REG 0xf12e2000
86
+#define MM_PMC_CFRAME9_REG_SIZE 0x1000
87
+#define MM_PMC_CFRAME9_FDRI 0xf12e3000
88
+#define MM_PMC_CFRAME9_FDRI_SIZE 0x1000
89
+#define MM_PMC_CFRAME10_REG 0xf12e4000
90
+#define MM_PMC_CFRAME10_REG_SIZE 0x1000
91
+#define MM_PMC_CFRAME10_FDRI 0xf12e5000
92
+#define MM_PMC_CFRAME10_FDRI_SIZE 0x1000
93
+#define MM_PMC_CFRAME11_REG 0xf12e6000
94
+#define MM_PMC_CFRAME11_REG_SIZE 0x1000
95
+#define MM_PMC_CFRAME11_FDRI 0xf12e7000
96
+#define MM_PMC_CFRAME11_FDRI_SIZE 0x1000
97
+#define MM_PMC_CFRAME12_REG 0xf12e8000
98
+#define MM_PMC_CFRAME12_REG_SIZE 0x1000
99
+#define MM_PMC_CFRAME12_FDRI 0xf12e9000
100
+#define MM_PMC_CFRAME12_FDRI_SIZE 0x1000
101
+#define MM_PMC_CFRAME13_REG 0xf12ea000
102
+#define MM_PMC_CFRAME13_REG_SIZE 0x1000
103
+#define MM_PMC_CFRAME13_FDRI 0xf12eb000
104
+#define MM_PMC_CFRAME13_FDRI_SIZE 0x1000
105
+#define MM_PMC_CFRAME14_REG 0xf12ec000
106
+#define MM_PMC_CFRAME14_REG_SIZE 0x1000
107
+#define MM_PMC_CFRAME14_FDRI 0xf12ed000
108
+#define MM_PMC_CFRAME14_FDRI_SIZE 0x1000
109
+#define MM_PMC_CFRAME_BCAST_REG 0xf12ee000
110
+#define MM_PMC_CFRAME_BCAST_REG_SIZE 0x1000
111
+#define MM_PMC_CFRAME_BCAST_FDRI 0xf12ef000
112
+#define MM_PMC_CFRAME_BCAST_FDRI_SIZE 0x1000
113
+
114
#define MM_PMC_CRP 0xf1260000U
115
#define MM_PMC_CRP_SIZE 0x10000
116
#define MM_PMC_RTC 0xf12a0000
117
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
43
index XXXXXXX..XXXXXXX 100644
118
index XXXXXXX..XXXXXXX 100644
44
--- a/device_tree.c
119
--- a/hw/arm/xlnx-versal.c
45
+++ b/device_tree.c
120
+++ b/hw/arm/xlnx-versal.c
46
@@ -XXX,XX +XXX,XX @@ static int findnode_nofail(void *fdt, const char *node_path)
121
@@ -XXX,XX +XXX,XX @@
47
return offset;
122
#define XLNX_VERSAL_RCPU_TYPE ARM_CPU_TYPE_NAME("cortex-r5f")
48
}
123
#define GEM_REVISION 0x40070106
49
124
50
+char **qemu_fdt_node_unit_path(void *fdt, const char *name, Error **errp)
125
-#define VERSAL_NUM_PMC_APB_IRQS 3
51
+{
126
+#define VERSAL_NUM_PMC_APB_IRQS 18
52
+ char *prefix = g_strdup_printf("%s@", name);
127
#define NUM_OSPI_IRQ_LINES 3
53
+ unsigned int path_len = 16, n = 0;
128
54
+ GSList *path_list = NULL, *iter;
129
static void versal_create_apu_cpus(Versal *s)
55
+ const char *iter_name;
130
@@ -XXX,XX +XXX,XX @@ static void versal_create_pmc_apb_irq_orgate(Versal *s, qemu_irq *pic)
56
+ int offset, len, ret;
131
* - RTC
57
+ char **path_array;
132
* - BBRAM
58
+
133
* - PMC SLCR
59
+ offset = fdt_next_node(fdt, -1, NULL);
134
+ * - CFRAME regs (input 3 - 17 to the orgate)
60
+
135
*/
61
+ while (offset >= 0) {
136
object_initialize_child(OBJECT(s), "pmc-apb-irq-orgate",
62
+ iter_name = fdt_get_name(fdt, offset, &len);
137
&s->pmc.apb_irq_orgate, TYPE_OR_IRQ);
63
+ if (!iter_name) {
138
@@ -XXX,XX +XXX,XX @@ static void versal_create_ospi(Versal *s, qemu_irq *pic)
64
+ offset = len;
139
static void versal_create_cfu(Versal *s, qemu_irq *pic)
65
+ break;
140
{
141
SysBusDevice *sbd;
142
+ DeviceState *dev;
143
+ int i;
144
+ const struct {
145
+ uint64_t reg_base;
146
+ uint64_t fdri_base;
147
+ } cframe_addr[] = {
148
+ { MM_PMC_CFRAME0_REG, MM_PMC_CFRAME0_FDRI },
149
+ { MM_PMC_CFRAME1_REG, MM_PMC_CFRAME1_FDRI },
150
+ { MM_PMC_CFRAME2_REG, MM_PMC_CFRAME2_FDRI },
151
+ { MM_PMC_CFRAME3_REG, MM_PMC_CFRAME3_FDRI },
152
+ { MM_PMC_CFRAME4_REG, MM_PMC_CFRAME4_FDRI },
153
+ { MM_PMC_CFRAME5_REG, MM_PMC_CFRAME5_FDRI },
154
+ { MM_PMC_CFRAME6_REG, MM_PMC_CFRAME6_FDRI },
155
+ { MM_PMC_CFRAME7_REG, MM_PMC_CFRAME7_FDRI },
156
+ { MM_PMC_CFRAME8_REG, MM_PMC_CFRAME8_FDRI },
157
+ { MM_PMC_CFRAME9_REG, MM_PMC_CFRAME9_FDRI },
158
+ { MM_PMC_CFRAME10_REG, MM_PMC_CFRAME10_FDRI },
159
+ { MM_PMC_CFRAME11_REG, MM_PMC_CFRAME11_FDRI },
160
+ { MM_PMC_CFRAME12_REG, MM_PMC_CFRAME12_FDRI },
161
+ { MM_PMC_CFRAME13_REG, MM_PMC_CFRAME13_FDRI },
162
+ { MM_PMC_CFRAME14_REG, MM_PMC_CFRAME14_FDRI },
163
+ };
164
+ const struct {
165
+ uint32_t blktype0_frames;
166
+ uint32_t blktype1_frames;
167
+ uint32_t blktype2_frames;
168
+ uint32_t blktype3_frames;
169
+ uint32_t blktype4_frames;
170
+ uint32_t blktype5_frames;
171
+ uint32_t blktype6_frames;
172
+ } cframe_cfg[] = {
173
+ [0] = { 34111, 3528, 12800, 11, 5, 1, 1 },
174
+ [1] = { 38498, 3841, 15361, 13, 7, 3, 1 },
175
+ [2] = { 38498, 3841, 15361, 13, 7, 3, 1 },
176
+ [3] = { 38498, 3841, 15361, 13, 7, 3, 1 },
177
+ };
178
179
/* CFU FDRO */
180
object_initialize_child(OBJECT(s), "cfu-fdro", &s->pmc.cfu_fdro,
181
@@ -XXX,XX +XXX,XX @@ static void versal_create_cfu(Versal *s, qemu_irq *pic)
182
memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_FDRO,
183
sysbus_mmio_get_region(sbd, 0));
184
185
+ /* CFRAME REG */
186
+ for (i = 0; i < ARRAY_SIZE(s->pmc.cframe); i++) {
187
+ g_autofree char *name = g_strdup_printf("cframe%d", i);
188
+
189
+ object_initialize_child(OBJECT(s), name, &s->pmc.cframe[i],
190
+ TYPE_XLNX_VERSAL_CFRAME_REG);
191
+
192
+ sbd = SYS_BUS_DEVICE(&s->pmc.cframe[i]);
193
+ dev = DEVICE(&s->pmc.cframe[i]);
194
+
195
+ if (i < ARRAY_SIZE(cframe_cfg)) {
196
+ object_property_set_int(OBJECT(dev), "blktype0-frames",
197
+ cframe_cfg[i].blktype0_frames,
198
+ &error_abort);
199
+ object_property_set_int(OBJECT(dev), "blktype1-frames",
200
+ cframe_cfg[i].blktype1_frames,
201
+ &error_abort);
202
+ object_property_set_int(OBJECT(dev), "blktype2-frames",
203
+ cframe_cfg[i].blktype2_frames,
204
+ &error_abort);
205
+ object_property_set_int(OBJECT(dev), "blktype3-frames",
206
+ cframe_cfg[i].blktype3_frames,
207
+ &error_abort);
208
+ object_property_set_int(OBJECT(dev), "blktype4-frames",
209
+ cframe_cfg[i].blktype4_frames,
210
+ &error_abort);
211
+ object_property_set_int(OBJECT(dev), "blktype5-frames",
212
+ cframe_cfg[i].blktype5_frames,
213
+ &error_abort);
214
+ object_property_set_int(OBJECT(dev), "blktype6-frames",
215
+ cframe_cfg[i].blktype6_frames,
216
+ &error_abort);
66
+ }
217
+ }
67
+ if (!strcmp(iter_name, name) || g_str_has_prefix(iter_name, prefix)) {
218
+ object_property_set_link(OBJECT(dev), "cfu-fdro",
68
+ char *path;
219
+ OBJECT(&s->pmc.cfu_fdro), &error_fatal);
69
+
220
+
70
+ path = g_malloc(path_len);
221
+ sysbus_realize(SYS_BUS_DEVICE(dev), &error_fatal);
71
+ while ((ret = fdt_get_path(fdt, offset, path, path_len))
222
+
72
+ == -FDT_ERR_NOSPACE) {
223
+ memory_region_add_subregion(&s->mr_ps, cframe_addr[i].reg_base,
73
+ path_len += 16;
224
+ sysbus_mmio_get_region(sbd, 0));
74
+ path = g_realloc(path, path_len);
225
+ memory_region_add_subregion(&s->mr_ps, cframe_addr[i].fdri_base,
75
+ }
226
+ sysbus_mmio_get_region(sbd, 1));
76
+ path_list = g_slist_prepend(path_list, path);
227
+ sysbus_connect_irq(sbd, 0,
77
+ n++;
228
+ qdev_get_gpio_in(DEVICE(&s->pmc.apb_irq_orgate),
78
+ }
229
+ 3 + i));
79
+ offset = fdt_next_node(fdt, offset, NULL);
80
+ }
230
+ }
81
+ g_free(prefix);
231
+
82
+
232
+ /* CFRAME BCAST */
83
+ if (offset < 0 && offset != -FDT_ERR_NOTFOUND) {
233
+ object_initialize_child(OBJECT(s), "cframe_bcast", &s->pmc.cframe_bcast,
84
+ error_setg(errp, "%s: abort parsing dt for %s node units: %s",
234
+ TYPE_XLNX_VERSAL_CFRAME_BCAST_REG);
85
+ __func__, name, fdt_strerror(offset));
235
+
86
+ for (iter = path_list; iter; iter = iter->next) {
236
+ sbd = SYS_BUS_DEVICE(&s->pmc.cframe_bcast);
87
+ g_free(iter->data);
237
+ dev = DEVICE(&s->pmc.cframe_bcast);
88
+ }
238
+
89
+ g_slist_free(path_list);
239
+ for (i = 0; i < ARRAY_SIZE(s->pmc.cframe); i++) {
90
+ return NULL;
240
+ g_autofree char *propname = g_strdup_printf("cframe%d", i);
241
+ object_property_set_link(OBJECT(dev), propname,
242
+ OBJECT(&s->pmc.cframe[i]), &error_fatal);
91
+ }
243
+ }
92
+
244
+
93
+ path_array = g_new(char *, n + 1);
245
+ sysbus_realize(sbd, &error_fatal);
94
+ path_array[n--] = NULL;
246
+
95
+
247
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFRAME_BCAST_REG,
96
+ for (iter = path_list; iter; iter = iter->next) {
248
+ sysbus_mmio_get_region(sbd, 0));
97
+ path_array[n--] = iter->data;
249
+ memory_region_add_subregion(&s->mr_ps, MM_PMC_CFRAME_BCAST_FDRI,
250
+ sysbus_mmio_get_region(sbd, 1));
251
+
252
/* CFU APB */
253
object_initialize_child(OBJECT(s), "cfu-apb", &s->pmc.cfu_apb,
254
TYPE_XLNX_VERSAL_CFU_APB);
255
sbd = SYS_BUS_DEVICE(&s->pmc.cfu_apb);
256
+ dev = DEVICE(&s->pmc.cfu_apb);
257
+
258
+ for (i = 0; i < ARRAY_SIZE(s->pmc.cframe); i++) {
259
+ g_autofree char *propname = g_strdup_printf("cframe%d", i);
260
+ object_property_set_link(OBJECT(dev), propname,
261
+ OBJECT(&s->pmc.cframe[i]), &error_fatal);
98
+ }
262
+ }
99
+
263
100
+ g_slist_free(path_list);
264
sysbus_realize(sbd, &error_fatal);
101
+
265
memory_region_add_subregion(&s->mr_ps, MM_PMC_CFU_APB,
102
+ return path_array;
103
+}
104
+
105
char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
106
Error **errp)
107
{
108
--
266
--
109
2.17.1
267
2.34.1
110
111
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
When running dtc on the guest /proc/device-tree we get the
4
following warning: Warning (unit_address_vs_reg): Node /memory
5
has a reg or ranges property, but no unit name".
6
7
Let's fix that by adding the unit address to the node name. We also
8
don't create the /memory node anymore in create_fdt(). We directly
9
create it in load_dtb. /chosen still needs to be created in create_fdt
10
as the uart needs it. In case the user provided his own dtb, we nop
11
all memory nodes found in root and create new one(s).
12
13
Signed-off-by: Eric Auger <eric.auger@redhat.com>
14
Message-id: 1530044492-24921-4-git-send-email-eric.auger@redhat.com
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
18
hw/arm/boot.c | 41 +++++++++++++++++++++++------------------
19
hw/arm/virt.c | 7 +------
20
2 files changed, 24 insertions(+), 24 deletions(-)
21
22
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
23
index XXXXXXX..XXXXXXX 100644
24
--- a/hw/arm/boot.c
25
+++ b/hw/arm/boot.c
26
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
27
hwaddr addr_limit, AddressSpace *as)
28
{
29
void *fdt = NULL;
30
- int size, rc;
31
+ int size, rc, n = 0;
32
uint32_t acells, scells;
33
char *nodename;
34
unsigned int i;
35
hwaddr mem_base, mem_len;
36
+ char **node_path;
37
+ Error *err = NULL;
38
39
if (binfo->dtb_filename) {
40
char *filename;
41
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
42
goto fail;
43
}
44
45
+ /* nop all root nodes matching /memory or /memory@unit-address */
46
+ node_path = qemu_fdt_node_unit_path(fdt, "memory", &err);
47
+ if (err) {
48
+ error_report_err(err);
49
+ goto fail;
50
+ }
51
+ while (node_path[n]) {
52
+ if (g_str_has_prefix(node_path[n], "/memory")) {
53
+ qemu_fdt_nop_node(fdt, node_path[n]);
54
+ }
55
+ n++;
56
+ }
57
+ g_strfreev(node_path);
58
+
59
if (nb_numa_nodes > 0) {
60
- /*
61
- * Turn the /memory node created before into a NOP node, then create
62
- * /memory@addr nodes for all numa nodes respectively.
63
- */
64
- qemu_fdt_nop_node(fdt, "/memory");
65
mem_base = binfo->loader_start;
66
for (i = 0; i < nb_numa_nodes; i++) {
67
mem_len = numa_info[i].node_mem;
68
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
69
g_free(nodename);
70
}
71
} else {
72
- Error *err = NULL;
73
+ nodename = g_strdup_printf("/memory@%" PRIx64, binfo->loader_start);
74
+ qemu_fdt_add_subnode(fdt, nodename);
75
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
76
77
- rc = fdt_path_offset(fdt, "/memory");
78
- if (rc < 0) {
79
- qemu_fdt_add_subnode(fdt, "/memory");
80
- }
81
-
82
- if (!qemu_fdt_getprop(fdt, "/memory", "device_type", NULL, &err)) {
83
- qemu_fdt_setprop_string(fdt, "/memory", "device_type", "memory");
84
- }
85
-
86
- rc = qemu_fdt_setprop_sized_cells(fdt, "/memory", "reg",
87
+ rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
88
acells, binfo->loader_start,
89
scells, binfo->ram_size);
90
if (rc < 0) {
91
- fprintf(stderr, "couldn't set /memory/reg\n");
92
+ fprintf(stderr, "couldn't set %s reg\n", nodename);
93
goto fail;
94
}
95
+ g_free(nodename);
96
}
97
98
rc = fdt_path_offset(fdt, "/chosen");
99
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
100
index XXXXXXX..XXXXXXX 100644
101
--- a/hw/arm/virt.c
102
+++ b/hw/arm/virt.c
103
@@ -XXX,XX +XXX,XX @@ static void create_fdt(VirtMachineState *vms)
104
qemu_fdt_setprop_cell(fdt, "/", "#address-cells", 0x2);
105
qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
106
107
- /*
108
- * /chosen and /memory nodes must exist for load_dtb
109
- * to fill in necessary properties later
110
- */
111
+ /* /chosen must exist for load_dtb to fill in necessary properties later */
112
qemu_fdt_add_subnode(fdt, "/chosen");
113
- qemu_fdt_add_subnode(fdt, "/memory");
114
- qemu_fdt_setprop_string(fdt, "/memory", "device_type", "memory");
115
116
/* Clock node, for the benefit of the UART. The kernel device tree
117
* binding documentation claims the PL011 node clock properties are
118
--
119
2.17.1
120
121
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20180627043328.11531-2-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/helper-sve.h | 35 +++++++++
10
target/arm/sve_helper.c | 153 +++++++++++++++++++++++++++++++++++++
11
target/arm/translate-sve.c | 121 +++++++++++++++++++++++++++++
12
target/arm/sve.decode | 34 +++++++++
13
4 files changed, 343 insertions(+)
14
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
18
+++ b/target/arm/helper-sve.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
22
void, ptr, ptr, ptr, ptr, i32)
23
+
24
+DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
25
+DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
+DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
+DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
+
29
+DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
32
+DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
33
+
34
+DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
+DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
+DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
38
+
39
+DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
+DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
43
+
44
+DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
+DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
50
+
51
+DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
52
+DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
+
56
+DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
58
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/sve_helper.c
61
+++ b/target/arm/sve_helper.c
62
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
63
64
return predtest_ones(d, oprsz, esz_mask);
65
}
66
+
67
+/*
68
+ * Load contiguous data, protected by a governing predicate.
69
+ */
70
+#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \
71
+static void do_##NAME(CPUARMState *env, void *vd, void *vg, \
72
+ target_ulong addr, intptr_t oprsz, \
73
+ uintptr_t ra) \
74
+{ \
75
+ intptr_t i = 0; \
76
+ do { \
77
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
78
+ do { \
79
+ TYPEM m = 0; \
80
+ if (pg & 1) { \
81
+ m = FN(env, addr, ra); \
82
+ } \
83
+ *(TYPEE *)(vd + H(i)) = m; \
84
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
85
+ addr += sizeof(TYPEM); \
86
+ } while (i & 15); \
87
+ } while (i < oprsz); \
88
+} \
89
+void HELPER(NAME)(CPUARMState *env, void *vg, \
90
+ target_ulong addr, uint32_t desc) \
91
+{ \
92
+ do_##NAME(env, &env->vfp.zregs[simd_data(desc)], vg, \
93
+ addr, simd_oprsz(desc), GETPC()); \
94
+}
95
+
96
+#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \
97
+void HELPER(NAME)(CPUARMState *env, void *vg, \
98
+ target_ulong addr, uint32_t desc) \
99
+{ \
100
+ intptr_t i, oprsz = simd_oprsz(desc); \
101
+ intptr_t ra = GETPC(); \
102
+ unsigned rd = simd_data(desc); \
103
+ void *d1 = &env->vfp.zregs[rd]; \
104
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
105
+ for (i = 0; i < oprsz; ) { \
106
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
107
+ do { \
108
+ TYPEM m1 = 0, m2 = 0; \
109
+ if (pg & 1) { \
110
+ m1 = FN(env, addr, ra); \
111
+ m2 = FN(env, addr + sizeof(TYPEM), ra); \
112
+ } \
113
+ *(TYPEE *)(d1 + H(i)) = m1; \
114
+ *(TYPEE *)(d2 + H(i)) = m2; \
115
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
116
+ addr += 2 * sizeof(TYPEM); \
117
+ } while (i & 15); \
118
+ } \
119
+}
120
+
121
+#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \
122
+void HELPER(NAME)(CPUARMState *env, void *vg, \
123
+ target_ulong addr, uint32_t desc) \
124
+{ \
125
+ intptr_t i, oprsz = simd_oprsz(desc); \
126
+ intptr_t ra = GETPC(); \
127
+ unsigned rd = simd_data(desc); \
128
+ void *d1 = &env->vfp.zregs[rd]; \
129
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
130
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
131
+ for (i = 0; i < oprsz; ) { \
132
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
133
+ do { \
134
+ TYPEM m1 = 0, m2 = 0, m3 = 0; \
135
+ if (pg & 1) { \
136
+ m1 = FN(env, addr, ra); \
137
+ m2 = FN(env, addr + sizeof(TYPEM), ra); \
138
+ m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
139
+ } \
140
+ *(TYPEE *)(d1 + H(i)) = m1; \
141
+ *(TYPEE *)(d2 + H(i)) = m2; \
142
+ *(TYPEE *)(d3 + H(i)) = m3; \
143
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
144
+ addr += 3 * sizeof(TYPEM); \
145
+ } while (i & 15); \
146
+ } \
147
+}
148
+
149
+#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \
150
+void HELPER(NAME)(CPUARMState *env, void *vg, \
151
+ target_ulong addr, uint32_t desc) \
152
+{ \
153
+ intptr_t i, oprsz = simd_oprsz(desc); \
154
+ intptr_t ra = GETPC(); \
155
+ unsigned rd = simd_data(desc); \
156
+ void *d1 = &env->vfp.zregs[rd]; \
157
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
158
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
159
+ void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \
160
+ for (i = 0; i < oprsz; ) { \
161
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
162
+ do { \
163
+ TYPEM m1 = 0, m2 = 0, m3 = 0, m4 = 0; \
164
+ if (pg & 1) { \
165
+ m1 = FN(env, addr, ra); \
166
+ m2 = FN(env, addr + sizeof(TYPEM), ra); \
167
+ m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
168
+ m4 = FN(env, addr + 3 * sizeof(TYPEM), ra); \
169
+ } \
170
+ *(TYPEE *)(d1 + H(i)) = m1; \
171
+ *(TYPEE *)(d2 + H(i)) = m2; \
172
+ *(TYPEE *)(d3 + H(i)) = m3; \
173
+ *(TYPEE *)(d4 + H(i)) = m4; \
174
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
175
+ addr += 4 * sizeof(TYPEM); \
176
+ } while (i & 15); \
177
+ } \
178
+}
179
+
180
+DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2)
181
+DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2)
182
+DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4)
183
+DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4)
184
+DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, )
185
+DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
186
+
187
+DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
188
+DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4)
189
+DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
190
+DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
191
+
192
+DO_LD1(sve_ld1sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, )
193
+DO_LD1(sve_ld1sds_r, cpu_ldl_data_ra, uint64_t, int32_t, )
194
+
195
+DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
196
+DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
197
+DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
198
+DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
199
+
200
+DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
201
+DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
202
+DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
203
+DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
204
+
205
+DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
206
+DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
207
+DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
208
+DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
209
+
210
+DO_LD1(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
211
+DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
212
+DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
213
+DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
214
+
215
+#undef DO_LD1
216
+#undef DO_LD2
217
+#undef DO_LD3
218
+#undef DO_LD4
219
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
220
index XXXXXXX..XXXXXXX 100644
221
--- a/target/arm/translate-sve.c
222
+++ b/target/arm/translate-sve.c
223
@@ -XXX,XX +XXX,XX @@ typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr,
224
typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
225
TCGv_ptr, TCGv_ptr, TCGv_i32);
226
227
+typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32);
228
+
229
/*
230
* Helpers for extracting complex instruction fields.
231
*/
232
@@ -XXX,XX +XXX,XX @@ static inline int expand_imm_sh8u(int x)
233
return (uint8_t)x << (x & 0x100 ? 8 : 0);
234
}
235
236
+/* Convert a 2-bit memory size (msz) to a 4-bit data type (dtype)
237
+ * with unsigned data. C.f. SVE Memory Contiguous Load Group.
238
+ */
239
+static inline int msz_dtype(int msz)
240
+{
241
+ static const uint8_t dtype[4] = { 0, 5, 10, 15 };
242
+ return dtype[msz];
243
+}
244
+
245
/*
246
* Include the generated decoder.
247
*/
248
@@ -XXX,XX +XXX,XX @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
249
}
250
return true;
251
}
252
+
253
+/*
254
+ *** SVE Memory - Contiguous Load Group
255
+ */
256
+
257
+/* The memory mode of the dtype. */
258
+static const TCGMemOp dtype_mop[16] = {
259
+ MO_UB, MO_UB, MO_UB, MO_UB,
260
+ MO_SL, MO_UW, MO_UW, MO_UW,
261
+ MO_SW, MO_SW, MO_UL, MO_UL,
262
+ MO_SB, MO_SB, MO_SB, MO_Q
263
+};
264
+
265
+#define dtype_msz(x) (dtype_mop[x] & MO_SIZE)
266
+
267
+/* The vector element size of dtype. */
268
+static const uint8_t dtype_esz[16] = {
269
+ 0, 1, 2, 3,
270
+ 3, 1, 2, 3,
271
+ 3, 2, 2, 3,
272
+ 3, 2, 1, 3
273
+};
274
+
275
+static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
276
+ gen_helper_gvec_mem *fn)
277
+{
278
+ unsigned vsz = vec_full_reg_size(s);
279
+ TCGv_ptr t_pg;
280
+ TCGv_i32 desc;
281
+
282
+ /* For e.g. LD4, there are not enough arguments to pass all 4
283
+ * registers as pointers, so encode the regno into the data field.
284
+ * For consistency, do this even for LD1.
285
+ */
286
+ desc = tcg_const_i32(simd_desc(vsz, vsz, zt));
287
+ t_pg = tcg_temp_new_ptr();
288
+
289
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
290
+ fn(cpu_env, t_pg, addr, desc);
291
+
292
+ tcg_temp_free_ptr(t_pg);
293
+ tcg_temp_free_i32(desc);
294
+}
295
+
296
+static void do_ld_zpa(DisasContext *s, int zt, int pg,
297
+ TCGv_i64 addr, int dtype, int nreg)
298
+{
299
+ static gen_helper_gvec_mem * const fns[16][4] = {
300
+ { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
301
+ gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
302
+ { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
303
+ { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
304
+ { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
305
+
306
+ { gen_helper_sve_ld1sds_r, NULL, NULL, NULL },
307
+ { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r,
308
+ gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r },
309
+ { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL },
310
+ { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL },
311
+
312
+ { gen_helper_sve_ld1hds_r, NULL, NULL, NULL },
313
+ { gen_helper_sve_ld1hss_r, NULL, NULL, NULL },
314
+ { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r,
315
+ gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r },
316
+ { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL },
317
+
318
+ { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
319
+ { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
320
+ { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
321
+ { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r,
322
+ gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r },
323
+ };
324
+ gen_helper_gvec_mem *fn = fns[dtype][nreg];
325
+
326
+ /* While there are holes in the table, they are not
327
+ * accessible via the instruction encoding.
328
+ */
329
+ assert(fn != NULL);
330
+ do_mem_zpa(s, zt, pg, addr, fn);
331
+}
332
+
333
+static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
334
+{
335
+ if (a->rm == 31) {
336
+ return false;
337
+ }
338
+ if (sve_access_check(s)) {
339
+ TCGv_i64 addr = new_tmp_a64(s);
340
+ tcg_gen_muli_i64(addr, cpu_reg(s, a->rm),
341
+ (a->nreg + 1) << dtype_msz(a->dtype));
342
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
343
+ do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg);
344
+ }
345
+ return true;
346
+}
347
+
348
+static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
349
+{
350
+ if (sve_access_check(s)) {
351
+ int vsz = vec_full_reg_size(s);
352
+ int elements = vsz >> dtype_esz[a->dtype];
353
+ TCGv_i64 addr = new_tmp_a64(s);
354
+
355
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn),
356
+ (a->imm * elements * (a->nreg + 1))
357
+ << dtype_msz(a->dtype));
358
+ do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg);
359
+ }
360
+ return true;
361
+}
362
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
363
index XXXXXXX..XXXXXXX 100644
364
--- a/target/arm/sve.decode
365
+++ b/target/arm/sve.decode
366
@@ -XXX,XX +XXX,XX @@
367
# Unsigned 8-bit immediate, optionally shifted left by 8.
368
%sh8_i8u 5:9 !function=expand_imm_sh8u
369
370
+# Unsigned load of msz into esz=2, represented as a dtype.
371
+%msz_dtype 23:2 !function=msz_dtype
372
+
373
# Either a copy of rd (at bit 0), or a different source
374
# as propagated via the MOVPRFX instruction.
375
%reg_movprfx 0:5
376
@@ -XXX,XX +XXX,XX @@
377
&incdec2_cnt rd rn pat esz imm d u
378
&incdec_pred rd pg esz d u
379
&incdec2_pred rd rn pg esz d u
380
+&rprr_load rd pg rn rm dtype nreg
381
+&rpri_load rd pg rn imm dtype nreg
382
383
###########################################################################
384
# Named instruction formats. These are generally used to
385
@@ -XXX,XX +XXX,XX @@
386
@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \
387
&incdec2_pred rn=%reg_movprfx
388
389
+# Loads; user must fill in NREG.
390
+@rprr_load_dt ....... dtype:4 rm:5 ... pg:3 rn:5 rd:5 &rprr_load
391
+@rpri_load_dt ....... dtype:4 . imm:s4 ... pg:3 rn:5 rd:5 &rpri_load
392
+
393
+@rprr_load_msz ....... .... rm:5 ... pg:3 rn:5 rd:5 \
394
+ &rprr_load dtype=%msz_dtype
395
+@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \
396
+ &rpri_load dtype=%msz_dtype
397
+
398
###########################################################################
399
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
400
401
@@ -XXX,XX +XXX,XX @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9
402
403
# SVE load vector register
404
LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
405
+
406
+### SVE Memory Contiguous Load Group
407
+
408
+# SVE contiguous load (scalar plus scalar)
409
+LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0
410
+
411
+# SVE contiguous load (scalar plus immediate)
412
+LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0
413
+
414
+# SVE contiguous non-temporal load (scalar plus scalar)
415
+# LDNT1B, LDNT1H, LDNT1W, LDNT1D
416
+# SVE load multiple structures (scalar plus scalar)
417
+# LD2B, LD2H, LD2W, LD2D; etc.
418
+LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz
419
+
420
+# SVE contiguous non-temporal load (scalar plus immediate)
421
+# LDNT1B, LDNT1H, LDNT1W, LDNT1D
422
+# SVE load multiple structures (scalar plus immediate)
423
+# LD2B, LD2H, LD2W, LD2D; etc.
424
+LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz
425
--
426
2.17.1
427
428
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Message-id: 20180627043328.11531-3-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/helper-sve.h | 40 ++++++++++
10
target/arm/sve_helper.c | 157 +++++++++++++++++++++++++++++++++++++
11
target/arm/translate-sve.c | 69 ++++++++++++++++
12
target/arm/sve.decode | 6 ++
13
4 files changed, 272 insertions(+)
14
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
18
+++ b/target/arm/helper-sve.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
20
21
DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
23
+
24
+DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
25
+DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
+DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
+DEF_HELPER_FLAGS_4(sve_ldff1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
+DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+
32
+DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
33
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
+DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
+
38
+DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+
42
+DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
43
+
44
+DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
+DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
50
+DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
51
+
52
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
56
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
57
+
58
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
59
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
61
+
62
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
63
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/sve_helper.c
66
+++ b/target/arm/sve_helper.c
67
@@ -XXX,XX +XXX,XX @@ DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
68
#undef DO_LD2
69
#undef DO_LD3
70
#undef DO_LD4
71
+
72
+/*
73
+ * Load contiguous data, first-fault and no-fault.
74
+ */
75
+
76
+#ifdef CONFIG_USER_ONLY
77
+
78
+/* Fault on byte I. All bits in FFR from I are cleared. The vector
79
+ * result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE
80
+ * option, which leaves subsequent data unchanged.
81
+ */
82
+static void record_fault(CPUARMState *env, uintptr_t i, uintptr_t oprsz)
83
+{
84
+ uint64_t *ffr = env->vfp.pregs[FFR_PRED_NUM].p;
85
+
86
+ if (i & 63) {
87
+ ffr[i / 64] &= MAKE_64BIT_MASK(0, i & 63);
88
+ i = ROUND_UP(i, 64);
89
+ }
90
+ for (; i < oprsz; i += 64) {
91
+ ffr[i / 64] = 0;
92
+ }
93
+}
94
+
95
+/* Hold the mmap lock during the operation so that there is no race
96
+ * between page_check_range and the load operation. We expect the
97
+ * usual case to have no faults at all, so we check the whole range
98
+ * first and if successful defer to the normal load operation.
99
+ *
100
+ * TODO: Change mmap_lock to a rwlock so that multiple readers
101
+ * can run simultaneously. This will probably help other uses
102
+ * within QEMU as well.
103
+ */
104
+#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
105
+static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg, \
106
+ target_ulong addr, intptr_t oprsz, \
107
+ bool first, uintptr_t ra) \
108
+{ \
109
+ intptr_t i = 0; \
110
+ do { \
111
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
112
+ do { \
113
+ TYPEM m = 0; \
114
+ if (pg & 1) { \
115
+ if (!first && \
116
+ unlikely(page_check_range(addr, sizeof(TYPEM), \
117
+ PAGE_READ))) { \
118
+ record_fault(env, i, oprsz); \
119
+ return; \
120
+ } \
121
+ m = FN(env, addr, ra); \
122
+ first = false; \
123
+ } \
124
+ *(TYPEE *)(vd + H(i)) = m; \
125
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
126
+ addr += sizeof(TYPEM); \
127
+ } while (i & 15); \
128
+ } while (i < oprsz); \
129
+} \
130
+void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \
131
+ target_ulong addr, uint32_t desc) \
132
+{ \
133
+ intptr_t oprsz = simd_oprsz(desc); \
134
+ unsigned rd = simd_data(desc); \
135
+ void *vd = &env->vfp.zregs[rd]; \
136
+ mmap_lock(); \
137
+ if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \
138
+ do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \
139
+ } else { \
140
+ do_sve_ldff1##PART(env, vd, vg, addr, oprsz, true, GETPC()); \
141
+ } \
142
+ mmap_unlock(); \
143
+}
144
+
145
+/* No-fault loads are like first-fault loads without the
146
+ * first faulting special case.
147
+ */
148
+#define DO_LDNF1(PART) \
149
+void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \
150
+ target_ulong addr, uint32_t desc) \
151
+{ \
152
+ intptr_t oprsz = simd_oprsz(desc); \
153
+ unsigned rd = simd_data(desc); \
154
+ void *vd = &env->vfp.zregs[rd]; \
155
+ mmap_lock(); \
156
+ if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \
157
+ do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \
158
+ } else { \
159
+ do_sve_ldff1##PART(env, vd, vg, addr, oprsz, false, GETPC()); \
160
+ } \
161
+ mmap_unlock(); \
162
+}
163
+
164
+#else
165
+
166
+/* TODO: System mode is not yet supported.
167
+ * This would probably use tlb_vaddr_to_host.
168
+ */
169
+#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
170
+void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \
171
+ target_ulong addr, uint32_t desc) \
172
+{ \
173
+ g_assert_not_reached(); \
174
+}
175
+
176
+#define DO_LDNF1(PART) \
177
+void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \
178
+ target_ulong addr, uint32_t desc) \
179
+{ \
180
+ g_assert_not_reached(); \
181
+}
182
+
183
+#endif
184
+
185
+DO_LDFF1(bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
186
+DO_LDFF1(bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2)
187
+DO_LDFF1(bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2)
188
+DO_LDFF1(bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4)
189
+DO_LDFF1(bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4)
190
+DO_LDFF1(bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, )
191
+DO_LDFF1(bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
192
+
193
+DO_LDFF1(hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
194
+DO_LDFF1(hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
195
+DO_LDFF1(hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4)
196
+DO_LDFF1(hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
197
+DO_LDFF1(hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
198
+
199
+DO_LDFF1(ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
200
+DO_LDFF1(sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, )
201
+DO_LDFF1(sds_r, cpu_ldl_data_ra, uint64_t, int32_t, )
202
+
203
+DO_LDFF1(dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
204
+
205
+#undef DO_LDFF1
206
+
207
+DO_LDNF1(bb_r)
208
+DO_LDNF1(bhu_r)
209
+DO_LDNF1(bhs_r)
210
+DO_LDNF1(bsu_r)
211
+DO_LDNF1(bss_r)
212
+DO_LDNF1(bdu_r)
213
+DO_LDNF1(bds_r)
214
+
215
+DO_LDNF1(hh_r)
216
+DO_LDNF1(hsu_r)
217
+DO_LDNF1(hss_r)
218
+DO_LDNF1(hdu_r)
219
+DO_LDNF1(hds_r)
220
+
221
+DO_LDNF1(ss_r)
222
+DO_LDNF1(sdu_r)
223
+DO_LDNF1(sds_r)
224
+
225
+DO_LDNF1(dd_r)
226
+
227
+#undef DO_LDNF1
228
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
229
index XXXXXXX..XXXXXXX 100644
230
--- a/target/arm/translate-sve.c
231
+++ b/target/arm/translate-sve.c
232
@@ -XXX,XX +XXX,XX @@ static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
233
}
234
return true;
235
}
236
+
237
+static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
238
+{
239
+ static gen_helper_gvec_mem * const fns[16] = {
240
+ gen_helper_sve_ldff1bb_r,
241
+ gen_helper_sve_ldff1bhu_r,
242
+ gen_helper_sve_ldff1bsu_r,
243
+ gen_helper_sve_ldff1bdu_r,
244
+
245
+ gen_helper_sve_ldff1sds_r,
246
+ gen_helper_sve_ldff1hh_r,
247
+ gen_helper_sve_ldff1hsu_r,
248
+ gen_helper_sve_ldff1hdu_r,
249
+
250
+ gen_helper_sve_ldff1hds_r,
251
+ gen_helper_sve_ldff1hss_r,
252
+ gen_helper_sve_ldff1ss_r,
253
+ gen_helper_sve_ldff1sdu_r,
254
+
255
+ gen_helper_sve_ldff1bds_r,
256
+ gen_helper_sve_ldff1bss_r,
257
+ gen_helper_sve_ldff1bhs_r,
258
+ gen_helper_sve_ldff1dd_r,
259
+ };
260
+
261
+ if (sve_access_check(s)) {
262
+ TCGv_i64 addr = new_tmp_a64(s);
263
+ tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
264
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
265
+ do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]);
266
+ }
267
+ return true;
268
+}
269
+
270
+static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
271
+{
272
+ static gen_helper_gvec_mem * const fns[16] = {
273
+ gen_helper_sve_ldnf1bb_r,
274
+ gen_helper_sve_ldnf1bhu_r,
275
+ gen_helper_sve_ldnf1bsu_r,
276
+ gen_helper_sve_ldnf1bdu_r,
277
+
278
+ gen_helper_sve_ldnf1sds_r,
279
+ gen_helper_sve_ldnf1hh_r,
280
+ gen_helper_sve_ldnf1hsu_r,
281
+ gen_helper_sve_ldnf1hdu_r,
282
+
283
+ gen_helper_sve_ldnf1hds_r,
284
+ gen_helper_sve_ldnf1hss_r,
285
+ gen_helper_sve_ldnf1ss_r,
286
+ gen_helper_sve_ldnf1sdu_r,
287
+
288
+ gen_helper_sve_ldnf1bds_r,
289
+ gen_helper_sve_ldnf1bss_r,
290
+ gen_helper_sve_ldnf1bhs_r,
291
+ gen_helper_sve_ldnf1dd_r,
292
+ };
293
+
294
+ if (sve_access_check(s)) {
295
+ int vsz = vec_full_reg_size(s);
296
+ int elements = vsz >> dtype_esz[a->dtype];
297
+ int off = (a->imm * elements) << dtype_msz(a->dtype);
298
+ TCGv_i64 addr = new_tmp_a64(s);
299
+
300
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off);
301
+ do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]);
302
+ }
303
+ return true;
304
+}
305
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
306
index XXXXXXX..XXXXXXX 100644
307
--- a/target/arm/sve.decode
308
+++ b/target/arm/sve.decode
309
@@ -XXX,XX +XXX,XX @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
310
# SVE contiguous load (scalar plus scalar)
311
LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0
312
313
+# SVE contiguous first-fault load (scalar plus scalar)
314
+LDFF1_zprr 1010010 .... ..... 011 ... ..... ..... @rprr_load_dt nreg=0
315
+
316
# SVE contiguous load (scalar plus immediate)
317
LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0
318
319
+# SVE contiguous non-fault load (scalar plus immediate)
320
+LDNF1_zpri 1010010 .... 1.... 101 ... ..... ..... @rpri_load_dt nreg=0
321
+
322
# SVE contiguous non-temporal load (scalar plus scalar)
323
# LDNT1B, LDNT1H, LDNT1W, LDNT1D
324
# SVE load multiple structures (scalar plus scalar)
325
--
326
2.17.1
327
328
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-4-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 29 +++++
9
target/arm/sve_helper.c | 211 +++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 65 ++++++++++++
11
target/arm/sve.decode | 38 +++++++
12
4 files changed, 343 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
19
DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
20
21
DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
+
23
+DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
24
+DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
25
+DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
+DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
+
28
+DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
32
+
33
+DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
+DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
+
38
+DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
+
43
+DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
44
+DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
+
47
+DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+
50
+DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
51
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/sve_helper.c
54
+++ b/target/arm/sve_helper.c
55
@@ -XXX,XX +XXX,XX @@ DO_LDNF1(sds_r)
56
DO_LDNF1(dd_r)
57
58
#undef DO_LDNF1
59
+
60
+/*
61
+ * Store contiguous data, protected by a governing predicate.
62
+ */
63
+#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \
64
+void HELPER(NAME)(CPUARMState *env, void *vg, \
65
+ target_ulong addr, uint32_t desc) \
66
+{ \
67
+ intptr_t i, oprsz = simd_oprsz(desc); \
68
+ intptr_t ra = GETPC(); \
69
+ unsigned rd = simd_data(desc); \
70
+ void *vd = &env->vfp.zregs[rd]; \
71
+ for (i = 0; i < oprsz; ) { \
72
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
73
+ do { \
74
+ if (pg & 1) { \
75
+ TYPEM m = *(TYPEE *)(vd + H(i)); \
76
+ FN(env, addr, m, ra); \
77
+ } \
78
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
79
+ addr += sizeof(TYPEM); \
80
+ } while (i & 15); \
81
+ } \
82
+}
83
+
84
+#define DO_ST1_D(NAME, FN, TYPEM) \
85
+void HELPER(NAME)(CPUARMState *env, void *vg, \
86
+ target_ulong addr, uint32_t desc) \
87
+{ \
88
+ intptr_t i, oprsz = simd_oprsz(desc) / 8; \
89
+ intptr_t ra = GETPC(); \
90
+ unsigned rd = simd_data(desc); \
91
+ uint64_t *d = &env->vfp.zregs[rd].d[0]; \
92
+ uint8_t *pg = vg; \
93
+ for (i = 0; i < oprsz; i += 1) { \
94
+ if (pg[H1(i)] & 1) { \
95
+ FN(env, addr, d[i], ra); \
96
+ } \
97
+ addr += sizeof(TYPEM); \
98
+ } \
99
+}
100
+
101
+#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \
102
+void HELPER(NAME)(CPUARMState *env, void *vg, \
103
+ target_ulong addr, uint32_t desc) \
104
+{ \
105
+ intptr_t i, oprsz = simd_oprsz(desc); \
106
+ intptr_t ra = GETPC(); \
107
+ unsigned rd = simd_data(desc); \
108
+ void *d1 = &env->vfp.zregs[rd]; \
109
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
110
+ for (i = 0; i < oprsz; ) { \
111
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
112
+ do { \
113
+ if (pg & 1) { \
114
+ TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
115
+ TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
116
+ FN(env, addr, m1, ra); \
117
+ FN(env, addr + sizeof(TYPEM), m2, ra); \
118
+ } \
119
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
120
+ addr += 2 * sizeof(TYPEM); \
121
+ } while (i & 15); \
122
+ } \
123
+}
124
+
125
+#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \
126
+void HELPER(NAME)(CPUARMState *env, void *vg, \
127
+ target_ulong addr, uint32_t desc) \
128
+{ \
129
+ intptr_t i, oprsz = simd_oprsz(desc); \
130
+ intptr_t ra = GETPC(); \
131
+ unsigned rd = simd_data(desc); \
132
+ void *d1 = &env->vfp.zregs[rd]; \
133
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
134
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
135
+ for (i = 0; i < oprsz; ) { \
136
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
137
+ do { \
138
+ if (pg & 1) { \
139
+ TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
140
+ TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
141
+ TYPEM m3 = *(TYPEE *)(d3 + H(i)); \
142
+ FN(env, addr, m1, ra); \
143
+ FN(env, addr + sizeof(TYPEM), m2, ra); \
144
+ FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \
145
+ } \
146
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
147
+ addr += 3 * sizeof(TYPEM); \
148
+ } while (i & 15); \
149
+ } \
150
+}
151
+
152
+#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \
153
+void HELPER(NAME)(CPUARMState *env, void *vg, \
154
+ target_ulong addr, uint32_t desc) \
155
+{ \
156
+ intptr_t i, oprsz = simd_oprsz(desc); \
157
+ intptr_t ra = GETPC(); \
158
+ unsigned rd = simd_data(desc); \
159
+ void *d1 = &env->vfp.zregs[rd]; \
160
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
161
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
162
+ void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \
163
+ for (i = 0; i < oprsz; ) { \
164
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
165
+ do { \
166
+ if (pg & 1) { \
167
+ TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
168
+ TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
169
+ TYPEM m3 = *(TYPEE *)(d3 + H(i)); \
170
+ TYPEM m4 = *(TYPEE *)(d4 + H(i)); \
171
+ FN(env, addr, m1, ra); \
172
+ FN(env, addr + sizeof(TYPEM), m2, ra); \
173
+ FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \
174
+ FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \
175
+ } \
176
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
177
+ addr += 4 * sizeof(TYPEM); \
178
+ } while (i & 15); \
179
+ } \
180
+}
181
+
182
+DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2)
183
+DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4)
184
+DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t)
185
+
186
+DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4)
187
+DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t)
188
+
189
+DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t)
190
+
191
+DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
192
+DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
193
+DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
194
+DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
195
+
196
+DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
197
+DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
198
+DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
199
+DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
200
+
201
+DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
202
+DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
203
+DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
204
+DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
205
+
206
+DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t)
207
+
208
+void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg,
209
+ target_ulong addr, uint32_t desc)
210
+{
211
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
212
+ intptr_t ra = GETPC();
213
+ unsigned rd = simd_data(desc);
214
+ uint64_t *d1 = &env->vfp.zregs[rd].d[0];
215
+ uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
216
+ uint8_t *pg = vg;
217
+
218
+ for (i = 0; i < oprsz; i += 1) {
219
+ if (pg[H1(i)] & 1) {
220
+ cpu_stq_data_ra(env, addr, d1[i], ra);
221
+ cpu_stq_data_ra(env, addr + 8, d2[i], ra);
222
+ }
223
+ addr += 2 * 8;
224
+ }
225
+}
226
+
227
+void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg,
228
+ target_ulong addr, uint32_t desc)
229
+{
230
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
231
+ intptr_t ra = GETPC();
232
+ unsigned rd = simd_data(desc);
233
+ uint64_t *d1 = &env->vfp.zregs[rd].d[0];
234
+ uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
235
+ uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0];
236
+ uint8_t *pg = vg;
237
+
238
+ for (i = 0; i < oprsz; i += 1) {
239
+ if (pg[H1(i)] & 1) {
240
+ cpu_stq_data_ra(env, addr, d1[i], ra);
241
+ cpu_stq_data_ra(env, addr + 8, d2[i], ra);
242
+ cpu_stq_data_ra(env, addr + 16, d3[i], ra);
243
+ }
244
+ addr += 3 * 8;
245
+ }
246
+}
247
+
248
+void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
249
+ target_ulong addr, uint32_t desc)
250
+{
251
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
252
+ intptr_t ra = GETPC();
253
+ unsigned rd = simd_data(desc);
254
+ uint64_t *d1 = &env->vfp.zregs[rd].d[0];
255
+ uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
256
+ uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0];
257
+ uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0];
258
+ uint8_t *pg = vg;
259
+
260
+ for (i = 0; i < oprsz; i += 1) {
261
+ if (pg[H1(i)] & 1) {
262
+ cpu_stq_data_ra(env, addr, d1[i], ra);
263
+ cpu_stq_data_ra(env, addr + 8, d2[i], ra);
264
+ cpu_stq_data_ra(env, addr + 16, d3[i], ra);
265
+ cpu_stq_data_ra(env, addr + 24, d4[i], ra);
266
+ }
267
+ addr += 4 * 8;
268
+ }
269
+}
270
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
271
index XXXXXXX..XXXXXXX 100644
272
--- a/target/arm/translate-sve.c
273
+++ b/target/arm/translate-sve.c
274
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
275
}
276
return true;
277
}
278
+
279
+static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
280
+ int msz, int esz, int nreg)
281
+{
282
+ static gen_helper_gvec_mem * const fn_single[4][4] = {
283
+ { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r,
284
+ gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r },
285
+ { NULL, gen_helper_sve_st1hh_r,
286
+ gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r },
287
+ { NULL, NULL,
288
+ gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r },
289
+ { NULL, NULL, NULL, gen_helper_sve_st1dd_r },
290
+ };
291
+ static gen_helper_gvec_mem * const fn_multiple[3][4] = {
292
+ { gen_helper_sve_st2bb_r, gen_helper_sve_st2hh_r,
293
+ gen_helper_sve_st2ss_r, gen_helper_sve_st2dd_r },
294
+ { gen_helper_sve_st3bb_r, gen_helper_sve_st3hh_r,
295
+ gen_helper_sve_st3ss_r, gen_helper_sve_st3dd_r },
296
+ { gen_helper_sve_st4bb_r, gen_helper_sve_st4hh_r,
297
+ gen_helper_sve_st4ss_r, gen_helper_sve_st4dd_r },
298
+ };
299
+ gen_helper_gvec_mem *fn;
300
+
301
+ if (nreg == 0) {
302
+ /* ST1 */
303
+ fn = fn_single[msz][esz];
304
+ } else {
305
+ /* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
306
+ assert(msz == esz);
307
+ fn = fn_multiple[nreg - 1][msz];
308
+ }
309
+ assert(fn != NULL);
310
+ do_mem_zpa(s, zt, pg, addr, fn);
311
+}
312
+
313
+static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t insn)
314
+{
315
+ if (a->rm == 31 || a->msz > a->esz) {
316
+ return false;
317
+ }
318
+ if (sve_access_check(s)) {
319
+ TCGv_i64 addr = new_tmp_a64(s);
320
+ tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz);
321
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
322
+ do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg);
323
+ }
324
+ return true;
325
+}
326
+
327
+static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn)
328
+{
329
+ if (a->msz > a->esz) {
330
+ return false;
331
+ }
332
+ if (sve_access_check(s)) {
333
+ int vsz = vec_full_reg_size(s);
334
+ int elements = vsz >> a->esz;
335
+ TCGv_i64 addr = new_tmp_a64(s);
336
+
337
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn),
338
+ (a->imm * elements * (a->nreg + 1)) << a->msz);
339
+ do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg);
340
+ }
341
+ return true;
342
+}
343
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
344
index XXXXXXX..XXXXXXX 100644
345
--- a/target/arm/sve.decode
346
+++ b/target/arm/sve.decode
347
@@ -XXX,XX +XXX,XX @@
348
%imm7_22_16 22:2 16:5
349
%imm8_16_10 16:5 10:3
350
%imm9_16_10 16:s6 10:3
351
+%size_23 23:2
352
353
# A combination of tsz:imm3 -- extract esize.
354
%tszimm_esz 22:2 5:5 !function=tszimm_esz
355
@@ -XXX,XX +XXX,XX @@
356
&incdec2_pred rd rn pg esz d u
357
&rprr_load rd pg rn rm dtype nreg
358
&rpri_load rd pg rn imm dtype nreg
359
+&rprr_store rd pg rn rm msz esz nreg
360
+&rpri_store rd pg rn imm msz esz nreg
361
362
###########################################################################
363
# Named instruction formats. These are generally used to
364
@@ -XXX,XX +XXX,XX @@
365
@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \
366
&rpri_load dtype=%msz_dtype
367
368
+# Stores; user must fill in ESZ, MSZ, NREG as needed.
369
+@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store
370
+@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store
371
+@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \
372
+ &rprr_store nreg=0
373
+
374
###########################################################################
375
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
376
377
@@ -XXX,XX +XXX,XX @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz
378
# SVE load multiple structures (scalar plus immediate)
379
# LD2B, LD2H, LD2W, LD2D; etc.
380
LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz
381
+
382
+### SVE Memory Store Group
383
+
384
+# SVE contiguous store (scalar plus immediate)
385
+# ST1B, ST1H, ST1W, ST1D; require msz <= esz
386
+ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \
387
+ @rpri_store_msz nreg=0
388
+
389
+# SVE contiguous store (scalar plus scalar)
390
+# ST1B, ST1H, ST1W, ST1D; require msz <= esz
391
+# Enumerate msz lest we conflict with STR_zri.
392
+ST_zprr 1110010 00 .. ..... 010 ... ..... ..... \
393
+ @rprr_store_esz_n0 msz=0
394
+ST_zprr 1110010 01 .. ..... 010 ... ..... ..... \
395
+ @rprr_store_esz_n0 msz=1
396
+ST_zprr 1110010 10 .. ..... 010 ... ..... ..... \
397
+ @rprr_store_esz_n0 msz=2
398
+ST_zprr 1110010 11 11 ..... 010 ... ..... ..... \
399
+ @rprr_store msz=3 esz=3 nreg=0
400
+
401
+# SVE contiguous non-temporal store (scalar plus immediate) (nreg == 0)
402
+# SVE store multiple structures (scalar plus immediate) (nreg != 0)
403
+ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \
404
+ @rpri_store_msz esz=%size_23
405
+
406
+# SVE contiguous non-temporal store (scalar plus scalar) (nreg == 0)
407
+# SVE store multiple structures (scalar plus scalar) (nreg != 0)
408
+ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \
409
+ @rprr_store esz=%size_23
410
--
411
2.17.1
412
413
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
[PMM: fixed typo]
6
Message-id: 20180627043328.11531-6-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/helper-sve.h | 30 +++++++++++++
10
target/arm/sve_helper.c | 38 ++++++++++++++++
11
target/arm/translate-sve.c | 90 ++++++++++++++++++++++++++++++++++++++
12
target/arm/sve.decode | 22 ++++++++++
13
4 files changed, 180 insertions(+)
14
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
18
+++ b/target/arm/helper-sve.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
20
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
21
void, ptr, ptr, ptr, ptr, i32)
22
23
+DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
24
+ void, ptr, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
26
+ void, ptr, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG,
28
+ void, ptr, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG,
36
+ void, ptr, ptr, ptr, ptr, i32)
37
+
38
+DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG,
43
+ void, ptr, ptr, ptr, ptr, i32)
44
+DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG,
45
+ void, ptr, ptr, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG,
47
+ void, ptr, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
49
+ void, ptr, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, i32)
52
+
53
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
56
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/sve_helper.c
59
+++ b/target/arm/sve_helper.c
60
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
61
return predtest_ones(d, oprsz, esz_mask);
62
}
63
64
+/* Fully general two-operand expander, controlled by a predicate,
65
+ * With the extra float_status parameter.
66
+ */
67
+#define DO_ZPZ_FP(NAME, TYPE, H, OP) \
68
+void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
69
+{ \
70
+ intptr_t i = simd_oprsz(desc); \
71
+ uint64_t *g = vg; \
72
+ do { \
73
+ uint64_t pg = g[(i - 1) >> 6]; \
74
+ do { \
75
+ i -= sizeof(TYPE); \
76
+ if (likely((pg >> (i & 63)) & 1)) { \
77
+ TYPE nn = *(TYPE *)(vn + H(i)); \
78
+ *(TYPE *)(vd + H(i)) = OP(nn, status); \
79
+ } \
80
+ } while (i & 63); \
81
+ } while (i != 0); \
82
+}
83
+
84
+DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
85
+DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
86
+DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
87
+DO_ZPZ_FP(sve_scvt_sd, uint64_t, , int32_to_float64)
88
+DO_ZPZ_FP(sve_scvt_dh, uint64_t, , int64_to_float16)
89
+DO_ZPZ_FP(sve_scvt_ds, uint64_t, , int64_to_float32)
90
+DO_ZPZ_FP(sve_scvt_dd, uint64_t, , int64_to_float64)
91
+
92
+DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16)
93
+DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16)
94
+DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32)
95
+DO_ZPZ_FP(sve_ucvt_sd, uint64_t, , uint32_to_float64)
96
+DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16)
97
+DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32)
98
+DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64)
99
+
100
+#undef DO_ZPZ_FP
101
+
102
/*
103
* Load contiguous data, protected by a governing predicate.
104
*/
105
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
106
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/translate-sve.c
108
+++ b/target/arm/translate-sve.c
109
@@ -XXX,XX +XXX,XX @@ DO_FP3(FRSQRTS, rsqrts)
110
111
#undef DO_FP3
112
113
+
114
+/*
115
+ *** SVE Floating Point Unary Operations Predicated Group
116
+ */
117
+
118
+static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg,
119
+ bool is_fp16, gen_helper_gvec_3_ptr *fn)
120
+{
121
+ if (sve_access_check(s)) {
122
+ unsigned vsz = vec_full_reg_size(s);
123
+ TCGv_ptr status = get_fpstatus_ptr(is_fp16);
124
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
125
+ vec_full_reg_offset(s, rn),
126
+ pred_full_reg_offset(s, pg),
127
+ status, vsz, vsz, 0, fn);
128
+ tcg_temp_free_ptr(status);
129
+ }
130
+ return true;
131
+}
132
+
133
+static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
134
+{
135
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
136
+}
137
+
138
+static bool trans_SCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
139
+{
140
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_sh);
141
+}
142
+
143
+static bool trans_SCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
144
+{
145
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_dh);
146
+}
147
+
148
+static bool trans_SCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
149
+{
150
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ss);
151
+}
152
+
153
+static bool trans_SCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
154
+{
155
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ds);
156
+}
157
+
158
+static bool trans_SCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
159
+{
160
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_sd);
161
+}
162
+
163
+static bool trans_SCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
164
+{
165
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_dd);
166
+}
167
+
168
+static bool trans_UCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
169
+{
170
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_hh);
171
+}
172
+
173
+static bool trans_UCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
174
+{
175
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_sh);
176
+}
177
+
178
+static bool trans_UCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
179
+{
180
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_dh);
181
+}
182
+
183
+static bool trans_UCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
184
+{
185
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ss);
186
+}
187
+
188
+static bool trans_UCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
189
+{
190
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ds);
191
+}
192
+
193
+static bool trans_UCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
194
+{
195
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_sd);
196
+}
197
+
198
+static bool trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
199
+{
200
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_dd);
201
+}
202
+
203
/*
204
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
205
*/
206
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
207
index XXXXXXX..XXXXXXX 100644
208
--- a/target/arm/sve.decode
209
+++ b/target/arm/sve.decode
210
@@ -XXX,XX +XXX,XX @@
211
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
212
@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz
213
214
+# One register operand, with governing predicate, no vector element size
215
+@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0
216
+
217
# Two register operands with a 6-bit signed immediate.
218
@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri
219
220
@@ -XXX,XX +XXX,XX @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm
221
FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm
222
FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm
223
224
+### SVE FP Unary Operations Predicated Group
225
+
226
+# SVE integer convert to floating-point
227
+SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
228
+SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
229
+SCVTF_dh 01100101 01 010 11 0 101 ... ..... ..... @rd_pg_rn_e0
230
+SCVTF_ss 01100101 10 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
231
+SCVTF_sd 01100101 11 010 00 0 101 ... ..... ..... @rd_pg_rn_e0
232
+SCVTF_ds 01100101 11 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
233
+SCVTF_dd 01100101 11 010 11 0 101 ... ..... ..... @rd_pg_rn_e0
234
+
235
+UCVTF_hh 01100101 01 010 01 1 101 ... ..... ..... @rd_pg_rn_e0
236
+UCVTF_sh 01100101 01 010 10 1 101 ... ..... ..... @rd_pg_rn_e0
237
+UCVTF_dh 01100101 01 010 11 1 101 ... ..... ..... @rd_pg_rn_e0
238
+UCVTF_ss 01100101 10 010 10 1 101 ... ..... ..... @rd_pg_rn_e0
239
+UCVTF_sd 01100101 11 010 00 1 101 ... ..... ..... @rd_pg_rn_e0
240
+UCVTF_ds 01100101 11 010 10 1 101 ... ..... ..... @rd_pg_rn_e0
241
+UCVTF_dd 01100101 11 010 11 1 101 ... ..... ..... @rd_pg_rn_e0
242
+
243
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
244
245
# SVE load predicate register
246
--
247
2.17.1
248
249
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-7-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 77 +++++++++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 89 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 46 ++++++++++++++++++++
11
target/arm/sve.decode | 17 ++++++++
12
4 files changed, 229 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG,
65
+ void, ptr, ptr, ptr, ptr, ptr, i32)
66
+DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, ptr, ptr, i32)
68
+DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, ptr, ptr, i32)
70
+
71
+DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG,
72
+ void, ptr, ptr, ptr, ptr, ptr, i32)
73
+DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG,
74
+ void, ptr, ptr, ptr, ptr, ptr, i32)
75
+DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG,
76
+ void, ptr, ptr, ptr, ptr, ptr, i32)
77
+
78
+DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG,
79
+ void, ptr, ptr, ptr, ptr, ptr, i32)
80
+DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG,
81
+ void, ptr, ptr, ptr, ptr, ptr, i32)
82
+DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG,
83
+ void, ptr, ptr, ptr, ptr, ptr, i32)
84
+
85
+DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG,
86
+ void, ptr, ptr, ptr, ptr, ptr, i32)
87
+DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG,
88
+ void, ptr, ptr, ptr, ptr, ptr, i32)
89
+DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG,
90
+ void, ptr, ptr, ptr, ptr, ptr, i32)
91
+
92
+DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG,
93
+ void, ptr, ptr, ptr, ptr, ptr, i32)
94
+DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
95
+ void, ptr, ptr, ptr, ptr, ptr, i32)
96
+DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
97
+ void, ptr, ptr, ptr, ptr, ptr, i32)
98
+
99
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
100
void, ptr, ptr, ptr, ptr, i32)
101
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
102
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/target/arm/sve_helper.c
105
+++ b/target/arm/sve_helper.c
106
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
107
return predtest_ones(d, oprsz, esz_mask);
108
}
109
110
+/* Fully general three-operand expander, controlled by a predicate,
111
+ * With the extra float_status parameter.
112
+ */
113
+#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \
114
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
115
+ void *status, uint32_t desc) \
116
+{ \
117
+ intptr_t i = simd_oprsz(desc); \
118
+ uint64_t *g = vg; \
119
+ do { \
120
+ uint64_t pg = g[(i - 1) >> 6]; \
121
+ do { \
122
+ i -= sizeof(TYPE); \
123
+ if (likely((pg >> (i & 63)) & 1)) { \
124
+ TYPE nn = *(TYPE *)(vn + H(i)); \
125
+ TYPE mm = *(TYPE *)(vm + H(i)); \
126
+ *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \
127
+ } \
128
+ } while (i & 63); \
129
+ } while (i != 0); \
130
+}
131
+
132
+DO_ZPZZ_FP(sve_fadd_h, uint16_t, H1_2, float16_add)
133
+DO_ZPZZ_FP(sve_fadd_s, uint32_t, H1_4, float32_add)
134
+DO_ZPZZ_FP(sve_fadd_d, uint64_t, , float64_add)
135
+
136
+DO_ZPZZ_FP(sve_fsub_h, uint16_t, H1_2, float16_sub)
137
+DO_ZPZZ_FP(sve_fsub_s, uint32_t, H1_4, float32_sub)
138
+DO_ZPZZ_FP(sve_fsub_d, uint64_t, , float64_sub)
139
+
140
+DO_ZPZZ_FP(sve_fmul_h, uint16_t, H1_2, float16_mul)
141
+DO_ZPZZ_FP(sve_fmul_s, uint32_t, H1_4, float32_mul)
142
+DO_ZPZZ_FP(sve_fmul_d, uint64_t, , float64_mul)
143
+
144
+DO_ZPZZ_FP(sve_fdiv_h, uint16_t, H1_2, float16_div)
145
+DO_ZPZZ_FP(sve_fdiv_s, uint32_t, H1_4, float32_div)
146
+DO_ZPZZ_FP(sve_fdiv_d, uint64_t, , float64_div)
147
+
148
+DO_ZPZZ_FP(sve_fmin_h, uint16_t, H1_2, float16_min)
149
+DO_ZPZZ_FP(sve_fmin_s, uint32_t, H1_4, float32_min)
150
+DO_ZPZZ_FP(sve_fmin_d, uint64_t, , float64_min)
151
+
152
+DO_ZPZZ_FP(sve_fmax_h, uint16_t, H1_2, float16_max)
153
+DO_ZPZZ_FP(sve_fmax_s, uint32_t, H1_4, float32_max)
154
+DO_ZPZZ_FP(sve_fmax_d, uint64_t, , float64_max)
155
+
156
+DO_ZPZZ_FP(sve_fminnum_h, uint16_t, H1_2, float16_minnum)
157
+DO_ZPZZ_FP(sve_fminnum_s, uint32_t, H1_4, float32_minnum)
158
+DO_ZPZZ_FP(sve_fminnum_d, uint64_t, , float64_minnum)
159
+
160
+DO_ZPZZ_FP(sve_fmaxnum_h, uint16_t, H1_2, float16_maxnum)
161
+DO_ZPZZ_FP(sve_fmaxnum_s, uint32_t, H1_4, float32_maxnum)
162
+DO_ZPZZ_FP(sve_fmaxnum_d, uint64_t, , float64_maxnum)
163
+
164
+static inline float16 abd_h(float16 a, float16 b, float_status *s)
165
+{
166
+ return float16_abs(float16_sub(a, b, s));
167
+}
168
+
169
+static inline float32 abd_s(float32 a, float32 b, float_status *s)
170
+{
171
+ return float32_abs(float32_sub(a, b, s));
172
+}
173
+
174
+static inline float64 abd_d(float64 a, float64 b, float_status *s)
175
+{
176
+ return float64_abs(float64_sub(a, b, s));
177
+}
178
+
179
+DO_ZPZZ_FP(sve_fabd_h, uint16_t, H1_2, abd_h)
180
+DO_ZPZZ_FP(sve_fabd_s, uint32_t, H1_4, abd_s)
181
+DO_ZPZZ_FP(sve_fabd_d, uint64_t, , abd_d)
182
+
183
+static inline float64 scalbn_d(float64 a, int64_t b, float_status *s)
184
+{
185
+ int b_int = MIN(MAX(b, INT_MIN), INT_MAX);
186
+ return float64_scalbn(a, b_int, s);
187
+}
188
+
189
+DO_ZPZZ_FP(sve_fscalbn_h, int16_t, H1_2, float16_scalbn)
190
+DO_ZPZZ_FP(sve_fscalbn_s, int32_t, H1_4, float32_scalbn)
191
+DO_ZPZZ_FP(sve_fscalbn_d, int64_t, , scalbn_d)
192
+
193
+DO_ZPZZ_FP(sve_fmulx_h, uint16_t, H1_2, helper_advsimd_mulxh)
194
+DO_ZPZZ_FP(sve_fmulx_s, uint32_t, H1_4, helper_vfp_mulxs)
195
+DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd)
196
+
197
+#undef DO_ZPZZ_FP
198
+
199
/* Fully general two-operand expander, controlled by a predicate,
200
* With the extra float_status parameter.
201
*/
202
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
203
index XXXXXXX..XXXXXXX 100644
204
--- a/target/arm/translate-sve.c
205
+++ b/target/arm/translate-sve.c
206
@@ -XXX,XX +XXX,XX @@ DO_FP3(FRSQRTS, rsqrts)
207
208
#undef DO_FP3
209
210
+/*
211
+ *** SVE Floating Point Arithmetic - Predicated Group
212
+ */
213
+
214
+static bool do_zpzz_fp(DisasContext *s, arg_rprr_esz *a,
215
+ gen_helper_gvec_4_ptr *fn)
216
+{
217
+ if (fn == NULL) {
218
+ return false;
219
+ }
220
+ if (sve_access_check(s)) {
221
+ unsigned vsz = vec_full_reg_size(s);
222
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
223
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
224
+ vec_full_reg_offset(s, a->rn),
225
+ vec_full_reg_offset(s, a->rm),
226
+ pred_full_reg_offset(s, a->pg),
227
+ status, vsz, vsz, 0, fn);
228
+ tcg_temp_free_ptr(status);
229
+ }
230
+ return true;
231
+}
232
+
233
+#define DO_FP3(NAME, name) \
234
+static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \
235
+{ \
236
+ static gen_helper_gvec_4_ptr * const fns[4] = { \
237
+ NULL, gen_helper_sve_##name##_h, \
238
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \
239
+ }; \
240
+ return do_zpzz_fp(s, a, fns[a->esz]); \
241
+}
242
+
243
+DO_FP3(FADD_zpzz, fadd)
244
+DO_FP3(FSUB_zpzz, fsub)
245
+DO_FP3(FMUL_zpzz, fmul)
246
+DO_FP3(FMIN_zpzz, fmin)
247
+DO_FP3(FMAX_zpzz, fmax)
248
+DO_FP3(FMINNM_zpzz, fminnum)
249
+DO_FP3(FMAXNM_zpzz, fmaxnum)
250
+DO_FP3(FABD, fabd)
251
+DO_FP3(FSCALE, fscalbn)
252
+DO_FP3(FDIV, fdiv)
253
+DO_FP3(FMULX, fmulx)
254
+
255
+#undef DO_FP3
256
257
/*
258
*** SVE Floating Point Unary Operations Predicated Group
259
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
260
index XXXXXXX..XXXXXXX 100644
261
--- a/target/arm/sve.decode
262
+++ b/target/arm/sve.decode
263
@@ -XXX,XX +XXX,XX @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm
264
FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm
265
FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm
266
267
+### SVE FP Arithmetic Predicated Group
268
+
269
+# SVE floating-point arithmetic (predicated)
270
+FADD_zpzz 01100101 .. 00 0000 100 ... ..... ..... @rdn_pg_rm
271
+FSUB_zpzz 01100101 .. 00 0001 100 ... ..... ..... @rdn_pg_rm
272
+FMUL_zpzz 01100101 .. 00 0010 100 ... ..... ..... @rdn_pg_rm
273
+FSUB_zpzz 01100101 .. 00 0011 100 ... ..... ..... @rdm_pg_rn # FSUBR
274
+FMAXNM_zpzz 01100101 .. 00 0100 100 ... ..... ..... @rdn_pg_rm
275
+FMINNM_zpzz 01100101 .. 00 0101 100 ... ..... ..... @rdn_pg_rm
276
+FMAX_zpzz 01100101 .. 00 0110 100 ... ..... ..... @rdn_pg_rm
277
+FMIN_zpzz 01100101 .. 00 0111 100 ... ..... ..... @rdn_pg_rm
278
+FABD 01100101 .. 00 1000 100 ... ..... ..... @rdn_pg_rm
279
+FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm
280
+FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm
281
+FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR
282
+FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm
283
+
284
### SVE FP Unary Operations Predicated Group
285
286
# SVE integer convert to floating-point
287
--
288
2.17.1
289
290
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Enhance the existing helpers to support SVE, which takes the
3
STGP writes to tag memory, it does not check it.
4
index from each 128-bit segment. The change has no effect
4
This happened to work because we wrote tag memory first
5
for AdvSIMD, since there is only one such segment.
5
so that the check always succeeded.
6
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20230901203103.136408-1-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
10
Message-id: 20180627043328.11531-32-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
11
---
13
target/arm/translate-sve.c | 23 ++++++++++++++++++
12
target/arm/tcg/translate-a64.c | 41 +++++++++++++---------------------
14
target/arm/vec_helper.c | 50 +++++++++++++++++++++++---------------
13
1 file changed, 15 insertions(+), 26 deletions(-)
15
target/arm/sve.decode | 6 +++++
16
3 files changed, 59 insertions(+), 20 deletions(-)
17
14
18
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
19
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate-sve.c
17
--- a/target/arm/tcg/translate-a64.c
21
+++ b/target/arm/translate-sve.c
18
+++ b/target/arm/tcg/translate-a64.c
22
@@ -XXX,XX +XXX,XX @@ static bool trans_FCMLA_zpzzz(DisasContext *s,
19
@@ -XXX,XX +XXX,XX @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
20
tcg_gen_addi_i64(dirty_addr, dirty_addr, offset);
21
}
22
23
- if (!s->ata) {
24
- /*
25
- * TODO: We could rely on the stores below, at least for
26
- * system mode, if we arrange to add MO_ALIGN_16.
27
- */
28
- gen_helper_stg_stub(cpu_env, dirty_addr);
29
- } else if (tb_cflags(s->base.tb) & CF_PARALLEL) {
30
- gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
31
- } else {
32
- gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
33
- }
34
-
35
- mop = finalize_memop(s, MO_64);
36
- clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << MO_64, mop);
37
-
38
+ clean_addr = clean_data_tbi(s, dirty_addr);
39
tcg_rt = cpu_reg(s, a->rt);
40
tcg_rt2 = cpu_reg(s, a->rt2);
41
42
/*
43
- * STGP is defined as two 8-byte memory operations and one tag operation.
44
- * We implement it as one single 16-byte memory operation for convenience.
45
- * Rebuild mop as for STP.
46
- * TODO: The atomicity with LSE2 is stronger than required.
47
- * Need a form of MO_ATOM_WITHIN16_PAIR that never requires
48
- * 16-byte atomicity.
49
+ * STGP is defined as two 8-byte memory operations, aligned to TAG_GRANULE,
50
+ * and one tag operation. We implement it as one single aligned 16-byte
51
+ * memory operation for convenience. Note that the alignment ensures
52
+ * MO_ATOM_IFALIGN_PAIR produces 8-byte atomicity for the memory store.
53
*/
54
- mop = MO_128;
55
- if (s->align_mem) {
56
- mop |= MO_ALIGN_8;
57
- }
58
- mop = finalize_memop_pair(s, mop);
59
+ mop = finalize_memop_atom(s, MO_128 | MO_ALIGN, MO_ATOM_IFALIGN_PAIR);
60
61
tmp = tcg_temp_new_i128();
62
if (s->be_data == MO_LE) {
63
@@ -XXX,XX +XXX,XX @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
64
}
65
tcg_gen_qemu_st_i128(tmp, clean_addr, get_mem_index(s), mop);
66
67
+ /* Perform the tag store, if tag access enabled. */
68
+ if (s->ata) {
69
+ if (tb_cflags(s->base.tb) & CF_PARALLEL) {
70
+ gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
71
+ } else {
72
+ gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
73
+ }
74
+ }
75
+
76
op_addr_ldstpair_post(s, a, dirty_addr, offset);
23
return true;
77
return true;
24
}
78
}
25
26
+static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn)
27
+{
28
+ static gen_helper_gvec_3_ptr * const fns[2] = {
29
+ gen_helper_gvec_fcmlah_idx,
30
+ gen_helper_gvec_fcmlas_idx,
31
+ };
32
+
33
+ tcg_debug_assert(a->esz == 1 || a->esz == 2);
34
+ tcg_debug_assert(a->rd == a->ra);
35
+ if (sve_access_check(s)) {
36
+ unsigned vsz = vec_full_reg_size(s);
37
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
38
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
39
+ vec_full_reg_offset(s, a->rn),
40
+ vec_full_reg_offset(s, a->rm),
41
+ status, vsz, vsz,
42
+ a->index * 4 + a->rot,
43
+ fns[a->esz - 1]);
44
+ tcg_temp_free_ptr(status);
45
+ }
46
+ return true;
47
+}
48
+
49
/*
50
*** SVE Floating Point Unary Operations Predicated Group
51
*/
52
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/vec_helper.c
55
+++ b/target/arm/vec_helper.c
56
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
57
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
58
intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
59
uint32_t neg_real = flip ^ neg_imag;
60
- uintptr_t i;
61
- float16 e1 = m[H2(2 * index + flip)];
62
- float16 e3 = m[H2(2 * index + 1 - flip)];
63
+ intptr_t elements = opr_sz / sizeof(float16);
64
+ intptr_t eltspersegment = 16 / sizeof(float16);
65
+ intptr_t i, j;
66
67
/* Shift boolean to the sign bit so we can xor to negate. */
68
neg_real <<= 15;
69
neg_imag <<= 15;
70
- e1 ^= neg_real;
71
- e3 ^= neg_imag;
72
73
- for (i = 0; i < opr_sz / 2; i += 2) {
74
- float16 e2 = n[H2(i + flip)];
75
- float16 e4 = e2;
76
+ for (i = 0; i < elements; i += eltspersegment) {
77
+ float16 mr = m[H2(i + 2 * index + 0)];
78
+ float16 mi = m[H2(i + 2 * index + 1)];
79
+ float16 e1 = neg_real ^ (flip ? mi : mr);
80
+ float16 e3 = neg_imag ^ (flip ? mr : mi);
81
82
- d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst);
83
- d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst);
84
+ for (j = i; j < i + eltspersegment; j += 2) {
85
+ float16 e2 = n[H2(j + flip)];
86
+ float16 e4 = e2;
87
+
88
+ d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst);
89
+ d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst);
90
+ }
91
}
92
clear_tail(d, opr_sz, simd_maxsz(desc));
93
}
94
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm,
95
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
96
intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
97
uint32_t neg_real = flip ^ neg_imag;
98
- uintptr_t i;
99
- float32 e1 = m[H4(2 * index + flip)];
100
- float32 e3 = m[H4(2 * index + 1 - flip)];
101
+ intptr_t elements = opr_sz / sizeof(float32);
102
+ intptr_t eltspersegment = 16 / sizeof(float32);
103
+ intptr_t i, j;
104
105
/* Shift boolean to the sign bit so we can xor to negate. */
106
neg_real <<= 31;
107
neg_imag <<= 31;
108
- e1 ^= neg_real;
109
- e3 ^= neg_imag;
110
111
- for (i = 0; i < opr_sz / 4; i += 2) {
112
- float32 e2 = n[H4(i + flip)];
113
- float32 e4 = e2;
114
+ for (i = 0; i < elements; i += eltspersegment) {
115
+ float32 mr = m[H4(i + 2 * index + 0)];
116
+ float32 mi = m[H4(i + 2 * index + 1)];
117
+ float32 e1 = neg_real ^ (flip ? mi : mr);
118
+ float32 e3 = neg_imag ^ (flip ? mr : mi);
119
120
- d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst);
121
- d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst);
122
+ for (j = i; j < i + eltspersegment; j += 2) {
123
+ float32 e2 = n[H4(j + flip)];
124
+ float32 e4 = e2;
125
+
126
+ d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst);
127
+ d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst);
128
+ }
129
}
130
clear_tail(d, opr_sz, simd_maxsz(desc));
131
}
132
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
133
index XXXXXXX..XXXXXXX 100644
134
--- a/target/arm/sve.decode
135
+++ b/target/arm/sve.decode
136
@@ -XXX,XX +XXX,XX @@ FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
137
FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \
138
ra=%reg_movprfx
139
140
+# SVE floating-point complex multiply-add (indexed)
141
+FCMLA_zzxz 01100100 10 1 index:2 rm:3 0001 rot:2 rn:5 rd:5 \
142
+ ra=%reg_movprfx esz=1
143
+FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \
144
+ ra=%reg_movprfx esz=2
145
+
146
### SVE FP Multiply-Add Indexed Group
147
148
# SVE floating-point multiply-add (indexed)
149
--
79
--
150
2.17.1
80
2.34.1
151
152
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Colton Lewis <coltonlewis@google.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Due to recent KVM changes, QEMU is setting a ptimer offset resulting
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
in unintended trap and emulate access and a consequent performance
5
Message-id: 20180627043328.11531-15-richard.henderson@linaro.org
5
hit. Filter out the PTIMER_CNT register to restore trapless ptimer
6
access.
7
8
Quoting Andrew Jones:
9
10
Simply reading the CNT register and writing back the same value is
11
enough to set an offset, since the timer will have certainly moved
12
past whatever value was read by the time it's written. QEMU
13
frequently saves and restores all registers in the get-reg-list array,
14
unless they've been explicitly filtered out (with Linux commit
15
680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array). So, to
16
restore trapless ptimer accesses, we need a QEMU patch to filter out
17
the register.
18
19
See
20
https://lore.kernel.org/kvmarm/gsntttsonus5.fsf@coltonlewis-kvm.c.googlers.com/T/#m0770023762a821db2a3f0dd0a7dc6aa54e0d0da9
21
for additional context.
22
23
Cc: qemu-stable@nongnu.org
24
Signed-off-by: Andrew Jones <andrew.jones@linux.dev>
25
Signed-off-by: Colton Lewis <coltonlewis@google.com>
26
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
27
Tested-by: Colton Lewis <coltonlewis@google.com>
28
Message-id: 20230831190052.129045-1-coltonlewis@google.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
29
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
30
---
8
target/arm/helper-sve.h | 67 +++++++++++++++++++++++++++++
31
target/arm/kvm64.c | 1 +
9
target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++
32
1 file changed, 1 insertion(+)
10
target/arm/translate-sve.c | 40 ++++++++++++++++-
11
3 files changed, 193 insertions(+), 2 deletions(-)
12
33
13
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
34
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
14
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-sve.h
36
--- a/target/arm/kvm64.c
16
+++ b/target/arm/helper-sve.h
37
+++ b/target/arm/kvm64.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
38
@@ -XXX,XX +XXX,XX @@ typedef struct CPRegStateLevel {
18
DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
39
*/
19
void, env, ptr, ptr, ptr, tl, i32)
40
static const CPRegStateLevel non_runtime_cpregs[] = {
20
41
{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
21
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
42
+ { KVM_REG_ARM_PTIMER_CNT, KVM_PUT_FULL_STATE },
22
+ void, env, ptr, ptr, ptr, tl, i32)
23
+DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG,
24
+ void, env, ptr, ptr, ptr, tl, i32)
25
+DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG,
26
+ void, env, ptr, ptr, ptr, tl, i32)
27
+DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG,
28
+ void, env, ptr, ptr, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG,
30
+ void, env, ptr, ptr, ptr, tl, i32)
31
+
32
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG,
33
+ void, env, ptr, ptr, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG,
35
+ void, env, ptr, ptr, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG,
37
+ void, env, ptr, ptr, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG,
39
+ void, env, ptr, ptr, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG,
41
+ void, env, ptr, ptr, ptr, tl, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG,
44
+ void, env, ptr, ptr, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG,
46
+ void, env, ptr, ptr, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG,
48
+ void, env, ptr, ptr, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG,
50
+ void, env, ptr, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG,
52
+ void, env, ptr, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG,
54
+ void, env, ptr, ptr, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG,
56
+ void, env, ptr, ptr, ptr, tl, i32)
57
+
58
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG,
59
+ void, env, ptr, ptr, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG,
61
+ void, env, ptr, ptr, ptr, tl, i32)
62
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG,
63
+ void, env, ptr, ptr, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG,
65
+ void, env, ptr, ptr, ptr, tl, i32)
66
+DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG,
67
+ void, env, ptr, ptr, ptr, tl, i32)
68
+DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG,
69
+ void, env, ptr, ptr, ptr, tl, i32)
70
+DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG,
71
+ void, env, ptr, ptr, ptr, tl, i32)
72
+
73
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG,
74
+ void, env, ptr, ptr, ptr, tl, i32)
75
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG,
76
+ void, env, ptr, ptr, ptr, tl, i32)
77
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG,
78
+ void, env, ptr, ptr, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG,
80
+ void, env, ptr, ptr, ptr, tl, i32)
81
+DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG,
82
+ void, env, ptr, ptr, ptr, tl, i32)
83
+DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG,
84
+ void, env, ptr, ptr, ptr, tl, i32)
85
+DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG,
86
+ void, env, ptr, ptr, ptr, tl, i32)
87
+
88
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
89
void, env, ptr, ptr, ptr, tl, i32)
90
DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
91
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
92
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/sve_helper.c
94
+++ b/target/arm/sve_helper.c
95
@@ -XXX,XX +XXX,XX @@ DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
96
DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
97
DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
98
99
+/* First fault loads with a vector index. */
100
+
101
+#ifdef CONFIG_USER_ONLY
102
+
103
+#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \
104
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
105
+ target_ulong base, uint32_t desc) \
106
+{ \
107
+ intptr_t i, oprsz = simd_oprsz(desc); \
108
+ unsigned scale = simd_data(desc); \
109
+ uintptr_t ra = GETPC(); \
110
+ bool first = true; \
111
+ mmap_lock(); \
112
+ for (i = 0; i < oprsz; i++) { \
113
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
114
+ do { \
115
+ TYPEM m = 0; \
116
+ if (pg & 1) { \
117
+ target_ulong off = *(TYPEI *)(vm + H(i)); \
118
+ target_ulong addr = base + (off << scale); \
119
+ if (!first && \
120
+ page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \
121
+ record_fault(env, i, oprsz); \
122
+ goto exit; \
123
+ } \
124
+ m = FN(env, addr, ra); \
125
+ first = false; \
126
+ } \
127
+ *(TYPEE *)(vd + H(i)) = m; \
128
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
129
+ } while (i & 15); \
130
+ } \
131
+ exit: \
132
+ mmap_unlock(); \
133
+}
134
+
135
+#else
136
+
137
+#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \
138
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
139
+ target_ulong base, uint32_t desc) \
140
+{ \
141
+ g_assert_not_reached(); \
142
+}
143
+
144
+#endif
145
+
146
+#define DO_LDFF1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
147
+ DO_LDFF1_ZPZ(NAME, uint32_t, TYPEI, TYPEM, FN, H1_4)
148
+#define DO_LDFF1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \
149
+ DO_LDFF1_ZPZ(NAME, uint64_t, TYPEI, TYPEM, FN, )
150
+
151
+DO_LDFF1_ZPZ_S(sve_ldffbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
152
+DO_LDFF1_ZPZ_S(sve_ldffhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
153
+DO_LDFF1_ZPZ_S(sve_ldffssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
154
+DO_LDFF1_ZPZ_S(sve_ldffbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
155
+DO_LDFF1_ZPZ_S(sve_ldffhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
156
+
157
+DO_LDFF1_ZPZ_S(sve_ldffbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
158
+DO_LDFF1_ZPZ_S(sve_ldffhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
159
+DO_LDFF1_ZPZ_S(sve_ldffssu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
160
+DO_LDFF1_ZPZ_S(sve_ldffbss_zss, int32_t, int8_t, cpu_ldub_data_ra)
161
+DO_LDFF1_ZPZ_S(sve_ldffhss_zss, int32_t, int16_t, cpu_lduw_data_ra)
162
+
163
+DO_LDFF1_ZPZ_D(sve_ldffbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
164
+DO_LDFF1_ZPZ_D(sve_ldffhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
165
+DO_LDFF1_ZPZ_D(sve_ldffsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
166
+DO_LDFF1_ZPZ_D(sve_ldffddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra)
167
+DO_LDFF1_ZPZ_D(sve_ldffbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
168
+DO_LDFF1_ZPZ_D(sve_ldffhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
169
+DO_LDFF1_ZPZ_D(sve_ldffsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra)
170
+
171
+DO_LDFF1_ZPZ_D(sve_ldffbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
172
+DO_LDFF1_ZPZ_D(sve_ldffhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
173
+DO_LDFF1_ZPZ_D(sve_ldffsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
174
+DO_LDFF1_ZPZ_D(sve_ldffddu_zss, int32_t, uint64_t, cpu_ldq_data_ra)
175
+DO_LDFF1_ZPZ_D(sve_ldffbds_zss, int32_t, int8_t, cpu_ldub_data_ra)
176
+DO_LDFF1_ZPZ_D(sve_ldffhds_zss, int32_t, int16_t, cpu_lduw_data_ra)
177
+DO_LDFF1_ZPZ_D(sve_ldffsds_zss, int32_t, int32_t, cpu_ldl_data_ra)
178
+
179
+DO_LDFF1_ZPZ_D(sve_ldffbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra)
180
+DO_LDFF1_ZPZ_D(sve_ldffhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra)
181
+DO_LDFF1_ZPZ_D(sve_ldffsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra)
182
+DO_LDFF1_ZPZ_D(sve_ldffddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra)
183
+DO_LDFF1_ZPZ_D(sve_ldffbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
184
+DO_LDFF1_ZPZ_D(sve_ldffhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
185
+DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
186
+
187
/* Stores with a vector index. */
188
189
#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
190
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
191
index XXXXXXX..XXXXXXX 100644
192
--- a/target/arm/translate-sve.c
193
+++ b/target/arm/translate-sve.c
194
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = {
195
{ gen_helper_sve_ldbsu_zss,
196
gen_helper_sve_ldhsu_zss,
197
gen_helper_sve_ldssu_zss, } } },
198
- /* TODO fill in first-fault handlers */
199
+
200
+ { { { gen_helper_sve_ldffbss_zsu,
201
+ gen_helper_sve_ldffhss_zsu,
202
+ NULL, },
203
+ { gen_helper_sve_ldffbsu_zsu,
204
+ gen_helper_sve_ldffhsu_zsu,
205
+ gen_helper_sve_ldffssu_zsu, } },
206
+ { { gen_helper_sve_ldffbss_zss,
207
+ gen_helper_sve_ldffhss_zss,
208
+ NULL, },
209
+ { gen_helper_sve_ldffbsu_zss,
210
+ gen_helper_sve_ldffhsu_zss,
211
+ gen_helper_sve_ldffssu_zss, } } }
212
};
43
};
213
44
214
/* Note that we overload xs=2 to indicate 64-bit offset. */
45
int kvm_arm_cpreg_level(uint64_t regidx)
215
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = {
216
gen_helper_sve_ldhdu_zd,
217
gen_helper_sve_ldsdu_zd,
218
gen_helper_sve_ldddu_zd, } } },
219
- /* TODO fill in first-fault handlers */
220
+
221
+ { { { gen_helper_sve_ldffbds_zsu,
222
+ gen_helper_sve_ldffhds_zsu,
223
+ gen_helper_sve_ldffsds_zsu,
224
+ NULL, },
225
+ { gen_helper_sve_ldffbdu_zsu,
226
+ gen_helper_sve_ldffhdu_zsu,
227
+ gen_helper_sve_ldffsdu_zsu,
228
+ gen_helper_sve_ldffddu_zsu, } },
229
+ { { gen_helper_sve_ldffbds_zss,
230
+ gen_helper_sve_ldffhds_zss,
231
+ gen_helper_sve_ldffsds_zss,
232
+ NULL, },
233
+ { gen_helper_sve_ldffbdu_zss,
234
+ gen_helper_sve_ldffhdu_zss,
235
+ gen_helper_sve_ldffsdu_zss,
236
+ gen_helper_sve_ldffddu_zss, } },
237
+ { { gen_helper_sve_ldffbds_zd,
238
+ gen_helper_sve_ldffhds_zd,
239
+ gen_helper_sve_ldffsds_zd,
240
+ NULL, },
241
+ { gen_helper_sve_ldffbdu_zd,
242
+ gen_helper_sve_ldffhdu_zd,
243
+ gen_helper_sve_ldffsdu_zd,
244
+ gen_helper_sve_ldffddu_zd, } } }
245
};
246
247
static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
248
--
46
--
249
2.17.1
47
2.34.1
250
251
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Leave ARM_CP_SVE, removing ARM_CP_FPU; the sve_access_check
3
Provide a stub implementation, as a write is a "request".
4
produced by the flag already includes fp_access_check. If
5
we also check ARM_CP_FPU the double fp_access_check asserts.
6
4
7
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Message-id: 20230831232441.66020-2-richard.henderson@linaro.org
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
12
Message-id: 20180629001538.11415-3-richard.henderson@linaro.org
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
9
---
15
target/arm/helper.c | 8 ++++----
10
target/arm/helper.c | 64 +++++++++++++++++++++++++++++----------------
16
target/arm/translate-a64.c | 5 ++---
11
1 file changed, 41 insertions(+), 23 deletions(-)
17
2 files changed, 6 insertions(+), 7 deletions(-)
18
12
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
diff --git a/target/arm/helper.c b/target/arm/helper.c
20
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.c
15
--- a/target/arm/helper.c
22
+++ b/target/arm/helper.c
16
+++ b/target/arm/helper.c
23
@@ -XXX,XX +XXX,XX @@ static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
17
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
24
static const ARMCPRegInfo zcr_el1_reginfo = {
18
};
25
.name = "ZCR_EL1", .state = ARM_CP_STATE_AA64,
19
modify_arm_cp_regs(v8_idregs, v8_user_idregs);
26
.opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 0,
20
#endif
27
- .access = PL1_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
21
- /* RVBAR_EL1 is only implemented if EL1 is the highest EL */
28
+ .access = PL1_RW, .type = ARM_CP_SVE,
22
+ /*
29
.fieldoffset = offsetof(CPUARMState, vfp.zcr_el[1]),
23
+ * RVBAR_EL1 and RMR_EL1 only implemented if EL1 is the highest EL.
30
.writefn = zcr_write, .raw_writefn = raw_write
24
+ * TODO: For RMR, a write with bit 1 set should do something with
31
};
25
+ * cpu_reset(). In the meantime, "the bit is strictly a request",
32
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo zcr_el1_reginfo = {
26
+ * so we are in spec just ignoring writes.
33
static const ARMCPRegInfo zcr_el2_reginfo = {
27
+ */
34
.name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
28
if (!arm_feature(env, ARM_FEATURE_EL3) &&
35
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
29
!arm_feature(env, ARM_FEATURE_EL2)) {
36
- .access = PL2_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
30
- ARMCPRegInfo rvbar = {
37
+ .access = PL2_RW, .type = ARM_CP_SVE,
31
- .name = "RVBAR_EL1", .state = ARM_CP_STATE_BOTH,
38
.fieldoffset = offsetof(CPUARMState, vfp.zcr_el[2]),
32
- .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 1,
39
.writefn = zcr_write, .raw_writefn = raw_write
33
- .access = PL1_R,
40
};
34
- .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
41
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo zcr_el2_reginfo = {
35
+ ARMCPRegInfo el1_reset_regs[] = {
42
static const ARMCPRegInfo zcr_no_el2_reginfo = {
36
+ { .name = "RVBAR_EL1", .state = ARM_CP_STATE_BOTH,
43
.name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
37
+ .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 1,
44
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
38
+ .access = PL1_R,
45
- .access = PL2_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
39
+ .fieldoffset = offsetof(CPUARMState, cp15.rvbar) },
46
+ .access = PL2_RW, .type = ARM_CP_SVE,
40
+ { .name = "RMR_EL1", .state = ARM_CP_STATE_BOTH,
47
.readfn = arm_cp_read_zero, .writefn = arm_cp_write_ignore
41
+ .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 2,
48
};
42
+ .access = PL1_RW, .type = ARM_CP_CONST,
49
43
+ .resetvalue = arm_feature(env, ARM_FEATURE_AARCH64) }
50
static const ARMCPRegInfo zcr_el3_reginfo = {
44
};
51
.name = "ZCR_EL3", .state = ARM_CP_STATE_AA64,
45
- define_one_arm_cp_reg(cpu, &rvbar);
52
.opc0 = 3, .opc1 = 6, .crn = 1, .crm = 2, .opc2 = 0,
46
+ define_arm_cp_regs(cpu, el1_reset_regs);
53
- .access = PL3_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
47
}
54
+ .access = PL3_RW, .type = ARM_CP_SVE,
48
define_arm_cp_regs(cpu, v8_idregs);
55
.fieldoffset = offsetof(CPUARMState, vfp.zcr_el[3]),
49
define_arm_cp_regs(cpu, v8_cp_reginfo);
56
.writefn = zcr_write, .raw_writefn = raw_write
50
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
57
};
51
if (cpu_isar_feature(aa64_sel2, cpu)) {
58
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
52
define_arm_cp_regs(cpu, el2_sec_cp_reginfo);
59
index XXXXXXX..XXXXXXX 100644
53
}
60
--- a/target/arm/translate-a64.c
54
- /* RVBAR_EL2 is only implemented if EL2 is the highest EL */
61
+++ b/target/arm/translate-a64.c
55
+ /*
62
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
56
+ * RVBAR_EL2 and RMR_EL2 only implemented if EL2 is the highest EL.
63
default:
57
+ * See commentary near RMR_EL1.
64
break;
58
+ */
59
if (!arm_feature(env, ARM_FEATURE_EL3)) {
60
- ARMCPRegInfo rvbar[] = {
61
- {
62
- .name = "RVBAR_EL2", .state = ARM_CP_STATE_AA64,
63
- .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 0, .opc2 = 1,
64
- .access = PL2_R,
65
- .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
66
- },
67
- { .name = "RVBAR", .type = ARM_CP_ALIAS,
68
- .cp = 15, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 1,
69
- .access = PL2_R,
70
- .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
71
- },
72
+ static const ARMCPRegInfo el2_reset_regs[] = {
73
+ { .name = "RVBAR_EL2", .state = ARM_CP_STATE_AA64,
74
+ .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 0, .opc2 = 1,
75
+ .access = PL2_R,
76
+ .fieldoffset = offsetof(CPUARMState, cp15.rvbar) },
77
+ { .name = "RVBAR", .type = ARM_CP_ALIAS,
78
+ .cp = 15, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 1,
79
+ .access = PL2_R,
80
+ .fieldoffset = offsetof(CPUARMState, cp15.rvbar) },
81
+ { .name = "RMR_EL2", .state = ARM_CP_STATE_AA64,
82
+ .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 0, .opc2 = 2,
83
+ .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 1 },
84
};
85
- define_arm_cp_regs(cpu, rvbar);
86
+ define_arm_cp_regs(cpu, el2_reset_regs);
87
}
65
}
88
}
66
- if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
89
67
- return;
90
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
68
- }
91
{ .name = "RVBAR_EL3", .state = ARM_CP_STATE_AA64,
69
if ((ri->type & ARM_CP_FPU) && !fp_access_check(s)) {
92
.opc0 = 3, .opc1 = 6, .crn = 12, .crm = 0, .opc2 = 1,
70
return;
93
.access = PL3_R,
71
+ } else if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
94
- .fieldoffset = offsetof(CPUARMState, cp15.rvbar),
72
+ return;
95
- },
73
}
96
+ .fieldoffset = offsetof(CPUARMState, cp15.rvbar), },
74
97
+ { .name = "RMR_EL3", .state = ARM_CP_STATE_AA64,
75
if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
98
+ .opc0 = 3, .opc1 = 6, .crn = 12, .crm = 0, .opc2 = 2,
99
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 1 },
100
+ { .name = "RMR", .state = ARM_CP_STATE_AA32,
101
+ .cp = 15, .opc1 = 0, .crn = 12, .crm = 0, .opc2 = 2,
102
+ .access = PL3_RW, .type = ARM_CP_CONST,
103
+ .resetvalue = arm_feature(env, ARM_FEATURE_AARCH64) },
104
{ .name = "SCTLR_EL3", .state = ARM_CP_STATE_AA64,
105
.opc0 = 3, .opc1 = 6, .crn = 1, .crm = 0, .opc2 = 0,
106
.access = PL3_RW,
76
--
107
--
77
2.17.1
108
2.34.1
78
79
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The cortex-a710 is a first generation ARMv9.0-A processor.
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20230831232441.66020-3-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-26-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
9
---
8
target/arm/helper-sve.h | 14 +++++++
10
docs/system/arm/virt.rst | 1 +
9
target/arm/sve_helper.c | 8 ++++
11
hw/arm/virt.c | 1 +
10
target/arm/translate-sve.c | 77 ++++++++++++++++++++++++++++++++++++++
12
target/arm/tcg/cpu64.c | 212 +++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 9 +++++
13
3 files changed, 214 insertions(+)
12
4 files changed, 108 insertions(+)
13
14
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
--- a/docs/system/arm/virt.rst
17
+++ b/target/arm/helper-sve.h
18
+++ b/docs/system/arm/virt.rst
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
19
@@ -XXX,XX +XXX,XX @@ Supported guest CPU types:
19
DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
20
- ``cortex-a57`` (64-bit)
20
void, ptr, ptr, ptr, ptr, i32)
21
- ``cortex-a72`` (64-bit)
21
22
- ``cortex-a76`` (64-bit)
22
+DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG,
23
+- ``cortex-a710`` (64-bit)
23
+ void, ptr, ptr, ptr, ptr, i32)
24
- ``a64fx`` (64-bit)
24
+DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG,
25
- ``host`` (with KVM only)
25
+ void, ptr, ptr, ptr, ptr, i32)
26
- ``neoverse-n1`` (64-bit)
26
+DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG,
27
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+
36
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
37
void, ptr, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
39
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
40
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/sve_helper.c
29
--- a/hw/arm/virt.c
42
+++ b/target/arm/sve_helper.c
30
+++ b/hw/arm/virt.c
43
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz)
31
@@ -XXX,XX +XXX,XX @@ static const char *valid_cpus[] = {
44
DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd)
32
ARM_CPU_TYPE_NAME("cortex-a55"),
45
DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz)
33
ARM_CPU_TYPE_NAME("cortex-a72"),
46
34
ARM_CPU_TYPE_NAME("cortex-a76"),
47
+DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth)
35
+ ARM_CPU_TYPE_NAME("cortex-a710"),
48
+DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints)
36
ARM_CPU_TYPE_NAME("a64fx"),
49
+DO_ZPZ_FP(sve_frint_d, uint64_t, , helper_rintd)
37
ARM_CPU_TYPE_NAME("neoverse-n1"),
50
+
38
ARM_CPU_TYPE_NAME("neoverse-v1"),
51
+DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int)
39
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
52
+DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
53
+DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int)
54
+
55
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
56
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
57
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
58
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
59
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-sve.c
41
--- a/target/arm/tcg/cpu64.c
61
+++ b/target/arm/translate-sve.c
42
+++ b/target/arm/tcg/cpu64.c
62
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
43
@@ -XXX,XX +XXX,XX @@ static void aarch64_neoverse_v1_initfn(Object *obj)
63
return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd);
44
aarch64_add_sve_properties(obj);
64
}
45
}
65
46
66
+static gen_helper_gvec_3_ptr * const frint_fns[3] = {
47
+static const ARMCPRegInfo cortex_a710_cp_reginfo[] = {
67
+ gen_helper_sve_frint_h,
48
+ { .name = "CPUACTLR_EL1", .state = ARM_CP_STATE_AA64,
68
+ gen_helper_sve_frint_s,
49
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 0,
69
+ gen_helper_sve_frint_d
50
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
51
+ .accessfn = access_actlr_w },
52
+ { .name = "CPUACTLR2_EL1", .state = ARM_CP_STATE_AA64,
53
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 1,
54
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
55
+ .accessfn = access_actlr_w },
56
+ { .name = "CPUACTLR3_EL1", .state = ARM_CP_STATE_AA64,
57
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 2,
58
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
59
+ .accessfn = access_actlr_w },
60
+ { .name = "CPUACTLR4_EL1", .state = ARM_CP_STATE_AA64,
61
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 3,
62
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
63
+ .accessfn = access_actlr_w },
64
+ { .name = "CPUECTLR_EL1", .state = ARM_CP_STATE_AA64,
65
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 4,
66
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
67
+ .accessfn = access_actlr_w },
68
+ { .name = "CPUECTLR2_EL1", .state = ARM_CP_STATE_AA64,
69
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 5,
70
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
71
+ .accessfn = access_actlr_w },
72
+ { .name = "CPUPPMCR_EL3", .state = ARM_CP_STATE_AA64,
73
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 2, .opc2 = 4,
74
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
75
+ { .name = "CPUPWRCTLR_EL1", .state = ARM_CP_STATE_AA64,
76
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 2, .opc2 = 7,
77
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
78
+ .accessfn = access_actlr_w },
79
+ { .name = "ATCR_EL1", .state = ARM_CP_STATE_AA64,
80
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 7, .opc2 = 0,
81
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
82
+ { .name = "CPUACTLR5_EL1", .state = ARM_CP_STATE_AA64,
83
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 8, .opc2 = 0,
84
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
85
+ .accessfn = access_actlr_w },
86
+ { .name = "CPUACTLR6_EL1", .state = ARM_CP_STATE_AA64,
87
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 8, .opc2 = 1,
88
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
89
+ .accessfn = access_actlr_w },
90
+ { .name = "CPUACTLR7_EL1", .state = ARM_CP_STATE_AA64,
91
+ .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 8, .opc2 = 2,
92
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0,
93
+ .accessfn = access_actlr_w },
94
+ { .name = "ATCR_EL2", .state = ARM_CP_STATE_AA64,
95
+ .opc0 = 3, .opc1 = 4, .crn = 15, .crm = 7, .opc2 = 0,
96
+ .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
97
+ { .name = "AVTCR_EL2", .state = ARM_CP_STATE_AA64,
98
+ .opc0 = 3, .opc1 = 4, .crn = 15, .crm = 7, .opc2 = 1,
99
+ .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
100
+ { .name = "CPUPPMCR_EL3", .state = ARM_CP_STATE_AA64,
101
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 0,
102
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
103
+ { .name = "CPUPPMCR2_EL3", .state = ARM_CP_STATE_AA64,
104
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 1,
105
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
106
+ { .name = "CPUPPMCR4_EL3", .state = ARM_CP_STATE_AA64,
107
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 4,
108
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
109
+ { .name = "CPUPPMCR5_EL3", .state = ARM_CP_STATE_AA64,
110
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 5,
111
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
112
+ { .name = "CPUPPMCR6_EL3", .state = ARM_CP_STATE_AA64,
113
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 6,
114
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
115
+ { .name = "CPUACTLR_EL3", .state = ARM_CP_STATE_AA64,
116
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 4, .opc2 = 0,
117
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
118
+ { .name = "ATCR_EL3", .state = ARM_CP_STATE_AA64,
119
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 7, .opc2 = 0,
120
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
121
+ { .name = "CPUPSELR_EL3", .state = ARM_CP_STATE_AA64,
122
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 0,
123
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
124
+ { .name = "CPUPCR_EL3", .state = ARM_CP_STATE_AA64,
125
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 1,
126
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
127
+ { .name = "CPUPOR_EL3", .state = ARM_CP_STATE_AA64,
128
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 2,
129
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
130
+ { .name = "CPUPMR_EL3", .state = ARM_CP_STATE_AA64,
131
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 3,
132
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
133
+ { .name = "CPUPOR2_EL3", .state = ARM_CP_STATE_AA64,
134
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 4,
135
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
136
+ { .name = "CPUPMR2_EL3", .state = ARM_CP_STATE_AA64,
137
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 5,
138
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
139
+ { .name = "CPUPFR_EL3", .state = ARM_CP_STATE_AA64,
140
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 8, .opc2 = 6,
141
+ .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
142
+
143
+ /*
144
+ * Stub RAMINDEX, as we don't actually implement caches, BTB,
145
+ * or anything else with cpu internal memory.
146
+ * "Read" zeros into the IDATA* and DDATA* output registers.
147
+ */
148
+ { .name = "RAMINDEX_EL3", .state = ARM_CP_STATE_AA64,
149
+ .opc0 = 1, .opc1 = 6, .crn = 15, .crm = 0, .opc2 = 0,
150
+ .access = PL3_W, .type = ARM_CP_CONST, .resetvalue = 0 },
151
+ { .name = "IDATA0_EL3", .state = ARM_CP_STATE_AA64,
152
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 0, .opc2 = 0,
153
+ .access = PL3_R, .type = ARM_CP_CONST, .resetvalue = 0 },
154
+ { .name = "IDATA1_EL3", .state = ARM_CP_STATE_AA64,
155
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 0, .opc2 = 1,
156
+ .access = PL3_R, .type = ARM_CP_CONST, .resetvalue = 0 },
157
+ { .name = "IDATA2_EL3", .state = ARM_CP_STATE_AA64,
158
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 0, .opc2 = 2,
159
+ .access = PL3_R, .type = ARM_CP_CONST, .resetvalue = 0 },
160
+ { .name = "DDATA0_EL3", .state = ARM_CP_STATE_AA64,
161
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 1, .opc2 = 0,
162
+ .access = PL3_R, .type = ARM_CP_CONST, .resetvalue = 0 },
163
+ { .name = "DDATA1_EL3", .state = ARM_CP_STATE_AA64,
164
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 1, .opc2 = 1,
165
+ .access = PL3_R, .type = ARM_CP_CONST, .resetvalue = 0 },
166
+ { .name = "DDATA2_EL3", .state = ARM_CP_STATE_AA64,
167
+ .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 1, .opc2 = 2,
168
+ .access = PL3_R, .type = ARM_CP_CONST, .resetvalue = 0 },
70
+};
169
+};
71
+
170
+
72
+static bool trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
171
+static void aarch64_a710_initfn(Object *obj)
73
+{
172
+{
74
+ if (a->esz == 0) {
173
+ ARMCPU *cpu = ARM_CPU(obj);
75
+ return false;
174
+
76
+ }
175
+ cpu->dtb_compatible = "arm,cortex-a710";
77
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16,
176
+ set_feature(&cpu->env, ARM_FEATURE_V8);
78
+ frint_fns[a->esz - 1]);
177
+ set_feature(&cpu->env, ARM_FEATURE_NEON);
178
+ set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
179
+ set_feature(&cpu->env, ARM_FEATURE_AARCH64);
180
+ set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
181
+ set_feature(&cpu->env, ARM_FEATURE_EL2);
182
+ set_feature(&cpu->env, ARM_FEATURE_EL3);
183
+ set_feature(&cpu->env, ARM_FEATURE_PMU);
184
+
185
+ /* Ordered by Section B.4: AArch64 registers */
186
+ cpu->midr = 0x412FD471; /* r2p1 */
187
+ cpu->revidr = 0;
188
+ cpu->isar.id_pfr0 = 0x21110131;
189
+ cpu->isar.id_pfr1 = 0x00010000; /* GIC filled in later */
190
+ cpu->isar.id_dfr0 = 0x16011099;
191
+ cpu->id_afr0 = 0;
192
+ cpu->isar.id_mmfr0 = 0x10201105;
193
+ cpu->isar.id_mmfr1 = 0x40000000;
194
+ cpu->isar.id_mmfr2 = 0x01260000;
195
+ cpu->isar.id_mmfr3 = 0x02122211;
196
+ cpu->isar.id_isar0 = 0x02101110;
197
+ cpu->isar.id_isar1 = 0x13112111;
198
+ cpu->isar.id_isar2 = 0x21232042;
199
+ cpu->isar.id_isar3 = 0x01112131;
200
+ cpu->isar.id_isar4 = 0x00010142;
201
+ cpu->isar.id_isar5 = 0x11011121; /* with Crypto */
202
+ cpu->isar.id_mmfr4 = 0x21021110;
203
+ cpu->isar.id_isar6 = 0x01111111;
204
+ cpu->isar.mvfr0 = 0x10110222;
205
+ cpu->isar.mvfr1 = 0x13211111;
206
+ cpu->isar.mvfr2 = 0x00000043;
207
+ cpu->isar.id_pfr2 = 0x00000011;
208
+ cpu->isar.id_aa64pfr0 = 0x1201111120111112ull; /* GIC filled in later */
209
+ cpu->isar.id_aa64pfr1 = 0x0000000000000221ull;
210
+ cpu->isar.id_aa64zfr0 = 0x0000110100110021ull; /* with Crypto */
211
+ cpu->isar.id_aa64dfr0 = 0x000011f010305611ull;
212
+ cpu->isar.id_aa64dfr1 = 0;
213
+ cpu->id_aa64afr0 = 0;
214
+ cpu->id_aa64afr1 = 0;
215
+ cpu->isar.id_aa64isar0 = 0x0221111110212120ull; /* with Crypto */
216
+ cpu->isar.id_aa64isar1 = 0x0010111101211032ull;
217
+ cpu->isar.id_aa64mmfr0 = 0x0000022200101122ull;
218
+ cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
219
+ cpu->isar.id_aa64mmfr2 = 0x1221011110101011ull;
220
+ cpu->clidr = 0x0000001482000023ull;
221
+ cpu->gm_blocksize = 4;
222
+ cpu->ctr = 0x000000049444c004ull;
223
+ cpu->dcz_blocksize = 4;
224
+ /* TODO FEAT_MPAM: mpamidr_el1 = 0x0000_0001_0006_003f */
225
+
226
+ /* Section B.5.2: PMCR_EL0 */
227
+ cpu->isar.reset_pmcr_el0 = 0xa000; /* with 20 counters */
228
+
229
+ /* Section B.6.7: ICH_VTR_EL2 */
230
+ cpu->gic_num_lrs = 4;
231
+ cpu->gic_vpribits = 5;
232
+ cpu->gic_vprebits = 5;
233
+ cpu->gic_pribits = 5;
234
+
235
+ /* Section 14: Scalable Vector Extensions support */
236
+ cpu->sve_vq.supported = 1 << 0; /* 128bit */
237
+
238
+ /*
239
+ * The cortex-a710 TRM does not list CCSIDR values. The layout of
240
+ * the caches are in text in Table 7-1, Table 8-1, and Table 9-1.
241
+ *
242
+ * L1: 4-way set associative 64-byte line size, total either 32K or 64K.
243
+ * L2: 8-way set associative 64 byte line size, total either 256K or 512K.
244
+ */
245
+ cpu->ccsidr[0] = make_ccsidr64(4, 64, 64 * KiB); /* L1 dcache */
246
+ cpu->ccsidr[1] = cpu->ccsidr[0]; /* L1 icache */
247
+ cpu->ccsidr[2] = make_ccsidr64(8, 64, 512 * KiB); /* L2 cache */
248
+
249
+ /* FIXME: Not documented -- copied from neoverse-v1 */
250
+ cpu->reset_sctlr = 0x30c50838;
251
+
252
+ define_arm_cp_regs(cpu, cortex_a710_cp_reginfo);
253
+
254
+ aarch64_add_pauth_properties(obj);
255
+ aarch64_add_sve_properties(obj);
79
+}
256
+}
80
+
257
+
81
+static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
258
/*
82
+{
259
* -cpu max: a CPU with as many features enabled as our emulation supports.
83
+ static gen_helper_gvec_3_ptr * const fns[3] = {
260
* The version of '-cpu max' for qemu-system-arm is defined in cpu32.c;
84
+ gen_helper_sve_frintx_h,
261
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo aarch64_cpus[] = {
85
+ gen_helper_sve_frintx_s,
262
{ .name = "cortex-a55", .initfn = aarch64_a55_initfn },
86
+ gen_helper_sve_frintx_d
263
{ .name = "cortex-a72", .initfn = aarch64_a72_initfn },
87
+ };
264
{ .name = "cortex-a76", .initfn = aarch64_a76_initfn },
88
+ if (a->esz == 0) {
265
+ { .name = "cortex-a710", .initfn = aarch64_a710_initfn },
89
+ return false;
266
{ .name = "a64fx", .initfn = aarch64_a64fx_initfn },
90
+ }
267
{ .name = "neoverse-n1", .initfn = aarch64_neoverse_n1_initfn },
91
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]);
268
{ .name = "neoverse-v1", .initfn = aarch64_neoverse_v1_initfn },
92
+}
93
+
94
+static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode)
95
+{
96
+ if (a->esz == 0) {
97
+ return false;
98
+ }
99
+ if (sve_access_check(s)) {
100
+ unsigned vsz = vec_full_reg_size(s);
101
+ TCGv_i32 tmode = tcg_const_i32(mode);
102
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
103
+
104
+ gen_helper_set_rmode(tmode, tmode, status);
105
+
106
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
107
+ vec_full_reg_offset(s, a->rn),
108
+ pred_full_reg_offset(s, a->pg),
109
+ status, vsz, vsz, 0, frint_fns[a->esz - 1]);
110
+
111
+ gen_helper_set_rmode(tmode, tmode, status);
112
+ tcg_temp_free_i32(tmode);
113
+ tcg_temp_free_ptr(status);
114
+ }
115
+ return true;
116
+}
117
+
118
+static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
119
+{
120
+ return do_frint_mode(s, a, float_round_nearest_even);
121
+}
122
+
123
+static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
124
+{
125
+ return do_frint_mode(s, a, float_round_up);
126
+}
127
+
128
+static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
129
+{
130
+ return do_frint_mode(s, a, float_round_down);
131
+}
132
+
133
+static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
134
+{
135
+ return do_frint_mode(s, a, float_round_to_zero);
136
+}
137
+
138
+static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
139
+{
140
+ return do_frint_mode(s, a, float_round_ties_away);
141
+}
142
+
143
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
144
{
145
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
146
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
147
index XXXXXXX..XXXXXXX 100644
148
--- a/target/arm/sve.decode
149
+++ b/target/arm/sve.decode
150
@@ -XXX,XX +XXX,XX @@ FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
151
FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0
152
FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0
153
154
+# SVE floating-point round to integral value
155
+FRINTN 01100101 .. 000 000 101 ... ..... ..... @rd_pg_rn
156
+FRINTP 01100101 .. 000 001 101 ... ..... ..... @rd_pg_rn
157
+FRINTM 01100101 .. 000 010 101 ... ..... ..... @rd_pg_rn
158
+FRINTZ 01100101 .. 000 011 101 ... ..... ..... @rd_pg_rn
159
+FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn
160
+FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn
161
+FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn
162
+
163
# SVE integer convert to floating-point
164
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
165
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
166
--
269
--
167
2.17.1
270
2.34.1
168
169
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Perform the check for EL2 enabled in the security space and the
4
TIDCP bit in an out-of-line helper.
5
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20230831232441.66020-4-richard.henderson@linaro.org
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Message-id: 20180627043328.11531-34-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
10
---
9
target/arm/helper.h | 5 ++
11
target/arm/helper.h | 1 +
10
target/arm/translate-sve.c | 18 ++++++
12
target/arm/tcg/op_helper.c | 13 +++++++++++++
11
target/arm/vec_helper.c | 124 +++++++++++++++++++++++++++++++++++++
13
target/arm/tcg/translate-a64.c | 16 ++++++++++++++--
12
target/arm/sve.decode | 6 ++
14
target/arm/tcg/translate.c | 27 +++++++++++++++++++++++++++
13
4 files changed, 153 insertions(+)
15
4 files changed, 55 insertions(+), 2 deletions(-)
14
16
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.h
19
--- a/target/arm/helper.h
18
+++ b/target/arm/helper.h
20
+++ b/target/arm/helper.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(check_bxj_trap, TCG_CALL_NO_WG, void, env, i32)
20
DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
21
DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
DEF_HELPER_4(access_check_cp_reg, cptr, env, i32, i32, i32)
22
24
DEF_HELPER_FLAGS_2(lookup_cp_reg, TCG_CALL_NO_RWG_SE, cptr, env, i32)
23
+DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_2(tidcp_el1, TCG_CALL_NO_WG, void, env, i32)
24
+DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
DEF_HELPER_3(set_cp_reg, void, env, cptr, i32)
25
+DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
DEF_HELPER_2(get_cp_reg, i32, env, cptr)
26
+DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
DEF_HELPER_3(set_cp_reg64, void, env, cptr, i64)
29
diff --git a/target/arm/tcg/op_helper.c b/target/arm/tcg/op_helper.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/tcg/op_helper.c
32
+++ b/target/arm/tcg/op_helper.c
33
@@ -XXX,XX +XXX,XX @@ const void *HELPER(lookup_cp_reg)(CPUARMState *env, uint32_t key)
34
return ri;
35
}
36
37
+/*
38
+ * Test for HCR_EL2.TIDCP at EL1.
39
+ * Since implementation defined registers are rare, and within QEMU
40
+ * most of them are no-op, do not waste HFLAGS space for this and
41
+ * always use a helper.
42
+ */
43
+void HELPER(tidcp_el1)(CPUARMState *env, uint32_t syndrome)
44
+{
45
+ if (arm_hcr_el2_eff(env) & HCR_TIDCP) {
46
+ raise_exception_ra(env, EXCP_UDEF, syndrome, 2, GETPC());
47
+ }
48
+}
27
+
49
+
28
DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
50
void HELPER(set_cp_reg)(CPUARMState *env, const void *rip, uint32_t value)
29
void, ptr, ptr, ptr, ptr, i32)
51
{
30
DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
52
const ARMCPRegInfo *ri = rip;
31
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
53
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
32
index XXXXXXX..XXXXXXX 100644
54
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/translate-sve.c
55
--- a/target/arm/tcg/translate-a64.c
34
+++ b/target/arm/translate-sve.c
56
+++ b/target/arm/tcg/translate-a64.c
35
@@ -XXX,XX +XXX,XX @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn)
57
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, bool isread,
36
return true;
58
bool need_exit_tb = false;
59
TCGv_ptr tcg_ri = NULL;
60
TCGv_i64 tcg_rt;
61
+ uint32_t syndrome;
62
+
63
+ if (crn == 11 || crn == 15) {
64
+ /*
65
+ * Check for TIDCP trap, which must take precedence over
66
+ * the UNDEF for "no such register" etc.
67
+ */
68
+ syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
69
+ switch (s->current_el) {
70
+ case 1:
71
+ gen_helper_tidcp_el1(cpu_env, tcg_constant_i32(syndrome));
72
+ break;
73
+ }
74
+ }
75
76
if (!ri) {
77
/* Unknown register; this might be a guest error or a QEMU
78
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, bool isread,
79
/* Emit code to perform further access permissions checks at
80
* runtime; this may result in an exception.
81
*/
82
- uint32_t syndrome;
83
-
84
syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
85
gen_a64_update_pc(s, 0);
86
tcg_ri = tcg_temp_new_ptr();
87
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/target/arm/tcg/translate.c
90
+++ b/target/arm/tcg/translate.c
91
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
92
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
37
}
93
}
38
94
39
+static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a, uint32_t insn)
95
+static bool aa32_cpreg_encoding_in_impdef_space(uint8_t crn, uint8_t crm)
40
+{
96
+{
41
+ static gen_helper_gvec_3 * const fns[2][2] = {
97
+ static const uint16_t mask[3] = {
42
+ { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h },
98
+ 0b0000000111100111, /* crn == 9, crm == {c0-c2, c5-c8} */
43
+ { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h }
99
+ 0b0000000100010011, /* crn == 10, crm == {c0, c1, c4, c8} */
100
+ 0b1000000111111111, /* crn == 11, crm == {c0-c8, c15} */
44
+ };
101
+ };
45
+
102
+
46
+ if (sve_access_check(s)) {
103
+ if (crn >= 9 && crn <= 11) {
47
+ unsigned vsz = vec_full_reg_size(s);
104
+ return (mask[crn - 9] >> crm) & 1;
48
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
49
+ vec_full_reg_offset(s, a->rn),
50
+ vec_full_reg_offset(s, a->rm),
51
+ vsz, vsz, a->index, fns[a->u][a->sz]);
52
+ }
105
+ }
53
+ return true;
106
+ return false;
54
+}
107
+}
55
+
108
+
56
+
109
static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
57
/*
110
int opc1, int crn, int crm, int opc2,
58
*** SVE Floating Point Multiply-Add Indexed Group
111
bool isread, int rt, int rt2)
59
*/
112
@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
60
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
113
}
61
index XXXXXXX..XXXXXXX 100644
114
}
62
--- a/target/arm/vec_helper.c
115
63
+++ b/target/arm/vec_helper.c
116
+ if (cpnum == 15 && aa32_cpreg_encoding_in_impdef_space(crn, crm)) {
64
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc)
117
+ /*
65
clear_tail(d, opr_sz, simd_maxsz(desc));
118
+ * Check for TIDCP trap, which must take precedence over the UNDEF
66
}
119
+ * for "no such register" etc. It shares precedence with HSTR,
67
120
+ * but raises the same exception, so order doesn't matter.
68
+void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
121
+ */
69
+{
122
+ switch (s->current_el) {
70
+ intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
123
+ case 1:
71
+ intptr_t index = simd_data(desc);
124
+ gen_helper_tidcp_el1(cpu_env, tcg_constant_i32(syndrome));
72
+ uint32_t *d = vd;
125
+ break;
73
+ int8_t *n = vn;
126
+ }
74
+ int8_t *m_indexed = (int8_t *)vm + index * 4;
75
+
76
+ /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
77
+ * Otherwise opr_sz is a multiple of 16.
78
+ */
79
+ segend = MIN(4, opr_sz_4);
80
+ i = 0;
81
+ do {
82
+ int8_t m0 = m_indexed[i * 4 + 0];
83
+ int8_t m1 = m_indexed[i * 4 + 1];
84
+ int8_t m2 = m_indexed[i * 4 + 2];
85
+ int8_t m3 = m_indexed[i * 4 + 3];
86
+
87
+ do {
88
+ d[i] += n[i * 4 + 0] * m0
89
+ + n[i * 4 + 1] * m1
90
+ + n[i * 4 + 2] * m2
91
+ + n[i * 4 + 3] * m3;
92
+ } while (++i < segend);
93
+ segend = i + 4;
94
+ } while (i < opr_sz_4);
95
+
96
+ clear_tail(d, opr_sz, simd_maxsz(desc));
97
+}
98
+
99
+void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
100
+{
101
+ intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
102
+ intptr_t index = simd_data(desc);
103
+ uint32_t *d = vd;
104
+ uint8_t *n = vn;
105
+ uint8_t *m_indexed = (uint8_t *)vm + index * 4;
106
+
107
+ /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
108
+ * Otherwise opr_sz is a multiple of 16.
109
+ */
110
+ segend = MIN(4, opr_sz_4);
111
+ i = 0;
112
+ do {
113
+ uint8_t m0 = m_indexed[i * 4 + 0];
114
+ uint8_t m1 = m_indexed[i * 4 + 1];
115
+ uint8_t m2 = m_indexed[i * 4 + 2];
116
+ uint8_t m3 = m_indexed[i * 4 + 3];
117
+
118
+ do {
119
+ d[i] += n[i * 4 + 0] * m0
120
+ + n[i * 4 + 1] * m1
121
+ + n[i * 4 + 2] * m2
122
+ + n[i * 4 + 3] * m3;
123
+ } while (++i < segend);
124
+ segend = i + 4;
125
+ } while (i < opr_sz_4);
126
+
127
+ clear_tail(d, opr_sz, simd_maxsz(desc));
128
+}
129
+
130
+void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc)
131
+{
132
+ intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8;
133
+ intptr_t index = simd_data(desc);
134
+ uint64_t *d = vd;
135
+ int16_t *n = vn;
136
+ int16_t *m_indexed = (int16_t *)vm + index * 4;
137
+
138
+ /* This is supported by SVE only, so opr_sz is always a multiple of 16.
139
+ * Process the entire segment all at once, writing back the results
140
+ * only after we've consumed all of the inputs.
141
+ */
142
+ for (i = 0; i < opr_sz_8 ; i += 2) {
143
+ uint64_t d0, d1;
144
+
145
+ d0 = n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0];
146
+ d0 += n[i * 4 + 1] * (int64_t)m_indexed[i * 4 + 1];
147
+ d0 += n[i * 4 + 2] * (int64_t)m_indexed[i * 4 + 2];
148
+ d0 += n[i * 4 + 3] * (int64_t)m_indexed[i * 4 + 3];
149
+ d1 = n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0];
150
+ d1 += n[i * 4 + 5] * (int64_t)m_indexed[i * 4 + 1];
151
+ d1 += n[i * 4 + 6] * (int64_t)m_indexed[i * 4 + 2];
152
+ d1 += n[i * 4 + 7] * (int64_t)m_indexed[i * 4 + 3];
153
+
154
+ d[i + 0] += d0;
155
+ d[i + 1] += d1;
156
+ }
127
+ }
157
+
128
+
158
+ clear_tail(d, opr_sz, simd_maxsz(desc));
129
if (!ri) {
159
+}
130
/*
160
+
131
* Unknown register; this might be a guest error or a QEMU
161
+void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc)
162
+{
163
+ intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8;
164
+ intptr_t index = simd_data(desc);
165
+ uint64_t *d = vd;
166
+ uint16_t *n = vn;
167
+ uint16_t *m_indexed = (uint16_t *)vm + index * 4;
168
+
169
+ /* This is supported by SVE only, so opr_sz is always a multiple of 16.
170
+ * Process the entire segment all at once, writing back the results
171
+ * only after we've consumed all of the inputs.
172
+ */
173
+ for (i = 0; i < opr_sz_8 ; i += 2) {
174
+ uint64_t d0, d1;
175
+
176
+ d0 = n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0];
177
+ d0 += n[i * 4 + 1] * (uint64_t)m_indexed[i * 4 + 1];
178
+ d0 += n[i * 4 + 2] * (uint64_t)m_indexed[i * 4 + 2];
179
+ d0 += n[i * 4 + 3] * (uint64_t)m_indexed[i * 4 + 3];
180
+ d1 = n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0];
181
+ d1 += n[i * 4 + 5] * (uint64_t)m_indexed[i * 4 + 1];
182
+ d1 += n[i * 4 + 6] * (uint64_t)m_indexed[i * 4 + 2];
183
+ d1 += n[i * 4 + 7] * (uint64_t)m_indexed[i * 4 + 3];
184
+
185
+ d[i + 0] += d0;
186
+ d[i + 1] += d1;
187
+ }
188
+
189
+ clear_tail(d, opr_sz, simd_maxsz(desc));
190
+}
191
+
192
void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
193
void *vfpst, uint32_t desc)
194
{
195
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
196
index XXXXXXX..XXXXXXX 100644
197
--- a/target/arm/sve.decode
198
+++ b/target/arm/sve.decode
199
@@ -XXX,XX +XXX,XX @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
200
# SVE integer dot product (unpredicated)
201
DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx
202
203
+# SVE integer dot product (indexed)
204
+DOT_zzx 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \
205
+ sz=0 ra=%reg_movprfx
206
+DOT_zzx 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \
207
+ sz=1 ra=%reg_movprfx
208
+
209
# SVE floating-point complex add (predicated)
210
FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
211
rn=%reg_movprfx
212
--
132
--
213
2.17.1
133
2.34.1
214
215
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20230831232441.66020-5-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-33-richard.henderson@linaro.org
6
[PMM: moved 'ra=%reg_movprfx' here from following patch]
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/helper.h | 5 +++
8
docs/system/arm/emulation.rst | 1 +
10
target/arm/translate-sve.c | 17 ++++++++++
9
target/arm/cpu.h | 5 +++++
11
target/arm/vec_helper.c | 67 ++++++++++++++++++++++++++++++++++++++
10
target/arm/helper.h | 1 +
12
target/arm/sve.decode | 3 ++
11
target/arm/tcg/cpu64.c | 1 +
13
4 files changed, 92 insertions(+)
12
target/arm/tcg/op_helper.c | 20 ++++++++++++++++++++
13
target/arm/tcg/translate-a64.c | 5 +++++
14
target/arm/tcg/translate.c | 6 ++++++
15
7 files changed, 39 insertions(+)
14
16
17
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
18
index XXXXXXX..XXXXXXX 100644
19
--- a/docs/system/arm/emulation.rst
20
+++ b/docs/system/arm/emulation.rst
21
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
22
- FEAT_SME_I16I64 (16-bit to 64-bit integer widening outer product instructions)
23
- FEAT_SPECRES (Speculation restriction instructions)
24
- FEAT_SSBS (Speculative Store Bypass Safe)
25
+- FEAT_TIDCP1 (EL0 use of IMPLEMENTATION DEFINED functionality)
26
- FEAT_TLBIOS (TLB invalidate instructions in Outer Shareable domain)
27
- FEAT_TLBIRANGE (TLB invalidate range instructions)
28
- FEAT_TTCNP (Translation table Common not private translations)
29
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/cpu.h
32
+++ b/target/arm/cpu.h
33
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_hcx(const ARMISARegisters *id)
34
return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, HCX) != 0;
35
}
36
37
+static inline bool isar_feature_aa64_tidcp1(const ARMISARegisters *id)
38
+{
39
+ return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR1, TIDCP1) != 0;
40
+}
41
+
42
static inline bool isar_feature_aa64_uao(const ARMISARegisters *id)
43
{
44
return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, UAO) != 0;
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
45
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.h
47
--- a/target/arm/helper.h
18
+++ b/target/arm/helper.h
48
+++ b/target/arm/helper.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG,
49
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(check_bxj_trap, TCG_CALL_NO_WG, void, env, i32)
20
DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG,
50
21
void, ptr, ptr, ptr, ptr, i32)
51
DEF_HELPER_4(access_check_cp_reg, cptr, env, i32, i32, i32)
22
52
DEF_HELPER_FLAGS_2(lookup_cp_reg, TCG_CALL_NO_RWG_SE, cptr, env, i32)
23
+DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
53
+DEF_HELPER_FLAGS_2(tidcp_el0, TCG_CALL_NO_WG, void, env, i32)
24
+DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
54
DEF_HELPER_FLAGS_2(tidcp_el1, TCG_CALL_NO_WG, void, env, i32)
25
+DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
55
DEF_HELPER_3(set_cp_reg, void, env, cptr, i32)
26
+DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
56
DEF_HELPER_2(get_cp_reg, i32, env, cptr)
57
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/tcg/cpu64.c
60
+++ b/target/arm/tcg/cpu64.c
61
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
62
t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* FEAT_XNX */
63
t = FIELD_DP64(t, ID_AA64MMFR1, ETS, 1); /* FEAT_ETS */
64
t = FIELD_DP64(t, ID_AA64MMFR1, HCX, 1); /* FEAT_HCX */
65
+ t = FIELD_DP64(t, ID_AA64MMFR1, TIDCP1, 1); /* FEAT_TIDCP1 */
66
cpu->isar.id_aa64mmfr1 = t;
67
68
t = cpu->isar.id_aa64mmfr2;
69
diff --git a/target/arm/tcg/op_helper.c b/target/arm/tcg/op_helper.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/tcg/op_helper.c
72
+++ b/target/arm/tcg/op_helper.c
73
@@ -XXX,XX +XXX,XX @@ void HELPER(tidcp_el1)(CPUARMState *env, uint32_t syndrome)
74
}
75
}
76
77
+/*
78
+ * Similarly, for FEAT_TIDCP1 at EL0.
79
+ * We have already checked for the presence of the feature.
80
+ */
81
+void HELPER(tidcp_el0)(CPUARMState *env, uint32_t syndrome)
82
+{
83
+ /* See arm_sctlr(), but we also need the sctlr el. */
84
+ ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, 0);
85
+ int target_el = mmu_idx == ARMMMUIdx_E20_0 ? 2 : 1;
27
+
86
+
28
DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
87
+ /*
29
void, ptr, ptr, ptr, ptr, i32)
88
+ * The bit is not valid unless the target el is aa64, but since the
30
DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
89
+ * bit test is simpler perform that first and check validity after.
31
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
90
+ */
32
index XXXXXXX..XXXXXXX 100644
91
+ if ((env->cp15.sctlr_el[target_el] & SCTLR_TIDCP)
33
--- a/target/arm/translate-sve.c
92
+ && arm_el_is_aa64(env, target_el)) {
34
+++ b/target/arm/translate-sve.c
93
+ raise_exception_ra(env, EXCP_UDEF, syndrome, target_el, GETPC());
35
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
36
37
#undef DO_ZZI
38
39
+static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn)
40
+{
41
+ static gen_helper_gvec_3 * const fns[2][2] = {
42
+ { gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h },
43
+ { gen_helper_gvec_udot_b, gen_helper_gvec_udot_h }
44
+ };
45
+
46
+ if (sve_access_check(s)) {
47
+ unsigned vsz = vec_full_reg_size(s);
48
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
49
+ vec_full_reg_offset(s, a->rn),
50
+ vec_full_reg_offset(s, a->rm),
51
+ vsz, vsz, 0, fns[a->u][a->sz]);
52
+ }
94
+ }
53
+ return true;
54
+}
95
+}
55
+
96
+
56
/*
97
void HELPER(set_cp_reg)(CPUARMState *env, const void *rip, uint32_t value)
57
*** SVE Floating Point Multiply-Add Indexed Group
98
{
58
*/
99
const ARMCPRegInfo *ri = rip;
59
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
100
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
60
index XXXXXXX..XXXXXXX 100644
101
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/vec_helper.c
102
--- a/target/arm/tcg/translate-a64.c
62
+++ b/target/arm/vec_helper.c
103
+++ b/target/arm/tcg/translate-a64.c
63
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
104
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, bool isread,
64
clear_tail(d, opr_sz, simd_maxsz(desc));
105
*/
65
}
106
syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
66
107
switch (s->current_el) {
67
+/* Integer 8 and 16-bit dot-product.
108
+ case 0:
68
+ *
109
+ if (dc_isar_feature(aa64_tidcp1, s)) {
69
+ * Note that for the loops herein, host endianness does not matter
110
+ gen_helper_tidcp_el0(cpu_env, tcg_constant_i32(syndrome));
70
+ * with respect to the ordering of data within the 64-bit lanes.
111
+ }
71
+ * All elements are treated equally, no matter where they are.
112
+ break;
72
+ */
113
case 1:
73
+
114
gen_helper_tidcp_el1(cpu_env, tcg_constant_i32(syndrome));
74
+void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc)
115
break;
75
+{
116
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
76
+ intptr_t i, opr_sz = simd_oprsz(desc);
77
+ uint32_t *d = vd;
78
+ int8_t *n = vn, *m = vm;
79
+
80
+ for (i = 0; i < opr_sz / 4; ++i) {
81
+ d[i] += n[i * 4 + 0] * m[i * 4 + 0]
82
+ + n[i * 4 + 1] * m[i * 4 + 1]
83
+ + n[i * 4 + 2] * m[i * 4 + 2]
84
+ + n[i * 4 + 3] * m[i * 4 + 3];
85
+ }
86
+ clear_tail(d, opr_sz, simd_maxsz(desc));
87
+}
88
+
89
+void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc)
90
+{
91
+ intptr_t i, opr_sz = simd_oprsz(desc);
92
+ uint32_t *d = vd;
93
+ uint8_t *n = vn, *m = vm;
94
+
95
+ for (i = 0; i < opr_sz / 4; ++i) {
96
+ d[i] += n[i * 4 + 0] * m[i * 4 + 0]
97
+ + n[i * 4 + 1] * m[i * 4 + 1]
98
+ + n[i * 4 + 2] * m[i * 4 + 2]
99
+ + n[i * 4 + 3] * m[i * 4 + 3];
100
+ }
101
+ clear_tail(d, opr_sz, simd_maxsz(desc));
102
+}
103
+
104
+void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc)
105
+{
106
+ intptr_t i, opr_sz = simd_oprsz(desc);
107
+ uint64_t *d = vd;
108
+ int16_t *n = vn, *m = vm;
109
+
110
+ for (i = 0; i < opr_sz / 8; ++i) {
111
+ d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0]
112
+ + (int64_t)n[i * 4 + 1] * m[i * 4 + 1]
113
+ + (int64_t)n[i * 4 + 2] * m[i * 4 + 2]
114
+ + (int64_t)n[i * 4 + 3] * m[i * 4 + 3];
115
+ }
116
+ clear_tail(d, opr_sz, simd_maxsz(desc));
117
+}
118
+
119
+void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc)
120
+{
121
+ intptr_t i, opr_sz = simd_oprsz(desc);
122
+ uint64_t *d = vd;
123
+ uint16_t *n = vn, *m = vm;
124
+
125
+ for (i = 0; i < opr_sz / 8; ++i) {
126
+ d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0]
127
+ + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1]
128
+ + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2]
129
+ + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3];
130
+ }
131
+ clear_tail(d, opr_sz, simd_maxsz(desc));
132
+}
133
+
134
void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
135
void *vfpst, uint32_t desc)
136
{
137
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
138
index XXXXXXX..XXXXXXX 100644
117
index XXXXXXX..XXXXXXX 100644
139
--- a/target/arm/sve.decode
118
--- a/target/arm/tcg/translate.c
140
+++ b/target/arm/sve.decode
119
+++ b/target/arm/tcg/translate.c
141
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
120
@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
142
# SVE integer multiply immediate (unpredicated)
121
* but raises the same exception, so order doesn't matter.
143
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
122
*/
144
123
switch (s->current_el) {
145
+# SVE integer dot product (unpredicated)
124
+ case 0:
146
+DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx
125
+ if (arm_dc_feature(s, ARM_FEATURE_AARCH64)
147
+
126
+ && dc_isar_feature(aa64_tidcp1, s)) {
148
# SVE floating-point complex add (predicated)
127
+ gen_helper_tidcp_el0(cpu_env, tcg_constant_i32(syndrome));
149
FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
128
+ }
150
rn=%reg_movprfx
129
+ break;
130
case 1:
131
gen_helper_tidcp_el1(cpu_env, tcg_constant_i32(syndrome));
132
break;
151
--
133
--
152
2.17.1
134
2.34.1
153
154
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
There is no need to re-set these 3 features already
3
The linux kernel detects and enables this bit. Once trapped,
4
implied by the call to aarch64_a15_initfn.
4
EC_SYSTEMREGISTERTRAP is treated like EC_UNCATEGORIZED, so
5
no changes required within linux-user/aarch64/cpu_loop.c.
5
6
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Message-id: 20230831232441.66020-6-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20180629001538.11415-5-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
11
---
12
target/arm/cpu.c | 3 ---
12
target/arm/cpu.c | 4 ++++
13
1 file changed, 3 deletions(-)
13
1 file changed, 4 insertions(+)
14
14
15
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
15
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.c
17
--- a/target/arm/cpu.c
18
+++ b/target/arm/cpu.c
18
+++ b/target/arm/cpu.c
19
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
19
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset_hold(Object *obj)
20
* since we don't correctly set the ID registers to advertise them,
20
SCTLR_EnDA | SCTLR_EnDB);
21
*/
21
/* Trap on btype=3 for PACIxSP. */
22
set_feature(&cpu->env, ARM_FEATURE_V8);
22
env->cp15.sctlr_el[1] |= SCTLR_BT0;
23
- set_feature(&cpu->env, ARM_FEATURE_VFP4);
23
+ /* Trap on implementation defined registers. */
24
- set_feature(&cpu->env, ARM_FEATURE_NEON);
24
+ if (cpu_isar_feature(aa64_tidcp1, cpu)) {
25
- set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
25
+ env->cp15.sctlr_el[1] |= SCTLR_TIDCP;
26
set_feature(&cpu->env, ARM_FEATURE_V8_AES);
26
+ }
27
set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
27
/* and to the FP/Neon instructions */
28
set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
28
env->cp15.cpacr_el1 = FIELD_DP64(env->cp15.cpacr_el1,
29
CPACR_EL1, FPEN, 3);
29
--
30
--
30
2.17.1
31
2.34.1
31
32
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Now that we have Eager Page Split support added for ARM in the kernel,
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
enable it in Qemu. This adds,
5
Message-id: 20180627043328.11531-9-richard.henderson@linaro.org
5
-eager-split-size to -accel sub-options to set the eager page split chunk size.
6
-enable KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE.
7
8
The chunk size specifies how many pages to break at a time, using a
9
single allocation. Bigger the chunk size, more pages need to be
10
allocated ahead of time.
11
12
Reviewed-by: Gavin Shan <gshan@redhat.com>
13
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
14
Message-id: 20230905091246.1931-1-shameerali.kolothum.thodi@huawei.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
16
---
8
target/arm/helper-sve.h | 7 +++++
17
include/sysemu/kvm_int.h | 1 +
9
target/arm/sve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++
18
accel/kvm/kvm-all.c | 1 +
10
target/arm/translate-sve.c | 45 ++++++++++++++++++++++++++++++
19
target/arm/kvm.c | 61 ++++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 5 ++++
20
qemu-options.hx | 15 ++++++++++
12
4 files changed, 113 insertions(+)
21
4 files changed, 78 insertions(+)
13
22
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
23
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
15
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
25
--- a/include/sysemu/kvm_int.h
17
+++ b/target/arm/helper-sve.h
26
+++ b/include/sysemu/kvm_int.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
27
@@ -XXX,XX +XXX,XX @@ struct KVMState
19
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
28
uint64_t kvm_dirty_ring_bytes; /* Size of the per-vcpu dirty ring */
20
void, ptr, ptr, ptr, ptr, i32)
29
uint32_t kvm_dirty_ring_size; /* Number of dirty GFNs per ring */
21
30
bool kvm_dirty_ring_with_bitmap;
22
+DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
31
+ uint64_t kvm_eager_split_size; /* Eager Page Splitting chunk size */
23
+ i64, i64, ptr, ptr, ptr, i32)
32
struct KVMDirtyRingReaper reaper;
24
+DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
33
NotifyVmexitOption notify_vmexit;
25
+ i64, i64, ptr, ptr, ptr, i32)
34
uint32_t notify_window;
26
+DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
35
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
27
+ i64, i64, ptr, ptr, ptr, i32)
36
index XXXXXXX..XXXXXXX 100644
37
--- a/accel/kvm/kvm-all.c
38
+++ b/accel/kvm/kvm-all.c
39
@@ -XXX,XX +XXX,XX @@ static void kvm_accel_instance_init(Object *obj)
40
/* KVM dirty ring is by default off */
41
s->kvm_dirty_ring_size = 0;
42
s->kvm_dirty_ring_with_bitmap = false;
43
+ s->kvm_eager_split_size = 0;
44
s->notify_vmexit = NOTIFY_VMEXIT_OPTION_RUN;
45
s->notify_window = 0;
46
s->xen_version = 0;
47
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/kvm.c
50
+++ b/target/arm/kvm.c
51
@@ -XXX,XX +XXX,XX @@
52
#include "exec/address-spaces.h"
53
#include "hw/boards.h"
54
#include "hw/irq.h"
55
+#include "qapi/visitor.h"
56
#include "qemu/log.h"
57
58
const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
59
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
60
}
61
}
62
63
+ if (s->kvm_eager_split_size) {
64
+ uint32_t sizes;
28
+
65
+
29
DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
66
+ sizes = kvm_vm_check_extension(s, KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES);
30
void, ptr, ptr, ptr, ptr, ptr, i32)
67
+ if (!sizes) {
31
DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
68
+ s->kvm_eager_split_size = 0;
32
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
69
+ warn_report("Eager Page Split support not available");
33
index XXXXXXX..XXXXXXX 100644
70
+ } else if (!(s->kvm_eager_split_size & sizes)) {
34
--- a/target/arm/sve_helper.c
71
+ error_report("Eager Page Split requested chunk size not valid");
35
+++ b/target/arm/sve_helper.c
72
+ ret = -EINVAL;
36
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
73
+ } else {
37
return predtest_ones(d, oprsz, esz_mask);
74
+ ret = kvm_vm_enable_cap(s, KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE, 0,
38
}
75
+ s->kvm_eager_split_size);
39
76
+ if (ret < 0) {
40
+uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
77
+ error_report("Enabling of Eager Page Split failed: %s",
41
+ void *status, uint32_t desc)
78
+ strerror(-ret));
42
+{
43
+ intptr_t i = 0, opr_sz = simd_oprsz(desc);
44
+ float16 result = nn;
45
+
46
+ do {
47
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
48
+ do {
49
+ if (pg & 1) {
50
+ float16 mm = *(float16 *)(vm + H1_2(i));
51
+ result = float16_add(result, mm, status);
52
+ }
79
+ }
53
+ i += sizeof(float16), pg >>= sizeof(float16);
54
+ } while (i & 15);
55
+ } while (i < opr_sz);
56
+
57
+ return result;
58
+}
59
+
60
+uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg,
61
+ void *status, uint32_t desc)
62
+{
63
+ intptr_t i = 0, opr_sz = simd_oprsz(desc);
64
+ float32 result = nn;
65
+
66
+ do {
67
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
68
+ do {
69
+ if (pg & 1) {
70
+ float32 mm = *(float32 *)(vm + H1_2(i));
71
+ result = float32_add(result, mm, status);
72
+ }
73
+ i += sizeof(float32), pg >>= sizeof(float32);
74
+ } while (i & 15);
75
+ } while (i < opr_sz);
76
+
77
+ return result;
78
+}
79
+
80
+uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg,
81
+ void *status, uint32_t desc)
82
+{
83
+ intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8;
84
+ uint64_t *m = vm;
85
+ uint8_t *pg = vg;
86
+
87
+ for (i = 0; i < opr_sz; i++) {
88
+ if (pg[H1(i)] & 1) {
89
+ nn = float64_add(nn, m[i], status);
90
+ }
80
+ }
91
+ }
81
+ }
92
+
82
+
93
+ return nn;
83
kvm_arm_init_debug(s);
84
85
return ret;
86
@@ -XXX,XX +XXX,XX @@ bool kvm_arch_cpu_check_are_resettable(void)
87
return true;
88
}
89
90
+static void kvm_arch_get_eager_split_size(Object *obj, Visitor *v,
91
+ const char *name, void *opaque,
92
+ Error **errp)
93
+{
94
+ KVMState *s = KVM_STATE(obj);
95
+ uint64_t value = s->kvm_eager_split_size;
96
+
97
+ visit_type_size(v, name, &value, errp);
94
+}
98
+}
95
+
99
+
96
/* Fully general three-operand expander, controlled by a predicate,
100
+static void kvm_arch_set_eager_split_size(Object *obj, Visitor *v,
97
* With the extra float_status parameter.
101
+ const char *name, void *opaque,
98
*/
102
+ Error **errp)
99
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
103
+{
100
index XXXXXXX..XXXXXXX 100644
104
+ KVMState *s = KVM_STATE(obj);
101
--- a/target/arm/translate-sve.c
105
+ uint64_t value;
102
+++ b/target/arm/translate-sve.c
103
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
104
105
#undef DO_ZZI
106
107
+/*
108
+ *** SVE Floating Point Accumulating Reduction Group
109
+ */
110
+
106
+
111
+static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
107
+ if (s->fd != -1) {
112
+{
108
+ error_setg(errp, "Unable to set early-split-size after KVM has been initialized");
113
+ typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr,
109
+ return;
114
+ TCGv_ptr, TCGv_ptr, TCGv_i32);
115
+ static fadda_fn * const fns[3] = {
116
+ gen_helper_sve_fadda_h,
117
+ gen_helper_sve_fadda_s,
118
+ gen_helper_sve_fadda_d,
119
+ };
120
+ unsigned vsz = vec_full_reg_size(s);
121
+ TCGv_ptr t_rm, t_pg, t_fpst;
122
+ TCGv_i64 t_val;
123
+ TCGv_i32 t_desc;
124
+
125
+ if (a->esz == 0) {
126
+ return false;
127
+ }
128
+ if (!sve_access_check(s)) {
129
+ return true;
130
+ }
110
+ }
131
+
111
+
132
+ t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz);
112
+ if (!visit_type_size(v, name, &value, errp)) {
133
+ t_rm = tcg_temp_new_ptr();
113
+ return;
134
+ t_pg = tcg_temp_new_ptr();
114
+ }
135
+ tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm));
136
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
137
+ t_fpst = get_fpstatus_ptr(a->esz == MO_16);
138
+ t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
139
+
115
+
140
+ fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc);
116
+ if (value && !is_power_of_2(value)) {
117
+ error_setg(errp, "early-split-size must be a power of two");
118
+ return;
119
+ }
141
+
120
+
142
+ tcg_temp_free_i32(t_desc);
121
+ s->kvm_eager_split_size = value;
143
+ tcg_temp_free_ptr(t_fpst);
144
+ tcg_temp_free_ptr(t_pg);
145
+ tcg_temp_free_ptr(t_rm);
146
+
147
+ write_fp_dreg(s, a->rd, t_val);
148
+ tcg_temp_free_i64(t_val);
149
+ return true;
150
+}
122
+}
151
+
123
+
152
/*
124
void kvm_arch_accel_class_init(ObjectClass *oc)
153
*** SVE Floating Point Arithmetic - Unpredicated Group
125
{
154
*/
126
+ object_class_property_add(oc, "eager-split-size", "size",
155
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
127
+ kvm_arch_get_eager_split_size,
128
+ kvm_arch_set_eager_split_size, NULL, NULL);
129
+
130
+ object_class_property_set_description(oc, "eager-split-size",
131
+ "Eager Page Split chunk size for hugepages. (default: 0, disabled)");
132
}
133
diff --git a/qemu-options.hx b/qemu-options.hx
156
index XXXXXXX..XXXXXXX 100644
134
index XXXXXXX..XXXXXXX 100644
157
--- a/target/arm/sve.decode
135
--- a/qemu-options.hx
158
+++ b/target/arm/sve.decode
136
+++ b/qemu-options.hx
159
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
137
@@ -XXX,XX +XXX,XX @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
160
# SVE integer multiply immediate (unpredicated)
138
" split-wx=on|off (enable TCG split w^x mapping)\n"
161
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
139
" tb-size=n (TCG translation block cache size)\n"
162
140
" dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
163
+### SVE FP Accumulating Reduction Group
141
+ " eager-split-size=n (KVM Eager Page Split chunk size, default 0, disabled. ARM only)\n"
142
" notify-vmexit=run|internal-error|disable,notify-window=n (enable notify VM exit and set notify window, x86 only)\n"
143
" thread=single|multi (enable multi-threaded TCG)\n", QEMU_ARCH_ALL)
144
SRST
145
@@ -XXX,XX +XXX,XX @@ SRST
146
is disabled (dirty-ring-size=0). When enabled, KVM will instead
147
record dirty pages in a bitmap.
148
149
+ ``eager-split-size=n``
150
+ KVM implements dirty page logging at the PAGE_SIZE granularity and
151
+ enabling dirty-logging on a huge-page requires breaking it into
152
+ PAGE_SIZE pages in the first place. KVM on ARM does this splitting
153
+ lazily by default. There are performance benefits in doing huge-page
154
+ split eagerly, especially in situations where TLBI costs associated
155
+ with break-before-make sequences are considerable and also if guest
156
+ workloads are read intensive. The size here specifies how many pages
157
+ to break at a time and needs to be a valid block size which is
158
+ 1GB/2MB/4KB, 32MB/16KB and 512MB/64KB for 4KB/16KB/64KB PAGE_SIZE
159
+ respectively. Be wary of specifying a higher size as it will have an
160
+ impact on the memory. By default, this feature is disabled
161
+ (eager-split-size=0).
164
+
162
+
165
+# SVE floating-point serial reduction (predicated)
163
``notify-vmexit=run|internal-error|disable,notify-window=n``
166
+FADDA 01100101 .. 011 000 001 ... ..... ..... @rdn_pg_rm
164
Enables or disables notify VM exit support on x86 host and specify
167
+
165
the corresponding notify window to trigger the VM exit if enabled.
168
### SVE Floating Point Arithmetic - Unpredicated Group
169
170
# SVE floating-point arithmetic (unpredicated)
171
--
166
--
172
2.17.1
167
2.34.1
173
174
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-12-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 41 +++++++++++++++++++++
9
target/arm/sve_helper.c | 61 +++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 39 ++++++++++++++++++++
12
4 files changed, 216 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
19
DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
20
21
DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
+
23
+DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
24
+ void, env, ptr, ptr, ptr, tl, i32)
25
+DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
26
+ void, env, ptr, ptr, ptr, tl, i32)
27
+DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG,
28
+ void, env, ptr, ptr, ptr, tl, i32)
29
+
30
+DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG,
31
+ void, env, ptr, ptr, ptr, tl, i32)
32
+DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG,
33
+ void, env, ptr, ptr, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG,
35
+ void, env, ptr, ptr, ptr, tl, i32)
36
+
37
+DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG,
38
+ void, env, ptr, ptr, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG,
40
+ void, env, ptr, ptr, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG,
42
+ void, env, ptr, ptr, ptr, tl, i32)
43
+DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG,
44
+ void, env, ptr, ptr, ptr, tl, i32)
45
+
46
+DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG,
47
+ void, env, ptr, ptr, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG,
49
+ void, env, ptr, ptr, ptr, tl, i32)
50
+DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG,
51
+ void, env, ptr, ptr, ptr, tl, i32)
52
+DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG,
53
+ void, env, ptr, ptr, ptr, tl, i32)
54
+
55
+DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG,
56
+ void, env, ptr, ptr, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG,
58
+ void, env, ptr, ptr, ptr, tl, i32)
59
+DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG,
60
+ void, env, ptr, ptr, ptr, tl, i32)
61
+DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG,
62
+ void, env, ptr, ptr, ptr, tl, i32)
63
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/sve_helper.c
66
+++ b/target/arm/sve_helper.c
67
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
68
addr += 4 * 8;
69
}
70
}
71
+
72
+/* Stores with a vector index. */
73
+
74
+#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
75
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
76
+ target_ulong base, uint32_t desc) \
77
+{ \
78
+ intptr_t i, oprsz = simd_oprsz(desc); \
79
+ unsigned scale = simd_data(desc); \
80
+ uintptr_t ra = GETPC(); \
81
+ for (i = 0; i < oprsz; ) { \
82
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
83
+ do { \
84
+ if (likely(pg & 1)) { \
85
+ target_ulong off = *(TYPEI *)(vm + H1_4(i)); \
86
+ uint32_t d = *(uint32_t *)(vd + H1_4(i)); \
87
+ FN(env, base + (off << scale), d, ra); \
88
+ } \
89
+ i += sizeof(uint32_t), pg >>= sizeof(uint32_t); \
90
+ } while (i & 15); \
91
+ } \
92
+}
93
+
94
+#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \
95
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
96
+ target_ulong base, uint32_t desc) \
97
+{ \
98
+ intptr_t i, oprsz = simd_oprsz(desc) / 8; \
99
+ unsigned scale = simd_data(desc); \
100
+ uintptr_t ra = GETPC(); \
101
+ uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \
102
+ for (i = 0; i < oprsz; i++) { \
103
+ if (likely(pg[H1(i)] & 1)) { \
104
+ target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \
105
+ FN(env, base + off, d[i], ra); \
106
+ } \
107
+ } \
108
+}
109
+
110
+DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra)
111
+DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra)
112
+DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra)
113
+
114
+DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra)
115
+DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra)
116
+DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra)
117
+
118
+DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra)
119
+DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra)
120
+DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra)
121
+DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra)
122
+
123
+DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra)
124
+DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra)
125
+DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra)
126
+DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra)
127
+
128
+DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra)
129
+DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra)
130
+DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra)
131
+DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra)
132
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
133
index XXXXXXX..XXXXXXX 100644
134
--- a/target/arm/translate-sve.c
135
+++ b/target/arm/translate-sve.c
136
@@ -XXX,XX +XXX,XX @@ typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
137
TCGv_ptr, TCGv_ptr, TCGv_i32);
138
139
typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32);
140
+typedef void gen_helper_gvec_mem_scatter(TCGv_env, TCGv_ptr, TCGv_ptr,
141
+ TCGv_ptr, TCGv_i64, TCGv_i32);
142
143
/*
144
* Helpers for extracting complex instruction fields.
145
@@ -XXX,XX +XXX,XX @@ static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn)
146
}
147
return true;
148
}
149
+
150
+/*
151
+ *** SVE gather loads / scatter stores
152
+ */
153
+
154
+static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale,
155
+ TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn)
156
+{
157
+ unsigned vsz = vec_full_reg_size(s);
158
+ TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, scale));
159
+ TCGv_ptr t_zm = tcg_temp_new_ptr();
160
+ TCGv_ptr t_pg = tcg_temp_new_ptr();
161
+ TCGv_ptr t_zt = tcg_temp_new_ptr();
162
+
163
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
164
+ tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm));
165
+ tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt));
166
+ fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc);
167
+
168
+ tcg_temp_free_ptr(t_zt);
169
+ tcg_temp_free_ptr(t_zm);
170
+ tcg_temp_free_ptr(t_pg);
171
+ tcg_temp_free_i32(desc);
172
+}
173
+
174
+static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
175
+{
176
+ /* Indexed by [xs][msz]. */
177
+ static gen_helper_gvec_mem_scatter * const fn32[2][3] = {
178
+ { gen_helper_sve_stbs_zsu,
179
+ gen_helper_sve_sths_zsu,
180
+ gen_helper_sve_stss_zsu, },
181
+ { gen_helper_sve_stbs_zss,
182
+ gen_helper_sve_sths_zss,
183
+ gen_helper_sve_stss_zss, },
184
+ };
185
+ /* Note that we overload xs=2 to indicate 64-bit offset. */
186
+ static gen_helper_gvec_mem_scatter * const fn64[3][4] = {
187
+ { gen_helper_sve_stbd_zsu,
188
+ gen_helper_sve_sthd_zsu,
189
+ gen_helper_sve_stsd_zsu,
190
+ gen_helper_sve_stdd_zsu, },
191
+ { gen_helper_sve_stbd_zss,
192
+ gen_helper_sve_sthd_zss,
193
+ gen_helper_sve_stsd_zss,
194
+ gen_helper_sve_stdd_zss, },
195
+ { gen_helper_sve_stbd_zd,
196
+ gen_helper_sve_sthd_zd,
197
+ gen_helper_sve_stsd_zd,
198
+ gen_helper_sve_stdd_zd, },
199
+ };
200
+ gen_helper_gvec_mem_scatter *fn;
201
+
202
+ if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
203
+ return false;
204
+ }
205
+ if (!sve_access_check(s)) {
206
+ return true;
207
+ }
208
+ switch (a->esz) {
209
+ case MO_32:
210
+ fn = fn32[a->xs][a->msz];
211
+ break;
212
+ case MO_64:
213
+ fn = fn64[a->xs][a->msz];
214
+ break;
215
+ default:
216
+ g_assert_not_reached();
217
+ }
218
+ do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
219
+ cpu_reg_sp(s, a->rn), fn);
220
+ return true;
221
+}
222
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
223
index XXXXXXX..XXXXXXX 100644
224
--- a/target/arm/sve.decode
225
+++ b/target/arm/sve.decode
226
@@ -XXX,XX +XXX,XX @@
227
&rpri_load rd pg rn imm dtype nreg
228
&rprr_store rd pg rn rm msz esz nreg
229
&rpri_store rd pg rn imm msz esz nreg
230
+&rprr_scatter_store rd pg rn rm esz msz xs scale
231
232
###########################################################################
233
# Named instruction formats. These are generally used to
234
@@ -XXX,XX +XXX,XX @@
235
@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store
236
@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \
237
&rprr_store nreg=0
238
+@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \
239
+ &rprr_scatter_store
240
241
###########################################################################
242
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
243
@@ -XXX,XX +XXX,XX @@ ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \
244
# SVE store multiple structures (scalar plus scalar) (nreg != 0)
245
ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \
246
@rprr_store esz=%size_23
247
+
248
+# SVE 32-bit scatter store (scalar plus 32-bit scaled offsets)
249
+# Require msz > 0 && msz <= esz.
250
+ST1_zprz 1110010 .. 11 ..... 100 ... ..... ..... \
251
+ @rprr_scatter_store xs=0 esz=2 scale=1
252
+ST1_zprz 1110010 .. 11 ..... 110 ... ..... ..... \
253
+ @rprr_scatter_store xs=1 esz=2 scale=1
254
+
255
+# SVE 32-bit scatter store (scalar plus 32-bit unscaled offsets)
256
+# Require msz <= esz.
257
+ST1_zprz 1110010 .. 10 ..... 100 ... ..... ..... \
258
+ @rprr_scatter_store xs=0 esz=2 scale=0
259
+ST1_zprz 1110010 .. 10 ..... 110 ... ..... ..... \
260
+ @rprr_scatter_store xs=1 esz=2 scale=0
261
+
262
+# SVE 64-bit scatter store (scalar plus 64-bit scaled offset)
263
+# Require msz > 0
264
+ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \
265
+ @rprr_scatter_store xs=2 esz=3 scale=1
266
+
267
+# SVE 64-bit scatter store (scalar plus 64-bit unscaled offset)
268
+ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \
269
+ @rprr_scatter_store xs=2 esz=3 scale=0
270
+
271
+# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset)
272
+# Require msz > 0
273
+ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \
274
+ @rprr_scatter_store xs=0 esz=3 scale=1
275
+ST1_zprz 1110010 .. 01 ..... 110 ... ..... ..... \
276
+ @rprr_scatter_store xs=1 esz=3 scale=1
277
+
278
+# SVE 64-bit scatter store (scalar plus unpacked 32-bit unscaled offset)
279
+ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \
280
+ @rprr_scatter_store xs=0 esz=3 scale=0
281
+ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \
282
+ @rprr_scatter_store xs=1 esz=3 scale=0
283
--
284
2.17.1
285
286
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/translate-sve.c | 21 +++++++++++++++++++++
9
target/arm/sve.decode | 23 +++++++++++++++++++++++
10
2 files changed, 44 insertions(+)
11
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
15
+++ b/target/arm/translate-sve.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
17
cpu_reg_sp(s, a->rn), fn);
18
return true;
19
}
20
+
21
+/*
22
+ * Prefetches
23
+ */
24
+
25
+static bool trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn)
26
+{
27
+ /* Prefetch is a nop within QEMU. */
28
+ sve_access_check(s);
29
+ return true;
30
+}
31
+
32
+static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn)
33
+{
34
+ if (a->rm == 31) {
35
+ return false;
36
+ }
37
+ /* Prefetch is a nop within QEMU. */
38
+ sve_access_check(s);
39
+ return true;
40
+}
41
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/sve.decode
44
+++ b/target/arm/sve.decode
45
@@ -XXX,XX +XXX,XX @@ LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \
46
LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \
47
@rpri_load_msz nreg=0
48
49
+# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets)
50
+PRF 1000010 00 -1 ----- 0-- --- ----- 0 ----
51
+
52
+# SVE 32-bit gather prefetch (vector plus immediate)
53
+PRF 1000010 -- 00 ----- 111 --- ----- 0 ----
54
+
55
+# SVE contiguous prefetch (scalar plus immediate)
56
+PRF 1000010 11 1- ----- 0-- --- ----- 0 ----
57
+
58
+# SVE contiguous prefetch (scalar plus scalar)
59
+PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ----
60
+
61
+### SVE Memory 64-bit Gather Group
62
+
63
+# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets)
64
+PRF 1100010 00 11 ----- 1-- --- ----- 0 ----
65
+
66
+# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets)
67
+PRF 1100010 00 -1 ----- 0-- --- ----- 0 ----
68
+
69
+# SVE 64-bit gather prefetch (vector plus immediate)
70
+PRF 1100010 -- 00 ----- 111 --- ----- 0 ----
71
+
72
### SVE Memory Store Group
73
74
# SVE store predicate register
75
--
76
2.17.1
77
78
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Message-id: 20180627043328.11531-14-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/helper-sve.h | 67 +++++++++++++++++++++++++
10
target/arm/sve_helper.c | 77 ++++++++++++++++++++++++++++
11
target/arm/translate-sve.c | 100 +++++++++++++++++++++++++++++++++++++
12
target/arm/sve.decode | 57 +++++++++++++++++++++
13
4 files changed, 301 insertions(+)
14
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
18
+++ b/target/arm/helper-sve.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
20
21
DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
23
+DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
24
+ void, env, ptr, ptr, ptr, tl, i32)
25
+DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG,
26
+ void, env, ptr, ptr, ptr, tl, i32)
27
+DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG,
28
+ void, env, ptr, ptr, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG,
30
+ void, env, ptr, ptr, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG,
32
+ void, env, ptr, ptr, ptr, tl, i32)
33
+
34
+DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG,
35
+ void, env, ptr, ptr, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG,
37
+ void, env, ptr, ptr, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG,
39
+ void, env, ptr, ptr, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG,
41
+ void, env, ptr, ptr, ptr, tl, i32)
42
+DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG,
43
+ void, env, ptr, ptr, ptr, tl, i32)
44
+
45
+DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG,
46
+ void, env, ptr, ptr, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG,
48
+ void, env, ptr, ptr, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG,
50
+ void, env, ptr, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG,
52
+ void, env, ptr, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG,
54
+ void, env, ptr, ptr, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG,
56
+ void, env, ptr, ptr, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG,
58
+ void, env, ptr, ptr, ptr, tl, i32)
59
+
60
+DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG,
61
+ void, env, ptr, ptr, ptr, tl, i32)
62
+DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG,
63
+ void, env, ptr, ptr, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG,
65
+ void, env, ptr, ptr, ptr, tl, i32)
66
+DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG,
67
+ void, env, ptr, ptr, ptr, tl, i32)
68
+DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG,
69
+ void, env, ptr, ptr, ptr, tl, i32)
70
+DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG,
71
+ void, env, ptr, ptr, ptr, tl, i32)
72
+DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG,
73
+ void, env, ptr, ptr, ptr, tl, i32)
74
+
75
+DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG,
76
+ void, env, ptr, ptr, ptr, tl, i32)
77
+DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG,
78
+ void, env, ptr, ptr, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG,
80
+ void, env, ptr, ptr, ptr, tl, i32)
81
+DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG,
82
+ void, env, ptr, ptr, ptr, tl, i32)
83
+DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG,
84
+ void, env, ptr, ptr, ptr, tl, i32)
85
+DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
86
+ void, env, ptr, ptr, ptr, tl, i32)
87
+DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
88
+ void, env, ptr, ptr, ptr, tl, i32)
89
+
90
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
91
void, env, ptr, ptr, ptr, tl, i32)
92
DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
93
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/sve_helper.c
96
+++ b/target/arm/sve_helper.c
97
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
98
}
99
}
100
101
+/* Loads with a vector index. */
102
+
103
+#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
104
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
105
+ target_ulong base, uint32_t desc) \
106
+{ \
107
+ intptr_t i, oprsz = simd_oprsz(desc); \
108
+ unsigned scale = simd_data(desc); \
109
+ uintptr_t ra = GETPC(); \
110
+ for (i = 0; i < oprsz; i++) { \
111
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
112
+ do { \
113
+ TYPEM m = 0; \
114
+ if (pg & 1) { \
115
+ target_ulong off = *(TYPEI *)(vm + H1_4(i)); \
116
+ m = FN(env, base + (off << scale), ra); \
117
+ } \
118
+ *(uint32_t *)(vd + H1_4(i)) = m; \
119
+ i += 4, pg >>= 4; \
120
+ } while (i & 15); \
121
+ } \
122
+}
123
+
124
+#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \
125
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
126
+ target_ulong base, uint32_t desc) \
127
+{ \
128
+ intptr_t i, oprsz = simd_oprsz(desc) / 8; \
129
+ unsigned scale = simd_data(desc); \
130
+ uintptr_t ra = GETPC(); \
131
+ uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \
132
+ for (i = 0; i < oprsz; i++) { \
133
+ TYPEM mm = 0; \
134
+ if (pg[H1(i)] & 1) { \
135
+ target_ulong off = (TYPEI)m[i]; \
136
+ mm = FN(env, base + (off << scale), ra); \
137
+ } \
138
+ d[i] = mm; \
139
+ } \
140
+}
141
+
142
+DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
143
+DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
144
+DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
145
+DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
146
+DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
147
+
148
+DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
149
+DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
150
+DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
151
+DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra)
152
+DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra)
153
+
154
+DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
155
+DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
156
+DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
157
+DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra)
158
+DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
159
+DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
160
+DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra)
161
+
162
+DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
163
+DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
164
+DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
165
+DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra)
166
+DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra)
167
+DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra)
168
+DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra)
169
+
170
+DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra)
171
+DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra)
172
+DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra)
173
+DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra)
174
+DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
175
+DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
176
+DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
177
+
178
/* Stores with a vector index. */
179
180
#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
181
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
182
index XXXXXXX..XXXXXXX 100644
183
--- a/target/arm/translate-sve.c
184
+++ b/target/arm/translate-sve.c
185
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale,
186
tcg_temp_free_i32(desc);
187
}
188
189
+/* Indexed by [ff][xs][u][msz]. */
190
+static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = {
191
+ { { { gen_helper_sve_ldbss_zsu,
192
+ gen_helper_sve_ldhss_zsu,
193
+ NULL, },
194
+ { gen_helper_sve_ldbsu_zsu,
195
+ gen_helper_sve_ldhsu_zsu,
196
+ gen_helper_sve_ldssu_zsu, } },
197
+ { { gen_helper_sve_ldbss_zss,
198
+ gen_helper_sve_ldhss_zss,
199
+ NULL, },
200
+ { gen_helper_sve_ldbsu_zss,
201
+ gen_helper_sve_ldhsu_zss,
202
+ gen_helper_sve_ldssu_zss, } } },
203
+ /* TODO fill in first-fault handlers */
204
+};
205
+
206
+/* Note that we overload xs=2 to indicate 64-bit offset. */
207
+static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = {
208
+ { { { gen_helper_sve_ldbds_zsu,
209
+ gen_helper_sve_ldhds_zsu,
210
+ gen_helper_sve_ldsds_zsu,
211
+ NULL, },
212
+ { gen_helper_sve_ldbdu_zsu,
213
+ gen_helper_sve_ldhdu_zsu,
214
+ gen_helper_sve_ldsdu_zsu,
215
+ gen_helper_sve_ldddu_zsu, } },
216
+ { { gen_helper_sve_ldbds_zss,
217
+ gen_helper_sve_ldhds_zss,
218
+ gen_helper_sve_ldsds_zss,
219
+ NULL, },
220
+ { gen_helper_sve_ldbdu_zss,
221
+ gen_helper_sve_ldhdu_zss,
222
+ gen_helper_sve_ldsdu_zss,
223
+ gen_helper_sve_ldddu_zss, } },
224
+ { { gen_helper_sve_ldbds_zd,
225
+ gen_helper_sve_ldhds_zd,
226
+ gen_helper_sve_ldsds_zd,
227
+ NULL, },
228
+ { gen_helper_sve_ldbdu_zd,
229
+ gen_helper_sve_ldhdu_zd,
230
+ gen_helper_sve_ldsdu_zd,
231
+ gen_helper_sve_ldddu_zd, } } },
232
+ /* TODO fill in first-fault handlers */
233
+};
234
+
235
+static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
236
+{
237
+ gen_helper_gvec_mem_scatter *fn = NULL;
238
+
239
+ if (!sve_access_check(s)) {
240
+ return true;
241
+ }
242
+
243
+ switch (a->esz) {
244
+ case MO_32:
245
+ fn = gather_load_fn32[a->ff][a->xs][a->u][a->msz];
246
+ break;
247
+ case MO_64:
248
+ fn = gather_load_fn64[a->ff][a->xs][a->u][a->msz];
249
+ break;
250
+ }
251
+ assert(fn != NULL);
252
+
253
+ do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
254
+ cpu_reg_sp(s, a->rn), fn);
255
+ return true;
256
+}
257
+
258
+static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
259
+{
260
+ gen_helper_gvec_mem_scatter *fn = NULL;
261
+ TCGv_i64 imm;
262
+
263
+ if (a->esz < a->msz || (a->esz == a->msz && !a->u)) {
264
+ return false;
265
+ }
266
+ if (!sve_access_check(s)) {
267
+ return true;
268
+ }
269
+
270
+ switch (a->esz) {
271
+ case MO_32:
272
+ fn = gather_load_fn32[a->ff][0][a->u][a->msz];
273
+ break;
274
+ case MO_64:
275
+ fn = gather_load_fn64[a->ff][2][a->u][a->msz];
276
+ break;
277
+ }
278
+ assert(fn != NULL);
279
+
280
+ /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x])
281
+ * by loading the immediate into the scalar parameter.
282
+ */
283
+ imm = tcg_const_i64(a->imm << a->msz);
284
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
285
+ tcg_temp_free_i64(imm);
286
+ return true;
287
+}
288
+
289
static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
290
{
291
/* Indexed by [xs][msz]. */
292
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
293
index XXXXXXX..XXXXXXX 100644
294
--- a/target/arm/sve.decode
295
+++ b/target/arm/sve.decode
296
@@ -XXX,XX +XXX,XX @@
297
&rpri_load rd pg rn imm dtype nreg
298
&rprr_store rd pg rn rm msz esz nreg
299
&rpri_store rd pg rn imm msz esz nreg
300
+&rprr_gather_load rd pg rn rm esz msz u ff xs scale
301
+&rpri_gather_load rd pg rn imm esz msz u ff
302
&rprr_scatter_store rd pg rn rm esz msz xs scale
303
304
###########################################################################
305
@@ -XXX,XX +XXX,XX @@
306
@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \
307
&rpri_load dtype=%msz_dtype
308
309
+# Gather Loads.
310
+@rprr_g_load_u ....... .. . . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
311
+ &rprr_gather_load xs=2
312
+@rprr_g_load_xs_u ....... .. xs:1 . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
313
+ &rprr_gather_load
314
+@rprr_g_load_xs_u_sc ....... .. xs:1 scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
315
+ &rprr_gather_load
316
+@rprr_g_load_xs_sc ....... .. xs:1 scale:1 rm:5 . . ff:1 pg:3 rn:5 rd:5 \
317
+ &rprr_gather_load
318
+@rprr_g_load_u_sc ....... .. . scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
319
+ &rprr_gather_load xs=2
320
+@rprr_g_load_sc ....... .. . scale:1 rm:5 . . ff:1 pg:3 rn:5 rd:5 \
321
+ &rprr_gather_load xs=2
322
+@rpri_g_load ....... msz:2 .. imm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
323
+ &rpri_gather_load
324
+
325
# Stores; user must fill in ESZ, MSZ, NREG as needed.
326
@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store
327
@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store
328
@@ -XXX,XX +XXX,XX @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
329
LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \
330
&rpri_load dtype=%dtype_23_13 nreg=0
331
332
+# SVE 32-bit gather load (scalar plus 32-bit unscaled offsets)
333
+# SVE 32-bit gather load (scalar plus 32-bit scaled offsets)
334
+LD1_zprz 1000010 00 .0 ..... 0.. ... ..... ..... \
335
+ @rprr_g_load_xs_u esz=2 msz=0 scale=0
336
+LD1_zprz 1000010 01 .. ..... 0.. ... ..... ..... \
337
+ @rprr_g_load_xs_u_sc esz=2 msz=1
338
+LD1_zprz 1000010 10 .. ..... 01. ... ..... ..... \
339
+ @rprr_g_load_xs_sc esz=2 msz=2 u=1
340
+
341
+# SVE 32-bit gather load (vector plus immediate)
342
+LD1_zpiz 1000010 .. 01 ..... 1.. ... ..... ..... \
343
+ @rpri_g_load esz=2
344
+
345
### SVE Memory Contiguous Load Group
346
347
# SVE contiguous load (scalar plus scalar)
348
@@ -XXX,XX +XXX,XX @@ PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ----
349
350
### SVE Memory 64-bit Gather Group
351
352
+# SVE 64-bit gather load (scalar plus 32-bit unpacked unscaled offsets)
353
+# SVE 64-bit gather load (scalar plus 32-bit unpacked scaled offsets)
354
+LD1_zprz 1100010 00 .0 ..... 0.. ... ..... ..... \
355
+ @rprr_g_load_xs_u esz=3 msz=0 scale=0
356
+LD1_zprz 1100010 01 .. ..... 0.. ... ..... ..... \
357
+ @rprr_g_load_xs_u_sc esz=3 msz=1
358
+LD1_zprz 1100010 10 .. ..... 0.. ... ..... ..... \
359
+ @rprr_g_load_xs_u_sc esz=3 msz=2
360
+LD1_zprz 1100010 11 .. ..... 01. ... ..... ..... \
361
+ @rprr_g_load_xs_sc esz=3 msz=3 u=1
362
+
363
+# SVE 64-bit gather load (scalar plus 64-bit unscaled offsets)
364
+# SVE 64-bit gather load (scalar plus 64-bit scaled offsets)
365
+LD1_zprz 1100010 00 10 ..... 1.. ... ..... ..... \
366
+ @rprr_g_load_u esz=3 msz=0 scale=0
367
+LD1_zprz 1100010 01 1. ..... 1.. ... ..... ..... \
368
+ @rprr_g_load_u_sc esz=3 msz=1
369
+LD1_zprz 1100010 10 1. ..... 1.. ... ..... ..... \
370
+ @rprr_g_load_u_sc esz=3 msz=2
371
+LD1_zprz 1100010 11 1. ..... 11. ... ..... ..... \
372
+ @rprr_g_load_sc esz=3 msz=3 u=1
373
+
374
+# SVE 64-bit gather load (vector plus immediate)
375
+LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \
376
+ @rpri_g_load esz=3
377
+
378
# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets)
379
PRF 1100010 00 11 ----- 1-- --- ----- 0 ----
380
381
--
382
2.17.1
383
384
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-17-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 49 ++++++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 40 ++++++++++++++++++++++++
11
target/arm/sve.decode | 11 +++++++
12
4 files changed, 162 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG,
65
+ void, ptr, ptr, ptr, ptr, ptr, i32)
66
+DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, ptr, ptr, i32)
68
+DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, ptr, ptr, i32)
70
+
71
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
72
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
73
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
74
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
75
index XXXXXXX..XXXXXXX 100644
76
--- a/target/arm/sve_helper.c
77
+++ b/target/arm/sve_helper.c
78
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
79
do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN);
80
}
81
82
+/* Two operand floating-point comparison controlled by a predicate.
83
+ * Unlike the integer version, we are not allowed to optimistically
84
+ * compare operands, since the comparison may have side effects wrt
85
+ * the FPSR.
86
+ */
87
+#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \
88
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
89
+ void *status, uint32_t desc) \
90
+{ \
91
+ intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \
92
+ uint64_t *d = vd, *g = vg; \
93
+ do { \
94
+ uint64_t out = 0, pg = g[j]; \
95
+ do { \
96
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
97
+ if (likely((pg >> (i & 63)) & 1)) { \
98
+ TYPE nn = *(TYPE *)(vn + H(i)); \
99
+ TYPE mm = *(TYPE *)(vm + H(i)); \
100
+ out |= OP(TYPE, nn, mm, status); \
101
+ } \
102
+ } while (i & 63); \
103
+ d[j--] = out; \
104
+ } while (i > 0); \
105
+}
106
+
107
+#define DO_FPCMP_PPZZ_H(NAME, OP) \
108
+ DO_FPCMP_PPZZ(NAME##_h, float16, H1_2, OP)
109
+#define DO_FPCMP_PPZZ_S(NAME, OP) \
110
+ DO_FPCMP_PPZZ(NAME##_s, float32, H1_4, OP)
111
+#define DO_FPCMP_PPZZ_D(NAME, OP) \
112
+ DO_FPCMP_PPZZ(NAME##_d, float64, , OP)
113
+
114
+#define DO_FPCMP_PPZZ_ALL(NAME, OP) \
115
+ DO_FPCMP_PPZZ_H(NAME, OP) \
116
+ DO_FPCMP_PPZZ_S(NAME, OP) \
117
+ DO_FPCMP_PPZZ_D(NAME, OP)
118
+
119
+#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0
120
+#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0
121
+#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0
122
+#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0
123
+#define DO_FCMUO(TYPE, X, Y, ST) \
124
+ TYPE##_compare_quiet(X, Y, ST) == float_relation_unordered
125
+#define DO_FACGE(TYPE, X, Y, ST) \
126
+ TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) <= 0
127
+#define DO_FACGT(TYPE, X, Y, ST) \
128
+ TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) < 0
129
+
130
+DO_FPCMP_PPZZ_ALL(sve_fcmge, DO_FCMGE)
131
+DO_FPCMP_PPZZ_ALL(sve_fcmgt, DO_FCMGT)
132
+DO_FPCMP_PPZZ_ALL(sve_fcmeq, DO_FCMEQ)
133
+DO_FPCMP_PPZZ_ALL(sve_fcmne, DO_FCMNE)
134
+DO_FPCMP_PPZZ_ALL(sve_fcmuo, DO_FCMUO)
135
+DO_FPCMP_PPZZ_ALL(sve_facge, DO_FACGE)
136
+DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT)
137
+
138
+#undef DO_FPCMP_PPZZ_ALL
139
+#undef DO_FPCMP_PPZZ_D
140
+#undef DO_FPCMP_PPZZ_S
141
+#undef DO_FPCMP_PPZZ_H
142
+#undef DO_FPCMP_PPZZ
143
+
144
/*
145
* Load contiguous data, protected by a governing predicate.
146
*/
147
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
148
index XXXXXXX..XXXXXXX 100644
149
--- a/target/arm/translate-sve.c
150
+++ b/target/arm/translate-sve.c
151
@@ -XXX,XX +XXX,XX @@ DO_FP3(FMULX, fmulx)
152
153
#undef DO_FP3
154
155
+static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a,
156
+ gen_helper_gvec_4_ptr *fn)
157
+{
158
+ if (fn == NULL) {
159
+ return false;
160
+ }
161
+ if (sve_access_check(s)) {
162
+ unsigned vsz = vec_full_reg_size(s);
163
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
164
+ tcg_gen_gvec_4_ptr(pred_full_reg_offset(s, a->rd),
165
+ vec_full_reg_offset(s, a->rn),
166
+ vec_full_reg_offset(s, a->rm),
167
+ pred_full_reg_offset(s, a->pg),
168
+ status, vsz, vsz, 0, fn);
169
+ tcg_temp_free_ptr(status);
170
+ }
171
+ return true;
172
+}
173
+
174
+#define DO_FPCMP(NAME, name) \
175
+static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \
176
+ uint32_t insn) \
177
+{ \
178
+ static gen_helper_gvec_4_ptr * const fns[4] = { \
179
+ NULL, gen_helper_sve_##name##_h, \
180
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \
181
+ }; \
182
+ return do_fp_cmp(s, a, fns[a->esz]); \
183
+}
184
+
185
+DO_FPCMP(FCMGE, fcmge)
186
+DO_FPCMP(FCMGT, fcmgt)
187
+DO_FPCMP(FCMEQ, fcmeq)
188
+DO_FPCMP(FCMNE, fcmne)
189
+DO_FPCMP(FCMUO, fcmuo)
190
+DO_FPCMP(FACGE, facge)
191
+DO_FPCMP(FACGT, facgt)
192
+
193
+#undef DO_FPCMP
194
+
195
typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32);
196
197
static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn)
198
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
199
index XXXXXXX..XXXXXXX 100644
200
--- a/target/arm/sve.decode
201
+++ b/target/arm/sve.decode
202
@@ -XXX,XX +XXX,XX @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn
203
SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn
204
UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn
205
206
+### SVE Floating Point Compare - Vectors Group
207
+
208
+# SVE floating-point compare vectors
209
+FCMGE_ppzz 01100101 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm
210
+FCMGT_ppzz 01100101 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm
211
+FCMEQ_ppzz 01100101 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm
212
+FCMNE_ppzz 01100101 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm
213
+FCMUO_ppzz 01100101 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm
214
+FACGE_ppzz 01100101 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm
215
+FACGT_ppzz 01100101 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm
216
+
217
### SVE Integer Multiply-Add Group
218
219
# SVE integer multiply-add writing addend (predicated)
220
--
221
2.17.1
222
223
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-18-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 56 ++++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 69 +++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 14 +++++++
12
4 files changed, 214 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, i64, ptr, i32)
24
+DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, i64, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, i64, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, i64, ptr, i32)
31
+DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, i64, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, i64, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, i64, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, i64, ptr, i32)
40
+DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, i64, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, i64, ptr, i32)
45
+DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, i64, ptr, i32)
47
+DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, i64, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, i64, ptr, i32)
52
+DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, i64, ptr, i32)
54
+DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, i64, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, i64, ptr, i32)
59
+DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, i64, ptr, i32)
61
+DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, i64, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG,
65
+ void, ptr, ptr, ptr, i64, ptr, i32)
66
+DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, i64, ptr, i32)
68
+DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, i64, ptr, i32)
70
+
71
+DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG,
72
+ void, ptr, ptr, ptr, i64, ptr, i32)
73
+DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
74
+ void, ptr, ptr, ptr, i64, ptr, i32)
75
+DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
76
+ void, ptr, ptr, ptr, i64, ptr, i32)
77
+
78
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
79
void, ptr, ptr, ptr, ptr, i32)
80
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
81
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
82
index XXXXXXX..XXXXXXX 100644
83
--- a/target/arm/sve_helper.c
84
+++ b/target/arm/sve_helper.c
85
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd)
86
87
#undef DO_ZPZZ_FP
88
89
+/* Three-operand expander, with one scalar operand, controlled by
90
+ * a predicate, with the extra float_status parameter.
91
+ */
92
+#define DO_ZPZS_FP(NAME, TYPE, H, OP) \
93
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \
94
+ void *status, uint32_t desc) \
95
+{ \
96
+ intptr_t i = simd_oprsz(desc); \
97
+ uint64_t *g = vg; \
98
+ TYPE mm = scalar; \
99
+ do { \
100
+ uint64_t pg = g[(i - 1) >> 6]; \
101
+ do { \
102
+ i -= sizeof(TYPE); \
103
+ if (likely((pg >> (i & 63)) & 1)) { \
104
+ TYPE nn = *(TYPE *)(vn + H(i)); \
105
+ *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \
106
+ } \
107
+ } while (i & 63); \
108
+ } while (i != 0); \
109
+}
110
+
111
+DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add)
112
+DO_ZPZS_FP(sve_fadds_s, float32, H1_4, float32_add)
113
+DO_ZPZS_FP(sve_fadds_d, float64, , float64_add)
114
+
115
+DO_ZPZS_FP(sve_fsubs_h, float16, H1_2, float16_sub)
116
+DO_ZPZS_FP(sve_fsubs_s, float32, H1_4, float32_sub)
117
+DO_ZPZS_FP(sve_fsubs_d, float64, , float64_sub)
118
+
119
+DO_ZPZS_FP(sve_fmuls_h, float16, H1_2, float16_mul)
120
+DO_ZPZS_FP(sve_fmuls_s, float32, H1_4, float32_mul)
121
+DO_ZPZS_FP(sve_fmuls_d, float64, , float64_mul)
122
+
123
+static inline float16 subr_h(float16 a, float16 b, float_status *s)
124
+{
125
+ return float16_sub(b, a, s);
126
+}
127
+
128
+static inline float32 subr_s(float32 a, float32 b, float_status *s)
129
+{
130
+ return float32_sub(b, a, s);
131
+}
132
+
133
+static inline float64 subr_d(float64 a, float64 b, float_status *s)
134
+{
135
+ return float64_sub(b, a, s);
136
+}
137
+
138
+DO_ZPZS_FP(sve_fsubrs_h, float16, H1_2, subr_h)
139
+DO_ZPZS_FP(sve_fsubrs_s, float32, H1_4, subr_s)
140
+DO_ZPZS_FP(sve_fsubrs_d, float64, , subr_d)
141
+
142
+DO_ZPZS_FP(sve_fmaxnms_h, float16, H1_2, float16_maxnum)
143
+DO_ZPZS_FP(sve_fmaxnms_s, float32, H1_4, float32_maxnum)
144
+DO_ZPZS_FP(sve_fmaxnms_d, float64, , float64_maxnum)
145
+
146
+DO_ZPZS_FP(sve_fminnms_h, float16, H1_2, float16_minnum)
147
+DO_ZPZS_FP(sve_fminnms_s, float32, H1_4, float32_minnum)
148
+DO_ZPZS_FP(sve_fminnms_d, float64, , float64_minnum)
149
+
150
+DO_ZPZS_FP(sve_fmaxs_h, float16, H1_2, float16_max)
151
+DO_ZPZS_FP(sve_fmaxs_s, float32, H1_4, float32_max)
152
+DO_ZPZS_FP(sve_fmaxs_d, float64, , float64_max)
153
+
154
+DO_ZPZS_FP(sve_fmins_h, float16, H1_2, float16_min)
155
+DO_ZPZS_FP(sve_fmins_s, float32, H1_4, float32_min)
156
+DO_ZPZS_FP(sve_fmins_d, float64, , float64_min)
157
+
158
/* Fully general two-operand expander, controlled by a predicate,
159
* With the extra float_status parameter.
160
*/
161
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
162
index XXXXXXX..XXXXXXX 100644
163
--- a/target/arm/translate-sve.c
164
+++ b/target/arm/translate-sve.c
165
@@ -XXX,XX +XXX,XX @@
166
#include "exec/log.h"
167
#include "trace-tcg.h"
168
#include "translate-a64.h"
169
+#include "fpu/softfloat.h"
170
171
172
typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t,
173
@@ -XXX,XX +XXX,XX @@ DO_FP3(FMULX, fmulx)
174
175
#undef DO_FP3
176
177
+typedef void gen_helper_sve_fp2scalar(TCGv_ptr, TCGv_ptr, TCGv_ptr,
178
+ TCGv_i64, TCGv_ptr, TCGv_i32);
179
+
180
+static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16,
181
+ TCGv_i64 scalar, gen_helper_sve_fp2scalar *fn)
182
+{
183
+ unsigned vsz = vec_full_reg_size(s);
184
+ TCGv_ptr t_zd, t_zn, t_pg, status;
185
+ TCGv_i32 desc;
186
+
187
+ t_zd = tcg_temp_new_ptr();
188
+ t_zn = tcg_temp_new_ptr();
189
+ t_pg = tcg_temp_new_ptr();
190
+ tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, zd));
191
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, zn));
192
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
193
+
194
+ status = get_fpstatus_ptr(is_fp16);
195
+ desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
196
+ fn(t_zd, t_zn, t_pg, scalar, status, desc);
197
+
198
+ tcg_temp_free_i32(desc);
199
+ tcg_temp_free_ptr(status);
200
+ tcg_temp_free_ptr(t_pg);
201
+ tcg_temp_free_ptr(t_zn);
202
+ tcg_temp_free_ptr(t_zd);
203
+}
204
+
205
+static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm,
206
+ gen_helper_sve_fp2scalar *fn)
207
+{
208
+ TCGv_i64 temp = tcg_const_i64(imm);
209
+ do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16, temp, fn);
210
+ tcg_temp_free_i64(temp);
211
+}
212
+
213
+#define DO_FP_IMM(NAME, name, const0, const1) \
214
+static bool trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a, \
215
+ uint32_t insn) \
216
+{ \
217
+ static gen_helper_sve_fp2scalar * const fns[3] = { \
218
+ gen_helper_sve_##name##_h, \
219
+ gen_helper_sve_##name##_s, \
220
+ gen_helper_sve_##name##_d \
221
+ }; \
222
+ static uint64_t const val[3][2] = { \
223
+ { float16_##const0, float16_##const1 }, \
224
+ { float32_##const0, float32_##const1 }, \
225
+ { float64_##const0, float64_##const1 }, \
226
+ }; \
227
+ if (a->esz == 0) { \
228
+ return false; \
229
+ } \
230
+ if (sve_access_check(s)) { \
231
+ do_fp_imm(s, a, val[a->esz - 1][a->imm], fns[a->esz - 1]); \
232
+ } \
233
+ return true; \
234
+}
235
+
236
+#define float16_two make_float16(0x4000)
237
+#define float32_two make_float32(0x40000000)
238
+#define float64_two make_float64(0x4000000000000000ULL)
239
+
240
+DO_FP_IMM(FADD, fadds, half, one)
241
+DO_FP_IMM(FSUB, fsubs, half, one)
242
+DO_FP_IMM(FMUL, fmuls, half, two)
243
+DO_FP_IMM(FSUBR, fsubrs, half, one)
244
+DO_FP_IMM(FMAXNM, fmaxnms, zero, one)
245
+DO_FP_IMM(FMINNM, fminnms, zero, one)
246
+DO_FP_IMM(FMAX, fmaxs, zero, one)
247
+DO_FP_IMM(FMIN, fmins, zero, one)
248
+
249
+#undef DO_FP_IMM
250
+
251
static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a,
252
gen_helper_gvec_4_ptr *fn)
253
{
254
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
255
index XXXXXXX..XXXXXXX 100644
256
--- a/target/arm/sve.decode
257
+++ b/target/arm/sve.decode
258
@@ -XXX,XX +XXX,XX @@
259
@rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \
260
&rpri_esz rn=%reg_movprfx
261
262
+# Two register operand, one one-bit floating-point operand.
263
+@rdn_i1 ........ esz:2 ......... pg:3 .... imm:1 rd:5 \
264
+ &rpri_esz rn=%reg_movprfx
265
+
266
# Two register operand, one encoded bitmask.
267
@rdn_dbm ........ .. .... dbm:13 rd:5 \
268
&rr_dbm rn=%reg_movprfx
269
@@ -XXX,XX +XXX,XX @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm
270
FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR
271
FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm
272
273
+# SVE floating-point arithmetic with immediate (predicated)
274
+FADD_zpzi 01100101 .. 011 000 100 ... 0000 . ..... @rdn_i1
275
+FSUB_zpzi 01100101 .. 011 001 100 ... 0000 . ..... @rdn_i1
276
+FMUL_zpzi 01100101 .. 011 010 100 ... 0000 . ..... @rdn_i1
277
+FSUBR_zpzi 01100101 .. 011 011 100 ... 0000 . ..... @rdn_i1
278
+FMAXNM_zpzi 01100101 .. 011 100 100 ... 0000 . ..... @rdn_i1
279
+FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1
280
+FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1
281
+FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1
282
+
283
### SVE FP Multiply-Add Group
284
285
# SVE floating-point multiply-accumulate writing addend
286
--
287
2.17.1
288
289
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-19-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper.h | 14 +++++++++++
9
target/arm/translate-sve.c | 50 ++++++++++++++++++++++++++++++++++++++
10
target/arm/vec_helper.c | 48 ++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 19 +++++++++++++++
12
4 files changed, 131 insertions(+)
13
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
17
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+
36
#ifdef TARGET_AARCH64
37
#include "helper-a64.h"
38
#include "helper-sve.h"
39
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate-sve.c
42
+++ b/target/arm/translate-sve.c
43
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
44
45
#undef DO_ZZI
46
47
+/*
48
+ *** SVE Floating Point Multiply-Add Indexed Group
49
+ */
50
+
51
+static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn)
52
+{
53
+ static gen_helper_gvec_4_ptr * const fns[3] = {
54
+ gen_helper_gvec_fmla_idx_h,
55
+ gen_helper_gvec_fmla_idx_s,
56
+ gen_helper_gvec_fmla_idx_d,
57
+ };
58
+
59
+ if (sve_access_check(s)) {
60
+ unsigned vsz = vec_full_reg_size(s);
61
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
62
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
63
+ vec_full_reg_offset(s, a->rn),
64
+ vec_full_reg_offset(s, a->rm),
65
+ vec_full_reg_offset(s, a->ra),
66
+ status, vsz, vsz, (a->index << 1) | a->sub,
67
+ fns[a->esz - 1]);
68
+ tcg_temp_free_ptr(status);
69
+ }
70
+ return true;
71
+}
72
+
73
+/*
74
+ *** SVE Floating Point Multiply Indexed Group
75
+ */
76
+
77
+static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn)
78
+{
79
+ static gen_helper_gvec_3_ptr * const fns[3] = {
80
+ gen_helper_gvec_fmul_idx_h,
81
+ gen_helper_gvec_fmul_idx_s,
82
+ gen_helper_gvec_fmul_idx_d,
83
+ };
84
+
85
+ if (sve_access_check(s)) {
86
+ unsigned vsz = vec_full_reg_size(s);
87
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
88
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
89
+ vec_full_reg_offset(s, a->rn),
90
+ vec_full_reg_offset(s, a->rm),
91
+ status, vsz, vsz, a->index, fns[a->esz - 1]);
92
+ tcg_temp_free_ptr(status);
93
+ }
94
+ return true;
95
+}
96
+
97
/*
98
*** SVE Floating Point Accumulating Reduction Group
99
*/
100
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/vec_helper.c
103
+++ b/target/arm/vec_helper.c
104
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64)
105
106
#endif
107
#undef DO_3OP
108
+
109
+/* For the indexed ops, SVE applies the index per 128-bit vector segment.
110
+ * For AdvSIMD, there is of course only one such vector segment.
111
+ */
112
+
113
+#define DO_MUL_IDX(NAME, TYPE, H) \
114
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
115
+{ \
116
+ intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
117
+ intptr_t idx = simd_data(desc); \
118
+ TYPE *d = vd, *n = vn, *m = vm; \
119
+ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
120
+ TYPE mm = m[H(i + idx)]; \
121
+ for (j = 0; j < segment; j++) { \
122
+ d[i + j] = TYPE##_mul(n[i + j], mm, stat); \
123
+ } \
124
+ } \
125
+}
126
+
127
+DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
128
+DO_MUL_IDX(gvec_fmul_idx_s, float32, H4)
129
+DO_MUL_IDX(gvec_fmul_idx_d, float64, )
130
+
131
+#undef DO_MUL_IDX
132
+
133
+#define DO_FMLA_IDX(NAME, TYPE, H) \
134
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
135
+ void *stat, uint32_t desc) \
136
+{ \
137
+ intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
138
+ TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \
139
+ intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \
140
+ TYPE *d = vd, *n = vn, *m = vm, *a = va; \
141
+ op1_neg <<= (8 * sizeof(TYPE) - 1); \
142
+ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
143
+ TYPE mm = m[H(i + idx)]; \
144
+ for (j = 0; j < segment; j++) { \
145
+ d[i + j] = TYPE##_muladd(n[i + j] ^ op1_neg, \
146
+ mm, a[i + j], 0, stat); \
147
+ } \
148
+ } \
149
+}
150
+
151
+DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
152
+DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4)
153
+DO_FMLA_IDX(gvec_fmla_idx_d, float64, )
154
+
155
+#undef DO_FMLA_IDX
156
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
157
index XXXXXXX..XXXXXXX 100644
158
--- a/target/arm/sve.decode
159
+++ b/target/arm/sve.decode
160
@@ -XXX,XX +XXX,XX @@
161
%imm9_16_10 16:s6 10:3
162
%size_23 23:2
163
%dtype_23_13 23:2 13:2
164
+%index3_22_19 22:1 19:2
165
166
# A combination of tsz:imm3 -- extract esize.
167
%tszimm_esz 22:2 5:5 !function=tszimm_esz
168
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
169
# SVE integer multiply immediate (unpredicated)
170
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
171
172
+### SVE FP Multiply-Add Indexed Group
173
+
174
+# SVE floating-point multiply-add (indexed)
175
+FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \
176
+ ra=%reg_movprfx index=%index3_22_19 esz=1
177
+FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \
178
+ ra=%reg_movprfx esz=2
179
+FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \
180
+ ra=%reg_movprfx esz=3
181
+
182
+### SVE FP Multiply Indexed Group
183
+
184
+# SVE floating-point multiply (indexed)
185
+FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \
186
+ index=%index3_22_19 esz=1
187
+FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2
188
+FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3
189
+
190
### SVE FP Accumulating Reduction Group
191
192
# SVE floating-point serial reduction (predicated)
193
--
194
2.17.1
195
196
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-21-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper.h | 8 +++++++
9
target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++
10
target/arm/vec_helper.c | 20 ++++++++++++++++
11
target/arm/sve.decode | 5 ++++
12
4 files changed, 80 insertions(+)
13
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
17
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+
26
+DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+
30
DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/translate-sve.c
36
+++ b/target/arm/translate-sve.c
37
@@ -XXX,XX +XXX,XX @@ DO_VPZ(FMAXNMV, fmaxnmv)
38
DO_VPZ(FMINV, fminv)
39
DO_VPZ(FMAXV, fmaxv)
40
41
+/*
42
+ *** SVE Floating Point Unary Operations - Unpredicated Group
43
+ */
44
+
45
+static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn)
46
+{
47
+ unsigned vsz = vec_full_reg_size(s);
48
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
49
+
50
+ tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd),
51
+ vec_full_reg_offset(s, a->rn),
52
+ status, vsz, vsz, 0, fn);
53
+ tcg_temp_free_ptr(status);
54
+}
55
+
56
+static bool trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
57
+{
58
+ static gen_helper_gvec_2_ptr * const fns[3] = {
59
+ gen_helper_gvec_frecpe_h,
60
+ gen_helper_gvec_frecpe_s,
61
+ gen_helper_gvec_frecpe_d,
62
+ };
63
+ if (a->esz == 0) {
64
+ return false;
65
+ }
66
+ if (sve_access_check(s)) {
67
+ do_zz_fp(s, a, fns[a->esz - 1]);
68
+ }
69
+ return true;
70
+}
71
+
72
+static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
73
+{
74
+ static gen_helper_gvec_2_ptr * const fns[3] = {
75
+ gen_helper_gvec_frsqrte_h,
76
+ gen_helper_gvec_frsqrte_s,
77
+ gen_helper_gvec_frsqrte_d,
78
+ };
79
+ if (a->esz == 0) {
80
+ return false;
81
+ }
82
+ if (sve_access_check(s)) {
83
+ do_zz_fp(s, a, fns[a->esz - 1]);
84
+ }
85
+ return true;
86
+}
87
+
88
/*
89
*** SVE Floating Point Accumulating Reduction Group
90
*/
91
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
92
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/vec_helper.c
94
+++ b/target/arm/vec_helper.c
95
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm,
96
clear_tail(d, opr_sz, simd_maxsz(desc));
97
}
98
99
+#define DO_2OP(NAME, FUNC, TYPE) \
100
+void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
101
+{ \
102
+ intptr_t i, oprsz = simd_oprsz(desc); \
103
+ TYPE *d = vd, *n = vn; \
104
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
105
+ d[i] = FUNC(n[i], stat); \
106
+ } \
107
+}
108
+
109
+DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16)
110
+DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32)
111
+DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64)
112
+
113
+DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16)
114
+DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32)
115
+DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64)
116
+
117
+#undef DO_2OP
118
+
119
/* Floating-point trigonometric starting value.
120
* See the ARM ARM pseudocode function FPTrigSMul.
121
*/
122
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
123
index XXXXXXX..XXXXXXX 100644
124
--- a/target/arm/sve.decode
125
+++ b/target/arm/sve.decode
126
@@ -XXX,XX +XXX,XX @@ FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn
127
FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn
128
FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn
129
130
+## SVE Floating Point Unary Operations - Unpredicated Group
131
+
132
+FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn
133
+FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn
134
+
135
### SVE FP Accumulating Reduction Group
136
137
# SVE floating-point serial reduction (predicated)
138
--
139
2.17.1
140
141
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-22-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 42 +++++++++++++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 43 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 10 +++++++++
12
4 files changed, 138 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
20
i64, i64, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, i32)
63
+
64
DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
65
void, ptr, ptr, ptr, ptr, ptr, i32)
66
DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
67
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
68
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/sve_helper.c
70
+++ b/target/arm/sve_helper.c
71
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
72
73
#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0
74
#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0
75
+#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <= 0
76
+#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0
77
#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0
78
#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0
79
#define DO_FCMUO(TYPE, X, Y, ST) \
80
@@ -XXX,XX +XXX,XX @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT)
81
#undef DO_FPCMP_PPZZ_H
82
#undef DO_FPCMP_PPZZ
83
84
+/* One operand floating-point comparison against zero, controlled
85
+ * by a predicate.
86
+ */
87
+#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \
88
+void HELPER(NAME)(void *vd, void *vn, void *vg, \
89
+ void *status, uint32_t desc) \
90
+{ \
91
+ intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \
92
+ uint64_t *d = vd, *g = vg; \
93
+ do { \
94
+ uint64_t out = 0, pg = g[j]; \
95
+ do { \
96
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
97
+ if ((pg >> (i & 63)) & 1) { \
98
+ TYPE nn = *(TYPE *)(vn + H(i)); \
99
+ out |= OP(TYPE, nn, 0, status); \
100
+ } \
101
+ } while (i & 63); \
102
+ d[j--] = out; \
103
+ } while (i > 0); \
104
+}
105
+
106
+#define DO_FPCMP_PPZ0_H(NAME, OP) \
107
+ DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP)
108
+#define DO_FPCMP_PPZ0_S(NAME, OP) \
109
+ DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP)
110
+#define DO_FPCMP_PPZ0_D(NAME, OP) \
111
+ DO_FPCMP_PPZ0(NAME##_d, float64, , OP)
112
+
113
+#define DO_FPCMP_PPZ0_ALL(NAME, OP) \
114
+ DO_FPCMP_PPZ0_H(NAME, OP) \
115
+ DO_FPCMP_PPZ0_S(NAME, OP) \
116
+ DO_FPCMP_PPZ0_D(NAME, OP)
117
+
118
+DO_FPCMP_PPZ0_ALL(sve_fcmge0, DO_FCMGE)
119
+DO_FPCMP_PPZ0_ALL(sve_fcmgt0, DO_FCMGT)
120
+DO_FPCMP_PPZ0_ALL(sve_fcmle0, DO_FCMLE)
121
+DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT)
122
+DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ)
123
+DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE)
124
+
125
/*
126
* Load contiguous data, protected by a governing predicate.
127
*/
128
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
129
index XXXXXXX..XXXXXXX 100644
130
--- a/target/arm/translate-sve.c
131
+++ b/target/arm/translate-sve.c
132
@@ -XXX,XX +XXX,XX @@ static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
133
return true;
134
}
135
136
+/*
137
+ *** SVE Floating Point Compare with Zero Group
138
+ */
139
+
140
+static void do_ppz_fp(DisasContext *s, arg_rpr_esz *a,
141
+ gen_helper_gvec_3_ptr *fn)
142
+{
143
+ unsigned vsz = vec_full_reg_size(s);
144
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
145
+
146
+ tcg_gen_gvec_3_ptr(pred_full_reg_offset(s, a->rd),
147
+ vec_full_reg_offset(s, a->rn),
148
+ pred_full_reg_offset(s, a->pg),
149
+ status, vsz, vsz, 0, fn);
150
+ tcg_temp_free_ptr(status);
151
+}
152
+
153
+#define DO_PPZ(NAME, name) \
154
+static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \
155
+{ \
156
+ static gen_helper_gvec_3_ptr * const fns[3] = { \
157
+ gen_helper_sve_##name##_h, \
158
+ gen_helper_sve_##name##_s, \
159
+ gen_helper_sve_##name##_d, \
160
+ }; \
161
+ if (a->esz == 0) { \
162
+ return false; \
163
+ } \
164
+ if (sve_access_check(s)) { \
165
+ do_ppz_fp(s, a, fns[a->esz - 1]); \
166
+ } \
167
+ return true; \
168
+}
169
+
170
+DO_PPZ(FCMGE_ppz0, fcmge0)
171
+DO_PPZ(FCMGT_ppz0, fcmgt0)
172
+DO_PPZ(FCMLE_ppz0, fcmle0)
173
+DO_PPZ(FCMLT_ppz0, fcmlt0)
174
+DO_PPZ(FCMEQ_ppz0, fcmeq0)
175
+DO_PPZ(FCMNE_ppz0, fcmne0)
176
+
177
+#undef DO_PPZ
178
+
179
/*
180
*** SVE Floating Point Accumulating Reduction Group
181
*/
182
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
183
index XXXXXXX..XXXXXXX 100644
184
--- a/target/arm/sve.decode
185
+++ b/target/arm/sve.decode
186
@@ -XXX,XX +XXX,XX @@
187
# One register operand, with governing predicate, vector element size
188
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
189
@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz
190
+@pd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 . rd:4 &rpr_esz
191
192
# One register operand, with governing predicate, no vector element size
193
@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0
194
@@ -XXX,XX +XXX,XX @@ FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn
195
FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn
196
FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn
197
198
+### SVE FP Compare with Zero Group
199
+
200
+FCMGE_ppz0 01100101 .. 0100 00 001 ... ..... 0 .... @pd_pg_rn
201
+FCMGT_ppz0 01100101 .. 0100 00 001 ... ..... 1 .... @pd_pg_rn
202
+FCMLT_ppz0 01100101 .. 0100 01 001 ... ..... 0 .... @pd_pg_rn
203
+FCMLE_ppz0 01100101 .. 0100 01 001 ... ..... 1 .... @pd_pg_rn
204
+FCMEQ_ppz0 01100101 .. 0100 10 001 ... ..... 0 .... @pd_pg_rn
205
+FCMNE_ppz0 01100101 .. 0100 11 001 ... ..... 0 .... @pd_pg_rn
206
+
207
### SVE FP Accumulating Reduction Group
208
209
# SVE floating-point serial reduction (predicated)
210
--
211
2.17.1
212
213
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-23-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 4 +++
9
target/arm/sve_helper.c | 70 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 27 +++++++++++++++
11
target/arm/sve.decode | 3 ++
12
4 files changed, 104 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
19
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
20
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
21
22
+DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
+
26
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/sve_helper.c
32
+++ b/target/arm/sve_helper.c
33
@@ -XXX,XX +XXX,XX @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT)
34
DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ)
35
DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE)
36
37
+/* FP Trig Multiply-Add. */
38
+
39
+void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
40
+{
41
+ static const float16 coeff[16] = {
42
+ 0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
43
+ 0x3c00, 0xb800, 0x293a, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
44
+ };
45
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16);
46
+ intptr_t x = simd_data(desc);
47
+ float16 *d = vd, *n = vn, *m = vm;
48
+ for (i = 0; i < opr_sz; i++) {
49
+ float16 mm = m[i];
50
+ intptr_t xx = x;
51
+ if (float16_is_neg(mm)) {
52
+ mm = float16_abs(mm);
53
+ xx += 8;
54
+ }
55
+ d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs);
56
+ }
57
+}
58
+
59
+void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
60
+{
61
+ static const float32 coeff[16] = {
62
+ 0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9,
63
+ 0x36369d6d, 0x00000000, 0x00000000, 0x00000000,
64
+ 0x3f800000, 0xbf000000, 0x3d2aaaa6, 0xbab60705,
65
+ 0x37cd37cc, 0x00000000, 0x00000000, 0x00000000,
66
+ };
67
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32);
68
+ intptr_t x = simd_data(desc);
69
+ float32 *d = vd, *n = vn, *m = vm;
70
+ for (i = 0; i < opr_sz; i++) {
71
+ float32 mm = m[i];
72
+ intptr_t xx = x;
73
+ if (float32_is_neg(mm)) {
74
+ mm = float32_abs(mm);
75
+ xx += 8;
76
+ }
77
+ d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs);
78
+ }
79
+}
80
+
81
+void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
82
+{
83
+ static const float64 coeff[16] = {
84
+ 0x3ff0000000000000ull, 0xbfc5555555555543ull,
85
+ 0x3f8111111110f30cull, 0xbf2a01a019b92fc6ull,
86
+ 0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull,
87
+ 0x3de5d8408868552full, 0x0000000000000000ull,
88
+ 0x3ff0000000000000ull, 0xbfe0000000000000ull,
89
+ 0x3fa5555555555536ull, 0xbf56c16c16c13a0bull,
90
+ 0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull,
91
+ 0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull,
92
+ };
93
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64);
94
+ intptr_t x = simd_data(desc);
95
+ float64 *d = vd, *n = vn, *m = vm;
96
+ for (i = 0; i < opr_sz; i++) {
97
+ float64 mm = m[i];
98
+ intptr_t xx = x;
99
+ if (float64_is_neg(mm)) {
100
+ mm = float64_abs(mm);
101
+ xx += 8;
102
+ }
103
+ d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs);
104
+ }
105
+}
106
+
107
/*
108
* Load contiguous data, protected by a governing predicate.
109
*/
110
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/translate-sve.c
113
+++ b/target/arm/translate-sve.c
114
@@ -XXX,XX +XXX,XX @@ DO_PPZ(FCMNE_ppz0, fcmne0)
115
116
#undef DO_PPZ
117
118
+/*
119
+ *** SVE floating-point trig multiply-add coefficient
120
+ */
121
+
122
+static bool trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn)
123
+{
124
+ static gen_helper_gvec_3_ptr * const fns[3] = {
125
+ gen_helper_sve_ftmad_h,
126
+ gen_helper_sve_ftmad_s,
127
+ gen_helper_sve_ftmad_d,
128
+ };
129
+
130
+ if (a->esz == 0) {
131
+ return false;
132
+ }
133
+ if (sve_access_check(s)) {
134
+ unsigned vsz = vec_full_reg_size(s);
135
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
136
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
137
+ vec_full_reg_offset(s, a->rn),
138
+ vec_full_reg_offset(s, a->rm),
139
+ status, vsz, vsz, a->imm, fns[a->esz - 1]);
140
+ tcg_temp_free_ptr(status);
141
+ }
142
+ return true;
143
+}
144
+
145
/*
146
*** SVE Floating Point Accumulating Reduction Group
147
*/
148
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
149
index XXXXXXX..XXXXXXX 100644
150
--- a/target/arm/sve.decode
151
+++ b/target/arm/sve.decode
152
@@ -XXX,XX +XXX,XX @@ FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1
153
FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1
154
FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1
155
156
+# SVE floating-point trig multiply-add coefficient
157
+FTMAD 01100101 esz:2 010 imm:3 100000 rm:5 rd:5 rn=%reg_movprfx
158
+
159
### SVE FP Multiply-Add Group
160
161
# SVE floating-point multiply-accumulate writing addend
162
--
163
2.17.1
164
165
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-27-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
8
target/arm/helper-sve.h | 14 ++++++++++++++
9
target/arm/sve_helper.c | 8 ++++++++
10
target/arm/translate-sve.c | 26 ++++++++++++++++++++++++++
11
target/arm/sve.decode | 4 ++++
12
4 files changed, 52 insertions(+)
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
+++ b/target/arm/helper-sve.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
19
DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG,
23
+ void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+
36
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
37
void, ptr, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
39
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/sve_helper.c
42
+++ b/target/arm/sve_helper.c
43
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int)
44
DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
45
DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int)
46
47
+DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16)
48
+DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32)
49
+DO_ZPZ_FP(sve_frecpx_d, uint64_t, , helper_frecpx_f64)
50
+
51
+DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt)
52
+DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt)
53
+DO_ZPZ_FP(sve_fsqrt_d, uint64_t, , float64_sqrt)
54
+
55
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
56
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
57
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
58
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-sve.c
61
+++ b/target/arm/translate-sve.c
62
@@ -XXX,XX +XXX,XX @@ static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
63
return do_frint_mode(s, a, float_round_ties_away);
64
}
65
66
+static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
67
+{
68
+ static gen_helper_gvec_3_ptr * const fns[3] = {
69
+ gen_helper_sve_frecpx_h,
70
+ gen_helper_sve_frecpx_s,
71
+ gen_helper_sve_frecpx_d
72
+ };
73
+ if (a->esz == 0) {
74
+ return false;
75
+ }
76
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]);
77
+}
78
+
79
+static bool trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
80
+{
81
+ static gen_helper_gvec_3_ptr * const fns[3] = {
82
+ gen_helper_sve_fsqrt_h,
83
+ gen_helper_sve_fsqrt_s,
84
+ gen_helper_sve_fsqrt_d
85
+ };
86
+ if (a->esz == 0) {
87
+ return false;
88
+ }
89
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]);
90
+}
91
+
92
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
93
{
94
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
95
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
96
index XXXXXXX..XXXXXXX 100644
97
--- a/target/arm/sve.decode
98
+++ b/target/arm/sve.decode
99
@@ -XXX,XX +XXX,XX @@ FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn
100
FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn
101
FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn
102
103
+# SVE floating-point unary operations
104
+FRECPX 01100101 .. 001 100 101 ... ..... ..... @rd_pg_rn
105
+FSQRT 01100101 .. 001 101 101 ... ..... ..... @rd_pg_rn
106
+
107
# SVE integer convert to floating-point
108
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
109
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
110
--
111
2.17.1
112
113
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Enable ARM_FEATURE_SVE for the generic "max" cpu.
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20180627043328.11531-35-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
linux-user/elfload.c | 1 +
12
target/arm/cpu.c | 7 +++++++
13
target/arm/cpu64.c | 1 +
14
3 files changed, 9 insertions(+)
15
16
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/linux-user/elfload.c
19
+++ b/linux-user/elfload.c
20
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
21
GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
22
GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
23
GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
24
+ GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
25
#undef GET_FEATURE
26
27
return hwcaps;
28
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/cpu.c
31
+++ b/target/arm/cpu.c
32
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
33
env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE;
34
/* and to the FP/Neon instructions */
35
env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
36
+ /* and to the SVE instructions */
37
+ env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3);
38
+ env->cp15.cptr_el[3] |= CPTR_EZ;
39
+ /* with maximum vector length */
40
+ env->vfp.zcr_el[1] = ARM_MAX_VQ - 1;
41
+ env->vfp.zcr_el[2] = ARM_MAX_VQ - 1;
42
+ env->vfp.zcr_el[3] = ARM_MAX_VQ - 1;
43
#else
44
/* Reset into the highest available EL */
45
if (arm_feature(env, ARM_FEATURE_EL3)) {
46
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/cpu64.c
49
+++ b/target/arm/cpu64.c
50
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
51
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
52
set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
53
set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
54
+ set_feature(&cpu->env, ARM_FEATURE_SVE);
55
/* For usermode -cpu max we can use a larger and more efficient DCZ
56
* blocksize since we don't have to follow what the hardware does.
57
*/
58
--
59
2.17.1
60
61
diff view generated by jsdifflib
Deleted patch
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
1
3
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
7
hw/arm/mcimx7d-sabre.c | 2 --
8
1 file changed, 2 deletions(-)
9
10
diff --git a/hw/arm/mcimx7d-sabre.c b/hw/arm/mcimx7d-sabre.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/hw/arm/mcimx7d-sabre.c
13
+++ b/hw/arm/mcimx7d-sabre.c
14
@@ -XXX,XX +XXX,XX @@
15
#include "hw/arm/fsl-imx7.h"
16
#include "hw/boards.h"
17
#include "sysemu/sysemu.h"
18
-#include "sysemu/device_tree.h"
19
#include "qemu/error-report.h"
20
#include "sysemu/qtest.h"
21
-#include "net/net.h"
22
23
typedef struct {
24
FslIMX7State soc;
25
--
26
2.17.1
27
28
diff view generated by jsdifflib
Deleted patch
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
1
3
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
7
hw/arm/fsl-imx7.c | 2 +-
8
1 file changed, 1 insertion(+), 1 deletion(-)
9
10
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/hw/arm/fsl-imx7.c
13
+++ b/hw/arm/fsl-imx7.c
14
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
15
/*
16
* SRC
17
*/
18
- create_unimplemented_device("sdma", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
19
+ create_unimplemented_device("src", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
20
21
/*
22
* Watchdog
23
--
24
2.17.1
25
26
diff view generated by jsdifflib
Deleted patch
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
2
1
3
The qdev_get_gpio_in() function accept an int as second parameter.
4
5
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
hw/arm/fsl-imx7.c | 6 +++---
11
1 file changed, 3 insertions(+), 3 deletions(-)
12
13
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/hw/arm/fsl-imx7.c
16
+++ b/hw/arm/fsl-imx7.c
17
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
18
FSL_IMX7_ECSPI4_ADDR,
19
};
20
21
- static const hwaddr FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
22
+ static const int FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
23
FSL_IMX7_ECSPI1_IRQ,
24
FSL_IMX7_ECSPI2_IRQ,
25
FSL_IMX7_ECSPI3_IRQ,
26
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
27
FSL_IMX7_I2C4_ADDR,
28
};
29
30
- static const hwaddr FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
31
+ static const int FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
32
FSL_IMX7_I2C1_IRQ,
33
FSL_IMX7_I2C2_IRQ,
34
FSL_IMX7_I2C3_IRQ,
35
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
36
FSL_IMX7_USB3_ADDR,
37
};
38
39
- static const hwaddr FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
40
+ static const int FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
41
FSL_IMX7_USB1_IRQ,
42
FSL_IMX7_USB2_IRQ,
43
FSL_IMX7_USB3_IRQ,
44
--
45
2.17.1
46
47
diff view generated by jsdifflib
Deleted patch
1
From: Aaron Lindsay <alindsay@codeaurora.org>
2
1
3
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
4
Message-id: 1529699547-17044-5-git-send-email-alindsay@codeaurora.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
7
target/arm/cpu.h | 1 +
8
target/arm/cpu.c | 21 ++++++++++++++-------
9
target/arm/kvm32.c | 8 ++++----
10
3 files changed, 19 insertions(+), 11 deletions(-)
11
12
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.h
15
+++ b/target/arm/cpu.h
16
@@ -XXX,XX +XXX,XX @@ enum arm_features {
17
ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling. */
18
ARM_FEATURE_THUMB2EE,
19
ARM_FEATURE_V7MP, /* v7 Multiprocessing Extensions */
20
+ ARM_FEATURE_V7VE, /* v7 Virtualization Extensions (non-EL2 parts) */
21
ARM_FEATURE_V4T,
22
ARM_FEATURE_V5,
23
ARM_FEATURE_STRONGARM,
24
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/cpu.c
27
+++ b/target/arm/cpu.c
28
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
29
30
/* Some features automatically imply others: */
31
if (arm_feature(env, ARM_FEATURE_V8)) {
32
- set_feature(env, ARM_FEATURE_V7);
33
+ set_feature(env, ARM_FEATURE_V7VE);
34
+ }
35
+ if (arm_feature(env, ARM_FEATURE_V7VE)) {
36
+ /* v7 Virtualization Extensions. In real hardware this implies
37
+ * EL2 and also the presence of the Security Extensions.
38
+ * For QEMU, for backwards-compatibility we implement some
39
+ * CPUs or CPU configs which have no actual EL2 or EL3 but do
40
+ * include the various other features that V7VE implies.
41
+ * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
42
+ * Security Extensions is ARM_FEATURE_EL3.
43
+ */
44
set_feature(env, ARM_FEATURE_ARM_DIV);
45
set_feature(env, ARM_FEATURE_LPAE);
46
+ set_feature(env, ARM_FEATURE_V7);
47
}
48
if (arm_feature(env, ARM_FEATURE_V7)) {
49
set_feature(env, ARM_FEATURE_VAPA);
50
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
51
ARMCPU *cpu = ARM_CPU(obj);
52
53
cpu->dtb_compatible = "arm,cortex-a7";
54
- set_feature(&cpu->env, ARM_FEATURE_V7);
55
+ set_feature(&cpu->env, ARM_FEATURE_V7VE);
56
set_feature(&cpu->env, ARM_FEATURE_VFP4);
57
set_feature(&cpu->env, ARM_FEATURE_NEON);
58
set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
59
- set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
60
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
61
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
62
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
63
- set_feature(&cpu->env, ARM_FEATURE_LPAE);
64
set_feature(&cpu->env, ARM_FEATURE_EL3);
65
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7;
66
cpu->midr = 0x410fc075;
67
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
68
ARMCPU *cpu = ARM_CPU(obj);
69
70
cpu->dtb_compatible = "arm,cortex-a15";
71
- set_feature(&cpu->env, ARM_FEATURE_V7);
72
+ set_feature(&cpu->env, ARM_FEATURE_V7VE);
73
set_feature(&cpu->env, ARM_FEATURE_VFP4);
74
set_feature(&cpu->env, ARM_FEATURE_NEON);
75
set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
76
- set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
77
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
78
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
79
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
80
- set_feature(&cpu->env, ARM_FEATURE_LPAE);
81
set_feature(&cpu->env, ARM_FEATURE_EL3);
82
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A15;
83
cpu->midr = 0x412fc0f1;
84
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/kvm32.c
87
+++ b/target/arm/kvm32.c
88
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
89
/* Now we've retrieved all the register information we can
90
* set the feature bits based on the ID register fields.
91
* We can assume any KVM supporting CPU is at least a v7
92
- * with VFPv3, LPAE and the generic timers; this in turn implies
93
- * most of the other feature bits, but a few must be tested.
94
+ * with VFPv3, virtualization extensions, and the generic
95
+ * timers; this in turn implies most of the other feature
96
+ * bits, but a few must be tested.
97
*/
98
- set_feature(&features, ARM_FEATURE_V7);
99
+ set_feature(&features, ARM_FEATURE_V7VE);
100
set_feature(&features, ARM_FEATURE_VFP3);
101
- set_feature(&features, ARM_FEATURE_LPAE);
102
set_feature(&features, ARM_FEATURE_GENERIC_TIMER);
103
104
switch (extract32(id_isar0, 24, 4)) {
105
--
106
2.17.1
107
108
diff view generated by jsdifflib
Deleted patch
1
From: Aaron Lindsay <alindsay@codeaurora.org>
2
1
3
KVM implies V7VE, which implies ARM_DIV and THUMB_DIV. The conditional
4
detection here is therefore unnecessary. Because V7VE is already
5
unconditionally specified for all KVM hosts, ARM_DIV and THUMB_DIV are
6
already indirectly specified and do not need to be included here at all.
7
8
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
9
Message-id: 1529699547-17044-6-git-send-email-alindsay@codeaurora.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/kvm32.c | 19 +------------------
13
1 file changed, 1 insertion(+), 18 deletions(-)
14
15
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/kvm32.c
18
+++ b/target/arm/kvm32.c
19
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
20
* and then query that CPU for the relevant ID registers.
21
*/
22
int i, ret, fdarray[3];
23
- uint32_t midr, id_pfr0, id_isar0, mvfr1;
24
+ uint32_t midr, id_pfr0, mvfr1;
25
uint64_t features = 0;
26
/* Old kernels may not know about the PREFERRED_TARGET ioctl: however
27
* we know these will only support creating one kind of guest CPU,
28
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
29
| ENCODE_CP_REG(15, 0, 0, 0, 1, 0, 0),
30
.addr = (uintptr_t)&id_pfr0,
31
},
32
- {
33
- .id = KVM_REG_ARM | KVM_REG_SIZE_U32
34
- | ENCODE_CP_REG(15, 0, 0, 0, 2, 0, 0),
35
- .addr = (uintptr_t)&id_isar0,
36
- },
37
{
38
.id = KVM_REG_ARM | KVM_REG_SIZE_U32
39
| KVM_REG_ARM_VFP | KVM_REG_ARM_VFP_MVFR1,
40
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
41
set_feature(&features, ARM_FEATURE_VFP3);
42
set_feature(&features, ARM_FEATURE_GENERIC_TIMER);
43
44
- switch (extract32(id_isar0, 24, 4)) {
45
- case 1:
46
- set_feature(&features, ARM_FEATURE_THUMB_DIV);
47
- break;
48
- case 2:
49
- set_feature(&features, ARM_FEATURE_ARM_DIV);
50
- set_feature(&features, ARM_FEATURE_THUMB_DIV);
51
- break;
52
- default:
53
- break;
54
- }
55
-
56
if (extract32(id_pfr0, 12, 4) == 1) {
57
set_feature(&features, ARM_FEATURE_THUMB2EE);
58
}
59
--
60
2.17.1
61
62
diff view generated by jsdifflib
Deleted patch
1
From: Aaron Lindsay <alindsay@codeaurora.org>
2
1
3
This makes it match its AArch64 equivalent, PMINTENSET_EL1
4
5
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
6
Message-id: 1529699547-17044-13-git-send-email-alindsay@codeaurora.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/helper.c | 2 +-
10
1 file changed, 1 insertion(+), 1 deletion(-)
11
12
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/helper.c
15
+++ b/target/arm/helper.c
16
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
17
.writefn = pmuserenr_write, .raw_writefn = raw_write },
18
{ .name = "PMINTENSET", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 1,
19
.access = PL1_RW, .accessfn = access_tpm,
20
- .type = ARM_CP_ALIAS,
21
+ .type = ARM_CP_ALIAS | ARM_CP_IO,
22
.fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pminten),
23
.resetvalue = 0,
24
.writefn = pmintenset_write, .raw_writefn = raw_write },
25
--
26
2.17.1
27
28
diff view generated by jsdifflib
Deleted patch
1
We don't actually implement SD command CRC checking, because
2
for almost all of our SD controllers the CRC generation is
3
done in hardware, and so modelling CRC generation and checking
4
would be a bit pointless. (The exception is that milkymist-memcard
5
makes the guest software compute the CRC.)
6
1
7
As a result almost all of our SD controller models don't bother
8
to set the SDRequest crc field, and the SD card model doesn't
9
check it. So the tracing of it in sdbus_do_command() provokes
10
Coverity warnings about use of uninitialized data.
11
12
Drop the CRC field from the trace; we can always add it back
13
if and when we do anything useful with the CRC.
14
15
Fixes Coverity issues 1386072, 1386074, 1386076, 1390571.
16
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
19
Message-id: 20180626180324.5537-1-peter.maydell@linaro.org
20
---
21
hw/sd/core.c | 2 +-
22
hw/sd/trace-events | 2 +-
23
2 files changed, 2 insertions(+), 2 deletions(-)
24
25
diff --git a/hw/sd/core.c b/hw/sd/core.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/hw/sd/core.c
28
+++ b/hw/sd/core.c
29
@@ -XXX,XX +XXX,XX @@ int sdbus_do_command(SDBus *sdbus, SDRequest *req, uint8_t *response)
30
{
31
SDState *card = get_card(sdbus);
32
33
- trace_sdbus_command(sdbus_name(sdbus), req->cmd, req->arg, req->crc);
34
+ trace_sdbus_command(sdbus_name(sdbus), req->cmd, req->arg);
35
if (card) {
36
SDCardClass *sc = SD_CARD_GET_CLASS(card);
37
38
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
39
index XXXXXXX..XXXXXXX 100644
40
--- a/hw/sd/trace-events
41
+++ b/hw/sd/trace-events
42
@@ -XXX,XX +XXX,XX @@ bcm2835_sdhost_edm_change(const char *why, uint32_t edm) "(%s) EDM now 0x%x"
43
bcm2835_sdhost_update_irq(uint32_t irq) "IRQ bits 0x%x\n"
44
45
# hw/sd/core.c
46
-sdbus_command(const char *bus_name, uint8_t cmd, uint32_t arg, uint8_t crc) "@%s CMD%02d arg 0x%08x crc 0x%02x"
47
+sdbus_command(const char *bus_name, uint8_t cmd, uint32_t arg) "@%s CMD%02d arg 0x%08x"
48
sdbus_read(const char *bus_name, uint8_t value) "@%s value 0x%02x"
49
sdbus_write(const char *bus_name, uint8_t value) "@%s value 0x%02x"
50
sdbus_set_voltage(const char *bus_name, uint16_t millivolts) "@%s %u (mV)"
51
--
52
2.17.1
53
54
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
The load/store API will ease further code movement.
4
5
Per the Physical Layer Simplified Spec. "3.6 Bus Protocol":
6
7
"In the CMD line the Most Significant Bit (MSB) is transmitted
8
first, the Least Significant Bit (LSB) is the last."
9
10
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
hw/sd/bcm2835_sdhost.c | 13 +++++--------
15
hw/sd/milkymist-memcard.c | 3 +--
16
hw/sd/omap_mmc.c | 6 ++----
17
hw/sd/pl181.c | 11 ++++-------
18
hw/sd/sdhci.c | 15 +++++----------
19
hw/sd/ssi-sd.c | 6 ++----
20
6 files changed, 19 insertions(+), 35 deletions(-)
21
22
diff --git a/hw/sd/bcm2835_sdhost.c b/hw/sd/bcm2835_sdhost.c
23
index XXXXXXX..XXXXXXX 100644
24
--- a/hw/sd/bcm2835_sdhost.c
25
+++ b/hw/sd/bcm2835_sdhost.c
26
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_send_command(BCM2835SDHostState *s)
27
goto error;
28
}
29
if (!(s->cmd & SDCMD_NO_RESPONSE)) {
30
-#define RWORD(n) (((uint32_t)rsp[n] << 24) | (rsp[n + 1] << 16) \
31
- | (rsp[n + 2] << 8) | rsp[n + 3])
32
if (rlen == 0 || (rlen == 4 && (s->cmd & SDCMD_LONG_RESPONSE))) {
33
goto error;
34
}
35
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_send_command(BCM2835SDHostState *s)
36
goto error;
37
}
38
if (rlen == 4) {
39
- s->rsp[0] = RWORD(0);
40
+ s->rsp[0] = ldl_be_p(&rsp[0]);
41
s->rsp[1] = s->rsp[2] = s->rsp[3] = 0;
42
} else {
43
- s->rsp[0] = RWORD(12);
44
- s->rsp[1] = RWORD(8);
45
- s->rsp[2] = RWORD(4);
46
- s->rsp[3] = RWORD(0);
47
+ s->rsp[0] = ldl_be_p(&rsp[12]);
48
+ s->rsp[1] = ldl_be_p(&rsp[8]);
49
+ s->rsp[2] = ldl_be_p(&rsp[4]);
50
+ s->rsp[3] = ldl_be_p(&rsp[0]);
51
}
52
-#undef RWORD
53
}
54
/* We never really delay commands, so if this was a 'busywait' command
55
* then we've completed it now and can raise the interrupt.
56
diff --git a/hw/sd/milkymist-memcard.c b/hw/sd/milkymist-memcard.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/hw/sd/milkymist-memcard.c
59
+++ b/hw/sd/milkymist-memcard.c
60
@@ -XXX,XX +XXX,XX @@ static void memcard_sd_command(MilkymistMemcardState *s)
61
SDRequest req;
62
63
req.cmd = s->command[0] & 0x3f;
64
- req.arg = (s->command[1] << 24) | (s->command[2] << 16)
65
- | (s->command[3] << 8) | s->command[4];
66
+ req.arg = ldl_be_p(s->command + 1);
67
req.crc = s->command[5];
68
69
s->response[0] = req.cmd;
70
diff --git a/hw/sd/omap_mmc.c b/hw/sd/omap_mmc.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/hw/sd/omap_mmc.c
73
+++ b/hw/sd/omap_mmc.c
74
@@ -XXX,XX +XXX,XX @@ static void omap_mmc_command(struct omap_mmc_s *host, int cmd, int dir,
75
CID_CSD_OVERWRITE;
76
if (host->sdio & (1 << 13))
77
mask |= AKE_SEQ_ERROR;
78
- rspstatus = (response[0] << 24) | (response[1] << 16) |
79
- (response[2] << 8) | (response[3] << 0);
80
+ rspstatus = ldl_be_p(response);
81
break;
82
83
case sd_r2:
84
@@ -XXX,XX +XXX,XX @@ static void omap_mmc_command(struct omap_mmc_s *host, int cmd, int dir,
85
}
86
rsplen = 4;
87
88
- rspstatus = (response[0] << 24) | (response[1] << 16) |
89
- (response[2] << 8) | (response[3] << 0);
90
+ rspstatus = ldl_be_p(response);
91
if (rspstatus & 0x80000000)
92
host->status &= 0xe000;
93
else
94
diff --git a/hw/sd/pl181.c b/hw/sd/pl181.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/hw/sd/pl181.c
97
+++ b/hw/sd/pl181.c
98
@@ -XXX,XX +XXX,XX @@ static void pl181_send_command(PL181State *s)
99
if (rlen < 0)
100
goto error;
101
if (s->cmd & PL181_CMD_RESPONSE) {
102
-#define RWORD(n) (((uint32_t)response[n] << 24) | (response[n + 1] << 16) \
103
- | (response[n + 2] << 8) | response[n + 3])
104
if (rlen == 0 || (rlen == 4 && (s->cmd & PL181_CMD_LONGRESP)))
105
goto error;
106
if (rlen != 4 && rlen != 16)
107
goto error;
108
- s->response[0] = RWORD(0);
109
+ s->response[0] = ldl_be_p(&response[0]);
110
if (rlen == 4) {
111
s->response[1] = s->response[2] = s->response[3] = 0;
112
} else {
113
- s->response[1] = RWORD(4);
114
- s->response[2] = RWORD(8);
115
- s->response[3] = RWORD(12) & ~1;
116
+ s->response[1] = ldl_be_p(&response[4]);
117
+ s->response[2] = ldl_be_p(&response[8]);
118
+ s->response[3] = ldl_be_p(&response[12]) & ~1;
119
}
120
DPRINTF("Response received\n");
121
s->status |= PL181_STATUS_CMDRESPEND;
122
-#undef RWORD
123
} else {
124
DPRINTF("Command sent\n");
125
s->status |= PL181_STATUS_CMDSENT;
126
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
127
index XXXXXXX..XXXXXXX 100644
128
--- a/hw/sd/sdhci.c
129
+++ b/hw/sd/sdhci.c
130
@@ -XXX,XX +XXX,XX @@ static void sdhci_send_command(SDHCIState *s)
131
132
if (s->cmdreg & SDHC_CMD_RESPONSE) {
133
if (rlen == 4) {
134
- s->rspreg[0] = (response[0] << 24) | (response[1] << 16) |
135
- (response[2] << 8) | response[3];
136
+ s->rspreg[0] = ldl_be_p(response);
137
s->rspreg[1] = s->rspreg[2] = s->rspreg[3] = 0;
138
trace_sdhci_response4(s->rspreg[0]);
139
} else if (rlen == 16) {
140
- s->rspreg[0] = (response[11] << 24) | (response[12] << 16) |
141
- (response[13] << 8) | response[14];
142
- s->rspreg[1] = (response[7] << 24) | (response[8] << 16) |
143
- (response[9] << 8) | response[10];
144
- s->rspreg[2] = (response[3] << 24) | (response[4] << 16) |
145
- (response[5] << 8) | response[6];
146
+ s->rspreg[0] = ldl_be_p(&response[11]);
147
+ s->rspreg[1] = ldl_be_p(&response[7]);
148
+ s->rspreg[2] = ldl_be_p(&response[3]);
149
s->rspreg[3] = (response[0] << 16) | (response[1] << 8) |
150
response[2];
151
trace_sdhci_response16(s->rspreg[3], s->rspreg[2],
152
@@ -XXX,XX +XXX,XX @@ static void sdhci_end_transfer(SDHCIState *s)
153
trace_sdhci_end_transfer(request.cmd, request.arg);
154
sdbus_do_command(&s->sdbus, &request, response);
155
/* Auto CMD12 response goes to the upper Response register */
156
- s->rspreg[3] = (response[0] << 24) | (response[1] << 16) |
157
- (response[2] << 8) | response[3];
158
+ s->rspreg[3] = ldl_be_p(response);
159
}
160
161
s->prnsts &= ~(SDHC_DOING_READ | SDHC_DOING_WRITE |
162
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
163
index XXXXXXX..XXXXXXX 100644
164
--- a/hw/sd/ssi-sd.c
165
+++ b/hw/sd/ssi-sd.c
166
@@ -XXX,XX +XXX,XX @@ static uint32_t ssi_sd_transfer(SSISlave *dev, uint32_t val)
167
uint8_t longresp[16];
168
/* FIXME: Check CRC. */
169
request.cmd = s->cmd;
170
- request.arg = (s->cmdarg[0] << 24) | (s->cmdarg[1] << 16)
171
- | (s->cmdarg[2] << 8) | s->cmdarg[3];
172
+ request.arg = ldl_be_p(s->cmdarg);
173
DPRINTF("CMD%d arg 0x%08x\n", s->cmd, request.arg);
174
s->arglen = sdbus_do_command(&s->sdbus, &request, longresp);
175
if (s->arglen <= 0) {
176
@@ -XXX,XX +XXX,XX @@ static uint32_t ssi_sd_transfer(SSISlave *dev, uint32_t val)
177
/* CMD13 returns a 2-byte statuse work. Other commands
178
only return the first byte. */
179
s->arglen = (s->cmd == 13) ? 2 : 1;
180
- cardstatus = (longresp[0] << 24) | (longresp[1] << 16)
181
- | (longresp[2] << 8) | longresp[3];
182
+ cardstatus = ldl_be_p(longresp);
183
status = 0;
184
if (((cardstatus >> 9) & 0xf) < 4)
185
status |= SSI_SDR_IDLE;
186
--
187
2.17.1
188
189
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
We already check for the same condition within the normal integer
4
sdiv and sdiv64 helpers. Use a slightly different formation that
5
does not require deducing the expression type.
6
7
Fixes: f97cfd596ed
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20180629001538.11415-2-richard.henderson@linaro.org
12
[PMM: reworded a comment]
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
15
target/arm/sve_helper.c | 20 +++++++++++++++-----
16
1 file changed, 15 insertions(+), 5 deletions(-)
17
18
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/sve_helper.c
21
+++ b/target/arm/sve_helper.c
22
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
23
#define DO_MIN(N, M) ((N) >= (M) ? (M) : (N))
24
#define DO_ABD(N, M) ((N) >= (M) ? (N) - (M) : (M) - (N))
25
#define DO_MUL(N, M) (N * M)
26
-#define DO_DIV(N, M) (M ? N / M : 0)
27
+
28
+
29
+/*
30
+ * We must avoid the C undefined behaviour cases: division by
31
+ * zero and signed division of INT_MIN by -1. Both of these
32
+ * have architecturally defined required results for Arm.
33
+ * We special case all signed divisions by -1 to avoid having
34
+ * to deduce the minimum integer for the type involved.
35
+ */
36
+#define DO_SDIV(N, M) (unlikely(M == 0) ? 0 : unlikely(M == -1) ? -N : N / M)
37
+#define DO_UDIV(N, M) (unlikely(M == 0) ? 0 : N / M)
38
39
DO_ZPZZ(sve_and_zpzz_b, uint8_t, H1, DO_AND)
40
DO_ZPZZ(sve_and_zpzz_h, uint16_t, H1_2, DO_AND)
41
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ(sve_umulh_zpzz_h, uint16_t, H1_2, do_mulh_h)
42
DO_ZPZZ(sve_umulh_zpzz_s, uint32_t, H1_4, do_mulh_s)
43
DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d)
44
45
-DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV)
46
-DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV)
47
+DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_SDIV)
48
+DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_SDIV)
49
50
-DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV)
51
-DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV)
52
+DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_UDIV)
53
+DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_UDIV)
54
55
/* Note that all bits of the shift are significant
56
and not modulo the element size. */
57
--
58
2.17.1
59
60
diff view generated by jsdifflib