1
Hopefully last target-arm queue before softfreeze;
1
First arm pullreq of the cycle; this is mostly my softfloat NaN
2
this one's largest part is the remainder of the SVE patches,
2
handling series. (Lots more in my to-review queue, but I don't
3
but there are a selection of other minor things too.
3
like pullreqs growing too close to a hundred patches at a time :-))
4
4
5
thanks
5
thanks
6
-- PMM
6
-- PMM
7
7
8
The following changes since commit 109b25045b3651f9c5d02c3766c0b3ff63e6d193:
8
The following changes since commit 97f2796a3736ed37a1b85dc1c76a6c45b829dd17:
9
9
10
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2018-06-29 12:30:29 +0100)
10
Open 10.0 development tree (2024-12-10 17:41:17 +0000)
11
11
12
are available in the Git repository at:
12
are available in the Git repository at:
13
13
14
git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180629
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20241211
15
15
16
for you to fetch changes up to 802abf4024d23e48d45373ac3f2b580124b54b47:
16
for you to fetch changes up to 1abe28d519239eea5cf9620bb13149423e5665f8:
17
17
18
target/arm: Add ID_ISAR6 (2018-06-29 15:30:54 +0100)
18
MAINTAINERS: Add correct email address for Vikram Garhwal (2024-12-11 15:31:09 +0000)
19
19
20
----------------------------------------------------------------
20
----------------------------------------------------------------
21
target-arm queue:
21
target-arm queue:
22
* last of the SVE patches; SVE is now enabled for aarch64 linux-user
22
* hw/net/lan9118: Extract PHY model, reuse with imx_fec, fix bugs
23
* sd: Don't trace SDRequest crc field (coverity bugfix)
23
* fpu: Make muladd NaN handling runtime-selected, not compile-time
24
* target/arm: Mark PMINTENSET accesses as possibly doing IO
24
* fpu: Make default NaN pattern runtime-selected, not compile-time
25
* clean up v7VE feature bit handling
25
* fpu: Minor NaN-related cleanups
26
* i.mx7d: minor cleanups
26
* MAINTAINERS: email address updates
27
* target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
28
* target/arm: Implement ARMv8.2-DotProd
29
* virt: add addresses to dt node names (which stops dtc from
30
complaining that they're not correctly named)
31
* cleanups: replace error_setg(&error_fatal) by error_report() + exit()
32
27
33
----------------------------------------------------------------
28
----------------------------------------------------------------
34
Aaron Lindsay (3):
29
Bernhard Beschow (5):
35
target/arm: Add ARM_FEATURE_V7VE for v7 Virtualization Extensions
30
hw/net/lan9118: Extract lan9118_phy
36
target/arm: Remove redundant DIV detection for KVM
31
hw/net/lan9118_phy: Reuse in imx_fec and consolidate implementations
37
target/arm: Mark PMINTENSET accesses as possibly doing IO
32
hw/net/lan9118_phy: Fix off-by-one error in MII_ANLPAR register
33
hw/net/lan9118_phy: Reuse MII constants
34
hw/net/lan9118_phy: Add missing 100 mbps full duplex advertisement
38
35
39
Alex Bennée (1):
36
Leif Lindholm (1):
40
target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space
37
MAINTAINERS: update email address for Leif Lindholm
41
38
42
Eric Auger (3):
39
Peter Maydell (54):
43
device_tree: Add qemu_fdt_node_unit_path
40
fpu: handle raising Invalid for infzero in pick_nan_muladd
44
hw/arm/virt: Silence dtc /intc warnings
41
fpu: Check for default_nan_mode before calling pickNaNMulAdd
45
hw/arm/virt: Silence dtc /memory warning
42
softfloat: Allow runtime choice of inf * 0 + NaN result
43
tests/fp: Explicitly set inf-zero-nan rule
44
target/arm: Set FloatInfZeroNaNRule explicitly
45
target/s390: Set FloatInfZeroNaNRule explicitly
46
target/ppc: Set FloatInfZeroNaNRule explicitly
47
target/mips: Set FloatInfZeroNaNRule explicitly
48
target/sparc: Set FloatInfZeroNaNRule explicitly
49
target/xtensa: Set FloatInfZeroNaNRule explicitly
50
target/x86: Set FloatInfZeroNaNRule explicitly
51
target/loongarch: Set FloatInfZeroNaNRule explicitly
52
target/hppa: Set FloatInfZeroNaNRule explicitly
53
softfloat: Pass have_snan to pickNaNMulAdd
54
softfloat: Allow runtime choice of NaN propagation for muladd
55
tests/fp: Explicitly set 3-NaN propagation rule
56
target/arm: Set Float3NaNPropRule explicitly
57
target/loongarch: Set Float3NaNPropRule explicitly
58
target/ppc: Set Float3NaNPropRule explicitly
59
target/s390x: Set Float3NaNPropRule explicitly
60
target/sparc: Set Float3NaNPropRule explicitly
61
target/mips: Set Float3NaNPropRule explicitly
62
target/xtensa: Set Float3NaNPropRule explicitly
63
target/i386: Set Float3NaNPropRule explicitly
64
target/hppa: Set Float3NaNPropRule explicitly
65
fpu: Remove use_first_nan field from float_status
66
target/m68k: Don't pass NULL float_status to floatx80_default_nan()
67
softfloat: Create floatx80 default NaN from parts64_default_nan
68
target/loongarch: Use normal float_status in fclass_s and fclass_d helpers
69
target/m68k: In frem helper, initialize local float_status from env->fp_status
70
target/m68k: Init local float_status from env fp_status in gdb get/set reg
71
target/sparc: Initialize local scratch float_status from env->fp_status
72
target/ppc: Use env->fp_status in helper_compute_fprf functions
73
fpu: Allow runtime choice of default NaN value
74
tests/fp: Set default NaN pattern explicitly
75
target/microblaze: Set default NaN pattern explicitly
76
target/i386: Set default NaN pattern explicitly
77
target/hppa: Set default NaN pattern explicitly
78
target/alpha: Set default NaN pattern explicitly
79
target/arm: Set default NaN pattern explicitly
80
target/loongarch: Set default NaN pattern explicitly
81
target/m68k: Set default NaN pattern explicitly
82
target/mips: Set default NaN pattern explicitly
83
target/openrisc: Set default NaN pattern explicitly
84
target/ppc: Set default NaN pattern explicitly
85
target/sh4: Set default NaN pattern explicitly
86
target/rx: Set default NaN pattern explicitly
87
target/s390x: Set default NaN pattern explicitly
88
target/sparc: Set default NaN pattern explicitly
89
target/xtensa: Set default NaN pattern explicitly
90
target/hexagon: Set default NaN pattern explicitly
91
target/riscv: Set default NaN pattern explicitly
92
target/tricore: Set default NaN pattern explicitly
93
fpu: Remove default handling for dnan_pattern
46
94
47
Jean-Christophe Dubois (3):
95
Richard Henderson (11):
48
i.mx7d: Remove unused header files
96
target/arm: Copy entire float_status in is_ebf
49
i.mx7d: Change SRC unimplemented device name from sdma to src
97
softfloat: Inline pickNaNMulAdd
50
i.mx7d: Change IRQ number type from hwaddr to int
98
softfloat: Use goto for default nan case in pick_nan_muladd
99
softfloat: Remove which from parts_pick_nan_muladd
100
softfloat: Pad array size in pick_nan_muladd
101
softfloat: Move propagateFloatx80NaN to softfloat.c
102
softfloat: Use parts_pick_nan in propagateFloatx80NaN
103
softfloat: Inline pickNaN
104
softfloat: Share code between parts_pick_nan cases
105
softfloat: Sink frac_cmp in parts_pick_nan until needed
106
softfloat: Replace WHICH with RET in parts_pick_nan
51
107
52
Peter Maydell (1):
108
Vikram Garhwal (1):
53
sd: Don't trace SDRequest crc field
109
MAINTAINERS: Add correct email address for Vikram Garhwal
54
110
55
Philippe Mathieu-Daudé (4):
111
MAINTAINERS | 4 +-
56
hw/block/fdc: Replace error_setg(&error_abort) by assert()
112
include/fpu/softfloat-helpers.h | 38 +++-
57
hw/arm/sysbus-fdt: Replace error_setg(&error_fatal) by error_report() + exit()
113
include/fpu/softfloat-types.h | 89 +++++++-
58
device_tree: Replace error_setg(&error_fatal) by error_report() + exit()
114
include/hw/net/imx_fec.h | 9 +-
59
sdcard: Use the ldst API
115
include/hw/net/lan9118_phy.h | 37 ++++
60
116
include/hw/net/mii.h | 6 +
61
Richard Henderson (40):
117
target/mips/fpu_helper.h | 20 ++
62
target/arm: Implement SVE Memory Contiguous Load Group
118
target/sparc/helper.h | 4 +-
63
target/arm: Implement SVE Contiguous Load, first-fault and no-fault
119
fpu/softfloat.c | 19 ++
64
target/arm: Implement SVE Memory Contiguous Store Group
120
hw/net/imx_fec.c | 146 ++------------
65
target/arm: Implement SVE load and broadcast quadword
121
hw/net/lan9118.c | 137 ++-----------
66
target/arm: Implement SVE integer convert to floating-point
122
hw/net/lan9118_phy.c | 222 ++++++++++++++++++++
67
target/arm: Implement SVE floating-point arithmetic (predicated)
123
linux-user/arm/nwfpe/fpa11.c | 5 +
68
target/arm: Implement SVE FP Multiply-Add Group
124
target/alpha/cpu.c | 2 +
69
target/arm: Implement SVE Floating Point Accumulating Reduction Group
125
target/arm/cpu.c | 10 +
70
target/arm: Implement SVE load and broadcast element
126
target/arm/tcg/vec_helper.c | 20 +-
71
target/arm: Implement SVE store vector/predicate register
127
target/hexagon/cpu.c | 2 +
72
target/arm: Implement SVE scatter stores
128
target/hppa/fpu_helper.c | 12 ++
73
target/arm: Implement SVE prefetches
129
target/i386/tcg/fpu_helper.c | 12 ++
74
target/arm: Implement SVE gather loads
130
target/loongarch/tcg/fpu_helper.c | 14 +-
75
target/arm: Implement SVE first-fault gather loads
131
target/m68k/cpu.c | 14 +-
76
target/arm: Implement SVE scatter store vector immediate
132
target/m68k/fpu_helper.c | 6 +-
77
target/arm: Implement SVE floating-point compare vectors
133
target/m68k/helper.c | 6 +-
78
target/arm: Implement SVE floating-point arithmetic with immediate
134
target/microblaze/cpu.c | 2 +
79
target/arm: Implement SVE Floating Point Multiply Indexed Group
135
target/mips/msa.c | 10 +
80
target/arm: Implement SVE FP Fast Reduction Group
136
target/openrisc/cpu.c | 2 +
81
target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group
137
target/ppc/cpu_init.c | 19 ++
82
target/arm: Implement SVE FP Compare with Zero Group
138
target/ppc/fpu_helper.c | 3 +-
83
target/arm: Implement SVE floating-point trig multiply-add coefficient
139
target/riscv/cpu.c | 2 +
84
target/arm: Implement SVE floating-point convert precision
140
target/rx/cpu.c | 2 +
85
target/arm: Implement SVE floating-point convert to integer
141
target/s390x/cpu.c | 5 +
86
target/arm: Implement SVE floating-point round to integral value
142
target/sh4/cpu.c | 2 +
87
target/arm: Implement SVE floating-point unary operations
143
target/sparc/cpu.c | 6 +
88
target/arm: Implement SVE MOVPRFX
144
target/sparc/fop_helper.c | 8 +-
89
target/arm: Implement SVE floating-point complex add
145
target/sparc/translate.c | 4 +-
90
target/arm: Implement SVE fp complex multiply add
146
target/tricore/helper.c | 2 +
91
target/arm: Pass index to AdvSIMD FCMLA (indexed)
147
target/xtensa/cpu.c | 4 +
92
target/arm: Implement SVE fp complex multiply add (indexed)
148
target/xtensa/fpu_helper.c | 3 +-
93
target/arm: Implement SVE dot product (vectors)
149
tests/fp/fp-bench.c | 7 +
94
target/arm: Implement SVE dot product (indexed)
150
tests/fp/fp-test-log2.c | 1 +
95
target/arm: Enable SVE for aarch64-linux-user
151
tests/fp/fp-test.c | 7 +
96
target/arm: Implement ARMv8.2-DotProd
152
fpu/softfloat-parts.c.inc | 152 +++++++++++---
97
target/arm: Fix SVE signed division vs x86 overflow exception
153
fpu/softfloat-specialize.c.inc | 412 ++------------------------------------
98
target/arm: Fix SVE system register access checks
154
.mailmap | 5 +-
99
target/arm: Prune a57 features from max
155
hw/net/Kconfig | 5 +
100
target/arm: Prune a15 features from max
156
hw/net/meson.build | 1 +
101
target/arm: Add ID_ISAR6
157
hw/net/trace-events | 10 +-
102
158
47 files changed, 778 insertions(+), 730 deletions(-)
103
include/sysemu/device_tree.h | 16 +
159
create mode 100644 include/hw/net/lan9118_phy.h
104
target/arm/cpu.h | 3 +
160
create mode 100644 hw/net/lan9118_phy.c
105
target/arm/helper-sve.h | 682 +++++++++++++++
106
target/arm/helper.h | 44 +-
107
device_tree.c | 78 +-
108
hw/arm/boot.c | 41 +-
109
hw/arm/fsl-imx7.c | 8 +-
110
hw/arm/mcimx7d-sabre.c | 2 -
111
hw/arm/sysbus-fdt.c | 53 +-
112
hw/arm/virt.c | 70 +-
113
hw/block/fdc.c | 9 +-
114
hw/sd/bcm2835_sdhost.c | 13 +-
115
hw/sd/core.c | 2 +-
116
hw/sd/milkymist-memcard.c | 3 +-
117
hw/sd/omap_mmc.c | 6 +-
118
hw/sd/pl181.c | 11 +-
119
hw/sd/sdhci.c | 15 +-
120
hw/sd/ssi-sd.c | 6 +-
121
linux-user/elfload.c | 2 +
122
target/arm/cpu.c | 36 +-
123
target/arm/cpu64.c | 13 +-
124
target/arm/helper.c | 44 +-
125
target/arm/kvm32.c | 27 +-
126
target/arm/sve_helper.c | 1875 +++++++++++++++++++++++++++++++++++++++++-
127
target/arm/translate-a64.c | 62 +-
128
target/arm/translate-sve.c | 1688 ++++++++++++++++++++++++++++++++++++-
129
target/arm/translate.c | 102 ++-
130
target/arm/vec_helper.c | 311 ++++++-
131
hw/sd/trace-events | 2 +-
132
target/arm/sve.decode | 427 ++++++++++
133
30 files changed, 5394 insertions(+), 257 deletions(-)
134
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
A very similar implementation of the same device exists in imx_fec. Prepare for
4
a common implementation by extracting a device model into its own files.
5
6
Some migration state has been moved into the new device model which breaks
7
migration compatibility for the following machines:
8
* smdkc210
9
* realview-*
10
* vexpress-*
11
* kzm
12
* mps2-*
13
14
While breaking migration ABI, fix the size of the MII registers to be 16 bit,
15
as defined by IEEE 802.3u.
16
17
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
18
Tested-by: Guenter Roeck <linux@roeck-us.net>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20241102125724.532843-2-shentey@gmail.com
5
Message-id: 20180627043328.11531-10-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
22
---
8
target/arm/helper-sve.h | 5 +++
23
include/hw/net/lan9118_phy.h | 37 ++++++++
9
target/arm/sve_helper.c | 41 +++++++++++++++++++++++++
24
hw/net/lan9118.c | 137 +++++-----------------------
10
target/arm/translate-sve.c | 62 ++++++++++++++++++++++++++++++++++++++
25
hw/net/lan9118_phy.c | 169 +++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 5 +++
26
hw/net/Kconfig | 4 +
12
4 files changed, 113 insertions(+)
27
hw/net/meson.build | 1 +
28
5 files changed, 233 insertions(+), 115 deletions(-)
29
create mode 100644 include/hw/net/lan9118_phy.h
30
create mode 100644 hw/net/lan9118_phy.c
13
31
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
32
diff --git a/include/hw/net/lan9118_phy.h b/include/hw/net/lan9118_phy.h
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/include/hw/net/lan9118_phy.h
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * SMSC LAN9118 PHY emulation
40
+ *
41
+ * Copyright (c) 2009 CodeSourcery, LLC.
42
+ * Written by Paul Brook
43
+ *
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
45
+ * See the COPYING file in the top-level directory.
46
+ */
47
+
48
+#ifndef HW_NET_LAN9118_PHY_H
49
+#define HW_NET_LAN9118_PHY_H
50
+
51
+#include "qom/object.h"
52
+#include "hw/sysbus.h"
53
+
54
+#define TYPE_LAN9118_PHY "lan9118-phy"
55
+OBJECT_DECLARE_SIMPLE_TYPE(Lan9118PhyState, LAN9118_PHY)
56
+
57
+typedef struct Lan9118PhyState {
58
+ SysBusDevice parent_obj;
59
+
60
+ uint16_t status;
61
+ uint16_t control;
62
+ uint16_t advertise;
63
+ uint16_t ints;
64
+ uint16_t int_mask;
65
+ qemu_irq irq;
66
+ bool link_down;
67
+} Lan9118PhyState;
68
+
69
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down);
70
+void lan9118_phy_reset(Lan9118PhyState *s);
71
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg);
72
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val);
73
+
74
+#endif
75
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
15
index XXXXXXX..XXXXXXX 100644
76
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
77
--- a/hw/net/lan9118.c
17
+++ b/target/arm/helper-sve.h
78
+++ b/hw/net/lan9118.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
79
@@ -XXX,XX +XXX,XX @@
19
DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
80
#include "net/net.h"
20
DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
81
#include "net/eth.h"
21
82
#include "hw/irq.h"
22
+DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
83
+#include "hw/net/lan9118_phy.h"
23
+DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
84
#include "hw/net/lan9118.h"
24
+DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
85
#include "hw/ptimer.h"
25
+DEF_HELPER_FLAGS_4(sve_movz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
86
#include "hw/qdev-properties.h"
26
+
87
@@ -XXX,XX +XXX,XX @@ do { printf("lan9118: " fmt , ## __VA_ARGS__); } while (0)
27
DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
88
#define MAC_CR_RXEN 0x00000004
28
DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
89
#define MAC_CR_RESERVED 0x7f404213
29
DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
90
30
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
91
-#define PHY_INT_ENERGYON 0x80
31
index XXXXXXX..XXXXXXX 100644
92
-#define PHY_INT_AUTONEG_COMPLETE 0x40
32
--- a/target/arm/sve_helper.c
93
-#define PHY_INT_FAULT 0x20
33
+++ b/target/arm/sve_helper.c
94
-#define PHY_INT_DOWN 0x10
34
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc)
95
-#define PHY_INT_AUTONEG_LP 0x08
96
-#define PHY_INT_PARFAULT 0x04
97
-#define PHY_INT_AUTONEG_PAGE 0x02
98
-
99
#define GPT_TIMER_EN 0x20000000
100
101
/*
102
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
103
uint32_t mac_mii_data;
104
uint32_t mac_flow;
105
106
- uint32_t phy_status;
107
- uint32_t phy_control;
108
- uint32_t phy_advertise;
109
- uint32_t phy_int;
110
- uint32_t phy_int_mask;
111
+ Lan9118PhyState mii;
112
+ IRQState mii_irq;
113
114
int32_t eeprom_writable;
115
uint8_t eeprom[128];
116
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
117
118
static const VMStateDescription vmstate_lan9118 = {
119
.name = "lan9118",
120
- .version_id = 2,
121
- .minimum_version_id = 1,
122
+ .version_id = 3,
123
+ .minimum_version_id = 3,
124
.fields = (const VMStateField[]) {
125
VMSTATE_PTIMER(timer, lan9118_state),
126
VMSTATE_UINT32(irq_cfg, lan9118_state),
127
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118 = {
128
VMSTATE_UINT32(mac_mii_acc, lan9118_state),
129
VMSTATE_UINT32(mac_mii_data, lan9118_state),
130
VMSTATE_UINT32(mac_flow, lan9118_state),
131
- VMSTATE_UINT32(phy_status, lan9118_state),
132
- VMSTATE_UINT32(phy_control, lan9118_state),
133
- VMSTATE_UINT32(phy_advertise, lan9118_state),
134
- VMSTATE_UINT32(phy_int, lan9118_state),
135
- VMSTATE_UINT32(phy_int_mask, lan9118_state),
136
VMSTATE_INT32(eeprom_writable, lan9118_state),
137
VMSTATE_UINT8_ARRAY(eeprom, lan9118_state, 128),
138
VMSTATE_INT32(tx_fifo_size, lan9118_state),
139
@@ -XXX,XX +XXX,XX @@ static void lan9118_reload_eeprom(lan9118_state *s)
140
lan9118_mac_changed(s);
141
}
142
143
-static void phy_update_irq(lan9118_state *s)
144
+static void lan9118_update_irq(void *opaque, int n, int level)
145
{
146
- if (s->phy_int & s->phy_int_mask) {
147
+ lan9118_state *s = opaque;
148
+
149
+ if (level) {
150
s->int_sts |= PHY_INT;
151
} else {
152
s->int_sts &= ~PHY_INT;
153
@@ -XXX,XX +XXX,XX @@ static void phy_update_irq(lan9118_state *s)
154
lan9118_update(s);
155
}
156
157
-static void phy_update_link(lan9118_state *s)
158
-{
159
- /* Autonegotiation status mirrors link status. */
160
- if (qemu_get_queue(s->nic)->link_down) {
161
- s->phy_status &= ~0x0024;
162
- s->phy_int |= PHY_INT_DOWN;
163
- } else {
164
- s->phy_status |= 0x0024;
165
- s->phy_int |= PHY_INT_ENERGYON;
166
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
167
- }
168
- phy_update_irq(s);
169
-}
170
-
171
static void lan9118_set_link(NetClientState *nc)
172
{
173
- phy_update_link(qemu_get_nic_opaque(nc));
174
-}
175
-
176
-static void phy_reset(lan9118_state *s)
177
-{
178
- s->phy_status = 0x7809;
179
- s->phy_control = 0x3000;
180
- s->phy_advertise = 0x01e1;
181
- s->phy_int_mask = 0;
182
- s->phy_int = 0;
183
- phy_update_link(s);
184
+ lan9118_phy_update_link(&LAN9118(qemu_get_nic_opaque(nc))->mii,
185
+ nc->link_down);
186
}
187
188
static void lan9118_reset(DeviceState *d)
189
@@ -XXX,XX +XXX,XX @@ static void lan9118_reset(DeviceState *d)
190
s->read_word_n = 0;
191
s->write_word_n = 0;
192
193
- phy_reset(s);
194
-
195
s->eeprom_writable = 0;
196
lan9118_reload_eeprom(s);
197
}
198
@@ -XXX,XX +XXX,XX @@ static void do_tx_packet(lan9118_state *s)
199
uint32_t status;
200
201
/* FIXME: Honor TX disable, and allow queueing of packets. */
202
- if (s->phy_control & 0x4000) {
203
+ if (s->mii.control & 0x4000) {
204
/* This assumes the receive routine doesn't touch the VLANClient. */
205
qemu_receive_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
206
} else {
207
@@ -XXX,XX +XXX,XX @@ static void tx_fifo_push(lan9118_state *s, uint32_t val)
35
}
208
}
36
}
209
}
37
210
38
+/* Copy Zn into Zd, and store zero into inactive elements. */
211
-static uint32_t do_phy_read(lan9118_state *s, int reg)
39
+void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc)
212
-{
40
+{
213
- uint32_t val;
41
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
214
-
42
+ uint64_t *d = vd, *n = vn;
215
- switch (reg) {
43
+ uint8_t *pg = vg;
216
- case 0: /* Basic Control */
44
+ for (i = 0; i < opr_sz; i += 1) {
217
- return s->phy_control;
45
+ d[i] = n[i] & expand_pred_b(pg[H1(i)]);
218
- case 1: /* Basic Status */
219
- return s->phy_status;
220
- case 2: /* ID1 */
221
- return 0x0007;
222
- case 3: /* ID2 */
223
- return 0xc0d1;
224
- case 4: /* Auto-neg advertisement */
225
- return s->phy_advertise;
226
- case 5: /* Auto-neg Link Partner Ability */
227
- return 0x0f71;
228
- case 6: /* Auto-neg Expansion */
229
- return 1;
230
- /* TODO 17, 18, 27, 29, 30, 31 */
231
- case 29: /* Interrupt source. */
232
- val = s->phy_int;
233
- s->phy_int = 0;
234
- phy_update_irq(s);
235
- return val;
236
- case 30: /* Interrupt mask */
237
- return s->phy_int_mask;
238
- default:
239
- qemu_log_mask(LOG_GUEST_ERROR,
240
- "do_phy_read: PHY read reg %d\n", reg);
241
- return 0;
242
- }
243
-}
244
-
245
-static void do_phy_write(lan9118_state *s, int reg, uint32_t val)
246
-{
247
- switch (reg) {
248
- case 0: /* Basic Control */
249
- if (val & 0x8000) {
250
- phy_reset(s);
251
- break;
252
- }
253
- s->phy_control = val & 0x7980;
254
- /* Complete autonegotiation immediately. */
255
- if (val & 0x1000) {
256
- s->phy_status |= 0x0020;
257
- }
258
- break;
259
- case 4: /* Auto-neg advertisement */
260
- s->phy_advertise = (val & 0x2d7f) | 0x80;
261
- break;
262
- /* TODO 17, 18, 27, 31 */
263
- case 30: /* Interrupt mask */
264
- s->phy_int_mask = val & 0xff;
265
- phy_update_irq(s);
266
- break;
267
- default:
268
- qemu_log_mask(LOG_GUEST_ERROR,
269
- "do_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
270
- }
271
-}
272
-
273
static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
274
{
275
switch (reg) {
276
@@ -XXX,XX +XXX,XX @@ static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
277
if (val & 2) {
278
DPRINTF("PHY write %d = 0x%04x\n",
279
(val >> 6) & 0x1f, s->mac_mii_data);
280
- do_phy_write(s, (val >> 6) & 0x1f, s->mac_mii_data);
281
+ lan9118_phy_write(&s->mii, (val >> 6) & 0x1f, s->mac_mii_data);
282
} else {
283
- s->mac_mii_data = do_phy_read(s, (val >> 6) & 0x1f);
284
+ s->mac_mii_data = lan9118_phy_read(&s->mii, (val >> 6) & 0x1f);
285
DPRINTF("PHY read %d = 0x%04x\n",
286
(val >> 6) & 0x1f, s->mac_mii_data);
287
}
288
@@ -XXX,XX +XXX,XX @@ static void lan9118_writel(void *opaque, hwaddr offset,
289
break;
290
case CSR_PMT_CTRL:
291
if (val & 0x400) {
292
- phy_reset(s);
293
+ lan9118_phy_reset(&s->mii);
294
}
295
s->pmt_ctrl &= ~0x34e;
296
s->pmt_ctrl |= (val & 0x34e);
297
@@ -XXX,XX +XXX,XX @@ static void lan9118_realize(DeviceState *dev, Error **errp)
298
const MemoryRegionOps *mem_ops =
299
s->mode_16bit ? &lan9118_16bit_mem_ops : &lan9118_mem_ops;
300
301
+ qemu_init_irq(&s->mii_irq, lan9118_update_irq, s, 0);
302
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
303
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
304
+ return;
46
+ }
305
+ }
47
+}
306
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
48
+
307
+
49
+void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc)
308
memory_region_init_io(&s->mmio, OBJECT(dev), mem_ops, s,
50
+{
309
"lan9118-mmio", 0x100);
51
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
310
sysbus_init_mmio(sbd, &s->mmio);
52
+ uint64_t *d = vd, *n = vn;
311
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
53
+ uint8_t *pg = vg;
312
new file mode 100644
54
+ for (i = 0; i < opr_sz; i += 1) {
313
index XXXXXXX..XXXXXXX
55
+ d[i] = n[i] & expand_pred_h(pg[H1(i)]);
314
--- /dev/null
315
+++ b/hw/net/lan9118_phy.c
316
@@ -XXX,XX +XXX,XX @@
317
+/*
318
+ * SMSC LAN9118 PHY emulation
319
+ *
320
+ * Copyright (c) 2009 CodeSourcery, LLC.
321
+ * Written by Paul Brook
322
+ *
323
+ * This code is licensed under the GNU GPL v2
324
+ *
325
+ * Contributions after 2012-01-13 are licensed under the terms of the
326
+ * GNU GPL, version 2 or (at your option) any later version.
327
+ */
328
+
329
+#include "qemu/osdep.h"
330
+#include "hw/net/lan9118_phy.h"
331
+#include "hw/irq.h"
332
+#include "hw/resettable.h"
333
+#include "migration/vmstate.h"
334
+#include "qemu/log.h"
335
+
336
+#define PHY_INT_ENERGYON (1 << 7)
337
+#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
338
+#define PHY_INT_FAULT (1 << 5)
339
+#define PHY_INT_DOWN (1 << 4)
340
+#define PHY_INT_AUTONEG_LP (1 << 3)
341
+#define PHY_INT_PARFAULT (1 << 2)
342
+#define PHY_INT_AUTONEG_PAGE (1 << 1)
343
+
344
+static void lan9118_phy_update_irq(Lan9118PhyState *s)
345
+{
346
+ qemu_set_irq(s->irq, !!(s->ints & s->int_mask));
347
+}
348
+
349
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
350
+{
351
+ uint16_t val;
352
+
353
+ switch (reg) {
354
+ case 0: /* Basic Control */
355
+ return s->control;
356
+ case 1: /* Basic Status */
357
+ return s->status;
358
+ case 2: /* ID1 */
359
+ return 0x0007;
360
+ case 3: /* ID2 */
361
+ return 0xc0d1;
362
+ case 4: /* Auto-neg advertisement */
363
+ return s->advertise;
364
+ case 5: /* Auto-neg Link Partner Ability */
365
+ return 0x0f71;
366
+ case 6: /* Auto-neg Expansion */
367
+ return 1;
368
+ /* TODO 17, 18, 27, 29, 30, 31 */
369
+ case 29: /* Interrupt source. */
370
+ val = s->ints;
371
+ s->ints = 0;
372
+ lan9118_phy_update_irq(s);
373
+ return val;
374
+ case 30: /* Interrupt mask */
375
+ return s->int_mask;
376
+ default:
377
+ qemu_log_mask(LOG_GUEST_ERROR,
378
+ "lan9118_phy_read: PHY read reg %d\n", reg);
379
+ return 0;
56
+ }
380
+ }
57
+}
381
+}
58
+
382
+
59
+void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc)
383
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
60
+{
384
+{
61
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
385
+ switch (reg) {
62
+ uint64_t *d = vd, *n = vn;
386
+ case 0: /* Basic Control */
63
+ uint8_t *pg = vg;
387
+ if (val & 0x8000) {
64
+ for (i = 0; i < opr_sz; i += 1) {
388
+ lan9118_phy_reset(s);
65
+ d[i] = n[i] & expand_pred_s(pg[H1(i)]);
389
+ break;
390
+ }
391
+ s->control = val & 0x7980;
392
+ /* Complete autonegotiation immediately. */
393
+ if (val & 0x1000) {
394
+ s->status |= 0x0020;
395
+ }
396
+ break;
397
+ case 4: /* Auto-neg advertisement */
398
+ s->advertise = (val & 0x2d7f) | 0x80;
399
+ break;
400
+ /* TODO 17, 18, 27, 31 */
401
+ case 30: /* Interrupt mask */
402
+ s->int_mask = val & 0xff;
403
+ lan9118_phy_update_irq(s);
404
+ break;
405
+ default:
406
+ qemu_log_mask(LOG_GUEST_ERROR,
407
+ "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
66
+ }
408
+ }
67
+}
409
+}
68
+
410
+
69
+void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc)
411
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
70
+{
412
+{
71
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
413
+ s->link_down = link_down;
72
+ uint64_t *d = vd, *n = vn;
414
+
73
+ uint8_t *pg = vg;
415
+ /* Autonegotiation status mirrors link status. */
74
+ for (i = 0; i < opr_sz; i += 1) {
416
+ if (link_down) {
75
+ d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1);
417
+ s->status &= ~0x0024;
418
+ s->ints |= PHY_INT_DOWN;
419
+ } else {
420
+ s->status |= 0x0024;
421
+ s->ints |= PHY_INT_ENERGYON;
422
+ s->ints |= PHY_INT_AUTONEG_COMPLETE;
76
+ }
423
+ }
77
+}
424
+ lan9118_phy_update_irq(s);
78
+
425
+}
79
/* Three-operand expander, immediate operand, controlled by a predicate.
426
+
80
*/
427
+void lan9118_phy_reset(Lan9118PhyState *s)
81
#define DO_ZPZI(NAME, TYPE, H, OP) \
428
+{
82
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
429
+ s->control = 0x3000;
430
+ s->status = 0x7809;
431
+ s->advertise = 0x01e1;
432
+ s->int_mask = 0;
433
+ s->ints = 0;
434
+ lan9118_phy_update_link(s, s->link_down);
435
+}
436
+
437
+static void lan9118_phy_reset_hold(Object *obj, ResetType type)
438
+{
439
+ Lan9118PhyState *s = LAN9118_PHY(obj);
440
+
441
+ lan9118_phy_reset(s);
442
+}
443
+
444
+static void lan9118_phy_init(Object *obj)
445
+{
446
+ Lan9118PhyState *s = LAN9118_PHY(obj);
447
+
448
+ qdev_init_gpio_out(DEVICE(s), &s->irq, 1);
449
+}
450
+
451
+static const VMStateDescription vmstate_lan9118_phy = {
452
+ .name = "lan9118-phy",
453
+ .version_id = 1,
454
+ .minimum_version_id = 1,
455
+ .fields = (const VMStateField[]) {
456
+ VMSTATE_UINT16(control, Lan9118PhyState),
457
+ VMSTATE_UINT16(status, Lan9118PhyState),
458
+ VMSTATE_UINT16(advertise, Lan9118PhyState),
459
+ VMSTATE_UINT16(ints, Lan9118PhyState),
460
+ VMSTATE_UINT16(int_mask, Lan9118PhyState),
461
+ VMSTATE_BOOL(link_down, Lan9118PhyState),
462
+ VMSTATE_END_OF_LIST()
463
+ }
464
+};
465
+
466
+static void lan9118_phy_class_init(ObjectClass *klass, void *data)
467
+{
468
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
469
+ DeviceClass *dc = DEVICE_CLASS(klass);
470
+
471
+ rc->phases.hold = lan9118_phy_reset_hold;
472
+ dc->vmsd = &vmstate_lan9118_phy;
473
+}
474
+
475
+static const TypeInfo types[] = {
476
+ {
477
+ .name = TYPE_LAN9118_PHY,
478
+ .parent = TYPE_SYS_BUS_DEVICE,
479
+ .instance_size = sizeof(Lan9118PhyState),
480
+ .instance_init = lan9118_phy_init,
481
+ .class_init = lan9118_phy_class_init,
482
+ }
483
+};
484
+
485
+DEFINE_TYPES(types)
486
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
83
index XXXXXXX..XXXXXXX 100644
487
index XXXXXXX..XXXXXXX 100644
84
--- a/target/arm/translate-sve.c
488
--- a/hw/net/Kconfig
85
+++ b/target/arm/translate-sve.c
489
+++ b/hw/net/Kconfig
86
@@ -XXX,XX +XXX,XX @@ static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz)
490
@@ -XXX,XX +XXX,XX @@ config VMXNET3_PCI
87
return true;
491
config SMC91C111
88
}
492
bool
89
493
90
+/* Copy Zn into Zd, storing zeros into inactive elements. */
494
+config LAN9118_PHY
91
+static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz)
495
+ bool
92
+{
496
+
93
+ static gen_helper_gvec_3 * const fns[4] = {
497
config LAN9118
94
+ gen_helper_sve_movz_b, gen_helper_sve_movz_h,
498
bool
95
+ gen_helper_sve_movz_s, gen_helper_sve_movz_d,
499
+ select LAN9118_PHY
96
+ };
500
select PTIMER
97
+ unsigned vsz = vec_full_reg_size(s);
501
98
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
502
config NE2000_ISA
99
+ vec_full_reg_offset(s, rn),
503
diff --git a/hw/net/meson.build b/hw/net/meson.build
100
+ pred_full_reg_offset(s, pg),
101
+ vsz, vsz, 0, fns[esz]);
102
+}
103
+
104
static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a,
105
gen_helper_gvec_3 *fn)
106
{
107
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
108
return true;
109
}
110
111
+/* Load and broadcast element. */
112
+static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
113
+{
114
+ if (!sve_access_check(s)) {
115
+ return true;
116
+ }
117
+
118
+ unsigned vsz = vec_full_reg_size(s);
119
+ unsigned psz = pred_full_reg_size(s);
120
+ unsigned esz = dtype_esz[a->dtype];
121
+ TCGLabel *over = gen_new_label();
122
+ TCGv_i64 temp;
123
+
124
+ /* If the guarding predicate has no bits set, no load occurs. */
125
+ if (psz <= 8) {
126
+ /* Reduce the pred_esz_masks value simply to reduce the
127
+ * size of the code generated here.
128
+ */
129
+ uint64_t psz_mask = MAKE_64BIT_MASK(0, psz * 8);
130
+ temp = tcg_temp_new_i64();
131
+ tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg));
132
+ tcg_gen_andi_i64(temp, temp, pred_esz_masks[esz] & psz_mask);
133
+ tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over);
134
+ tcg_temp_free_i64(temp);
135
+ } else {
136
+ TCGv_i32 t32 = tcg_temp_new_i32();
137
+ find_last_active(s, t32, esz, a->pg);
138
+ tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over);
139
+ tcg_temp_free_i32(t32);
140
+ }
141
+
142
+ /* Load the data. */
143
+ temp = tcg_temp_new_i64();
144
+ tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << esz);
145
+ tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s),
146
+ s->be_data | dtype_mop[a->dtype]);
147
+
148
+ /* Broadcast to *all* elements. */
149
+ tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd),
150
+ vsz, vsz, temp);
151
+ tcg_temp_free_i64(temp);
152
+
153
+ /* Zero the inactive elements. */
154
+ gen_set_label(over);
155
+ do_movz_zpz(s, a->rd, a->rd, a->pg, esz);
156
+ return true;
157
+}
158
+
159
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
160
int msz, int esz, int nreg)
161
{
162
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
163
index XXXXXXX..XXXXXXX 100644
504
index XXXXXXX..XXXXXXX 100644
164
--- a/target/arm/sve.decode
505
--- a/hw/net/meson.build
165
+++ b/target/arm/sve.decode
506
+++ b/hw/net/meson.build
166
@@ -XXX,XX +XXX,XX @@
507
@@ -XXX,XX +XXX,XX @@ system_ss.add(when: 'CONFIG_VMXNET3_PCI', if_true: files('vmxnet3.c'))
167
%imm8_16_10 16:5 10:3
508
168
%imm9_16_10 16:s6 10:3
509
system_ss.add(when: 'CONFIG_SMC91C111', if_true: files('smc91c111.c'))
169
%size_23 23:2
510
system_ss.add(when: 'CONFIG_LAN9118', if_true: files('lan9118.c'))
170
+%dtype_23_13 23:2 13:2
511
+system_ss.add(when: 'CONFIG_LAN9118_PHY', if_true: files('lan9118_phy.c'))
171
512
system_ss.add(when: 'CONFIG_NE2000_ISA', if_true: files('ne2000-isa.c'))
172
# A combination of tsz:imm3 -- extract esize.
513
system_ss.add(when: 'CONFIG_OPENCORES_ETH', if_true: files('opencores_eth.c'))
173
%tszimm_esz 22:2 5:5 !function=tszimm_esz
514
system_ss.add(when: 'CONFIG_XGMAC', if_true: files('xgmac.c'))
174
@@ -XXX,XX +XXX,XX @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9
175
# SVE load vector register
176
LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
177
178
+# SVE load and broadcast element
179
+LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \
180
+ &rpri_load dtype=%dtype_23_13 nreg=0
181
+
182
### SVE Memory Contiguous Load Group
183
184
# SVE contiguous load (scalar plus scalar)
185
--
515
--
186
2.17.1
516
2.34.1
187
188
diff view generated by jsdifflib
1
From: Aaron Lindsay <alindsay@codeaurora.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
KVM implies V7VE, which implies ARM_DIV and THUMB_DIV. The conditional
3
imx_fec models the same PHY as lan9118_phy. The code is almost the same with
4
detection here is therefore unnecessary. Because V7VE is already
4
imx_fec having more logging and tracing. Merge these improvements into
5
unconditionally specified for all KVM hosts, ARM_DIV and THUMB_DIV are
5
lan9118_phy and reuse in imx_fec to fix the code duplication.
6
already indirectly specified and do not need to be included here at all.
7
6
8
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
7
Some migration state how resides in the new device model which breaks migration
9
Message-id: 1529699547-17044-6-git-send-email-alindsay@codeaurora.org
8
compatibility for the following machines:
9
* imx25-pdk
10
* sabrelite
11
* mcimx7d-sabre
12
* mcimx6ul-evk
13
14
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
15
Tested-by: Guenter Roeck <linux@roeck-us.net>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
17
Message-id: 20241102125724.532843-3-shentey@gmail.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
19
---
12
target/arm/kvm32.c | 19 +------------------
20
include/hw/net/imx_fec.h | 9 ++-
13
1 file changed, 1 insertion(+), 18 deletions(-)
21
hw/net/imx_fec.c | 146 ++++-----------------------------------
22
hw/net/lan9118_phy.c | 82 ++++++++++++++++------
23
hw/net/Kconfig | 1 +
24
hw/net/trace-events | 10 +--
25
5 files changed, 85 insertions(+), 163 deletions(-)
14
26
15
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
27
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
16
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/kvm32.c
29
--- a/include/hw/net/imx_fec.h
18
+++ b/target/arm/kvm32.c
30
+++ b/include/hw/net/imx_fec.h
19
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
31
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(IMXFECState, IMX_FEC)
20
* and then query that CPU for the relevant ID registers.
32
#define TYPE_IMX_ENET "imx.enet"
21
*/
33
22
int i, ret, fdarray[3];
34
#include "hw/sysbus.h"
23
- uint32_t midr, id_pfr0, id_isar0, mvfr1;
35
+#include "hw/net/lan9118_phy.h"
24
+ uint32_t midr, id_pfr0, mvfr1;
36
+#include "hw/irq.h"
25
uint64_t features = 0;
37
#include "net/net.h"
26
/* Old kernels may not know about the PREFERRED_TARGET ioctl: however
38
27
* we know these will only support creating one kind of guest CPU,
39
#define ENET_EIR 1
28
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
40
@@ -XXX,XX +XXX,XX @@ struct IMXFECState {
29
| ENCODE_CP_REG(15, 0, 0, 0, 1, 0, 0),
41
uint32_t tx_descriptor[ENET_TX_RING_NUM];
30
.addr = (uintptr_t)&id_pfr0,
42
uint32_t tx_ring_num;
31
},
43
32
- {
44
- uint32_t phy_status;
33
- .id = KVM_REG_ARM | KVM_REG_SIZE_U32
45
- uint32_t phy_control;
34
- | ENCODE_CP_REG(15, 0, 0, 0, 2, 0, 0),
46
- uint32_t phy_advertise;
35
- .addr = (uintptr_t)&id_isar0,
47
- uint32_t phy_int;
36
- },
48
- uint32_t phy_int_mask;
37
{
49
+ Lan9118PhyState mii;
38
.id = KVM_REG_ARM | KVM_REG_SIZE_U32
50
+ IRQState mii_irq;
39
| KVM_REG_ARM_VFP | KVM_REG_ARM_VFP_MVFR1,
51
uint32_t phy_num;
40
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
52
bool phy_connected;
41
set_feature(&features, ARM_FEATURE_VFP3);
53
struct IMXFECState *phy_consumer;
42
set_feature(&features, ARM_FEATURE_GENERIC_TIMER);
54
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
43
55
index XXXXXXX..XXXXXXX 100644
44
- switch (extract32(id_isar0, 24, 4)) {
56
--- a/hw/net/imx_fec.c
45
- case 1:
57
+++ b/hw/net/imx_fec.c
46
- set_feature(&features, ARM_FEATURE_THUMB_DIV);
58
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth_txdescs = {
47
- break;
59
48
- case 2:
60
static const VMStateDescription vmstate_imx_eth = {
49
- set_feature(&features, ARM_FEATURE_ARM_DIV);
61
.name = TYPE_IMX_FEC,
50
- set_feature(&features, ARM_FEATURE_THUMB_DIV);
62
- .version_id = 2,
63
- .minimum_version_id = 2,
64
+ .version_id = 3,
65
+ .minimum_version_id = 3,
66
.fields = (const VMStateField[]) {
67
VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
68
VMSTATE_UINT32(rx_descriptor, IMXFECState),
69
VMSTATE_UINT32(tx_descriptor[0], IMXFECState),
70
- VMSTATE_UINT32(phy_status, IMXFECState),
71
- VMSTATE_UINT32(phy_control, IMXFECState),
72
- VMSTATE_UINT32(phy_advertise, IMXFECState),
73
- VMSTATE_UINT32(phy_int, IMXFECState),
74
- VMSTATE_UINT32(phy_int_mask, IMXFECState),
75
VMSTATE_END_OF_LIST()
76
},
77
.subsections = (const VMStateDescription * const []) {
78
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth = {
79
},
80
};
81
82
-#define PHY_INT_ENERGYON (1 << 7)
83
-#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
84
-#define PHY_INT_FAULT (1 << 5)
85
-#define PHY_INT_DOWN (1 << 4)
86
-#define PHY_INT_AUTONEG_LP (1 << 3)
87
-#define PHY_INT_PARFAULT (1 << 2)
88
-#define PHY_INT_AUTONEG_PAGE (1 << 1)
89
-
90
static void imx_eth_update(IMXFECState *s);
91
92
/*
93
@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
94
* For now we don't handle any GPIO/interrupt line, so the OS will
95
* have to poll for the PHY status.
96
*/
97
-static void imx_phy_update_irq(IMXFECState *s)
98
+static void imx_phy_update_irq(void *opaque, int n, int level)
99
{
100
- imx_eth_update(s);
101
-}
102
-
103
-static void imx_phy_update_link(IMXFECState *s)
104
-{
105
- /* Autonegotiation status mirrors link status. */
106
- if (qemu_get_queue(s->nic)->link_down) {
107
- trace_imx_phy_update_link("down");
108
- s->phy_status &= ~0x0024;
109
- s->phy_int |= PHY_INT_DOWN;
110
- } else {
111
- trace_imx_phy_update_link("up");
112
- s->phy_status |= 0x0024;
113
- s->phy_int |= PHY_INT_ENERGYON;
114
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
115
- }
116
- imx_phy_update_irq(s);
117
+ imx_eth_update(opaque);
118
}
119
120
static void imx_eth_set_link(NetClientState *nc)
121
{
122
- imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
123
-}
124
-
125
-static void imx_phy_reset(IMXFECState *s)
126
-{
127
- trace_imx_phy_reset();
128
-
129
- s->phy_status = 0x7809;
130
- s->phy_control = 0x3000;
131
- s->phy_advertise = 0x01e1;
132
- s->phy_int_mask = 0;
133
- s->phy_int = 0;
134
- imx_phy_update_link(s);
135
+ lan9118_phy_update_link(&IMX_FEC(qemu_get_nic_opaque(nc))->mii,
136
+ nc->link_down);
137
}
138
139
static uint32_t imx_phy_read(IMXFECState *s, int reg)
140
{
141
- uint32_t val;
142
uint32_t phy = reg / 32;
143
144
if (!s->phy_connected) {
145
@@ -XXX,XX +XXX,XX @@ static uint32_t imx_phy_read(IMXFECState *s, int reg)
146
147
reg %= 32;
148
149
- switch (reg) {
150
- case 0: /* Basic Control */
151
- val = s->phy_control;
152
- break;
153
- case 1: /* Basic Status */
154
- val = s->phy_status;
155
- break;
156
- case 2: /* ID1 */
157
- val = 0x0007;
158
- break;
159
- case 3: /* ID2 */
160
- val = 0xc0d1;
161
- break;
162
- case 4: /* Auto-neg advertisement */
163
- val = s->phy_advertise;
164
- break;
165
- case 5: /* Auto-neg Link Partner Ability */
166
- val = 0x0f71;
167
- break;
168
- case 6: /* Auto-neg Expansion */
169
- val = 1;
170
- break;
171
- case 29: /* Interrupt source. */
172
- val = s->phy_int;
173
- s->phy_int = 0;
174
- imx_phy_update_irq(s);
175
- break;
176
- case 30: /* Interrupt mask */
177
- val = s->phy_int_mask;
178
- break;
179
- case 17:
180
- case 18:
181
- case 27:
182
- case 31:
183
- qemu_log_mask(LOG_UNIMP, "[%s.phy]%s: reg %d not implemented\n",
184
- TYPE_IMX_FEC, __func__, reg);
185
- val = 0;
51
- break;
186
- break;
52
- default:
187
- default:
188
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
189
- TYPE_IMX_FEC, __func__, reg);
190
- val = 0;
53
- break;
191
- break;
54
- }
192
- }
55
-
193
-
56
if (extract32(id_pfr0, 12, 4) == 1) {
194
- trace_imx_phy_read(val, phy, reg);
57
set_feature(&features, ARM_FEATURE_THUMB2EE);
195
-
196
- return val;
197
+ return lan9118_phy_read(&s->mii, reg);
198
}
199
200
static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
201
@@ -XXX,XX +XXX,XX @@ static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
202
203
reg %= 32;
204
205
- trace_imx_phy_write(val, phy, reg);
206
-
207
- switch (reg) {
208
- case 0: /* Basic Control */
209
- if (val & 0x8000) {
210
- imx_phy_reset(s);
211
- } else {
212
- s->phy_control = val & 0x7980;
213
- /* Complete autonegotiation immediately. */
214
- if (val & 0x1000) {
215
- s->phy_status |= 0x0020;
216
- }
217
- }
218
- break;
219
- case 4: /* Auto-neg advertisement */
220
- s->phy_advertise = (val & 0x2d7f) | 0x80;
221
- break;
222
- case 30: /* Interrupt mask */
223
- s->phy_int_mask = val & 0xff;
224
- imx_phy_update_irq(s);
225
- break;
226
- case 17:
227
- case 18:
228
- case 27:
229
- case 31:
230
- qemu_log_mask(LOG_UNIMP, "[%s.phy)%s: reg %d not implemented\n",
231
- TYPE_IMX_FEC, __func__, reg);
232
- break;
233
- default:
234
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
235
- TYPE_IMX_FEC, __func__, reg);
236
- break;
237
- }
238
+ lan9118_phy_write(&s->mii, reg, val);
239
}
240
241
static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
242
@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
243
244
s->rx_descriptor = 0;
245
memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
246
-
247
- /* We also reset the PHY */
248
- imx_phy_reset(s);
249
}
250
251
static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
252
@@ -XXX,XX +XXX,XX @@ static void imx_eth_realize(DeviceState *dev, Error **errp)
253
sysbus_init_irq(sbd, &s->irq[0]);
254
sysbus_init_irq(sbd, &s->irq[1]);
255
256
+ qemu_init_irq(&s->mii_irq, imx_phy_update_irq, s, 0);
257
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
258
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
259
+ return;
260
+ }
261
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
262
+
263
qemu_macaddr_default_if_unset(&s->conf.macaddr);
264
265
s->nic = qemu_new_nic(&imx_eth_net_info, &s->conf,
266
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
267
index XXXXXXX..XXXXXXX 100644
268
--- a/hw/net/lan9118_phy.c
269
+++ b/hw/net/lan9118_phy.c
270
@@ -XXX,XX +XXX,XX @@
271
* Copyright (c) 2009 CodeSourcery, LLC.
272
* Written by Paul Brook
273
*
274
+ * Copyright (c) 2013 Jean-Christophe Dubois. <jcd@tribudubois.net>
275
+ *
276
* This code is licensed under the GNU GPL v2
277
*
278
* Contributions after 2012-01-13 are licensed under the terms of the
279
@@ -XXX,XX +XXX,XX @@
280
#include "hw/resettable.h"
281
#include "migration/vmstate.h"
282
#include "qemu/log.h"
283
+#include "trace.h"
284
285
#define PHY_INT_ENERGYON (1 << 7)
286
#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
287
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
288
289
switch (reg) {
290
case 0: /* Basic Control */
291
- return s->control;
292
+ val = s->control;
293
+ break;
294
case 1: /* Basic Status */
295
- return s->status;
296
+ val = s->status;
297
+ break;
298
case 2: /* ID1 */
299
- return 0x0007;
300
+ val = 0x0007;
301
+ break;
302
case 3: /* ID2 */
303
- return 0xc0d1;
304
+ val = 0xc0d1;
305
+ break;
306
case 4: /* Auto-neg advertisement */
307
- return s->advertise;
308
+ val = s->advertise;
309
+ break;
310
case 5: /* Auto-neg Link Partner Ability */
311
- return 0x0f71;
312
+ val = 0x0f71;
313
+ break;
314
case 6: /* Auto-neg Expansion */
315
- return 1;
316
- /* TODO 17, 18, 27, 29, 30, 31 */
317
+ val = 1;
318
+ break;
319
case 29: /* Interrupt source. */
320
val = s->ints;
321
s->ints = 0;
322
lan9118_phy_update_irq(s);
323
- return val;
324
+ break;
325
case 30: /* Interrupt mask */
326
- return s->int_mask;
327
+ val = s->int_mask;
328
+ break;
329
+ case 17:
330
+ case 18:
331
+ case 27:
332
+ case 31:
333
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
334
+ __func__, reg);
335
+ val = 0;
336
+ break;
337
default:
338
- qemu_log_mask(LOG_GUEST_ERROR,
339
- "lan9118_phy_read: PHY read reg %d\n", reg);
340
- return 0;
341
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
342
+ __func__, reg);
343
+ val = 0;
344
+ break;
58
}
345
}
346
+
347
+ trace_lan9118_phy_read(val, reg);
348
+
349
+ return val;
350
}
351
352
void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
353
{
354
+ trace_lan9118_phy_write(val, reg);
355
+
356
switch (reg) {
357
case 0: /* Basic Control */
358
if (val & 0x8000) {
359
lan9118_phy_reset(s);
360
- break;
361
- }
362
- s->control = val & 0x7980;
363
- /* Complete autonegotiation immediately. */
364
- if (val & 0x1000) {
365
- s->status |= 0x0020;
366
+ } else {
367
+ s->control = val & 0x7980;
368
+ /* Complete autonegotiation immediately. */
369
+ if (val & 0x1000) {
370
+ s->status |= 0x0020;
371
+ }
372
}
373
break;
374
case 4: /* Auto-neg advertisement */
375
s->advertise = (val & 0x2d7f) | 0x80;
376
break;
377
- /* TODO 17, 18, 27, 31 */
378
case 30: /* Interrupt mask */
379
s->int_mask = val & 0xff;
380
lan9118_phy_update_irq(s);
381
break;
382
+ case 17:
383
+ case 18:
384
+ case 27:
385
+ case 31:
386
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
387
+ __func__, reg);
388
+ break;
389
default:
390
- qemu_log_mask(LOG_GUEST_ERROR,
391
- "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
392
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
393
+ __func__, reg);
394
+ break;
395
}
396
}
397
398
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
399
400
/* Autonegotiation status mirrors link status. */
401
if (link_down) {
402
+ trace_lan9118_phy_update_link("down");
403
s->status &= ~0x0024;
404
s->ints |= PHY_INT_DOWN;
405
} else {
406
+ trace_lan9118_phy_update_link("up");
407
s->status |= 0x0024;
408
s->ints |= PHY_INT_ENERGYON;
409
s->ints |= PHY_INT_AUTONEG_COMPLETE;
410
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
411
412
void lan9118_phy_reset(Lan9118PhyState *s)
413
{
414
+ trace_lan9118_phy_reset();
415
+
416
s->control = 0x3000;
417
s->status = 0x7809;
418
s->advertise = 0x01e1;
419
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_phy = {
420
.version_id = 1,
421
.minimum_version_id = 1,
422
.fields = (const VMStateField[]) {
423
- VMSTATE_UINT16(control, Lan9118PhyState),
424
VMSTATE_UINT16(status, Lan9118PhyState),
425
+ VMSTATE_UINT16(control, Lan9118PhyState),
426
VMSTATE_UINT16(advertise, Lan9118PhyState),
427
VMSTATE_UINT16(ints, Lan9118PhyState),
428
VMSTATE_UINT16(int_mask, Lan9118PhyState),
429
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
430
index XXXXXXX..XXXXXXX 100644
431
--- a/hw/net/Kconfig
432
+++ b/hw/net/Kconfig
433
@@ -XXX,XX +XXX,XX @@ config ALLWINNER_SUN8I_EMAC
434
435
config IMX_FEC
436
bool
437
+ select LAN9118_PHY
438
439
config CADENCE
440
bool
441
diff --git a/hw/net/trace-events b/hw/net/trace-events
442
index XXXXXXX..XXXXXXX 100644
443
--- a/hw/net/trace-events
444
+++ b/hw/net/trace-events
445
@@ -XXX,XX +XXX,XX @@ allwinner_sun8i_emac_set_link(bool active) "Set link: active=%u"
446
allwinner_sun8i_emac_read(uint64_t offset, uint64_t val) "MMIO read: offset=0x%" PRIx64 " value=0x%" PRIx64
447
allwinner_sun8i_emac_write(uint64_t offset, uint64_t val) "MMIO write: offset=0x%" PRIx64 " value=0x%" PRIx64
448
449
+# lan9118_phy.c
450
+lan9118_phy_read(uint16_t val, int reg) "[0x%02x] -> 0x%04" PRIx16
451
+lan9118_phy_write(uint16_t val, int reg) "[0x%02x] <- 0x%04" PRIx16
452
+lan9118_phy_update_link(const char *s) "%s"
453
+lan9118_phy_reset(void) ""
454
+
455
# lance.c
456
lance_mem_readw(uint64_t addr, uint32_t ret) "addr=0x%"PRIx64"val=0x%04x"
457
lance_mem_writew(uint64_t addr, uint32_t val) "addr=0x%"PRIx64"val=0x%04x"
458
@@ -XXX,XX +XXX,XX @@ i82596_set_multicast(uint16_t count) "Added %d multicast entries"
459
i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
460
461
# imx_fec.c
462
-imx_phy_read(uint32_t val, int phy, int reg) "0x%04"PRIx32" <= phy[%d].reg[%d]"
463
imx_phy_read_num(int phy, int configured) "read request from unconfigured phy %d (configured %d)"
464
-imx_phy_write(uint32_t val, int phy, int reg) "0x%04"PRIx32" => phy[%d].reg[%d]"
465
imx_phy_write_num(int phy, int configured) "write request to unconfigured phy %d (configured %d)"
466
-imx_phy_update_link(const char *s) "%s"
467
-imx_phy_reset(void) ""
468
imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
469
imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
470
imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
59
--
471
--
60
2.17.1
472
2.34.1
61
62
diff view generated by jsdifflib
1
From: Aaron Lindsay <alindsay@codeaurora.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
This makes it match its AArch64 equivalent, PMINTENSET_EL1
3
Turns 0x70 into 0xe0 (== 0x70 << 1) which adds the missing MII_ANLPAR_TX and
4
fixes the MSB of selector field to be zero, as specified in the datasheet.
4
5
5
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
6
Fixes: 2a424990170b "LAN9118 emulation"
6
Message-id: 1529699547-17044-13-git-send-email-alindsay@codeaurora.org
7
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
8
Tested-by: Guenter Roeck <linux@roeck-us.net>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20241102125724.532843-4-shentey@gmail.com
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
12
---
9
target/arm/helper.c | 2 +-
13
hw/net/lan9118_phy.c | 2 +-
10
1 file changed, 1 insertion(+), 1 deletion(-)
14
1 file changed, 1 insertion(+), 1 deletion(-)
11
15
12
diff --git a/target/arm/helper.c b/target/arm/helper.c
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
13
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/helper.c
18
--- a/hw/net/lan9118_phy.c
15
+++ b/target/arm/helper.c
19
+++ b/hw/net/lan9118_phy.c
16
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
20
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
17
.writefn = pmuserenr_write, .raw_writefn = raw_write },
21
val = s->advertise;
18
{ .name = "PMINTENSET", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 1,
22
break;
19
.access = PL1_RW, .accessfn = access_tpm,
23
case 5: /* Auto-neg Link Partner Ability */
20
- .type = ARM_CP_ALIAS,
24
- val = 0x0f71;
21
+ .type = ARM_CP_ALIAS | ARM_CP_IO,
25
+ val = 0x0fe1;
22
.fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pminten),
26
break;
23
.resetvalue = 0,
27
case 6: /* Auto-neg Expansion */
24
.writefn = pmintenset_write, .raw_writefn = raw_write },
28
val = 1;
25
--
29
--
26
2.17.1
30
2.34.1
27
28
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
3
Prefer named constants over magic values for better readability.
2
4
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
5
Message-id: 20180627043328.11531-33-richard.henderson@linaro.org
7
Tested-by: Guenter Roeck <linux@roeck-us.net>
6
[PMM: moved 'ra=%reg_movprfx' here from following patch]
8
Message-id: 20241102125724.532843-5-shentey@gmail.com
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
10
---
9
target/arm/helper.h | 5 +++
11
include/hw/net/mii.h | 6 +++++
10
target/arm/translate-sve.c | 17 ++++++++++
12
hw/net/lan9118_phy.c | 63 ++++++++++++++++++++++++++++----------------
11
target/arm/vec_helper.c | 67 ++++++++++++++++++++++++++++++++++++++
13
2 files changed, 46 insertions(+), 23 deletions(-)
12
target/arm/sve.decode | 3 ++
13
4 files changed, 92 insertions(+)
14
14
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.h
17
--- a/include/hw/net/mii.h
18
+++ b/target/arm/helper.h
18
+++ b/include/hw/net/mii.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG,
19
@@ -XXX,XX +XXX,XX @@
20
DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG,
20
#define MII_BMSR_JABBER (1 << 1) /* Jabber detected */
21
void, ptr, ptr, ptr, ptr, i32)
21
#define MII_BMSR_EXTCAP (1 << 0) /* Ext-reg capability */
22
22
23
+DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+#define MII_ANAR_RFAULT (1 << 13) /* Say we can detect faults */
24
+DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
#define MII_ANAR_PAUSE_ASYM (1 << 11) /* Try for asymmetric pause */
25
+DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
#define MII_ANAR_PAUSE (1 << 10) /* Try for pause */
26
+DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
#define MII_ANAR_TXFD (1 << 8)
27
@@ -XXX,XX +XXX,XX @@
28
#define MII_ANAR_10FD (1 << 6)
29
#define MII_ANAR_10 (1 << 5)
30
#define MII_ANAR_CSMACD (1 << 0)
31
+#define MII_ANAR_SELECT (0x001f) /* Selector bits */
32
33
#define MII_ANLPAR_ACK (1 << 14)
34
#define MII_ANLPAR_PAUSEASY (1 << 11) /* can pause asymmetrically */
35
@@ -XXX,XX +XXX,XX @@
36
#define RTL8201CP_PHYID1 0x0000
37
#define RTL8201CP_PHYID2 0x8201
38
39
+/* SMSC LAN9118 */
40
+#define SMSCLAN9118_PHYID1 0x0007
41
+#define SMSCLAN9118_PHYID2 0xc0d1
27
+
42
+
28
DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
43
/* RealTek 8211E */
29
void, ptr, ptr, ptr, ptr, i32)
44
#define RTL8211E_PHYID1 0x001c
30
DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
45
#define RTL8211E_PHYID2 0xc915
31
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
46
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
32
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/translate-sve.c
48
--- a/hw/net/lan9118_phy.c
34
+++ b/target/arm/translate-sve.c
49
+++ b/hw/net/lan9118_phy.c
35
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
50
@@ -XXX,XX +XXX,XX @@
36
51
37
#undef DO_ZZI
52
#include "qemu/osdep.h"
38
53
#include "hw/net/lan9118_phy.h"
39
+static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn)
54
+#include "hw/net/mii.h"
40
+{
55
#include "hw/irq.h"
41
+ static gen_helper_gvec_3 * const fns[2][2] = {
56
#include "hw/resettable.h"
42
+ { gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h },
57
#include "migration/vmstate.h"
43
+ { gen_helper_gvec_udot_b, gen_helper_gvec_udot_h }
58
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
44
+ };
59
uint16_t val;
45
+
60
46
+ if (sve_access_check(s)) {
61
switch (reg) {
47
+ unsigned vsz = vec_full_reg_size(s);
62
- case 0: /* Basic Control */
48
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
63
+ case MII_BMCR:
49
+ vec_full_reg_offset(s, a->rn),
64
val = s->control;
50
+ vec_full_reg_offset(s, a->rm),
65
break;
51
+ vsz, vsz, 0, fns[a->u][a->sz]);
66
- case 1: /* Basic Status */
52
+ }
67
+ case MII_BMSR:
53
+ return true;
68
val = s->status;
54
+}
69
break;
55
+
70
- case 2: /* ID1 */
56
/*
71
- val = 0x0007;
57
*** SVE Floating Point Multiply-Add Indexed Group
72
+ case MII_PHYID1:
58
*/
73
+ val = SMSCLAN9118_PHYID1;
59
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
74
break;
60
index XXXXXXX..XXXXXXX 100644
75
- case 3: /* ID2 */
61
--- a/target/arm/vec_helper.c
76
- val = 0xc0d1;
62
+++ b/target/arm/vec_helper.c
77
+ case MII_PHYID2:
63
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
78
+ val = SMSCLAN9118_PHYID2;
64
clear_tail(d, opr_sz, simd_maxsz(desc));
79
break;
65
}
80
- case 4: /* Auto-neg advertisement */
66
81
+ case MII_ANAR:
67
+/* Integer 8 and 16-bit dot-product.
82
val = s->advertise;
68
+ *
83
break;
69
+ * Note that for the loops herein, host endianness does not matter
84
- case 5: /* Auto-neg Link Partner Ability */
70
+ * with respect to the ordering of data within the 64-bit lanes.
85
- val = 0x0fe1;
71
+ * All elements are treated equally, no matter where they are.
86
+ case MII_ANLPAR:
72
+ */
87
+ val = MII_ANLPAR_PAUSEASY | MII_ANLPAR_PAUSE | MII_ANLPAR_T4 |
73
+
88
+ MII_ANLPAR_TXFD | MII_ANLPAR_TX | MII_ANLPAR_10FD |
74
+void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc)
89
+ MII_ANLPAR_10 | MII_ANLPAR_CSMACD;
75
+{
90
break;
76
+ intptr_t i, opr_sz = simd_oprsz(desc);
91
- case 6: /* Auto-neg Expansion */
77
+ uint32_t *d = vd;
92
- val = 1;
78
+ int8_t *n = vn, *m = vm;
93
+ case MII_ANER:
79
+
94
+ val = MII_ANER_NWAY;
80
+ for (i = 0; i < opr_sz / 4; ++i) {
95
break;
81
+ d[i] += n[i * 4 + 0] * m[i * 4 + 0]
96
case 29: /* Interrupt source. */
82
+ + n[i * 4 + 1] * m[i * 4 + 1]
97
val = s->ints;
83
+ + n[i * 4 + 2] * m[i * 4 + 2]
98
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
84
+ + n[i * 4 + 3] * m[i * 4 + 3];
99
trace_lan9118_phy_write(val, reg);
85
+ }
100
86
+ clear_tail(d, opr_sz, simd_maxsz(desc));
101
switch (reg) {
87
+}
102
- case 0: /* Basic Control */
88
+
103
- if (val & 0x8000) {
89
+void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc)
104
+ case MII_BMCR:
90
+{
105
+ if (val & MII_BMCR_RESET) {
91
+ intptr_t i, opr_sz = simd_oprsz(desc);
106
lan9118_phy_reset(s);
92
+ uint32_t *d = vd;
107
} else {
93
+ uint8_t *n = vn, *m = vm;
108
- s->control = val & 0x7980;
94
+
109
+ s->control = val & (MII_BMCR_LOOPBACK | MII_BMCR_SPEED100 |
95
+ for (i = 0; i < opr_sz / 4; ++i) {
110
+ MII_BMCR_AUTOEN | MII_BMCR_PDOWN | MII_BMCR_FD |
96
+ d[i] += n[i * 4 + 0] * m[i * 4 + 0]
111
+ MII_BMCR_CTST);
97
+ + n[i * 4 + 1] * m[i * 4 + 1]
112
/* Complete autonegotiation immediately. */
98
+ + n[i * 4 + 2] * m[i * 4 + 2]
113
- if (val & 0x1000) {
99
+ + n[i * 4 + 3] * m[i * 4 + 3];
114
- s->status |= 0x0020;
100
+ }
115
+ if (val & MII_BMCR_AUTOEN) {
101
+ clear_tail(d, opr_sz, simd_maxsz(desc));
116
+ s->status |= MII_BMSR_AN_COMP;
102
+}
117
}
103
+
118
}
104
+void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc)
119
break;
105
+{
120
- case 4: /* Auto-neg advertisement */
106
+ intptr_t i, opr_sz = simd_oprsz(desc);
121
- s->advertise = (val & 0x2d7f) | 0x80;
107
+ uint64_t *d = vd;
122
+ case MII_ANAR:
108
+ int16_t *n = vn, *m = vm;
123
+ s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
109
+
124
+ MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
110
+ for (i = 0; i < opr_sz / 8; ++i) {
125
+ MII_ANAR_SELECT))
111
+ d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0]
126
+ | MII_ANAR_TX;
112
+ + (int64_t)n[i * 4 + 1] * m[i * 4 + 1]
127
break;
113
+ + (int64_t)n[i * 4 + 2] * m[i * 4 + 2]
128
case 30: /* Interrupt mask */
114
+ + (int64_t)n[i * 4 + 3] * m[i * 4 + 3];
129
s->int_mask = val & 0xff;
115
+ }
130
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
116
+ clear_tail(d, opr_sz, simd_maxsz(desc));
131
/* Autonegotiation status mirrors link status. */
117
+}
132
if (link_down) {
118
+
133
trace_lan9118_phy_update_link("down");
119
+void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc)
134
- s->status &= ~0x0024;
120
+{
135
+ s->status &= ~(MII_BMSR_AN_COMP | MII_BMSR_LINK_ST);
121
+ intptr_t i, opr_sz = simd_oprsz(desc);
136
s->ints |= PHY_INT_DOWN;
122
+ uint64_t *d = vd;
137
} else {
123
+ uint16_t *n = vn, *m = vm;
138
trace_lan9118_phy_update_link("up");
124
+
139
- s->status |= 0x0024;
125
+ for (i = 0; i < opr_sz / 8; ++i) {
140
+ s->status |= MII_BMSR_AN_COMP | MII_BMSR_LINK_ST;
126
+ d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0]
141
s->ints |= PHY_INT_ENERGYON;
127
+ + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1]
142
s->ints |= PHY_INT_AUTONEG_COMPLETE;
128
+ + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2]
143
}
129
+ + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3];
144
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_reset(Lan9118PhyState *s)
130
+ }
131
+ clear_tail(d, opr_sz, simd_maxsz(desc));
132
+}
133
+
134
void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
135
void *vfpst, uint32_t desc)
136
{
145
{
137
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
146
trace_lan9118_phy_reset();
138
index XXXXXXX..XXXXXXX 100644
147
139
--- a/target/arm/sve.decode
148
- s->control = 0x3000;
140
+++ b/target/arm/sve.decode
149
- s->status = 0x7809;
141
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
150
- s->advertise = 0x01e1;
142
# SVE integer multiply immediate (unpredicated)
151
+ s->control = MII_BMCR_AUTOEN | MII_BMCR_SPEED100;
143
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
152
+ s->status = MII_BMSR_100TX_FD
144
153
+ | MII_BMSR_100TX_HD
145
+# SVE integer dot product (unpredicated)
154
+ | MII_BMSR_10T_FD
146
+DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx
155
+ | MII_BMSR_10T_HD
147
+
156
+ | MII_BMSR_AUTONEG
148
# SVE floating-point complex add (predicated)
157
+ | MII_BMSR_EXTCAP;
149
FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
158
+ s->advertise = MII_ANAR_TXFD
150
rn=%reg_movprfx
159
+ | MII_ANAR_TX
160
+ | MII_ANAR_10FD
161
+ | MII_ANAR_10
162
+ | MII_ANAR_CSMACD;
163
s->int_mask = 0;
164
s->ints = 0;
165
lan9118_phy_update_link(s, s->link_down);
151
--
166
--
152
2.17.1
167
2.34.1
153
154
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
3
The real device advertises this mode and the device model already advertises
4
100 mbps half duplex and 10 mbps full+half duplex. So advertise this mode to
5
make the model more realistic.
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
5
Message-id: 20180627043328.11531-23-richard.henderson@linaro.org
9
Tested-by: Guenter Roeck <linux@roeck-us.net>
10
Message-id: 20241102125724.532843-6-shentey@gmail.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
12
---
8
target/arm/helper-sve.h | 4 +++
13
hw/net/lan9118_phy.c | 4 ++--
9
target/arm/sve_helper.c | 70 ++++++++++++++++++++++++++++++++++++++
14
1 file changed, 2 insertions(+), 2 deletions(-)
10
target/arm/translate-sve.c | 27 +++++++++++++++
11
target/arm/sve.decode | 3 ++
12
4 files changed, 104 insertions(+)
13
15
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
18
--- a/hw/net/lan9118_phy.c
17
+++ b/target/arm/helper-sve.h
19
+++ b/hw/net/lan9118_phy.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
19
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
21
break;
20
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
22
case MII_ANAR:
21
23
s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
22
+DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
- MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
23
+DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
- MII_ANAR_SELECT))
24
+DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+ MII_ANAR_PAUSE | MII_ANAR_TXFD | MII_ANAR_10FD |
25
+
27
+ MII_ANAR_10 | MII_ANAR_SELECT))
26
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
| MII_ANAR_TX;
27
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
break;
28
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
case 30: /* Interrupt mask */
29
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/sve_helper.c
32
+++ b/target/arm/sve_helper.c
33
@@ -XXX,XX +XXX,XX @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT)
34
DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ)
35
DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE)
36
37
+/* FP Trig Multiply-Add. */
38
+
39
+void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
40
+{
41
+ static const float16 coeff[16] = {
42
+ 0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
43
+ 0x3c00, 0xb800, 0x293a, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000,
44
+ };
45
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16);
46
+ intptr_t x = simd_data(desc);
47
+ float16 *d = vd, *n = vn, *m = vm;
48
+ for (i = 0; i < opr_sz; i++) {
49
+ float16 mm = m[i];
50
+ intptr_t xx = x;
51
+ if (float16_is_neg(mm)) {
52
+ mm = float16_abs(mm);
53
+ xx += 8;
54
+ }
55
+ d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs);
56
+ }
57
+}
58
+
59
+void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
60
+{
61
+ static const float32 coeff[16] = {
62
+ 0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9,
63
+ 0x36369d6d, 0x00000000, 0x00000000, 0x00000000,
64
+ 0x3f800000, 0xbf000000, 0x3d2aaaa6, 0xbab60705,
65
+ 0x37cd37cc, 0x00000000, 0x00000000, 0x00000000,
66
+ };
67
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32);
68
+ intptr_t x = simd_data(desc);
69
+ float32 *d = vd, *n = vn, *m = vm;
70
+ for (i = 0; i < opr_sz; i++) {
71
+ float32 mm = m[i];
72
+ intptr_t xx = x;
73
+ if (float32_is_neg(mm)) {
74
+ mm = float32_abs(mm);
75
+ xx += 8;
76
+ }
77
+ d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs);
78
+ }
79
+}
80
+
81
+void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
82
+{
83
+ static const float64 coeff[16] = {
84
+ 0x3ff0000000000000ull, 0xbfc5555555555543ull,
85
+ 0x3f8111111110f30cull, 0xbf2a01a019b92fc6ull,
86
+ 0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull,
87
+ 0x3de5d8408868552full, 0x0000000000000000ull,
88
+ 0x3ff0000000000000ull, 0xbfe0000000000000ull,
89
+ 0x3fa5555555555536ull, 0xbf56c16c16c13a0bull,
90
+ 0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull,
91
+ 0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull,
92
+ };
93
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64);
94
+ intptr_t x = simd_data(desc);
95
+ float64 *d = vd, *n = vn, *m = vm;
96
+ for (i = 0; i < opr_sz; i++) {
97
+ float64 mm = m[i];
98
+ intptr_t xx = x;
99
+ if (float64_is_neg(mm)) {
100
+ mm = float64_abs(mm);
101
+ xx += 8;
102
+ }
103
+ d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs);
104
+ }
105
+}
106
+
107
/*
108
* Load contiguous data, protected by a governing predicate.
109
*/
110
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/translate-sve.c
113
+++ b/target/arm/translate-sve.c
114
@@ -XXX,XX +XXX,XX @@ DO_PPZ(FCMNE_ppz0, fcmne0)
115
116
#undef DO_PPZ
117
118
+/*
119
+ *** SVE floating-point trig multiply-add coefficient
120
+ */
121
+
122
+static bool trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn)
123
+{
124
+ static gen_helper_gvec_3_ptr * const fns[3] = {
125
+ gen_helper_sve_ftmad_h,
126
+ gen_helper_sve_ftmad_s,
127
+ gen_helper_sve_ftmad_d,
128
+ };
129
+
130
+ if (a->esz == 0) {
131
+ return false;
132
+ }
133
+ if (sve_access_check(s)) {
134
+ unsigned vsz = vec_full_reg_size(s);
135
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
136
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
137
+ vec_full_reg_offset(s, a->rn),
138
+ vec_full_reg_offset(s, a->rm),
139
+ status, vsz, vsz, a->imm, fns[a->esz - 1]);
140
+ tcg_temp_free_ptr(status);
141
+ }
142
+ return true;
143
+}
144
+
145
/*
146
*** SVE Floating Point Accumulating Reduction Group
147
*/
148
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
149
index XXXXXXX..XXXXXXX 100644
150
--- a/target/arm/sve.decode
151
+++ b/target/arm/sve.decode
152
@@ -XXX,XX +XXX,XX @@ FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1
153
FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1
154
FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1
155
156
+# SVE floating-point trig multiply-add coefficient
157
+FTMAD 01100101 esz:2 010 imm:3 100000 rm:5 rd:5 rn=%reg_movprfx
158
+
159
### SVE FP Multiply-Add Group
160
161
# SVE floating-point multiply-accumulate writing addend
162
--
31
--
163
2.17.1
32
2.34.1
164
165
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
For IEEE fused multiply-add, the (0 * inf) + NaN case should raise
2
Invalid for the multiplication of 0 by infinity. Currently we handle
3
this in the per-architecture ifdef ladder in pickNaNMulAdd().
4
However, since this isn't really architecture specific we can hoist
5
it up to the generic code.
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
For the cases where the infzero test in pickNaNMulAdd was
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
returning 2, we can delete the check entirely and allow the
5
Message-id: 20180627043328.11531-9-richard.henderson@linaro.org
9
code to fall into the normal pick-a-NaN handling, because this
10
will return 2 anyway (input 'c' being the only NaN in this case).
11
For the cases where infzero was returning 3 to indicate "return
12
the default NaN", we must retain that "return 3".
13
14
For Arm, this looks like it might be a behaviour change because we
15
used to set float_flag_invalid | float_flag_invalid_imz only if C is
16
a quiet NaN. However, it is not, because Arm target code never looks
17
at float_flag_invalid_imz, and for the (0 * inf) + SNaN case we
18
already raised float_flag_invalid via the "abc_mask &
19
float_cmask_snan" check in pick_nan_muladd.
20
21
For any target architecture using the "default implementation" at the
22
bottom of the ifdef, this is a behaviour change but will be fixing a
23
bug (where we failed to raise the Invalid exception for (0 * inf +
24
QNaN). The architectures using the default case are:
25
* hppa
26
* i386
27
* sh4
28
* tricore
29
30
The x86, Tricore and SH4 CPU architecture manuals are clear that this
31
should have raised Invalid; HPPA is a bit vaguer but still seems
32
clear enough.
33
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
34
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
35
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
36
Message-id: 20241202131347.498124-2-peter.maydell@linaro.org
7
---
37
---
8
target/arm/helper-sve.h | 7 +++++
38
fpu/softfloat-parts.c.inc | 13 +++++++------
9
target/arm/sve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++
39
fpu/softfloat-specialize.c.inc | 29 +----------------------------
10
target/arm/translate-sve.c | 45 ++++++++++++++++++++++++++++++
40
2 files changed, 8 insertions(+), 34 deletions(-)
11
target/arm/sve.decode | 5 ++++
12
4 files changed, 113 insertions(+)
13
41
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
42
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
44
--- a/fpu/softfloat-parts.c.inc
17
+++ b/target/arm/helper-sve.h
45
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
46
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
19
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
47
int ab_mask, int abc_mask)
20
void, ptr, ptr, ptr, ptr, i32)
48
{
21
49
int which;
22
+DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
50
+ bool infzero = (ab_mask == float_cmask_infzero);
23
+ i64, i64, ptr, ptr, ptr, i32)
51
24
+DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
52
if (unlikely(abc_mask & float_cmask_snan)) {
25
+ i64, i64, ptr, ptr, ptr, i32)
53
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
26
+DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
54
}
27
+ i64, i64, ptr, ptr, ptr, i32)
55
28
+
56
- which = pickNaNMulAdd(a->cls, b->cls, c->cls,
29
DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
57
- ab_mask == float_cmask_infzero, s);
30
void, ptr, ptr, ptr, ptr, ptr, i32)
58
+ if (infzero) {
31
DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
59
+ /* This is (0 * inf) + NaN or (inf * 0) + NaN */
32
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
60
+ float_raise(float_flag_invalid | float_flag_invalid_imz, s);
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/sve_helper.c
35
+++ b/target/arm/sve_helper.c
36
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
37
return predtest_ones(d, oprsz, esz_mask);
38
}
39
40
+uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
41
+ void *status, uint32_t desc)
42
+{
43
+ intptr_t i = 0, opr_sz = simd_oprsz(desc);
44
+ float16 result = nn;
45
+
46
+ do {
47
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
48
+ do {
49
+ if (pg & 1) {
50
+ float16 mm = *(float16 *)(vm + H1_2(i));
51
+ result = float16_add(result, mm, status);
52
+ }
53
+ i += sizeof(float16), pg >>= sizeof(float16);
54
+ } while (i & 15);
55
+ } while (i < opr_sz);
56
+
57
+ return result;
58
+}
59
+
60
+uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg,
61
+ void *status, uint32_t desc)
62
+{
63
+ intptr_t i = 0, opr_sz = simd_oprsz(desc);
64
+ float32 result = nn;
65
+
66
+ do {
67
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
68
+ do {
69
+ if (pg & 1) {
70
+ float32 mm = *(float32 *)(vm + H1_2(i));
71
+ result = float32_add(result, mm, status);
72
+ }
73
+ i += sizeof(float32), pg >>= sizeof(float32);
74
+ } while (i & 15);
75
+ } while (i < opr_sz);
76
+
77
+ return result;
78
+}
79
+
80
+uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg,
81
+ void *status, uint32_t desc)
82
+{
83
+ intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8;
84
+ uint64_t *m = vm;
85
+ uint8_t *pg = vg;
86
+
87
+ for (i = 0; i < opr_sz; i++) {
88
+ if (pg[H1(i)] & 1) {
89
+ nn = float64_add(nn, m[i], status);
90
+ }
91
+ }
61
+ }
92
+
62
+
93
+ return nn;
63
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
94
+}
64
65
if (s->default_nan_mode || which == 3) {
66
- /*
67
- * Note that this check is after pickNaNMulAdd so that function
68
- * has an opportunity to set the Invalid flag for infzero.
69
- */
70
parts_default_nan(a, s);
71
return a;
72
}
73
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
74
index XXXXXXX..XXXXXXX 100644
75
--- a/fpu/softfloat-specialize.c.inc
76
+++ b/fpu/softfloat-specialize.c.inc
77
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
78
* the default NaN
79
*/
80
if (infzero && is_qnan(c_cls)) {
81
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
82
return 3;
83
}
84
85
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
86
* case sets InvalidOp and returns the default NaN
87
*/
88
if (infzero) {
89
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
90
return 3;
91
}
92
/* Prefer sNaN over qNaN, in the a, b, c order. */
93
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
94
* For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
95
* case sets InvalidOp and returns the input value 'c'
96
*/
97
- if (infzero) {
98
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
99
- return 2;
100
- }
101
/* Prefer sNaN over qNaN, in the c, a, b order. */
102
if (is_snan(c_cls)) {
103
return 2;
104
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
105
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
106
* case sets InvalidOp and returns the input value 'c'
107
*/
108
- if (infzero) {
109
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
110
- return 2;
111
- }
95
+
112
+
96
/* Fully general three-operand expander, controlled by a predicate,
113
/* Prefer sNaN over qNaN, in the c, a, b order. */
97
* With the extra float_status parameter.
114
if (is_snan(c_cls)) {
98
*/
115
return 2;
99
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
116
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
100
index XXXXXXX..XXXXXXX 100644
117
* to return an input NaN if we have one (ie c) rather than generating
101
--- a/target/arm/translate-sve.c
118
* a default NaN
102
+++ b/target/arm/translate-sve.c
119
*/
103
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
120
- if (infzero) {
104
121
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
105
#undef DO_ZZI
122
- return 2;
106
123
- }
107
+/*
124
108
+ *** SVE Floating Point Accumulating Reduction Group
125
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
109
+ */
126
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
110
+
127
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
111
+static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
128
return 1;
112
+{
129
}
113
+ typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr,
130
#elif defined(TARGET_RISCV)
114
+ TCGv_ptr, TCGv_ptr, TCGv_i32);
131
- /* For RISC-V, InvalidOp is set when multiplicands are Inf and zero */
115
+ static fadda_fn * const fns[3] = {
132
- if (infzero) {
116
+ gen_helper_sve_fadda_h,
133
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
117
+ gen_helper_sve_fadda_s,
134
- }
118
+ gen_helper_sve_fadda_d,
135
return 3; /* default NaN */
119
+ };
136
#elif defined(TARGET_S390X)
120
+ unsigned vsz = vec_full_reg_size(s);
137
if (infzero) {
121
+ TCGv_ptr t_rm, t_pg, t_fpst;
138
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
122
+ TCGv_i64 t_val;
139
return 3;
123
+ TCGv_i32 t_desc;
140
}
124
+
141
125
+ if (a->esz == 0) {
142
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
126
+ return false;
143
return 2;
127
+ }
144
}
128
+ if (!sve_access_check(s)) {
145
#elif defined(TARGET_SPARC)
129
+ return true;
146
- /* For (inf,0,nan) return c. */
130
+ }
147
- if (infzero) {
131
+
148
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
132
+ t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz);
149
- return 2;
133
+ t_rm = tcg_temp_new_ptr();
150
- }
134
+ t_pg = tcg_temp_new_ptr();
151
/* Prefer SNaN over QNaN, order C, B, A. */
135
+ tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm));
152
if (is_snan(c_cls)) {
136
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
153
return 2;
137
+ t_fpst = get_fpstatus_ptr(a->esz == MO_16);
154
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
138
+ t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
155
* For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
139
+
156
* an input NaN if we have one (ie c).
140
+ fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc);
157
*/
141
+
158
- if (infzero) {
142
+ tcg_temp_free_i32(t_desc);
159
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
143
+ tcg_temp_free_ptr(t_fpst);
160
- return 2;
144
+ tcg_temp_free_ptr(t_pg);
161
- }
145
+ tcg_temp_free_ptr(t_rm);
162
if (status->use_first_nan) {
146
+
163
if (is_nan(a_cls)) {
147
+ write_fp_dreg(s, a->rd, t_val);
164
return 0;
148
+ tcg_temp_free_i64(t_val);
149
+ return true;
150
+}
151
+
152
/*
153
*** SVE Floating Point Arithmetic - Unpredicated Group
154
*/
155
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
156
index XXXXXXX..XXXXXXX 100644
157
--- a/target/arm/sve.decode
158
+++ b/target/arm/sve.decode
159
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
160
# SVE integer multiply immediate (unpredicated)
161
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
162
163
+### SVE FP Accumulating Reduction Group
164
+
165
+# SVE floating-point serial reduction (predicated)
166
+FADDA 01100101 .. 011 000 001 ... ..... ..... @rdn_pg_rm
167
+
168
### SVE Floating Point Arithmetic - Unpredicated Group
169
170
# SVE floating-point arithmetic (unpredicated)
171
--
165
--
172
2.17.1
166
2.34.1
173
174
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
If the target sets default_nan_mode then we're always going to return
2
the default NaN, and pickNaNMulAdd() no longer has any side effects.
3
For consistency with pickNaN(), check for default_nan_mode before
4
calling pickNaNMulAdd().
2
5
3
For aa64 advsimd, we had been passing the pre-indexed vector.
6
When we convert pickNaNMulAdd() to allow runtime selection of the NaN
4
However, sve applies the index to each 128-bit segment, so we
7
propagation rule, this means we won't have to make the targets which
5
need to pass in the index separately.
8
use default_nan_mode also set a propagation rule.
6
9
7
For aa32 advsimd, the fp32 operation always has index 0, but
10
Since RiscV always uses default_nan_mode, this allows us to remove
8
we failed to interpret the fp16 index correctly.
11
its ifdef case from pickNaNMulAdd().
9
12
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
13
Message-id: 20180627043328.11531-31-richard.henderson@linaro.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20241202131347.498124-3-peter.maydell@linaro.org
15
---
16
---
16
target/arm/translate-a64.c | 21 ++++++++++++---------
17
fpu/softfloat-parts.c.inc | 8 ++++++--
17
target/arm/translate.c | 32 +++++++++++++++++++++++---------
18
fpu/softfloat-specialize.c.inc | 9 +++++++--
18
target/arm/vec_helper.c | 10 ++++++----
19
2 files changed, 13 insertions(+), 4 deletions(-)
19
3 files changed, 41 insertions(+), 22 deletions(-)
20
20
21
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
21
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
22
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/translate-a64.c
23
--- a/fpu/softfloat-parts.c.inc
24
+++ b/target/arm/translate-a64.c
24
+++ b/fpu/softfloat-parts.c.inc
25
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
25
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
26
case 0x13: /* FCMLA #90 */
26
float_raise(float_flag_invalid | float_flag_invalid_imz, s);
27
case 0x15: /* FCMLA #180 */
28
case 0x17: /* FCMLA #270 */
29
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
30
- vec_full_reg_offset(s, rn),
31
- vec_reg_offset(s, rm, index, size), fpst,
32
- is_q ? 16 : 8, vec_full_reg_size(s),
33
- extract32(insn, 13, 2), /* rot */
34
- size == MO_64
35
- ? gen_helper_gvec_fcmlas_idx
36
- : gen_helper_gvec_fcmlah_idx);
37
- tcg_temp_free_ptr(fpst);
38
+ {
39
+ int rot = extract32(insn, 13, 2);
40
+ int data = (index << 2) | rot;
41
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
42
+ vec_full_reg_offset(s, rn),
43
+ vec_full_reg_offset(s, rm), fpst,
44
+ is_q ? 16 : 8, vec_full_reg_size(s), data,
45
+ size == MO_64
46
+ ? gen_helper_gvec_fcmlas_idx
47
+ : gen_helper_gvec_fcmlah_idx);
48
+ tcg_temp_free_ptr(fpst);
49
+ }
50
return;
51
}
27
}
52
28
53
diff --git a/target/arm/translate.c b/target/arm/translate.c
29
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
30
+ if (s->default_nan_mode) {
31
+ which = 3;
32
+ } else {
33
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ }
35
36
- if (s->default_nan_mode || which == 3) {
37
+ if (which == 3) {
38
parts_default_nan(a, s);
39
return a;
40
}
41
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
54
index XXXXXXX..XXXXXXX 100644
42
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/translate.c
43
--- a/fpu/softfloat-specialize.c.inc
56
+++ b/target/arm/translate.c
44
+++ b/fpu/softfloat-specialize.c.inc
57
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
45
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
58
46
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
59
static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
47
bool infzero, float_status *status)
60
{
48
{
61
- int rd, rn, rm, rot, size, opr_sz;
49
+ /*
62
+ gen_helper_gvec_3_ptr *fn_gvec_ptr;
50
+ * We guarantee not to require the target to tell us how to
63
+ int rd, rn, rm, opr_sz, data;
51
+ * pick a NaN if we're always returning the default NaN.
64
TCGv_ptr fpst;
52
+ * But if we're not in default-NaN mode then the target must
65
bool q;
53
+ * specify.
66
54
+ */
67
q = extract32(insn, 6, 1);
55
+ assert(!status->default_nan_mode);
68
VFP_DREG_D(rd, insn);
56
#if defined(TARGET_ARM)
69
VFP_DREG_N(rn, insn);
57
/* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
70
- VFP_DREG_M(rm, insn);
58
* the default NaN
71
if ((rd | rn) & q) {
59
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
72
return 1;
73
}
74
75
if ((insn & 0xff000f10) == 0xfe000800) {
76
/* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
77
- rot = extract32(insn, 20, 2);
78
- size = extract32(insn, 23, 1);
79
- if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
80
- || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
81
+ int rot = extract32(insn, 20, 2);
82
+ int size = extract32(insn, 23, 1);
83
+ int index;
84
+
85
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
86
return 1;
87
}
88
+ if (size == 0) {
89
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
90
+ return 1;
91
+ }
92
+ /* For fp16, rm is just Vm, and index is M. */
93
+ rm = extract32(insn, 0, 4);
94
+ index = extract32(insn, 5, 1);
95
+ } else {
96
+ /* For fp32, rm is the usual M:Vm, and index is 0. */
97
+ VFP_DREG_M(rm, insn);
98
+ index = 0;
99
+ }
100
+ data = (index << 2) | rot;
101
+ fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
102
+ : gen_helper_gvec_fcmlah_idx);
103
} else {
60
} else {
104
return 1;
61
return 1;
105
}
62
}
106
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
63
-#elif defined(TARGET_RISCV)
107
tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
64
- return 3; /* default NaN */
108
vfp_reg_offset(1, rn),
65
#elif defined(TARGET_S390X)
109
vfp_reg_offset(1, rm), fpst,
66
if (infzero) {
110
- opr_sz, opr_sz, rot,
67
return 3;
111
- size ? gen_helper_gvec_fcmlas_idx
112
- : gen_helper_gvec_fcmlah_idx);
113
+ opr_sz, opr_sz, data, fn_gvec_ptr);
114
tcg_temp_free_ptr(fpst);
115
return 0;
116
}
117
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
118
index XXXXXXX..XXXXXXX 100644
119
--- a/target/arm/vec_helper.c
120
+++ b/target/arm/vec_helper.c
121
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
122
float_status *fpst = vfpst;
123
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
124
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
125
+ intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
126
uint32_t neg_real = flip ^ neg_imag;
127
uintptr_t i;
128
- float16 e1 = m[H2(flip)];
129
- float16 e3 = m[H2(1 - flip)];
130
+ float16 e1 = m[H2(2 * index + flip)];
131
+ float16 e3 = m[H2(2 * index + 1 - flip)];
132
133
/* Shift boolean to the sign bit so we can xor to negate. */
134
neg_real <<= 15;
135
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm,
136
float_status *fpst = vfpst;
137
intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
138
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
139
+ intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
140
uint32_t neg_real = flip ^ neg_imag;
141
uintptr_t i;
142
- float32 e1 = m[H4(flip)];
143
- float32 e3 = m[H4(1 - flip)];
144
+ float32 e1 = m[H4(2 * index + flip)];
145
+ float32 e3 = m[H4(2 * index + 1 - flip)];
146
147
/* Shift boolean to the sign bit so we can xor to negate. */
148
neg_real <<= 31;
149
--
68
--
150
2.17.1
69
2.34.1
151
152
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
IEEE 758 does not define a fixed rule for what NaN to return in
2
2
the case of a fused multiply-add of inf * 0 + NaN. Different
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
architectures thus do different things:
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
* some return the default NaN
5
Message-id: 20180627043328.11531-30-richard.henderson@linaro.org
5
* some return the input NaN
6
* Arm returns the default NaN if the input NaN is quiet,
7
and the input NaN if it is signalling
8
9
We want to make this logic be runtime selected rather than
10
hardcoded into the binary, because:
11
* this will let us have multiple targets in one QEMU binary
12
* the Arm FEAT_AFP architectural feature includes letting
13
the guest select a NaN propagation rule at runtime
14
15
In this commit we add an enum for the propagation rule, the field in
16
float_status, and the corresponding getters and setters. We change
17
pickNaNMulAdd to honour this, but because all targets still leave
18
this field at its default 0 value, the fallback logic will pick the
19
rule type with the old ifdef ladder.
20
21
Note that four architectures both use the muladd softfloat functions
22
and did not have a branch of the ifdef ladder to specify their
23
behaviour (and so were ending up with the "default" case, probably
24
wrongly): i386, HPPA, SH4 and Tricore. SH4 and Tricore both set
25
default_nan_mode, and so will never get into pickNaNMulAdd(). For
26
HPPA and i386 we retain the same behaviour as the old default-case,
27
which is to not ever return the default NaN. This might not be
28
correct but it is not a behaviour change.
29
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
31
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
32
Message-id: 20241202131347.498124-4-peter.maydell@linaro.org
7
---
33
---
8
target/arm/helper-sve.h | 4 +
34
include/fpu/softfloat-helpers.h | 11 ++++
9
target/arm/sve_helper.c | 162 +++++++++++++++++++++++++++++++++++++
35
include/fpu/softfloat-types.h | 23 +++++++++
10
target/arm/translate-sve.c | 37 +++++++++
36
fpu/softfloat-specialize.c.inc | 91 ++++++++++++++++++++++-----------
11
target/arm/sve.decode | 4 +
37
3 files changed, 95 insertions(+), 30 deletions(-)
12
4 files changed, 207 insertions(+)
38
13
39
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
41
--- a/include/fpu/softfloat-helpers.h
17
+++ b/target/arm/helper-sve.h
42
+++ b/include/fpu/softfloat-helpers.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
43
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
19
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
44
status->float_2nan_prop_rule = rule;
20
DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
45
}
21
46
22
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
47
+static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
23
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
48
+ float_status *status)
24
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
49
+{
25
+
50
+ status->float_infzeronan_rule = rule;
26
DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
51
+}
27
DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
52
+
28
DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
53
static inline void set_flush_to_zero(bool val, float_status *status)
29
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
54
{
55
status->flush_to_zero = val;
56
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
57
return status->float_2nan_prop_rule;
58
}
59
60
+static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
61
+{
62
+ return status->float_infzeronan_rule;
63
+}
64
+
65
static inline bool get_flush_to_zero(float_status *status)
66
{
67
return status->flush_to_zero;
68
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
30
index XXXXXXX..XXXXXXX 100644
69
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/sve_helper.c
70
--- a/include/fpu/softfloat-types.h
32
+++ b/target/arm/sve_helper.c
71
+++ b/include/fpu/softfloat-types.h
33
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
72
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
34
} while (i != 0);
73
float_2nan_prop_x87,
35
}
74
} Float2NaNPropRule;
36
75
37
+/*
76
+/*
38
+ * FP Complex Multiply
77
+ * Rule for result of fused multiply-add 0 * Inf + NaN.
78
+ * This must be a NaN, but implementations differ on whether this
79
+ * is the input NaN or the default NaN.
80
+ *
81
+ * You don't need to set this if default_nan_mode is enabled.
82
+ * When not in default-NaN mode, it is an error for the target
83
+ * not to set the rule in float_status if it uses muladd, and we
84
+ * will assert if we need to handle an input NaN and no rule was
85
+ * selected.
39
+ */
86
+ */
40
+
87
+typedef enum __attribute__((__packed__)) {
41
+QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32);
88
+ /* No propagation rule specified */
42
+
89
+ float_infzeronan_none = 0,
43
+void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
90
+ /* Result is never the default NaN (so always the input NaN) */
44
+{
91
+ float_infzeronan_dnan_never,
45
+ intptr_t j, i = simd_oprsz(desc);
92
+ /* Result is always the default NaN */
46
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
93
+ float_infzeronan_dnan_always,
47
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
94
+ /* Result is the default NaN if the input NaN is quiet */
48
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
95
+ float_infzeronan_dnan_if_qnan,
49
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
96
+} FloatInfZeroNaNRule;
50
+ unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
51
+ bool flip = rot & 1;
52
+ float16 neg_imag, neg_real;
53
+ void *vd = &env->vfp.zregs[rd];
54
+ void *vn = &env->vfp.zregs[rn];
55
+ void *vm = &env->vfp.zregs[rm];
56
+ void *va = &env->vfp.zregs[ra];
57
+ uint64_t *g = vg;
58
+
59
+ neg_imag = float16_set_sign(0, (rot & 2) != 0);
60
+ neg_real = float16_set_sign(0, rot == 1 || rot == 2);
61
+
62
+ do {
63
+ uint64_t pg = g[(i - 1) >> 6];
64
+ do {
65
+ float16 e1, e2, e3, e4, nr, ni, mr, mi, d;
66
+
67
+ /* I holds the real index; J holds the imag index. */
68
+ j = i - sizeof(float16);
69
+ i -= 2 * sizeof(float16);
70
+
71
+ nr = *(float16 *)(vn + H1_2(i));
72
+ ni = *(float16 *)(vn + H1_2(j));
73
+ mr = *(float16 *)(vm + H1_2(i));
74
+ mi = *(float16 *)(vm + H1_2(j));
75
+
76
+ e2 = (flip ? ni : nr);
77
+ e1 = (flip ? mi : mr) ^ neg_real;
78
+ e4 = e2;
79
+ e3 = (flip ? mr : mi) ^ neg_imag;
80
+
81
+ if (likely((pg >> (i & 63)) & 1)) {
82
+ d = *(float16 *)(va + H1_2(i));
83
+ d = float16_muladd(e2, e1, d, 0, &env->vfp.fp_status_f16);
84
+ *(float16 *)(vd + H1_2(i)) = d;
85
+ }
86
+ if (likely((pg >> (j & 63)) & 1)) {
87
+ d = *(float16 *)(va + H1_2(j));
88
+ d = float16_muladd(e4, e3, d, 0, &env->vfp.fp_status_f16);
89
+ *(float16 *)(vd + H1_2(j)) = d;
90
+ }
91
+ } while (i & 63);
92
+ } while (i != 0);
93
+}
94
+
95
+void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
96
+{
97
+ intptr_t j, i = simd_oprsz(desc);
98
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
99
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
100
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
101
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
102
+ unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
103
+ bool flip = rot & 1;
104
+ float32 neg_imag, neg_real;
105
+ void *vd = &env->vfp.zregs[rd];
106
+ void *vn = &env->vfp.zregs[rn];
107
+ void *vm = &env->vfp.zregs[rm];
108
+ void *va = &env->vfp.zregs[ra];
109
+ uint64_t *g = vg;
110
+
111
+ neg_imag = float32_set_sign(0, (rot & 2) != 0);
112
+ neg_real = float32_set_sign(0, rot == 1 || rot == 2);
113
+
114
+ do {
115
+ uint64_t pg = g[(i - 1) >> 6];
116
+ do {
117
+ float32 e1, e2, e3, e4, nr, ni, mr, mi, d;
118
+
119
+ /* I holds the real index; J holds the imag index. */
120
+ j = i - sizeof(float32);
121
+ i -= 2 * sizeof(float32);
122
+
123
+ nr = *(float32 *)(vn + H1_2(i));
124
+ ni = *(float32 *)(vn + H1_2(j));
125
+ mr = *(float32 *)(vm + H1_2(i));
126
+ mi = *(float32 *)(vm + H1_2(j));
127
+
128
+ e2 = (flip ? ni : nr);
129
+ e1 = (flip ? mi : mr) ^ neg_real;
130
+ e4 = e2;
131
+ e3 = (flip ? mr : mi) ^ neg_imag;
132
+
133
+ if (likely((pg >> (i & 63)) & 1)) {
134
+ d = *(float32 *)(va + H1_2(i));
135
+ d = float32_muladd(e2, e1, d, 0, &env->vfp.fp_status);
136
+ *(float32 *)(vd + H1_2(i)) = d;
137
+ }
138
+ if (likely((pg >> (j & 63)) & 1)) {
139
+ d = *(float32 *)(va + H1_2(j));
140
+ d = float32_muladd(e4, e3, d, 0, &env->vfp.fp_status);
141
+ *(float32 *)(vd + H1_2(j)) = d;
142
+ }
143
+ } while (i & 63);
144
+ } while (i != 0);
145
+}
146
+
147
+void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
148
+{
149
+ intptr_t j, i = simd_oprsz(desc);
150
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
151
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
152
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
153
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
154
+ unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
155
+ bool flip = rot & 1;
156
+ float64 neg_imag, neg_real;
157
+ void *vd = &env->vfp.zregs[rd];
158
+ void *vn = &env->vfp.zregs[rn];
159
+ void *vm = &env->vfp.zregs[rm];
160
+ void *va = &env->vfp.zregs[ra];
161
+ uint64_t *g = vg;
162
+
163
+ neg_imag = float64_set_sign(0, (rot & 2) != 0);
164
+ neg_real = float64_set_sign(0, rot == 1 || rot == 2);
165
+
166
+ do {
167
+ uint64_t pg = g[(i - 1) >> 6];
168
+ do {
169
+ float64 e1, e2, e3, e4, nr, ni, mr, mi, d;
170
+
171
+ /* I holds the real index; J holds the imag index. */
172
+ j = i - sizeof(float64);
173
+ i -= 2 * sizeof(float64);
174
+
175
+ nr = *(float64 *)(vn + H1_2(i));
176
+ ni = *(float64 *)(vn + H1_2(j));
177
+ mr = *(float64 *)(vm + H1_2(i));
178
+ mi = *(float64 *)(vm + H1_2(j));
179
+
180
+ e2 = (flip ? ni : nr);
181
+ e1 = (flip ? mi : mr) ^ neg_real;
182
+ e4 = e2;
183
+ e3 = (flip ? mr : mi) ^ neg_imag;
184
+
185
+ if (likely((pg >> (i & 63)) & 1)) {
186
+ d = *(float64 *)(va + H1_2(i));
187
+ d = float64_muladd(e2, e1, d, 0, &env->vfp.fp_status);
188
+ *(float64 *)(vd + H1_2(i)) = d;
189
+ }
190
+ if (likely((pg >> (j & 63)) & 1)) {
191
+ d = *(float64 *)(va + H1_2(j));
192
+ d = float64_muladd(e4, e3, d, 0, &env->vfp.fp_status);
193
+ *(float64 *)(vd + H1_2(j)) = d;
194
+ }
195
+ } while (i & 63);
196
+ } while (i != 0);
197
+}
198
+
97
+
199
/*
98
/*
200
* Load contiguous data, protected by a governing predicate.
99
* Floating Point Status. Individual architectures may maintain
201
*/
100
* several versions of float_status for different functions. The
202
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
101
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
102
FloatRoundMode float_rounding_mode;
103
FloatX80RoundPrec floatx80_rounding_precision;
104
Float2NaNPropRule float_2nan_prop_rule;
105
+ FloatInfZeroNaNRule float_infzeronan_rule;
106
bool tininess_before_rounding;
107
/* should denormalised results go to zero and set the inexact flag? */
108
bool flush_to_zero;
109
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
203
index XXXXXXX..XXXXXXX 100644
110
index XXXXXXX..XXXXXXX 100644
204
--- a/target/arm/translate-sve.c
111
--- a/fpu/softfloat-specialize.c.inc
205
+++ b/target/arm/translate-sve.c
112
+++ b/fpu/softfloat-specialize.c.inc
206
@@ -XXX,XX +XXX,XX @@ DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz)
113
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
207
114
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
208
#undef DO_FMLA
115
bool infzero, float_status *status)
209
116
{
210
+static bool trans_FCMLA_zpzzz(DisasContext *s,
117
+ FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
211
+ arg_FCMLA_zpzzz *a, uint32_t insn)
118
+
212
+{
119
/*
213
+ static gen_helper_sve_fmla * const fns[3] = {
120
* We guarantee not to require the target to tell us how to
214
+ gen_helper_sve_fcmla_zpzzz_h,
121
* pick a NaN if we're always returning the default NaN.
215
+ gen_helper_sve_fcmla_zpzzz_s,
122
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
216
+ gen_helper_sve_fcmla_zpzzz_d,
123
* specify.
217
+ };
124
*/
218
+
125
assert(!status->default_nan_mode);
219
+ if (a->esz == 0) {
126
+
220
+ return false;
127
+ if (rule == float_infzeronan_none) {
128
+ /*
129
+ * Temporarily fall back to ifdef ladder
130
+ */
131
#if defined(TARGET_ARM)
132
- /* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
133
- * the default NaN
134
- */
135
- if (infzero && is_qnan(c_cls)) {
136
- return 3;
137
+ /*
138
+ * For ARM, the (inf,zero,qnan) case returns the default NaN,
139
+ * but (inf,zero,snan) returns the input NaN.
140
+ */
141
+ rule = float_infzeronan_dnan_if_qnan;
142
+#elif defined(TARGET_MIPS)
143
+ if (snan_bit_is_one(status)) {
144
+ /*
145
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
146
+ * case sets InvalidOp and returns the default NaN
147
+ */
148
+ rule = float_infzeronan_dnan_always;
149
+ } else {
150
+ /*
151
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
152
+ * case sets InvalidOp and returns the input value 'c'
153
+ */
154
+ rule = float_infzeronan_dnan_never;
155
+ }
156
+#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
157
+ defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
158
+ defined(TARGET_I386) || defined(TARGET_LOONGARCH)
159
+ /*
160
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
161
+ * case sets InvalidOp and returns the input value 'c'
162
+ */
163
+ /*
164
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
165
+ * to return an input NaN if we have one (ie c) rather than generating
166
+ * a default NaN
167
+ */
168
+ rule = float_infzeronan_dnan_never;
169
+#elif defined(TARGET_S390X)
170
+ rule = float_infzeronan_dnan_always;
171
+#endif
172
}
173
174
+ if (infzero) {
175
+ /*
176
+ * Inf * 0 + NaN -- some implementations return the default NaN here,
177
+ * and some return the input NaN.
178
+ */
179
+ switch (rule) {
180
+ case float_infzeronan_dnan_never:
181
+ return 2;
182
+ case float_infzeronan_dnan_always:
183
+ return 3;
184
+ case float_infzeronan_dnan_if_qnan:
185
+ return is_qnan(c_cls) ? 3 : 2;
186
+ default:
187
+ g_assert_not_reached();
188
+ }
221
+ }
189
+ }
222
+ if (sve_access_check(s)) {
190
+
223
+ unsigned vsz = vec_full_reg_size(s);
191
+#if defined(TARGET_ARM)
224
+ unsigned desc;
192
+
225
+ TCGv_i32 t_desc;
193
/* This looks different from the ARM ARM pseudocode, because the ARM ARM
226
+ TCGv_ptr pg = tcg_temp_new_ptr();
194
* puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
227
+
195
*/
228
+ /* We would need 7 operands to pass these arguments "properly".
196
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
229
+ * So we encode all the register numbers into the descriptor.
197
}
230
+ */
198
#elif defined(TARGET_MIPS)
231
+ desc = deposit32(a->rd, 5, 5, a->rn);
199
if (snan_bit_is_one(status)) {
232
+ desc = deposit32(desc, 10, 5, a->rm);
200
- /*
233
+ desc = deposit32(desc, 15, 5, a->ra);
201
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
234
+ desc = deposit32(desc, 20, 2, a->rot);
202
- * case sets InvalidOp and returns the default NaN
235
+ desc = sextract32(desc, 0, 22);
203
- */
236
+ desc = simd_desc(vsz, vsz, desc);
204
- if (infzero) {
237
+
205
- return 3;
238
+ t_desc = tcg_const_i32(desc);
206
- }
239
+ tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
207
/* Prefer sNaN over qNaN, in the a, b, c order. */
240
+ fns[a->esz - 1](cpu_env, pg, t_desc);
208
if (is_snan(a_cls)) {
241
+ tcg_temp_free_i32(t_desc);
209
return 0;
242
+ tcg_temp_free_ptr(pg);
210
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
243
+ }
211
return 2;
244
+ return true;
212
}
245
+}
213
} else {
246
+
214
- /*
247
/*
215
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
248
*** SVE Floating Point Unary Operations Predicated Group
216
- * case sets InvalidOp and returns the input value 'c'
249
*/
217
- */
250
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
218
/* Prefer sNaN over qNaN, in the c, a, b order. */
251
index XXXXXXX..XXXXXXX 100644
219
if (is_snan(c_cls)) {
252
--- a/target/arm/sve.decode
220
return 2;
253
+++ b/target/arm/sve.decode
221
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
254
@@ -XXX,XX +XXX,XX @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
222
}
255
FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
223
}
256
rn=%reg_movprfx
224
#elif defined(TARGET_LOONGARCH64)
257
225
- /*
258
+# SVE floating-point complex multiply-add (predicated)
226
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
259
+FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \
227
- * case sets InvalidOp and returns the input value 'c'
260
+ ra=%reg_movprfx
228
- */
261
+
229
-
262
### SVE FP Multiply-Add Indexed Group
230
/* Prefer sNaN over qNaN, in the c, a, b order. */
263
231
if (is_snan(c_cls)) {
264
# SVE floating-point multiply-add (indexed)
232
return 2;
233
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
234
return 1;
235
}
236
#elif defined(TARGET_PPC)
237
- /* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
238
- * to return an input NaN if we have one (ie c) rather than generating
239
- * a default NaN
240
- */
241
-
242
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
243
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
244
*/
245
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
246
return 1;
247
}
248
#elif defined(TARGET_S390X)
249
- if (infzero) {
250
- return 3;
251
- }
252
-
253
if (is_snan(a_cls)) {
254
return 0;
255
} else if (is_snan(b_cls)) {
265
--
256
--
266
2.17.1
257
2.34.1
267
268
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Explicitly set a rule in the softfloat tests for the inf-zero-nan
2
muladd special case. In meson.build we put -DTARGET_ARM in fpcflags,
3
and so we should select here the Arm rule of
4
float_infzeronan_dnan_if_qnan.
2
5
3
Leave ARM_CP_SVE, removing ARM_CP_FPU; the sve_access_check
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
produced by the flag already includes fp_access_check. If
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
we also check ARM_CP_FPU the double fp_access_check asserts.
8
Message-id: 20241202131347.498124-5-peter.maydell@linaro.org
9
---
10
tests/fp/fp-bench.c | 5 +++++
11
tests/fp/fp-test.c | 5 +++++
12
2 files changed, 10 insertions(+)
6
13
7
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
14
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
12
Message-id: 20180629001538.11415-3-richard.henderson@linaro.org
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
15
target/arm/helper.c | 8 ++++----
16
target/arm/translate-a64.c | 5 ++---
17
2 files changed, 6 insertions(+), 7 deletions(-)
18
19
diff --git a/target/arm/helper.c b/target/arm/helper.c
20
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.c
16
--- a/tests/fp/fp-bench.c
22
+++ b/target/arm/helper.c
17
+++ b/tests/fp/fp-bench.c
23
@@ -XXX,XX +XXX,XX @@ static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
18
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
24
static const ARMCPRegInfo zcr_el1_reginfo = {
19
{
25
.name = "ZCR_EL1", .state = ARM_CP_STATE_AA64,
20
bench_func_t f;
26
.opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 0,
21
27
- .access = PL1_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
22
+ /*
28
+ .access = PL1_RW, .type = ARM_CP_SVE,
23
+ * These implementation-defined choices for various things IEEE
29
.fieldoffset = offsetof(CPUARMState, vfp.zcr_el[1]),
24
+ * doesn't specify match those used by the Arm architecture.
30
.writefn = zcr_write, .raw_writefn = raw_write
25
+ */
31
};
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
32
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo zcr_el1_reginfo = {
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
33
static const ARMCPRegInfo zcr_el2_reginfo = {
28
34
.name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
29
f = bench_funcs[operation][precision];
35
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
30
g_assert(f);
36
- .access = PL2_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
31
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
37
+ .access = PL2_RW, .type = ARM_CP_SVE,
38
.fieldoffset = offsetof(CPUARMState, vfp.zcr_el[2]),
39
.writefn = zcr_write, .raw_writefn = raw_write
40
};
41
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo zcr_el2_reginfo = {
42
static const ARMCPRegInfo zcr_no_el2_reginfo = {
43
.name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
44
.opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
45
- .access = PL2_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
46
+ .access = PL2_RW, .type = ARM_CP_SVE,
47
.readfn = arm_cp_read_zero, .writefn = arm_cp_write_ignore
48
};
49
50
static const ARMCPRegInfo zcr_el3_reginfo = {
51
.name = "ZCR_EL3", .state = ARM_CP_STATE_AA64,
52
.opc0 = 3, .opc1 = 6, .crn = 1, .crm = 2, .opc2 = 0,
53
- .access = PL3_RW, .type = ARM_CP_SVE | ARM_CP_FPU,
54
+ .access = PL3_RW, .type = ARM_CP_SVE,
55
.fieldoffset = offsetof(CPUARMState, vfp.zcr_el[3]),
56
.writefn = zcr_write, .raw_writefn = raw_write
57
};
58
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
59
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-a64.c
33
--- a/tests/fp/fp-test.c
61
+++ b/target/arm/translate-a64.c
34
+++ b/tests/fp/fp-test.c
62
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
35
@@ -XXX,XX +XXX,XX @@ void run_test(void)
63
default:
36
{
64
break;
37
unsigned int i;
65
}
38
66
- if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
39
+ /*
67
- return;
40
+ * These implementation-defined choices for various things IEEE
68
- }
41
+ * doesn't specify match those used by the Arm architecture.
69
if ((ri->type & ARM_CP_FPU) && !fp_access_check(s)) {
42
+ */
70
return;
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
71
+ } else if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
44
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
72
+ return;
45
73
}
46
genCases_setLevel(test_level);
74
47
verCases_maxErrorCount = n_max_errors;
75
if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
76
--
48
--
77
2.17.1
49
2.34.1
78
79
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the FloatInfZeroNaNRule explicitly for the Arm target,
2
so we can remove the ifdef from pickNaNMulAdd().
2
3
3
This register was added to aa32 state by ARMv8.2.
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-6-peter.maydell@linaro.org
7
---
8
target/arm/cpu.c | 3 +++
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 4 insertions(+), 7 deletions(-)
4
11
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20180629001538.11415-6-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/cpu.h | 1 +
11
target/arm/cpu.c | 4 ++++
12
target/arm/cpu64.c | 2 ++
13
target/arm/helper.c | 5 ++---
14
4 files changed, 9 insertions(+), 3 deletions(-)
15
16
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.h
19
+++ b/target/arm/cpu.h
20
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
21
uint32_t id_isar3;
22
uint32_t id_isar4;
23
uint32_t id_isar5;
24
+ uint32_t id_isar6;
25
uint64_t id_aa64pfr0;
26
uint64_t id_aa64pfr1;
27
uint64_t id_aa64dfr0;
28
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
29
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/cpu.c
14
--- a/target/arm/cpu.c
31
+++ b/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
32
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
33
cpu->id_isar3 = 0x01111110;
17
* * tininess-before-rounding
34
cpu->id_isar4 = 0x01310102;
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
35
cpu->id_isar5 = 0x00000000;
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
36
+ cpu->id_isar6 = 0x00000000;
20
+ * * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
21
+ * and the input NaN if it is signalling
22
*/
23
static void arm_set_default_fp_behaviours(float_status *s)
24
{
25
set_float_detect_tininess(float_tininess_before_rounding, s);
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
37
}
28
}
38
29
39
static void cortex_m4_initfn(Object *obj)
30
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
40
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
41
cpu->id_isar3 = 0x01111110;
42
cpu->id_isar4 = 0x01310102;
43
cpu->id_isar5 = 0x00000000;
44
+ cpu->id_isar6 = 0x00000000;
45
}
46
47
static void cortex_m33_initfn(Object *obj)
48
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
49
cpu->id_isar3 = 0x01111131;
50
cpu->id_isar4 = 0x01310132;
51
cpu->id_isar5 = 0x00000000;
52
+ cpu->id_isar6 = 0x00000000;
53
cpu->clidr = 0x00000000;
54
cpu->ctr = 0x8000c000;
55
}
56
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
57
cpu->id_isar3 = 0x01112131;
58
cpu->id_isar4 = 0x0010142;
59
cpu->id_isar5 = 0x0;
60
+ cpu->id_isar6 = 0x0;
61
cpu->mp_is_up = true;
62
cpu->pmsav7_dregion = 16;
63
define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
64
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
65
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
66
--- a/target/arm/cpu64.c
33
--- a/fpu/softfloat-specialize.c.inc
67
+++ b/target/arm/cpu64.c
34
+++ b/fpu/softfloat-specialize.c.inc
68
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
69
cpu->id_isar3 = 0x01112131;
36
/*
70
cpu->id_isar4 = 0x00011142;
37
* Temporarily fall back to ifdef ladder
71
cpu->id_isar5 = 0x00011121;
38
*/
72
+ cpu->id_isar6 = 0;
39
-#if defined(TARGET_ARM)
73
cpu->id_aa64pfr0 = 0x00002222;
40
- /*
74
cpu->id_aa64dfr0 = 0x10305106;
41
- * For ARM, the (inf,zero,qnan) case returns the default NaN,
75
cpu->pmceid0 = 0x00000000;
42
- * but (inf,zero,snan) returns the input NaN.
76
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
43
- */
77
cpu->id_isar3 = 0x01112131;
44
- rule = float_infzeronan_dnan_if_qnan;
78
cpu->id_isar4 = 0x00011142;
45
-#elif defined(TARGET_MIPS)
79
cpu->id_isar5 = 0x00011121;
46
+#if defined(TARGET_MIPS)
80
+ cpu->id_isar6 = 0;
47
if (snan_bit_is_one(status)) {
81
cpu->id_aa64pfr0 = 0x00002222;
48
/*
82
cpu->id_aa64dfr0 = 0x10305106;
49
* For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
83
cpu->id_aa64isar0 = 0x00011120;
84
diff --git a/target/arm/helper.c b/target/arm/helper.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/helper.c
87
+++ b/target/arm/helper.c
88
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
89
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
90
.access = PL1_R, .type = ARM_CP_CONST,
91
.resetvalue = cpu->id_mmfr4 },
92
- /* 7 is as yet unallocated and must RAZ */
93
- { .name = "ID_ISAR7_RESERVED", .state = ARM_CP_STATE_BOTH,
94
+ { .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
95
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
96
.access = PL1_R, .type = ARM_CP_CONST,
97
- .resetvalue = 0 },
98
+ .resetvalue = cpu->id_isar6 },
99
REGINFO_SENTINEL
100
};
101
define_arm_cp_regs(cpu, v6_idregs);
102
--
50
--
103
2.17.1
51
2.34.1
104
105
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the FloatInfZeroNaNRule explicitly for s390, so we
2
can remove the ifdef from pickNaNMulAdd().
2
3
3
We've already added the helpers with an SVE patch, all that remains
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
is to wire up the aa64 and aa32 translators. Enable the feature
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
within -cpu max for CONFIG_USER_ONLY.
6
Message-id: 20241202131347.498124-7-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 2 insertions(+), 2 deletions(-)
6
11
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20180627043328.11531-36-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/cpu.h | 1 +
13
linux-user/elfload.c | 1 +
14
target/arm/cpu.c | 1 +
15
target/arm/cpu64.c | 1 +
16
target/arm/translate-a64.c | 36 +++++++++++++++++++
17
target/arm/translate.c | 74 +++++++++++++++++++++++++++-----------
18
6 files changed, 93 insertions(+), 21 deletions(-)
19
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
21
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
14
--- a/target/s390x/cpu.c
23
+++ b/target/arm/cpu.h
15
+++ b/target/s390x/cpu.c
24
@@ -XXX,XX +XXX,XX @@ enum arm_features {
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
25
ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
17
set_float_detect_tininess(float_tininess_before_rounding,
26
ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */
18
&env->fpu_status);
27
ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
28
+ ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */
20
+ set_float_infzeronan_rule(float_infzeronan_dnan_always,
29
ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
21
+ &env->fpu_status);
30
ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */
22
/* fall through */
31
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
23
case RESET_TYPE_S390_CPU_NORMAL:
32
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
24
env->psw.mask &= ~PSW_MASK_RI;
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
34
--- a/linux-user/elfload.c
27
--- a/fpu/softfloat-specialize.c.inc
35
+++ b/linux-user/elfload.c
28
+++ b/fpu/softfloat-specialize.c.inc
36
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
37
ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
30
* a default NaN
38
GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
31
*/
39
GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
32
rule = float_infzeronan_dnan_never;
40
+ GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP);
33
-#elif defined(TARGET_S390X)
41
GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
34
- rule = float_infzeronan_dnan_always;
42
GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
43
#undef GET_FEATURE
44
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/cpu.c
47
+++ b/target/arm/cpu.c
48
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
49
set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
50
set_feature(&cpu->env, ARM_FEATURE_CRC);
51
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
52
+ set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
53
set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
54
#endif
35
#endif
55
}
36
}
56
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/cpu64.c
59
+++ b/target/arm/cpu64.c
60
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
61
set_feature(&cpu->env, ARM_FEATURE_CRC);
62
set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
63
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
64
+ set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
65
set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
66
set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
67
set_feature(&cpu->env, ARM_FEATURE_SVE);
68
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/target/arm/translate-a64.c
71
+++ b/target/arm/translate-a64.c
72
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
73
vec_full_reg_size(s), gvec_op);
74
}
75
76
+/* Expand a 3-operand operation using an out-of-line helper. */
77
+static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
78
+ int rn, int rm, int data, gen_helper_gvec_3 *fn)
79
+{
80
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
81
+ vec_full_reg_offset(s, rn),
82
+ vec_full_reg_offset(s, rm),
83
+ is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
84
+}
85
+
86
/* Expand a 3-operand + env pointer operation using
87
* an out-of-line helper.
88
*/
89
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
90
}
91
feature = ARM_FEATURE_V8_RDM;
92
break;
93
+ case 0x02: /* SDOT (vector) */
94
+ case 0x12: /* UDOT (vector) */
95
+ if (size != MO_32) {
96
+ unallocated_encoding(s);
97
+ return;
98
+ }
99
+ feature = ARM_FEATURE_V8_DOTPROD;
100
+ break;
101
case 0x8: /* FCMLA, #0 */
102
case 0x9: /* FCMLA, #90 */
103
case 0xa: /* FCMLA, #180 */
104
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
105
}
106
return;
107
108
+ case 0x2: /* SDOT / UDOT */
109
+ gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0,
110
+ u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b);
111
+ return;
112
+
113
case 0x8: /* FCMLA, #0 */
114
case 0x9: /* FCMLA, #90 */
115
case 0xa: /* FCMLA, #180 */
116
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
117
return;
118
}
119
break;
120
+ case 0x0e: /* SDOT */
121
+ case 0x1e: /* UDOT */
122
+ if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
123
+ unallocated_encoding(s);
124
+ return;
125
+ }
126
+ break;
127
case 0x11: /* FCMLA #0 */
128
case 0x13: /* FCMLA #90 */
129
case 0x15: /* FCMLA #180 */
130
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
131
}
132
133
switch (16 * u + opcode) {
134
+ case 0x0e: /* SDOT */
135
+ case 0x1e: /* UDOT */
136
+ gen_gvec_op3_ool(s, is_q, rd, rn, rm, index,
137
+ u ? gen_helper_gvec_udot_idx_b
138
+ : gen_helper_gvec_sdot_idx_b);
139
+ return;
140
case 0x11: /* FCMLA #0 */
141
case 0x13: /* FCMLA #90 */
142
case 0x15: /* FCMLA #180 */
143
diff --git a/target/arm/translate.c b/target/arm/translate.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/arm/translate.c
146
+++ b/target/arm/translate.c
147
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
148
*/
149
static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
150
{
151
- gen_helper_gvec_3_ptr *fn_gvec_ptr;
152
- int rd, rn, rm, rot, size, opr_sz;
153
- TCGv_ptr fpst;
154
+ gen_helper_gvec_3 *fn_gvec = NULL;
155
+ gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
156
+ int rd, rn, rm, opr_sz;
157
+ int data = 0;
158
bool q;
159
160
q = extract32(insn, 6, 1);
161
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
162
163
if ((insn & 0xfe200f10) == 0xfc200800) {
164
/* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
165
- size = extract32(insn, 20, 1);
166
- rot = extract32(insn, 23, 2);
167
+ int size = extract32(insn, 20, 1);
168
+ data = extract32(insn, 23, 2); /* rot */
169
if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
170
|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
171
return 1;
172
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
173
fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
174
} else if ((insn & 0xfea00f10) == 0xfc800800) {
175
/* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
176
- size = extract32(insn, 20, 1);
177
- rot = extract32(insn, 24, 1);
178
+ int size = extract32(insn, 20, 1);
179
+ data = extract32(insn, 24, 1); /* rot */
180
if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
181
|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
182
return 1;
183
}
184
fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
185
+ } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
186
+ /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
187
+ bool u = extract32(insn, 4, 1);
188
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
189
+ return 1;
190
+ }
191
+ fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
192
} else {
193
return 1;
194
}
195
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
196
}
197
198
opr_sz = (1 + q) * 8;
199
- fpst = get_fpstatus_ptr(1);
200
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
201
- vfp_reg_offset(1, rn),
202
- vfp_reg_offset(1, rm), fpst,
203
- opr_sz, opr_sz, rot, fn_gvec_ptr);
204
- tcg_temp_free_ptr(fpst);
205
+ if (fn_gvec_ptr) {
206
+ TCGv_ptr fpst = get_fpstatus_ptr(1);
207
+ tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
208
+ vfp_reg_offset(1, rn),
209
+ vfp_reg_offset(1, rm), fpst,
210
+ opr_sz, opr_sz, data, fn_gvec_ptr);
211
+ tcg_temp_free_ptr(fpst);
212
+ } else {
213
+ tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
214
+ vfp_reg_offset(1, rn),
215
+ vfp_reg_offset(1, rm),
216
+ opr_sz, opr_sz, data, fn_gvec);
217
+ }
218
return 0;
219
}
220
221
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
222
223
static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
224
{
225
- gen_helper_gvec_3_ptr *fn_gvec_ptr;
226
+ gen_helper_gvec_3 *fn_gvec = NULL;
227
+ gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
228
int rd, rn, rm, opr_sz, data;
229
- TCGv_ptr fpst;
230
bool q;
231
232
q = extract32(insn, 6, 1);
233
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
234
data = (index << 2) | rot;
235
fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
236
: gen_helper_gvec_fcmlah_idx);
237
+ } else if ((insn & 0xffb00f00) == 0xfe200d00) {
238
+ /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
239
+ int u = extract32(insn, 4, 1);
240
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
241
+ return 1;
242
+ }
243
+ fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
244
+ /* rm is just Vm, and index is M. */
245
+ data = extract32(insn, 5, 1); /* index */
246
+ rm = extract32(insn, 0, 4);
247
} else {
248
return 1;
249
}
250
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
251
}
252
253
opr_sz = (1 + q) * 8;
254
- fpst = get_fpstatus_ptr(1);
255
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
256
- vfp_reg_offset(1, rn),
257
- vfp_reg_offset(1, rm), fpst,
258
- opr_sz, opr_sz, data, fn_gvec_ptr);
259
- tcg_temp_free_ptr(fpst);
260
+ if (fn_gvec_ptr) {
261
+ TCGv_ptr fpst = get_fpstatus_ptr(1);
262
+ tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
263
+ vfp_reg_offset(1, rn),
264
+ vfp_reg_offset(1, rm), fpst,
265
+ opr_sz, opr_sz, data, fn_gvec_ptr);
266
+ tcg_temp_free_ptr(fpst);
267
+ } else {
268
+ tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
269
+ vfp_reg_offset(1, rn),
270
+ vfp_reg_offset(1, rm),
271
+ opr_sz, opr_sz, data, fn_gvec);
272
+ }
273
return 0;
274
}
275
37
276
--
38
--
277
2.17.1
39
2.34.1
278
279
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the PPC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-8-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 7 +++++++
9
fpu/softfloat-specialize.c.inc | 7 +------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
22
+ * to return an input NaN if we have one (ie c) rather than generating
23
+ * a default NaN
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
27
28
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
29
ppc_spr_t *spr = &env->spr_cb[i];
30
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
index XXXXXXX..XXXXXXX 100644
32
--- a/fpu/softfloat-specialize.c.inc
33
+++ b/fpu/softfloat-specialize.c.inc
34
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
*/
36
rule = float_infzeronan_dnan_never;
37
}
38
-#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
39
+#elif defined(TARGET_SPARC) || \
40
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
41
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
42
/*
43
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
44
* case sets InvalidOp and returns the input value 'c'
45
*/
46
- /*
47
- * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
48
- * to return an input NaN if we have one (ie c) rather than generating
49
- * a default NaN
50
- */
51
rule = float_infzeronan_dnan_never;
52
#endif
53
}
54
--
55
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the FloatInfZeroNaNRule explicitly for the MIPS target,
2
so we can remove the ifdef from pickNaNMulAdd().
2
3
3
We already check for the same condition within the normal integer
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
sdiv and sdiv64 helpers. Use a slightly different formation that
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
does not require deducing the expression type.
6
Message-id: 20241202131347.498124-9-peter.maydell@linaro.org
7
---
8
target/mips/fpu_helper.h | 9 +++++++++
9
target/mips/msa.c | 4 ++++
10
fpu/softfloat-specialize.c.inc | 16 +---------------
11
3 files changed, 14 insertions(+), 15 deletions(-)
6
12
7
Fixes: f97cfd596ed
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20180629001538.11415-2-richard.henderson@linaro.org
12
[PMM: reworded a comment]
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
15
target/arm/sve_helper.c | 20 +++++++++++++++-----
16
1 file changed, 15 insertions(+), 5 deletions(-)
17
18
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
19
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/sve_helper.c
15
--- a/target/mips/fpu_helper.h
21
+++ b/target/arm/sve_helper.c
16
+++ b/target/mips/fpu_helper.h
22
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_flush_mode(CPUMIPSState *env)
23
#define DO_MIN(N, M) ((N) >= (M) ? (M) : (N))
18
static inline void restore_snan_bit_mode(CPUMIPSState *env)
24
#define DO_ABD(N, M) ((N) >= (M) ? (N) - (M) : (M) - (N))
19
{
25
#define DO_MUL(N, M) (N * M)
20
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
26
-#define DO_DIV(N, M) (M ? N / M : 0)
21
+ FloatInfZeroNaNRule izn_rule;
22
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
set_snan_bit_is_one(!nan2008, &env->active_fpu.fp_status);
28
set_default_nan_mode(!nan2008, &env->active_fpu.fp_status);
29
+ /*
30
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
31
+ * case sets InvalidOp and returns the default NaN.
32
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
33
+ * case sets InvalidOp and returns the input value 'c'.
34
+ */
35
+ izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
36
+ set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
37
}
38
39
static inline void restore_fp_status(CPUMIPSState *env)
40
diff --git a/target/mips/msa.c b/target/mips/msa.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/mips/msa.c
43
+++ b/target/mips/msa.c
44
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
45
46
/* set proper signanling bit meaning ("1" means "quiet") */
47
set_snan_bit_is_one(0, &env->active_tc.msa_fp_status);
27
+
48
+
28
+
49
+ /* Inf * 0 + NaN returns the input NaN */
29
+/*
50
+ set_float_infzeronan_rule(float_infzeronan_dnan_never,
30
+ * We must avoid the C undefined behaviour cases: division by
51
+ &env->active_tc.msa_fp_status);
31
+ * zero and signed division of INT_MIN by -1. Both of these
52
}
32
+ * have architecturally defined required results for Arm.
53
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
+ * We special case all signed divisions by -1 to avoid having
54
index XXXXXXX..XXXXXXX 100644
34
+ * to deduce the minimum integer for the type involved.
55
--- a/fpu/softfloat-specialize.c.inc
35
+ */
56
+++ b/fpu/softfloat-specialize.c.inc
36
+#define DO_SDIV(N, M) (unlikely(M == 0) ? 0 : unlikely(M == -1) ? -N : N / M)
57
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
37
+#define DO_UDIV(N, M) (unlikely(M == 0) ? 0 : N / M)
58
/*
38
59
* Temporarily fall back to ifdef ladder
39
DO_ZPZZ(sve_and_zpzz_b, uint8_t, H1, DO_AND)
60
*/
40
DO_ZPZZ(sve_and_zpzz_h, uint16_t, H1_2, DO_AND)
61
-#if defined(TARGET_MIPS)
41
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ(sve_umulh_zpzz_h, uint16_t, H1_2, do_mulh_h)
62
- if (snan_bit_is_one(status)) {
42
DO_ZPZZ(sve_umulh_zpzz_s, uint32_t, H1_4, do_mulh_s)
63
- /*
43
DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d)
64
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
44
65
- * case sets InvalidOp and returns the default NaN
45
-DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV)
66
- */
46
-DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV)
67
- rule = float_infzeronan_dnan_always;
47
+DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_SDIV)
68
- } else {
48
+DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_SDIV)
69
- /*
49
70
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
50
-DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV)
71
- * case sets InvalidOp and returns the input value 'c'
51
-DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV)
72
- */
52
+DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_UDIV)
73
- rule = float_infzeronan_dnan_never;
53
+DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_UDIV)
74
- }
54
75
-#elif defined(TARGET_SPARC) || \
55
/* Note that all bits of the shift are significant
76
+#if defined(TARGET_SPARC) || \
56
and not modulo the element size. */
77
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
78
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
79
/*
57
--
80
--
58
2.17.1
81
2.34.1
59
60
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the SPARC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-10-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 +--
10
2 files changed, 3 insertions(+), 2 deletions(-)
11
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/sparc/cpu.c
15
+++ b/target/sparc/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
17
* the CPU state struct so it won't get zeroed on reset.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
23
cpu_exec_realizefn(cs, &local_err);
24
if (local_err != NULL) {
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_SPARC) || \
34
- defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
35
+#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
36
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
37
/*
38
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
--
40
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the xtensa target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-11-peter.maydell@linaro.org
7
---
8
target/xtensa/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 +-
10
2 files changed, 3 insertions(+), 1 deletion(-)
11
12
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/xtensa/cpu.c
15
+++ b/target/xtensa/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
17
reset_mmu(env);
18
cs->halted = env->runstall;
19
#endif
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
set_no_signaling_nans(!dfpu, &env->fp_status);
23
xtensa_use_first_nan(env, !dfpu);
24
}
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
34
+#if defined(TARGET_HPPA) || \
35
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
36
/*
37
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the x86 target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-12-peter.maydell@linaro.org
6
---
7
target/i386/tcg/fpu_helper.c | 7 +++++++
8
fpu/softfloat-specialize.c.inc | 2 +-
9
2 files changed, 8 insertions(+), 1 deletion(-)
10
11
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/i386/tcg/fpu_helper.c
14
+++ b/target/i386/tcg/fpu_helper.c
15
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
16
*/
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->mmx_status);
18
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->sse_status);
19
+ /*
20
+ * Only SSE has multiply-add instructions. In the SDM Section 14.5.2
21
+ * "Fused-Multiply-ADD (FMA) Numeric Behavior" the NaN handling is
22
+ * specified -- for 0 * inf + NaN the input NaN is selected, and if
23
+ * there are multiple input NaNs they are selected in the order a, b, c.
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
26
}
27
28
static inline uint8_t save_exception_flags(CPUX86State *env)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
34
* Temporarily fall back to ifdef ladder
35
*/
36
#if defined(TARGET_HPPA) || \
37
- defined(TARGET_I386) || defined(TARGET_LOONGARCH)
38
+ defined(TARGET_LOONGARCH)
39
/*
40
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
41
* case sets InvalidOp and returns the input value 'c'
42
--
43
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the loongarch target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-13-peter.maydell@linaro.org
6
---
7
target/loongarch/tcg/fpu_helper.c | 5 +++++
8
fpu/softfloat-specialize.c.inc | 7 +------
9
2 files changed, 6 insertions(+), 6 deletions(-)
10
11
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/loongarch/tcg/fpu_helper.c
14
+++ b/target/loongarch/tcg/fpu_helper.c
15
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
16
&env->fp_status);
17
set_flush_to_zero(0, &env->fp_status);
18
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
19
+ /*
20
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
21
+ * case sets InvalidOp and returns the input value 'c'
22
+ */
23
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
}
25
26
int ieee_ex_to_loongarch(int xcpt)
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
index XXXXXXX..XXXXXXX 100644
29
--- a/fpu/softfloat-specialize.c.inc
30
+++ b/fpu/softfloat-specialize.c.inc
31
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
32
/*
33
* Temporarily fall back to ifdef ladder
34
*/
35
-#if defined(TARGET_HPPA) || \
36
- defined(TARGET_LOONGARCH)
37
- /*
38
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
- * case sets InvalidOp and returns the input value 'c'
40
- */
41
+#if defined(TARGET_HPPA)
42
rule = float_infzeronan_dnan_never;
43
#endif
44
}
45
--
46
2.34.1
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
1
Set the FloatInfZeroNaNRule explicitly for the HPPA target,
2
so we can remove the ifdef from pickNaNMulAdd().
2
3
3
Use assert() instead of error_setg(&error_abort),
4
As this is the last target to be converted to explicitly setting
4
as suggested by the "qapi/error.h" documentation:
5
the rule, we can remove the fallback code in pickNaNMulAdd()
6
entirely.
5
7
6
Please don't error_setg(&error_fatal, ...), use error_report() and
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
exit(), because that's more obvious.
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Likewise, don't error_setg(&error_abort, ...), use assert().
10
Message-id: 20241202131347.498124-14-peter.maydell@linaro.org
11
---
12
target/hppa/fpu_helper.c | 2 ++
13
fpu/softfloat-specialize.c.inc | 13 +------------
14
2 files changed, 3 insertions(+), 12 deletions(-)
9
15
10
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
16
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
11
Acked-by: John Snow <jsnow@redhat.com>
12
Message-id: 20180625165749.3910-2-f4bug@amsat.org
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
15
hw/block/fdc.c | 9 +--------
16
1 file changed, 1 insertion(+), 8 deletions(-)
17
18
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/block/fdc.c
18
--- a/target/hppa/fpu_helper.c
21
+++ b/hw/block/fdc.c
19
+++ b/target/hppa/fpu_helper.c
22
@@ -XXX,XX +XXX,XX @@ static int pick_geometry(FDrive *drv)
20
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
23
nb_sectors,
21
* HPPA does note implement a CPU reset method at all...
24
FloppyDriveType_str(parse->drive));
22
*/
25
}
23
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
26
+ assert(type_match != -1 && "misconfigured fd_format");
24
+ /* For inf * 0 + NaN, return the input NaN */
27
match = type_match;
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
28
}
26
}
27
28
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
34
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
bool infzero, float_status *status)
36
{
37
- FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
29
-
38
-
30
- /* No match of any kind found -- fd_format is misconfigured, abort. */
39
/*
31
- if (match == -1) {
40
* We guarantee not to require the target to tell us how to
32
- error_setg(&error_abort, "No candidate geometries present in table "
41
* pick a NaN if we're always returning the default NaN.
33
- " for floppy drive type '%s'",
42
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
34
- FloppyDriveType_str(drv->drive));
43
*/
44
assert(!status->default_nan_mode);
45
46
- if (rule == float_infzeronan_none) {
47
- /*
48
- * Temporarily fall back to ifdef ladder
49
- */
50
-#if defined(TARGET_HPPA)
51
- rule = float_infzeronan_dnan_never;
52
-#endif
35
- }
53
- }
36
-
54
-
37
parse = &(fd_formats[match]);
55
if (infzero) {
38
56
/*
39
out:
57
* Inf * 0 + NaN -- some implementations return the default NaN here,
58
* and some return the input NaN.
59
*/
60
- switch (rule) {
61
+ switch (status->float_infzeronan_rule) {
62
case float_infzeronan_dnan_never:
63
return 2;
64
case float_infzeronan_dnan_always:
40
--
65
--
41
2.17.1
66
2.34.1
42
43
diff view generated by jsdifflib
1
From: Eric Auger <eric.auger@redhat.com>
1
The new implementation of pickNaNMulAdd() will find it convenient
2
to know whether at least one of the three arguments to the muladd
3
was a signaling NaN. We already calculate that in the caller,
4
so pass it in as a new bool have_snan.
2
5
3
When running dtc on the guest /proc/device-tree we get the
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
following warning: Warning (unit_address_vs_reg): Node /memory
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
has a reg or ranges property, but no unit name".
8
Message-id: 20241202131347.498124-15-peter.maydell@linaro.org
9
---
10
fpu/softfloat-parts.c.inc | 5 +++--
11
fpu/softfloat-specialize.c.inc | 2 +-
12
2 files changed, 4 insertions(+), 3 deletions(-)
6
13
7
Let's fix that by adding the unit address to the node name. We also
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
8
don't create the /memory node anymore in create_fdt(). We directly
9
create it in load_dtb. /chosen still needs to be created in create_fdt
10
as the uart needs it. In case the user provided his own dtb, we nop
11
all memory nodes found in root and create new one(s).
12
13
Signed-off-by: Eric Auger <eric.auger@redhat.com>
14
Message-id: 1530044492-24921-4-git-send-email-eric.auger@redhat.com
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
18
hw/arm/boot.c | 41 +++++++++++++++++++++++------------------
19
hw/arm/virt.c | 7 +------
20
2 files changed, 24 insertions(+), 24 deletions(-)
21
22
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
23
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
24
--- a/hw/arm/boot.c
16
--- a/fpu/softfloat-parts.c.inc
25
+++ b/hw/arm/boot.c
17
+++ b/fpu/softfloat-parts.c.inc
26
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
27
hwaddr addr_limit, AddressSpace *as)
28
{
19
{
29
void *fdt = NULL;
20
int which;
30
- int size, rc;
21
bool infzero = (ab_mask == float_cmask_infzero);
31
+ int size, rc, n = 0;
22
+ bool have_snan = (abc_mask & float_cmask_snan);
32
uint32_t acells, scells;
23
33
char *nodename;
24
- if (unlikely(abc_mask & float_cmask_snan)) {
34
unsigned int i;
25
+ if (unlikely(have_snan)) {
35
hwaddr mem_base, mem_len;
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
36
+ char **node_path;
37
+ Error *err = NULL;
38
39
if (binfo->dtb_filename) {
40
char *filename;
41
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
42
goto fail;
43
}
27
}
44
28
45
+ /* nop all root nodes matching /memory or /memory@unit-address */
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
46
+ node_path = qemu_fdt_node_unit_path(fdt, "memory", &err);
30
if (s->default_nan_mode) {
47
+ if (err) {
31
which = 3;
48
+ error_report_err(err);
49
+ goto fail;
50
+ }
51
+ while (node_path[n]) {
52
+ if (g_str_has_prefix(node_path[n], "/memory")) {
53
+ qemu_fdt_nop_node(fdt, node_path[n]);
54
+ }
55
+ n++;
56
+ }
57
+ g_strfreev(node_path);
58
+
59
if (nb_numa_nodes > 0) {
60
- /*
61
- * Turn the /memory node created before into a NOP node, then create
62
- * /memory@addr nodes for all numa nodes respectively.
63
- */
64
- qemu_fdt_nop_node(fdt, "/memory");
65
mem_base = binfo->loader_start;
66
for (i = 0; i < nb_numa_nodes; i++) {
67
mem_len = numa_info[i].node_mem;
68
@@ -XXX,XX +XXX,XX @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
69
g_free(nodename);
70
}
71
} else {
32
} else {
72
- Error *err = NULL;
33
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
73
+ nodename = g_strdup_printf("/memory@%" PRIx64, binfo->loader_start);
34
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
74
+ qemu_fdt_add_subnode(fdt, nodename);
75
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
76
77
- rc = fdt_path_offset(fdt, "/memory");
78
- if (rc < 0) {
79
- qemu_fdt_add_subnode(fdt, "/memory");
80
- }
81
-
82
- if (!qemu_fdt_getprop(fdt, "/memory", "device_type", NULL, &err)) {
83
- qemu_fdt_setprop_string(fdt, "/memory", "device_type", "memory");
84
- }
85
-
86
- rc = qemu_fdt_setprop_sized_cells(fdt, "/memory", "reg",
87
+ rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
88
acells, binfo->loader_start,
89
scells, binfo->ram_size);
90
if (rc < 0) {
91
- fprintf(stderr, "couldn't set /memory/reg\n");
92
+ fprintf(stderr, "couldn't set %s reg\n", nodename);
93
goto fail;
94
}
95
+ g_free(nodename);
96
}
35
}
97
36
98
rc = fdt_path_offset(fdt, "/chosen");
37
if (which == 3) {
99
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
38
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
100
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
101
--- a/hw/arm/virt.c
40
--- a/fpu/softfloat-specialize.c.inc
102
+++ b/hw/arm/virt.c
41
+++ b/fpu/softfloat-specialize.c.inc
103
@@ -XXX,XX +XXX,XX @@ static void create_fdt(VirtMachineState *vms)
42
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
104
qemu_fdt_setprop_cell(fdt, "/", "#address-cells", 0x2);
43
| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
105
qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
44
*----------------------------------------------------------------------------*/
106
45
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
107
- /*
46
- bool infzero, float_status *status)
108
- * /chosen and /memory nodes must exist for load_dtb
47
+ bool infzero, bool have_snan, float_status *status)
109
- * to fill in necessary properties later
48
{
110
- */
49
/*
111
+ /* /chosen must exist for load_dtb to fill in necessary properties later */
50
* We guarantee not to require the target to tell us how to
112
qemu_fdt_add_subnode(fdt, "/chosen");
113
- qemu_fdt_add_subnode(fdt, "/memory");
114
- qemu_fdt_setprop_string(fdt, "/memory", "device_type", "memory");
115
116
/* Clock node, for the benefit of the UART. The kernel device tree
117
* binding documentation claims the PL011 node clock properties are
118
--
51
--
119
2.17.1
52
2.34.1
120
121
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
IEEE 758 does not define a fixed rule for which NaN to pick as the
2
2
result if both operands of a 3-operand fused multiply-add operation
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
are NaNs. As a result different architectures have ended up with
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
different rules for propagating NaNs.
5
Message-id: 20180627043328.11531-21-richard.henderson@linaro.org
5
6
QEMU currently hardcodes the NaN propagation logic into the binary
7
because pickNaNMulAdd() has an ifdef ladder for different targets.
8
We want to make the propagation rule instead be selectable at
9
runtime, because:
10
* this will let us have multiple targets in one QEMU binary
11
* the Arm FEAT_AFP architectural feature includes letting
12
the guest select a NaN propagation rule at runtime
13
14
In this commit we add an enum for the propagation rule, the field in
15
float_status, and the corresponding getters and setters. We change
16
pickNaNMulAdd to honour this, but because all targets still leave
17
this field at its default 0 value, the fallback logic will pick the
18
rule type with the old ifdef ladder.
19
20
It's valid not to set a propagation rule if default_nan_mode is
21
enabled, because in that case there's no need to pick a NaN; all the
22
callers of pickNaNMulAdd() catch this case and skip calling it.
23
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
26
Message-id: 20241202131347.498124-16-peter.maydell@linaro.org
7
---
27
---
8
target/arm/helper.h | 8 +++++++
28
include/fpu/softfloat-helpers.h | 11 +++
9
target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++
29
include/fpu/softfloat-types.h | 55 +++++++++++
10
target/arm/vec_helper.c | 20 ++++++++++++++++
30
fpu/softfloat-specialize.c.inc | 167 ++++++++------------------------
11
target/arm/sve.decode | 5 ++++
31
3 files changed, 107 insertions(+), 126 deletions(-)
12
4 files changed, 80 insertions(+)
32
13
33
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
35
--- a/include/fpu/softfloat-helpers.h
17
+++ b/target/arm/helper.h
36
+++ b/include/fpu/softfloat-helpers.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
37
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
19
DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
38
status->float_2nan_prop_rule = rule;
20
void, ptr, ptr, ptr, ptr, i32)
39
}
21
40
22
+DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
+static inline void set_float_3nan_prop_rule(Float3NaNPropRule rule,
23
+DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
+ float_status *status)
24
+DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
43
+{
25
+
44
+ status->float_3nan_prop_rule = rule;
26
+DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
45
+}
27
+DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
46
+
28
+DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
29
+
48
float_status *status)
30
DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
49
{
31
DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
50
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
32
DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
51
return status->float_2nan_prop_rule;
33
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
52
}
53
54
+static inline Float3NaNPropRule get_float_3nan_prop_rule(float_status *status)
55
+{
56
+ return status->float_3nan_prop_rule;
57
+}
58
+
59
static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
60
{
61
return status->float_infzeronan_rule;
62
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
34
index XXXXXXX..XXXXXXX 100644
63
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/translate-sve.c
64
--- a/include/fpu/softfloat-types.h
36
+++ b/target/arm/translate-sve.c
65
+++ b/include/fpu/softfloat-types.h
37
@@ -XXX,XX +XXX,XX @@ DO_VPZ(FMAXNMV, fmaxnmv)
66
@@ -XXX,XX +XXX,XX @@ this code that are retained.
38
DO_VPZ(FMINV, fminv)
67
#ifndef SOFTFLOAT_TYPES_H
39
DO_VPZ(FMAXV, fmaxv)
68
#define SOFTFLOAT_TYPES_H
69
70
+#include "hw/registerfields.h"
71
+
72
/*
73
* Software IEC/IEEE floating-point types.
74
*/
75
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
76
float_2nan_prop_x87,
77
} Float2NaNPropRule;
40
78
41
+/*
79
+/*
42
+ *** SVE Floating Point Unary Operations - Unpredicated Group
80
+ * 3-input NaN propagation rule, for fused multiply-add. Individual
81
+ * architectures have different rules for which input NaN is
82
+ * propagated to the output when there is more than one NaN on the
83
+ * input.
84
+ *
85
+ * If default_nan_mode is enabled then it is valid not to set a NaN
86
+ * propagation rule, because the softfloat code guarantees not to try
87
+ * to pick a NaN to propagate in default NaN mode. When not in
88
+ * default-NaN mode, it is an error for the target not to set the rule
89
+ * in float_status if it uses a muladd, and we will assert if we need
90
+ * to handle an input NaN and no rule was selected.
91
+ *
92
+ * The naming scheme for Float3NaNPropRule values is:
93
+ * float_3nan_prop_s_abc:
94
+ * = "Prefer SNaN over QNaN, then operand A over B over C"
95
+ * float_3nan_prop_abc:
96
+ * = "Prefer A over B over C regardless of SNaN vs QNAN"
97
+ *
98
+ * For QEMU, the multiply-add operation is A * B + C.
43
+ */
99
+ */
44
+
100
+
45
+static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn)
101
+/*
46
+{
102
+ * We set the Float3NaNPropRule enum values up so we can select the
47
+ unsigned vsz = vec_full_reg_size(s);
103
+ * right value in pickNaNMulAdd in a data driven way.
48
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
104
+ */
49
+
105
+FIELD(3NAN, 1ST, 0, 2) /* which operand is most preferred ? */
50
+ tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd),
106
+FIELD(3NAN, 2ND, 2, 2) /* which operand is next most preferred ? */
51
+ vec_full_reg_offset(s, a->rn),
107
+FIELD(3NAN, 3RD, 4, 2) /* which operand is least preferred ? */
52
+ status, vsz, vsz, 0, fn);
108
+FIELD(3NAN, SNAN, 6, 1) /* do we prefer SNaN over QNaN ? */
53
+ tcg_temp_free_ptr(status);
109
+
54
+}
110
+#define PROPRULE(X, Y, Z) \
55
+
111
+ ((X << R_3NAN_1ST_SHIFT) | (Y << R_3NAN_2ND_SHIFT) | (Z << R_3NAN_3RD_SHIFT))
56
+static bool trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
112
+
57
+{
113
+typedef enum __attribute__((__packed__)) {
58
+ static gen_helper_gvec_2_ptr * const fns[3] = {
114
+ float_3nan_prop_none = 0, /* No propagation rule specified */
59
+ gen_helper_gvec_frecpe_h,
115
+ float_3nan_prop_abc = PROPRULE(0, 1, 2),
60
+ gen_helper_gvec_frecpe_s,
116
+ float_3nan_prop_acb = PROPRULE(0, 2, 1),
61
+ gen_helper_gvec_frecpe_d,
117
+ float_3nan_prop_bac = PROPRULE(1, 0, 2),
62
+ };
118
+ float_3nan_prop_bca = PROPRULE(1, 2, 0),
63
+ if (a->esz == 0) {
119
+ float_3nan_prop_cab = PROPRULE(2, 0, 1),
64
+ return false;
120
+ float_3nan_prop_cba = PROPRULE(2, 1, 0),
121
+ float_3nan_prop_s_abc = float_3nan_prop_abc | R_3NAN_SNAN_MASK,
122
+ float_3nan_prop_s_acb = float_3nan_prop_acb | R_3NAN_SNAN_MASK,
123
+ float_3nan_prop_s_bac = float_3nan_prop_bac | R_3NAN_SNAN_MASK,
124
+ float_3nan_prop_s_bca = float_3nan_prop_bca | R_3NAN_SNAN_MASK,
125
+ float_3nan_prop_s_cab = float_3nan_prop_cab | R_3NAN_SNAN_MASK,
126
+ float_3nan_prop_s_cba = float_3nan_prop_cba | R_3NAN_SNAN_MASK,
127
+} Float3NaNPropRule;
128
+
129
+#undef PROPRULE
130
+
131
/*
132
* Rule for result of fused multiply-add 0 * Inf + NaN.
133
* This must be a NaN, but implementations differ on whether this
134
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
135
FloatRoundMode float_rounding_mode;
136
FloatX80RoundPrec floatx80_rounding_precision;
137
Float2NaNPropRule float_2nan_prop_rule;
138
+ Float3NaNPropRule float_3nan_prop_rule;
139
FloatInfZeroNaNRule float_infzeronan_rule;
140
bool tininess_before_rounding;
141
/* should denormalised results go to zero and set the inexact flag? */
142
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
143
index XXXXXXX..XXXXXXX 100644
144
--- a/fpu/softfloat-specialize.c.inc
145
+++ b/fpu/softfloat-specialize.c.inc
146
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
147
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
148
bool infzero, bool have_snan, float_status *status)
149
{
150
+ FloatClass cls[3] = { a_cls, b_cls, c_cls };
151
+ Float3NaNPropRule rule = status->float_3nan_prop_rule;
152
+ int which;
153
+
154
/*
155
* We guarantee not to require the target to tell us how to
156
* pick a NaN if we're always returning the default NaN.
157
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
158
}
159
}
160
161
+ if (rule == float_3nan_prop_none) {
162
#if defined(TARGET_ARM)
163
-
164
- /* This looks different from the ARM ARM pseudocode, because the ARM ARM
165
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
166
- */
167
- if (is_snan(c_cls)) {
168
- return 2;
169
- } else if (is_snan(a_cls)) {
170
- return 0;
171
- } else if (is_snan(b_cls)) {
172
- return 1;
173
- } else if (is_qnan(c_cls)) {
174
- return 2;
175
- } else if (is_qnan(a_cls)) {
176
- return 0;
177
- } else {
178
- return 1;
179
- }
180
+ /*
181
+ * This looks different from the ARM ARM pseudocode, because the ARM ARM
182
+ * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
183
+ */
184
+ rule = float_3nan_prop_s_cab;
185
#elif defined(TARGET_MIPS)
186
- if (snan_bit_is_one(status)) {
187
- /* Prefer sNaN over qNaN, in the a, b, c order. */
188
- if (is_snan(a_cls)) {
189
- return 0;
190
- } else if (is_snan(b_cls)) {
191
- return 1;
192
- } else if (is_snan(c_cls)) {
193
- return 2;
194
- } else if (is_qnan(a_cls)) {
195
- return 0;
196
- } else if (is_qnan(b_cls)) {
197
- return 1;
198
+ if (snan_bit_is_one(status)) {
199
+ rule = float_3nan_prop_s_abc;
200
} else {
201
- return 2;
202
+ rule = float_3nan_prop_s_cab;
203
}
204
- } else {
205
- /* Prefer sNaN over qNaN, in the c, a, b order. */
206
- if (is_snan(c_cls)) {
207
- return 2;
208
- } else if (is_snan(a_cls)) {
209
- return 0;
210
- } else if (is_snan(b_cls)) {
211
- return 1;
212
- } else if (is_qnan(c_cls)) {
213
- return 2;
214
- } else if (is_qnan(a_cls)) {
215
- return 0;
216
- } else {
217
- return 1;
218
- }
219
- }
220
#elif defined(TARGET_LOONGARCH64)
221
- /* Prefer sNaN over qNaN, in the c, a, b order. */
222
- if (is_snan(c_cls)) {
223
- return 2;
224
- } else if (is_snan(a_cls)) {
225
- return 0;
226
- } else if (is_snan(b_cls)) {
227
- return 1;
228
- } else if (is_qnan(c_cls)) {
229
- return 2;
230
- } else if (is_qnan(a_cls)) {
231
- return 0;
232
- } else {
233
- return 1;
234
- }
235
+ rule = float_3nan_prop_s_cab;
236
#elif defined(TARGET_PPC)
237
- /* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
238
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
239
- */
240
- if (is_nan(a_cls)) {
241
- return 0;
242
- } else if (is_nan(c_cls)) {
243
- return 2;
244
- } else {
245
- return 1;
246
- }
247
+ /*
248
+ * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
249
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
250
+ */
251
+ rule = float_3nan_prop_acb;
252
#elif defined(TARGET_S390X)
253
- if (is_snan(a_cls)) {
254
- return 0;
255
- } else if (is_snan(b_cls)) {
256
- return 1;
257
- } else if (is_snan(c_cls)) {
258
- return 2;
259
- } else if (is_qnan(a_cls)) {
260
- return 0;
261
- } else if (is_qnan(b_cls)) {
262
- return 1;
263
- } else {
264
- return 2;
265
- }
266
+ rule = float_3nan_prop_s_abc;
267
#elif defined(TARGET_SPARC)
268
- /* Prefer SNaN over QNaN, order C, B, A. */
269
- if (is_snan(c_cls)) {
270
- return 2;
271
- } else if (is_snan(b_cls)) {
272
- return 1;
273
- } else if (is_snan(a_cls)) {
274
- return 0;
275
- } else if (is_qnan(c_cls)) {
276
- return 2;
277
- } else if (is_qnan(b_cls)) {
278
- return 1;
279
- } else {
280
- return 0;
281
- }
282
+ rule = float_3nan_prop_s_cba;
283
#elif defined(TARGET_XTENSA)
284
- /*
285
- * For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
286
- * an input NaN if we have one (ie c).
287
- */
288
- if (status->use_first_nan) {
289
- if (is_nan(a_cls)) {
290
- return 0;
291
- } else if (is_nan(b_cls)) {
292
- return 1;
293
+ if (status->use_first_nan) {
294
+ rule = float_3nan_prop_abc;
295
} else {
296
- return 2;
297
+ rule = float_3nan_prop_cba;
298
}
299
- } else {
300
- if (is_nan(c_cls)) {
301
- return 2;
302
- } else if (is_nan(b_cls)) {
303
- return 1;
304
- } else {
305
- return 0;
306
- }
307
- }
308
#else
309
- /* A default implementation: prefer a to b to c.
310
- * This is unlikely to actually match any real implementation.
311
- */
312
- if (is_nan(a_cls)) {
313
- return 0;
314
- } else if (is_nan(b_cls)) {
315
- return 1;
316
- } else {
317
- return 2;
318
- }
319
+ rule = float_3nan_prop_abc;
320
#endif
65
+ }
321
+ }
66
+ if (sve_access_check(s)) {
322
+
67
+ do_zz_fp(s, a, fns[a->esz - 1]);
323
+ assert(rule != float_3nan_prop_none);
324
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
325
+ /* We have at least one SNaN input and should prefer it */
326
+ do {
327
+ which = rule & R_3NAN_1ST_MASK;
328
+ rule >>= R_3NAN_1ST_LENGTH;
329
+ } while (!is_snan(cls[which]));
330
+ } else {
331
+ do {
332
+ which = rule & R_3NAN_1ST_MASK;
333
+ rule >>= R_3NAN_1ST_LENGTH;
334
+ } while (!is_nan(cls[which]));
68
+ }
335
+ }
69
+ return true;
336
+ return which;
70
+}
71
+
72
+static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
73
+{
74
+ static gen_helper_gvec_2_ptr * const fns[3] = {
75
+ gen_helper_gvec_frsqrte_h,
76
+ gen_helper_gvec_frsqrte_s,
77
+ gen_helper_gvec_frsqrte_d,
78
+ };
79
+ if (a->esz == 0) {
80
+ return false;
81
+ }
82
+ if (sve_access_check(s)) {
83
+ do_zz_fp(s, a, fns[a->esz - 1]);
84
+ }
85
+ return true;
86
+}
87
+
88
/*
89
*** SVE Floating Point Accumulating Reduction Group
90
*/
91
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
92
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/vec_helper.c
94
+++ b/target/arm/vec_helper.c
95
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm,
96
clear_tail(d, opr_sz, simd_maxsz(desc));
97
}
337
}
98
338
99
+#define DO_2OP(NAME, FUNC, TYPE) \
339
/*----------------------------------------------------------------------------
100
+void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
101
+{ \
102
+ intptr_t i, oprsz = simd_oprsz(desc); \
103
+ TYPE *d = vd, *n = vn; \
104
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
105
+ d[i] = FUNC(n[i], stat); \
106
+ } \
107
+}
108
+
109
+DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16)
110
+DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32)
111
+DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64)
112
+
113
+DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16)
114
+DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32)
115
+DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64)
116
+
117
+#undef DO_2OP
118
+
119
/* Floating-point trigonometric starting value.
120
* See the ARM ARM pseudocode function FPTrigSMul.
121
*/
122
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
123
index XXXXXXX..XXXXXXX 100644
124
--- a/target/arm/sve.decode
125
+++ b/target/arm/sve.decode
126
@@ -XXX,XX +XXX,XX @@ FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn
127
FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn
128
FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn
129
130
+## SVE Floating Point Unary Operations - Unpredicated Group
131
+
132
+FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn
133
+FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn
134
+
135
### SVE FP Accumulating Reduction Group
136
137
# SVE floating-point serial reduction (predicated)
138
--
340
--
139
2.17.1
341
2.34.1
140
141
diff view generated by jsdifflib
New patch
1
Explicitly set a rule in the softfloat tests for propagating NaNs in
2
the muladd case. In meson.build we put -DTARGET_ARM in fpcflags, and
3
so we should select here the Arm rule of float_3nan_prop_s_cab.
1
4
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-17-peter.maydell@linaro.org
8
---
9
tests/fp/fp-bench.c | 1 +
10
tests/fp/fp-test.c | 1 +
11
2 files changed, 2 insertions(+)
12
13
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tests/fp/fp-bench.c
16
+++ b/tests/fp/fp-bench.c
17
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
18
* doesn't specify match those used by the Arm architecture.
19
*/
20
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
22
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
23
24
f = bench_funcs[operation][precision];
25
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/tests/fp/fp-test.c
28
+++ b/tests/fp/fp-test.c
29
@@ -XXX,XX +XXX,XX @@ void run_test(void)
30
* doesn't specify match those used by the Arm architecture.
31
*/
32
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
33
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
34
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
35
36
genCases_setLevel(test_level);
37
--
38
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
There is no need to re-set these 3 features already
4
implied by the call to aarch64_a15_initfn.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20180629001538.11415-5-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-18-peter.maydell@linaro.org
11
---
7
---
12
target/arm/cpu.c | 3 ---
8
target/arm/cpu.c | 5 +++++
13
1 file changed, 3 deletions(-)
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 6 insertions(+), 7 deletions(-)
14
11
15
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.c
14
--- a/target/arm/cpu.c
18
+++ b/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
19
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
20
* since we don't correctly set the ID registers to advertise them,
17
* * tininess-before-rounding
21
*/
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
22
set_feature(&cpu->env, ARM_FEATURE_V8);
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
23
- set_feature(&cpu->env, ARM_FEATURE_VFP4);
20
+ * * 3-input NaN propagation prefers SNaN over QNaN, and then
24
- set_feature(&cpu->env, ARM_FEATURE_NEON);
21
+ * operand C over A over B (see FPProcessNaNs3() pseudocode,
25
- set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
22
+ * but note that for QEMU muladd is a * b + c, whereas for
26
set_feature(&cpu->env, ARM_FEATURE_V8_AES);
23
+ * the pseudocode function the arguments are in the order c, a, b.
27
set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
24
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
28
set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
25
* and the input NaN if it is signalling
26
*/
27
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
28
{
29
set_float_detect_tininess(float_tininess_before_rounding, s);
30
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
31
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
32
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
33
}
34
35
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
36
index XXXXXXX..XXXXXXX 100644
37
--- a/fpu/softfloat-specialize.c.inc
38
+++ b/fpu/softfloat-specialize.c.inc
39
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
40
}
41
42
if (rule == float_3nan_prop_none) {
43
-#if defined(TARGET_ARM)
44
- /*
45
- * This looks different from the ARM ARM pseudocode, because the ARM ARM
46
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
47
- */
48
- rule = float_3nan_prop_s_cab;
49
-#elif defined(TARGET_MIPS)
50
+#if defined(TARGET_MIPS)
51
if (snan_bit_is_one(status)) {
52
rule = float_3nan_prop_s_abc;
53
} else {
29
--
54
--
30
2.17.1
55
2.34.1
31
32
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the Float3NaNPropRule explicitly for loongarch, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Enable ARM_FEATURE_SVE for the generic "max" cpu.
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-19-peter.maydell@linaro.org
7
---
8
target/loongarch/tcg/fpu_helper.c | 1 +
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 1 insertion(+), 2 deletions(-)
4
11
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
12
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20180627043328.11531-35-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
linux-user/elfload.c | 1 +
12
target/arm/cpu.c | 7 +++++++
13
target/arm/cpu64.c | 1 +
14
3 files changed, 9 insertions(+)
15
16
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/linux-user/elfload.c
14
--- a/target/loongarch/tcg/fpu_helper.c
19
+++ b/linux-user/elfload.c
15
+++ b/target/loongarch/tcg/fpu_helper.c
20
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
16
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
21
GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
17
* case sets InvalidOp and returns the input value 'c'
22
GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
18
*/
23
GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
+ GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
25
#undef GET_FEATURE
21
}
26
22
27
return hwcaps;
23
int ieee_ex_to_loongarch(int xcpt)
28
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/cpu.c
26
--- a/fpu/softfloat-specialize.c.inc
31
+++ b/target/arm/cpu.c
27
+++ b/fpu/softfloat-specialize.c.inc
32
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
33
env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE;
29
} else {
34
/* and to the FP/Neon instructions */
30
rule = float_3nan_prop_s_cab;
35
env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
31
}
36
+ /* and to the SVE instructions */
32
-#elif defined(TARGET_LOONGARCH64)
37
+ env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3);
33
- rule = float_3nan_prop_s_cab;
38
+ env->cp15.cptr_el[3] |= CPTR_EZ;
34
#elif defined(TARGET_PPC)
39
+ /* with maximum vector length */
35
/*
40
+ env->vfp.zcr_el[1] = ARM_MAX_VQ - 1;
36
* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
41
+ env->vfp.zcr_el[2] = ARM_MAX_VQ - 1;
42
+ env->vfp.zcr_el[3] = ARM_MAX_VQ - 1;
43
#else
44
/* Reset into the highest available EL */
45
if (arm_feature(env, ARM_FEATURE_EL3)) {
46
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/cpu64.c
49
+++ b/target/arm/cpu64.c
50
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
51
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
52
set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
53
set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
54
+ set_feature(&cpu->env, ARM_FEATURE_SVE);
55
/* For usermode -cpu max we can use a larger and more efficient DCZ
56
* blocksize since we don't have to follow what the hardware does.
57
*/
58
--
37
--
59
2.17.1
38
2.34.1
60
61
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for PPC, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-20-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 8 ++++++++
9
fpu/softfloat-specialize.c.inc | 6 ------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * NaN propagation for fused multiply-add:
22
+ * if fRA is a NaN return it; otherwise if fRB is a NaN return it;
23
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
24
+ * whereas QEMU labels the operands as (a * b) + c.
25
+ */
26
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->fp_status);
27
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->vec_status);
28
/*
29
* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
30
* to return an input NaN if we have one (ie c) rather than generating
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
32
index XXXXXXX..XXXXXXX 100644
33
--- a/fpu/softfloat-specialize.c.inc
34
+++ b/fpu/softfloat-specialize.c.inc
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
36
} else {
37
rule = float_3nan_prop_s_cab;
38
}
39
-#elif defined(TARGET_PPC)
40
- /*
41
- * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
42
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
43
- */
44
- rule = float_3nan_prop_acb;
45
#elif defined(TARGET_S390X)
46
rule = float_3nan_prop_s_abc;
47
#elif defined(TARGET_SPARC)
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for s390x, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-21-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 1 +
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 1 insertion(+), 2 deletions(-)
11
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/s390x/cpu.c
15
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
set_float_detect_tininess(float_tininess_before_rounding,
18
&env->fpu_status);
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
21
set_float_infzeronan_rule(float_infzeronan_dnan_always,
22
&env->fpu_status);
23
/* fall through */
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
29
} else {
30
rule = float_3nan_prop_s_cab;
31
}
32
-#elif defined(TARGET_S390X)
33
- rule = float_3nan_prop_s_abc;
34
#elif defined(TARGET_SPARC)
35
rule = float_3nan_prop_s_cba;
36
#elif defined(TARGET_XTENSA)
37
--
38
2.34.1
diff view generated by jsdifflib
1
We don't actually implement SD command CRC checking, because
1
Set the Float3NaNPropRule explicitly for SPARC, and remove the
2
for almost all of our SD controllers the CRC generation is
2
ifdef from pickNaNMulAdd().
3
done in hardware, and so modelling CRC generation and checking
4
would be a bit pointless. (The exception is that milkymist-memcard
5
makes the guest software compute the CRC.)
6
7
As a result almost all of our SD controller models don't bother
8
to set the SDRequest crc field, and the SD card model doesn't
9
check it. So the tracing of it in sdbus_do_command() provokes
10
Coverity warnings about use of uninitialized data.
11
12
Drop the CRC field from the trace; we can always add it back
13
if and when we do anything useful with the CRC.
14
15
Fixes Coverity issues 1386072, 1386074, 1386076, 1390571.
16
3
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Message-id: 20180626180324.5537-1-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-22-peter.maydell@linaro.org
20
---
7
---
21
hw/sd/core.c | 2 +-
8
target/sparc/cpu.c | 2 ++
22
hw/sd/trace-events | 2 +-
9
fpu/softfloat-specialize.c.inc | 2 --
23
2 files changed, 2 insertions(+), 2 deletions(-)
10
2 files changed, 2 insertions(+), 2 deletions(-)
24
11
25
diff --git a/hw/sd/core.c b/hw/sd/core.c
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
26
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
27
--- a/hw/sd/core.c
14
--- a/target/sparc/cpu.c
28
+++ b/hw/sd/core.c
15
+++ b/target/sparc/cpu.c
29
@@ -XXX,XX +XXX,XX @@ int sdbus_do_command(SDBus *sdbus, SDRequest *req, uint8_t *response)
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
30
{
17
* the CPU state struct so it won't get zeroed on reset.
31
SDState *card = get_card(sdbus);
18
*/
32
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
33
- trace_sdbus_command(sdbus_name(sdbus), req->cmd, req->arg, req->crc);
20
+ /* For fused-multiply add, prefer SNaN over QNaN, then C->B->A */
34
+ trace_sdbus_command(sdbus_name(sdbus), req->cmd, req->arg);
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
35
if (card) {
22
/* For inf * 0 + NaN, return the input NaN */
36
SDCardClass *sc = SD_CARD_GET_CLASS(card);
23
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
37
24
38
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
39
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
40
--- a/hw/sd/trace-events
27
--- a/fpu/softfloat-specialize.c.inc
41
+++ b/hw/sd/trace-events
28
+++ b/fpu/softfloat-specialize.c.inc
42
@@ -XXX,XX +XXX,XX @@ bcm2835_sdhost_edm_change(const char *why, uint32_t edm) "(%s) EDM now 0x%x"
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
43
bcm2835_sdhost_update_irq(uint32_t irq) "IRQ bits 0x%x\n"
30
} else {
44
31
rule = float_3nan_prop_s_cab;
45
# hw/sd/core.c
32
}
46
-sdbus_command(const char *bus_name, uint8_t cmd, uint32_t arg, uint8_t crc) "@%s CMD%02d arg 0x%08x crc 0x%02x"
33
-#elif defined(TARGET_SPARC)
47
+sdbus_command(const char *bus_name, uint8_t cmd, uint32_t arg) "@%s CMD%02d arg 0x%08x"
34
- rule = float_3nan_prop_s_cba;
48
sdbus_read(const char *bus_name, uint8_t value) "@%s value 0x%02x"
35
#elif defined(TARGET_XTENSA)
49
sdbus_write(const char *bus_name, uint8_t value) "@%s value 0x%02x"
36
if (status->use_first_nan) {
50
sdbus_set_voltage(const char *bus_name, uint16_t millivolts) "@%s %u (mV)"
37
rule = float_3nan_prop_abc;
51
--
38
--
52
2.17.1
39
2.34.1
53
54
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-28-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-23-peter.maydell@linaro.org
7
---
7
---
8
target/arm/translate-sve.c | 60 +++++++++++++++++++++++++++++++++++++-
8
target/mips/fpu_helper.h | 4 ++++
9
target/arm/sve.decode | 7 +++++
9
target/mips/msa.c | 3 +++
10
2 files changed, 66 insertions(+), 1 deletion(-)
10
fpu/softfloat-specialize.c.inc | 8 +-------
11
3 files changed, 8 insertions(+), 7 deletions(-)
11
12
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
13
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
15
--- a/target/mips/fpu_helper.h
15
+++ b/target/arm/translate-sve.c
16
+++ b/target/mips/fpu_helper.h
16
@@ -XXX,XX +XXX,XX @@ static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn)
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
17
return true;
18
{
19
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
20
FloatInfZeroNaNRule izn_rule;
21
+ Float3NaNPropRule nan3_rule;
22
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
28
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
29
+ nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
30
+ set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
31
+
18
}
32
}
19
33
20
+/* Select active elememnts from Zn and inactive elements from Zm,
34
static inline void restore_fp_status(CPUMIPSState *env)
21
+ * storing the result in Zd.
35
diff --git a/target/mips/msa.c b/target/mips/msa.c
22
+ */
36
index XXXXXXX..XXXXXXX 100644
23
+static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz)
37
--- a/target/mips/msa.c
24
+{
38
+++ b/target/mips/msa.c
25
+ static gen_helper_gvec_4 * const fns[4] = {
39
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
26
+ gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h,
40
set_float_2nan_prop_rule(float_2nan_prop_s_ab,
27
+ gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d
41
&env->active_tc.msa_fp_status);
28
+ };
42
29
+ unsigned vsz = vec_full_reg_size(s);
43
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab,
30
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
44
+ &env->active_tc.msa_fp_status);
31
+ vec_full_reg_offset(s, rn),
32
+ vec_full_reg_offset(s, rm),
33
+ pred_full_reg_offset(s, pg),
34
+ vsz, vsz, 0, fns[esz]);
35
+}
36
+
45
+
37
#define DO_ZPZZ(NAME, name) \
46
/* clear float_status exception flags */
38
static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \
47
set_float_exception_flags(0, &env->active_tc.msa_fp_status);
39
uint32_t insn) \
48
40
@@ -XXX,XX +XXX,XX @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
49
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
41
return do_zpzz_ool(s, a, fns[a->esz]);
42
}
43
44
-DO_ZPZZ(SEL, sel)
45
+static bool trans_SEL_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
46
+{
47
+ if (sve_access_check(s)) {
48
+ do_sel_z(s, a->rd, a->rn, a->rm, a->pg, a->esz);
49
+ }
50
+ return true;
51
+}
52
53
#undef DO_ZPZZ
54
55
@@ -XXX,XX +XXX,XX @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn)
56
sve_access_check(s);
57
return true;
58
}
59
+
60
+/*
61
+ * Move Prefix
62
+ *
63
+ * TODO: The implementation so far could handle predicated merging movprfx.
64
+ * The helper functions as written take an extra source register to
65
+ * use in the operation, but the result is only written when predication
66
+ * succeeds. For unpredicated movprfx, we need to rearrange the helpers
67
+ * to allow the final write back to the destination to be unconditional.
68
+ * For predicated zeroing movprfx, we need to rearrange the helpers to
69
+ * allow the final write back to zero inactives.
70
+ *
71
+ * In the meantime, just emit the moves.
72
+ */
73
+
74
+static bool trans_MOVPRFX(DisasContext *s, arg_MOVPRFX *a, uint32_t insn)
75
+{
76
+ return do_mov_z(s, a->rd, a->rn);
77
+}
78
+
79
+static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
80
+{
81
+ if (sve_access_check(s)) {
82
+ do_sel_z(s, a->rd, a->rn, a->rd, a->pg, a->esz);
83
+ }
84
+ return true;
85
+}
86
+
87
+static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
88
+{
89
+ if (sve_access_check(s)) {
90
+ do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz);
91
+ }
92
+ return true;
93
+}
94
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
95
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
96
--- a/target/arm/sve.decode
51
--- a/fpu/softfloat-specialize.c.inc
97
+++ b/target/arm/sve.decode
52
+++ b/fpu/softfloat-specialize.c.inc
98
@@ -XXX,XX +XXX,XX @@ ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn
53
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
99
EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn
54
}
100
ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn
55
101
56
if (rule == float_3nan_prop_none) {
102
+# SVE constructive prefix (predicated)
57
-#if defined(TARGET_MIPS)
103
+MOVPRFX_z 00000100 .. 010 000 001 ... ..... ..... @rd_pg_rn
58
- if (snan_bit_is_one(status)) {
104
+MOVPRFX_m 00000100 .. 010 001 001 ... ..... ..... @rd_pg_rn
59
- rule = float_3nan_prop_s_abc;
105
+
60
- } else {
106
# SVE integer add reduction (predicated)
61
- rule = float_3nan_prop_s_cab;
107
# Note that saddv requires size != 3.
62
- }
108
UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn
63
-#elif defined(TARGET_XTENSA)
109
@@ -XXX,XX +XXX,XX @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm
64
+#if defined(TARGET_XTENSA)
110
65
if (status->use_first_nan) {
111
### SVE Integer Misc - Unpredicated Group
66
rule = float_3nan_prop_abc;
112
67
} else {
113
+# SVE constructive prefix (unpredicated)
114
+MOVPRFX 00000100 00 1 00000 101111 rn:5 rd:5
115
+
116
# SVE floating-point exponential accelerator
117
# Note esz != 0
118
FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn
119
--
68
--
120
2.17.1
69
2.34.1
121
122
diff view generated by jsdifflib
1
From: Eric Auger <eric.auger@redhat.com>
1
Set the Float3NaNPropRule explicitly for xtensa, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
When running dtc on the guest /proc/device-tree we get the
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
following warnings: "Warning (unit_address_vs_reg): Node <name>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
has a reg or ranges property, but no unit name", with name:
6
Message-id: 20241202131347.498124-24-peter.maydell@linaro.org
6
/intc, /intc/its, /intc/v2m.
7
---
8
target/xtensa/fpu_helper.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 8 --------
10
2 files changed, 2 insertions(+), 8 deletions(-)
7
11
8
Nodes should have a name in the form <name>[@<unit-address>] where
12
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
9
unit-address is the primary address used to access the device, listed
10
in the node's reg property. This fix seems to make dtc happy.
11
12
Signed-off-by: Eric Auger <eric.auger@redhat.com>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 1530044492-24921-3-git-send-email-eric.auger@redhat.com
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
hw/arm/virt.c | 63 +++++++++++++++++++++++++++++++--------------------
18
1 file changed, 39 insertions(+), 24 deletions(-)
19
20
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
21
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/arm/virt.c
14
--- a/target/xtensa/fpu_helper.c
23
+++ b/hw/arm/virt.c
15
+++ b/target/xtensa/fpu_helper.c
24
@@ -XXX,XX +XXX,XX @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
16
@@ -XXX,XX +XXX,XX @@ void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
25
17
set_use_first_nan(use_first, &env->fp_status);
26
static void fdt_add_its_gic_node(VirtMachineState *vms)
18
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
27
{
19
&env->fp_status);
28
+ char *nodename;
20
+ set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
29
+
21
+ &env->fp_status);
30
vms->msi_phandle = qemu_fdt_alloc_phandle(vms->fdt);
31
- qemu_fdt_add_subnode(vms->fdt, "/intc/its");
32
- qemu_fdt_setprop_string(vms->fdt, "/intc/its", "compatible",
33
+ nodename = g_strdup_printf("/intc/its@%" PRIx64,
34
+ vms->memmap[VIRT_GIC_ITS].base);
35
+ qemu_fdt_add_subnode(vms->fdt, nodename);
36
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
37
"arm,gic-v3-its");
38
- qemu_fdt_setprop(vms->fdt, "/intc/its", "msi-controller", NULL, 0);
39
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc/its", "reg",
40
+ qemu_fdt_setprop(vms->fdt, nodename, "msi-controller", NULL, 0);
41
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
42
2, vms->memmap[VIRT_GIC_ITS].base,
43
2, vms->memmap[VIRT_GIC_ITS].size);
44
- qemu_fdt_setprop_cell(vms->fdt, "/intc/its", "phandle", vms->msi_phandle);
45
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->msi_phandle);
46
+ g_free(nodename);
47
}
22
}
48
23
49
static void fdt_add_v2m_gic_node(VirtMachineState *vms)
24
void HELPER(wur_fpu2k_fcr)(CPUXtensaState *env, uint32_t v)
50
{
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
51
+ char *nodename;
26
index XXXXXXX..XXXXXXX 100644
52
+
27
--- a/fpu/softfloat-specialize.c.inc
53
+ nodename = g_strdup_printf("/intc/v2m@%" PRIx64,
28
+++ b/fpu/softfloat-specialize.c.inc
54
+ vms->memmap[VIRT_GIC_V2M].base);
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
55
vms->msi_phandle = qemu_fdt_alloc_phandle(vms->fdt);
56
- qemu_fdt_add_subnode(vms->fdt, "/intc/v2m");
57
- qemu_fdt_setprop_string(vms->fdt, "/intc/v2m", "compatible",
58
+ qemu_fdt_add_subnode(vms->fdt, nodename);
59
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
60
"arm,gic-v2m-frame");
61
- qemu_fdt_setprop(vms->fdt, "/intc/v2m", "msi-controller", NULL, 0);
62
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc/v2m", "reg",
63
+ qemu_fdt_setprop(vms->fdt, nodename, "msi-controller", NULL, 0);
64
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
65
2, vms->memmap[VIRT_GIC_V2M].base,
66
2, vms->memmap[VIRT_GIC_V2M].size);
67
- qemu_fdt_setprop_cell(vms->fdt, "/intc/v2m", "phandle", vms->msi_phandle);
68
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->msi_phandle);
69
+ g_free(nodename);
70
}
71
72
static void fdt_add_gic_node(VirtMachineState *vms)
73
{
74
+ char *nodename;
75
+
76
vms->gic_phandle = qemu_fdt_alloc_phandle(vms->fdt);
77
qemu_fdt_setprop_cell(vms->fdt, "/", "interrupt-parent", vms->gic_phandle);
78
79
- qemu_fdt_add_subnode(vms->fdt, "/intc");
80
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "#interrupt-cells", 3);
81
- qemu_fdt_setprop(vms->fdt, "/intc", "interrupt-controller", NULL, 0);
82
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "#address-cells", 0x2);
83
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "#size-cells", 0x2);
84
- qemu_fdt_setprop(vms->fdt, "/intc", "ranges", NULL, 0);
85
+ nodename = g_strdup_printf("/intc@%" PRIx64,
86
+ vms->memmap[VIRT_GIC_DIST].base);
87
+ qemu_fdt_add_subnode(vms->fdt, nodename);
88
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 3);
89
+ qemu_fdt_setprop(vms->fdt, nodename, "interrupt-controller", NULL, 0);
90
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "#address-cells", 0x2);
91
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "#size-cells", 0x2);
92
+ qemu_fdt_setprop(vms->fdt, nodename, "ranges", NULL, 0);
93
if (vms->gic_version == 3) {
94
int nb_redist_regions = virt_gicv3_redist_region_count(vms);
95
96
- qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
97
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
98
"arm,gic-v3");
99
100
- qemu_fdt_setprop_cell(vms->fdt, "/intc",
101
+ qemu_fdt_setprop_cell(vms->fdt, nodename,
102
"#redistributor-regions", nb_redist_regions);
103
104
if (nb_redist_regions == 1) {
105
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
106
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
107
2, vms->memmap[VIRT_GIC_DIST].base,
108
2, vms->memmap[VIRT_GIC_DIST].size,
109
2, vms->memmap[VIRT_GIC_REDIST].base,
110
2, vms->memmap[VIRT_GIC_REDIST].size);
111
} else {
112
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
113
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
114
2, vms->memmap[VIRT_GIC_DIST].base,
115
2, vms->memmap[VIRT_GIC_DIST].size,
116
2, vms->memmap[VIRT_GIC_REDIST].base,
117
@@ -XXX,XX +XXX,XX @@ static void fdt_add_gic_node(VirtMachineState *vms)
118
}
119
120
if (vms->virt) {
121
- qemu_fdt_setprop_cells(vms->fdt, "/intc", "interrupts",
122
+ qemu_fdt_setprop_cells(vms->fdt, nodename, "interrupts",
123
GIC_FDT_IRQ_TYPE_PPI, ARCH_GICV3_MAINT_IRQ,
124
GIC_FDT_IRQ_FLAGS_LEVEL_HI);
125
}
126
} else {
127
/* 'cortex-a15-gic' means 'GIC v2' */
128
- qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
129
+ qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
130
"arm,cortex-a15-gic");
131
- qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
132
+ qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
133
2, vms->memmap[VIRT_GIC_DIST].base,
134
2, vms->memmap[VIRT_GIC_DIST].size,
135
2, vms->memmap[VIRT_GIC_CPU].base,
136
2, vms->memmap[VIRT_GIC_CPU].size);
137
}
30
}
138
31
139
- qemu_fdt_setprop_cell(vms->fdt, "/intc", "phandle", vms->gic_phandle);
32
if (rule == float_3nan_prop_none) {
140
+ qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->gic_phandle);
33
-#if defined(TARGET_XTENSA)
141
+ g_free(nodename);
34
- if (status->use_first_nan) {
142
}
35
- rule = float_3nan_prop_abc;
143
36
- } else {
144
static void fdt_add_pmu_nodes(const VirtMachineState *vms)
37
- rule = float_3nan_prop_cba;
38
- }
39
-#else
40
rule = float_3nan_prop_abc;
41
-#endif
42
}
43
44
assert(rule != float_3nan_prop_none);
145
--
45
--
146
2.17.1
46
2.34.1
147
148
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the Float3NaNPropRule explicitly for i386. We had no
2
i386-specific behaviour in the old ifdef ladder, so we were using the
3
default "prefer a then b then c" fallback; this is actually the
4
correct per-the-spec handling for i386.
2
5
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-27-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-25-peter.maydell@linaro.org
7
---
9
---
8
target/arm/helper-sve.h | 14 ++++++++++++++
10
target/i386/tcg/fpu_helper.c | 1 +
9
target/arm/sve_helper.c | 8 ++++++++
11
1 file changed, 1 insertion(+)
10
target/arm/translate-sve.c | 26 ++++++++++++++++++++++++++
11
target/arm/sve.decode | 4 ++++
12
4 files changed, 52 insertions(+)
13
12
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
13
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
15
--- a/target/i386/tcg/fpu_helper.c
17
+++ b/target/arm/helper-sve.h
16
+++ b/target/i386/tcg/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
17
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
19
DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
18
* there are multiple input NaNs they are selected in the order a, b, c.
20
void, ptr, ptr, ptr, ptr, i32)
19
*/
21
20
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
22
+DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG,
21
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
23
+ void, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+
36
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
37
void, ptr, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
39
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/sve_helper.c
42
+++ b/target/arm/sve_helper.c
43
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int)
44
DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
45
DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int)
46
47
+DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16)
48
+DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32)
49
+DO_ZPZ_FP(sve_frecpx_d, uint64_t, , helper_frecpx_f64)
50
+
51
+DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt)
52
+DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt)
53
+DO_ZPZ_FP(sve_fsqrt_d, uint64_t, , float64_sqrt)
54
+
55
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
56
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
57
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
58
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-sve.c
61
+++ b/target/arm/translate-sve.c
62
@@ -XXX,XX +XXX,XX @@ static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
63
return do_frint_mode(s, a, float_round_ties_away);
64
}
22
}
65
23
66
+static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
24
static inline uint8_t save_exception_flags(CPUX86State *env)
67
+{
68
+ static gen_helper_gvec_3_ptr * const fns[3] = {
69
+ gen_helper_sve_frecpx_h,
70
+ gen_helper_sve_frecpx_s,
71
+ gen_helper_sve_frecpx_d
72
+ };
73
+ if (a->esz == 0) {
74
+ return false;
75
+ }
76
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]);
77
+}
78
+
79
+static bool trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
80
+{
81
+ static gen_helper_gvec_3_ptr * const fns[3] = {
82
+ gen_helper_sve_fsqrt_h,
83
+ gen_helper_sve_fsqrt_s,
84
+ gen_helper_sve_fsqrt_d
85
+ };
86
+ if (a->esz == 0) {
87
+ return false;
88
+ }
89
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]);
90
+}
91
+
92
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
93
{
94
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
95
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
96
index XXXXXXX..XXXXXXX 100644
97
--- a/target/arm/sve.decode
98
+++ b/target/arm/sve.decode
99
@@ -XXX,XX +XXX,XX @@ FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn
100
FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn
101
FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn
102
103
+# SVE floating-point unary operations
104
+FRECPX 01100101 .. 001 100 101 ... ..... ..... @rd_pg_rn
105
+FSQRT 01100101 .. 001 101 101 ... ..... ..... @rd_pg_rn
106
+
107
# SVE integer convert to floating-point
108
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
109
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
110
--
25
--
111
2.17.1
26
2.34.1
112
113
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
1
Set the Float3NaNPropRule explicitly for HPPA, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Use error_report() + exit() instead of error_setg(&error_fatal),
4
HPPA is the only target that was using the default branch of the
4
as suggested by the "qapi/error.h" documentation:
5
ifdef ladder (other targets either do not use muladd or set
6
default_nan_mode), so we can remove the ifdef fallback entirely now
7
(allowing the "rule not set" case to fall into the default of the
8
switch statement and assert).
5
9
6
Please don't error_setg(&error_fatal, ...), use error_report() and
10
We add a TODO note that the HPPA rule is probably wrong; this is
7
exit(), because that's more obvious.
11
not a behavioural change for this refactoring.
8
12
9
This fixes CID 1352173:
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
"Passing null pointer dt_name to qemu_fdt_node_path, which dereferences it."
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20241202131347.498124-26-peter.maydell@linaro.org
16
---
17
target/hppa/fpu_helper.c | 8 ++++++++
18
fpu/softfloat-specialize.c.inc | 4 ----
19
2 files changed, 8 insertions(+), 4 deletions(-)
11
20
12
And this also fixes:
21
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
13
14
hw/arm/sysbus-fdt.c:322:9: warning: Array access (from variable 'node_path') results in a null pointer dereference
15
if (node_path[1]) {
16
^~~~~~~~~~~~
17
18
Fixes: Coverity CID 1352173 (Dereference after null check)
19
Suggested-by: Eric Blake <eblake@redhat.com>
20
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
21
Reviewed-by: Eric Auger <eric.auger@redhat.com>
22
Message-id: 20180625165749.3910-3-f4bug@amsat.org
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
---
25
hw/arm/sysbus-fdt.c | 53 +++++++++++++++++++++++++--------------------
26
1 file changed, 30 insertions(+), 23 deletions(-)
27
28
diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
29
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
30
--- a/hw/arm/sysbus-fdt.c
23
--- a/target/hppa/fpu_helper.c
31
+++ b/hw/arm/sysbus-fdt.c
24
+++ b/target/hppa/fpu_helper.c
32
@@ -XXX,XX +XXX,XX @@ static void copy_properties_from_host(HostProperty *props, int nb_props,
25
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
33
r = qemu_fdt_getprop(host_fdt, node_path,
26
* HPPA does note implement a CPU reset method at all...
34
props[i].name,
27
*/
35
&prop_len,
28
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
36
- props[i].optional ? &err : &error_fatal);
29
+ /*
37
+ &err);
30
+ * TODO: The HPPA architecture reference only documents its NaN
38
if (r) {
31
+ * propagation rule for 2-operand operations. Testing on real hardware
39
qemu_fdt_setprop(guest_fdt, nodename,
32
+ * might be necessary to confirm whether this order for muladd is correct.
40
props[i].name, r, prop_len);
33
+ * Not preferring the SNaN is almost certainly incorrect as it diverges
41
} else {
34
+ * from the documented rules for 2-operand operations.
42
- if (prop_len != -FDT_ERR_NOTFOUND) {
35
+ */
43
- /* optional property not returned although property exists */
36
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
44
- error_report_err(err);
37
/* For inf * 0 + NaN, return the input NaN */
45
- } else {
38
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
46
+ if (props[i].optional && prop_len == -FDT_ERR_NOTFOUND) {
39
}
47
+ /* optional property does not exist */
40
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
48
error_free(err);
41
index XXXXXXX..XXXXXXX 100644
49
+ } else {
42
--- a/fpu/softfloat-specialize.c.inc
50
+ error_report_err(err);
43
+++ b/fpu/softfloat-specialize.c.inc
51
+ }
44
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
52
+ if (!props[i].optional) {
53
+ /* mandatory property not found: bail out */
54
+ exit(1);
55
}
56
}
45
}
57
}
46
}
58
@@ -XXX,XX +XXX,XX @@ static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
47
59
48
- if (rule == float_3nan_prop_none) {
60
node_offset = fdt_node_offset_by_phandle(host_fdt, host_phandle);
49
- rule = float_3nan_prop_abc;
61
if (node_offset <= 0) {
50
- }
62
- error_setg(&error_fatal,
51
-
63
- "not able to locate clock handle %d in host device tree",
52
assert(rule != float_3nan_prop_none);
64
- host_phandle);
53
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
65
+ error_report("not able to locate clock handle %d in host device tree",
54
/* We have at least one SNaN input and should prefer it */
66
+ host_phandle);
67
+ exit(1);
68
}
69
node_path = g_malloc(path_len);
70
while ((ret = fdt_get_path(host_fdt, node_offset, node_path, path_len))
71
@@ -XXX,XX +XXX,XX @@ static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
72
node_path = g_realloc(node_path, path_len);
73
}
74
if (ret < 0) {
75
- error_setg(&error_fatal,
76
- "not able to retrieve node path for clock handle %d",
77
- host_phandle);
78
+ error_report("not able to retrieve node path for clock handle %d",
79
+ host_phandle);
80
+ exit(1);
81
}
82
83
r = qemu_fdt_getprop(host_fdt, node_path, "compatible", &prop_len,
84
&error_fatal);
85
if (strcmp(r, "fixed-clock")) {
86
- error_setg(&error_fatal,
87
- "clock handle %d is not a fixed clock", host_phandle);
88
+ error_report("clock handle %d is not a fixed clock", host_phandle);
89
+ exit(1);
90
}
91
92
nodename = strrchr(node_path, '/');
93
@@ -XXX,XX +XXX,XX @@ static int add_amd_xgbe_fdt_node(SysBusDevice *sbdev, void *opaque)
94
95
dt_name = sysfs_to_dt_name(vbasedev->name);
96
if (!dt_name) {
97
- error_setg(&error_fatal, "%s incorrect sysfs device name %s",
98
- __func__, vbasedev->name);
99
+ error_report("%s incorrect sysfs device name %s",
100
+ __func__, vbasedev->name);
101
+ exit(1);
102
}
103
node_path = qemu_fdt_node_path(host_fdt, dt_name, vdev->compat,
104
&error_fatal);
105
if (!node_path || !node_path[0]) {
106
- error_setg(&error_fatal, "%s unable to retrieve node path for %s/%s",
107
- __func__, dt_name, vdev->compat);
108
+ error_report("%s unable to retrieve node path for %s/%s",
109
+ __func__, dt_name, vdev->compat);
110
+ exit(1);
111
}
112
113
if (node_path[1]) {
114
- error_setg(&error_fatal, "%s more than one node matching %s/%s!",
115
- __func__, dt_name, vdev->compat);
116
+ error_report("%s more than one node matching %s/%s!",
117
+ __func__, dt_name, vdev->compat);
118
+ exit(1);
119
}
120
121
g_free(dt_name);
122
123
if (vbasedev->num_regions != 5) {
124
- error_setg(&error_fatal, "%s Does the host dt node combine XGBE/PHY?",
125
- __func__);
126
+ error_report("%s Does the host dt node combine XGBE/PHY?", __func__);
127
+ exit(1);
128
}
129
130
/* generate nodes for DMA_CLK and PTP_CLK */
131
r = qemu_fdt_getprop(host_fdt, node_path[0], "clocks",
132
&prop_len, &error_fatal);
133
if (prop_len != 8) {
134
- error_setg(&error_fatal, "%s clocks property should contain 2 handles",
135
- __func__);
136
+ error_report("%s clocks property should contain 2 handles", __func__);
137
+ exit(1);
138
}
139
host_clock_phandles = (uint32_t *)r;
140
guest_clock_phandles[0] = qemu_fdt_alloc_phandle(guest_fdt);
141
--
55
--
142
2.17.1
56
2.34.1
143
144
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The use_first_nan field in float_status was an xtensa-specific way to
2
select at runtime from two different NaN propagation rules. Now that
3
xtensa is using the target-agnostic NaN propagation rule selection
4
that we've just added, we can remove use_first_nan, because there is
5
no longer any code that reads it.
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-19-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20241202131347.498124-27-peter.maydell@linaro.org
7
---
10
---
8
target/arm/helper.h | 14 +++++++++++
11
include/fpu/softfloat-helpers.h | 5 -----
9
target/arm/translate-sve.c | 50 ++++++++++++++++++++++++++++++++++++++
12
include/fpu/softfloat-types.h | 1 -
10
target/arm/vec_helper.c | 48 ++++++++++++++++++++++++++++++++++++
13
target/xtensa/fpu_helper.c | 1 -
11
target/arm/sve.decode | 19 +++++++++++++++
14
3 files changed, 7 deletions(-)
12
4 files changed, 131 insertions(+)
13
15
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.h
18
--- a/include/fpu/softfloat-helpers.h
17
+++ b/target/arm/helper.h
19
+++ b/include/fpu/softfloat-helpers.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
20
@@ -XXX,XX +XXX,XX @@ static inline void set_snan_bit_is_one(bool val, float_status *status)
19
DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG,
21
status->snan_bit_is_one = val;
20
void, ptr, ptr, ptr, ptr, i32)
22
}
21
23
22
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG,
24
-static inline void set_use_first_nan(bool val, float_status *status)
23
+ void, ptr, ptr, ptr, ptr, i32)
25
-{
24
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG,
26
- status->use_first_nan = val;
25
+ void, ptr, ptr, ptr, ptr, i32)
27
-}
26
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG,
28
-
27
+ void, ptr, ptr, ptr, ptr, i32)
29
static inline void set_no_signaling_nans(bool val, float_status *status)
28
+
30
{
29
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG,
31
status->no_signaling_nans = val;
30
+ void, ptr, ptr, ptr, ptr, ptr, i32)
32
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
31
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+
36
#ifdef TARGET_AARCH64
37
#include "helper-a64.h"
38
#include "helper-sve.h"
39
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
40
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate-sve.c
34
--- a/include/fpu/softfloat-types.h
42
+++ b/target/arm/translate-sve.c
35
+++ b/include/fpu/softfloat-types.h
43
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
36
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
44
37
* softfloat-specialize.inc.c)
45
#undef DO_ZZI
38
*/
46
39
bool snan_bit_is_one;
47
+/*
40
- bool use_first_nan;
48
+ *** SVE Floating Point Multiply-Add Indexed Group
41
bool no_signaling_nans;
49
+ */
42
/* should overflowed results subtract re_bias to its exponent? */
50
+
43
bool rebias_overflow;
51
+static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn)
44
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
52
+{
53
+ static gen_helper_gvec_4_ptr * const fns[3] = {
54
+ gen_helper_gvec_fmla_idx_h,
55
+ gen_helper_gvec_fmla_idx_s,
56
+ gen_helper_gvec_fmla_idx_d,
57
+ };
58
+
59
+ if (sve_access_check(s)) {
60
+ unsigned vsz = vec_full_reg_size(s);
61
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
62
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
63
+ vec_full_reg_offset(s, a->rn),
64
+ vec_full_reg_offset(s, a->rm),
65
+ vec_full_reg_offset(s, a->ra),
66
+ status, vsz, vsz, (a->index << 1) | a->sub,
67
+ fns[a->esz - 1]);
68
+ tcg_temp_free_ptr(status);
69
+ }
70
+ return true;
71
+}
72
+
73
+/*
74
+ *** SVE Floating Point Multiply Indexed Group
75
+ */
76
+
77
+static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn)
78
+{
79
+ static gen_helper_gvec_3_ptr * const fns[3] = {
80
+ gen_helper_gvec_fmul_idx_h,
81
+ gen_helper_gvec_fmul_idx_s,
82
+ gen_helper_gvec_fmul_idx_d,
83
+ };
84
+
85
+ if (sve_access_check(s)) {
86
+ unsigned vsz = vec_full_reg_size(s);
87
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
88
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
89
+ vec_full_reg_offset(s, a->rn),
90
+ vec_full_reg_offset(s, a->rm),
91
+ status, vsz, vsz, a->index, fns[a->esz - 1]);
92
+ tcg_temp_free_ptr(status);
93
+ }
94
+ return true;
95
+}
96
+
97
/*
98
*** SVE Floating Point Accumulating Reduction Group
99
*/
100
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
101
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/vec_helper.c
46
--- a/target/xtensa/fpu_helper.c
103
+++ b/target/arm/vec_helper.c
47
+++ b/target/xtensa/fpu_helper.c
104
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64)
48
@@ -XXX,XX +XXX,XX @@ static const struct {
105
49
106
#endif
50
void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
107
#undef DO_3OP
51
{
108
+
52
- set_use_first_nan(use_first, &env->fp_status);
109
+/* For the indexed ops, SVE applies the index per 128-bit vector segment.
53
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
110
+ * For AdvSIMD, there is of course only one such vector segment.
54
&env->fp_status);
111
+ */
55
set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
112
+
113
+#define DO_MUL_IDX(NAME, TYPE, H) \
114
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
115
+{ \
116
+ intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
117
+ intptr_t idx = simd_data(desc); \
118
+ TYPE *d = vd, *n = vn, *m = vm; \
119
+ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
120
+ TYPE mm = m[H(i + idx)]; \
121
+ for (j = 0; j < segment; j++) { \
122
+ d[i + j] = TYPE##_mul(n[i + j], mm, stat); \
123
+ } \
124
+ } \
125
+}
126
+
127
+DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
128
+DO_MUL_IDX(gvec_fmul_idx_s, float32, H4)
129
+DO_MUL_IDX(gvec_fmul_idx_d, float64, )
130
+
131
+#undef DO_MUL_IDX
132
+
133
+#define DO_FMLA_IDX(NAME, TYPE, H) \
134
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
135
+ void *stat, uint32_t desc) \
136
+{ \
137
+ intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
138
+ TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \
139
+ intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \
140
+ TYPE *d = vd, *n = vn, *m = vm, *a = va; \
141
+ op1_neg <<= (8 * sizeof(TYPE) - 1); \
142
+ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
143
+ TYPE mm = m[H(i + idx)]; \
144
+ for (j = 0; j < segment; j++) { \
145
+ d[i + j] = TYPE##_muladd(n[i + j] ^ op1_neg, \
146
+ mm, a[i + j], 0, stat); \
147
+ } \
148
+ } \
149
+}
150
+
151
+DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
152
+DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4)
153
+DO_FMLA_IDX(gvec_fmla_idx_d, float64, )
154
+
155
+#undef DO_FMLA_IDX
156
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
157
index XXXXXXX..XXXXXXX 100644
158
--- a/target/arm/sve.decode
159
+++ b/target/arm/sve.decode
160
@@ -XXX,XX +XXX,XX @@
161
%imm9_16_10 16:s6 10:3
162
%size_23 23:2
163
%dtype_23_13 23:2 13:2
164
+%index3_22_19 22:1 19:2
165
166
# A combination of tsz:imm3 -- extract esize.
167
%tszimm_esz 22:2 5:5 !function=tszimm_esz
168
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
169
# SVE integer multiply immediate (unpredicated)
170
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
171
172
+### SVE FP Multiply-Add Indexed Group
173
+
174
+# SVE floating-point multiply-add (indexed)
175
+FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \
176
+ ra=%reg_movprfx index=%index3_22_19 esz=1
177
+FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \
178
+ ra=%reg_movprfx esz=2
179
+FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \
180
+ ra=%reg_movprfx esz=3
181
+
182
+### SVE FP Multiply Indexed Group
183
+
184
+# SVE floating-point multiply (indexed)
185
+FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \
186
+ index=%index3_22_19 esz=1
187
+FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2
188
+FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3
189
+
190
### SVE FP Accumulating Reduction Group
191
192
# SVE floating-point serial reduction (predicated)
193
--
56
--
194
2.17.1
57
2.34.1
195
196
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Currently m68k_cpu_reset_hold() calls floatx80_default_nan(NULL)
2
to get the NaN bit pattern to reset the FPU registers. This
3
works because it happens that our implementation of
4
floatx80_default_nan() doesn't actually look at the float_status
5
pointer except for TARGET_MIPS. However, this isn't guaranteed,
6
and to be able to remove the ifdef in floatx80_default_nan()
7
we're going to need a real float_status here.
2
8
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Rearrange m68k_cpu_reset_hold() so that we initialize env->fp_status
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
earlier, and thus can pass it to floatx80_default_nan().
5
Message-id: 20180627043328.11531-18-richard.henderson@linaro.org
11
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20241202131347.498124-28-peter.maydell@linaro.org
7
---
15
---
8
target/arm/helper-sve.h | 56 ++++++++++++++++++++++++++++
16
target/m68k/cpu.c | 12 +++++++-----
9
target/arm/sve_helper.c | 69 +++++++++++++++++++++++++++++++++++
17
1 file changed, 7 insertions(+), 5 deletions(-)
10
target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 14 +++++++
12
4 files changed, 214 insertions(+)
13
18
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
19
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
15
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
21
--- a/target/m68k/cpu.c
17
+++ b/target/arm/helper-sve.h
22
+++ b/target/m68k/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
23
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
24
CPUState *cs = CPU(obj);
20
void, ptr, ptr, ptr, ptr, ptr, i32)
25
M68kCPUClass *mcc = M68K_CPU_GET_CLASS(obj);
21
26
CPUM68KState *env = cpu_env(cs);
22
+DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG,
27
- floatx80 nan = floatx80_default_nan(NULL);
23
+ void, ptr, ptr, ptr, i64, ptr, i32)
28
+ floatx80 nan;
24
+DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG,
29
int i;
25
+ void, ptr, ptr, ptr, i64, ptr, i32)
30
26
+DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG,
31
if (mcc->parent_phases.hold) {
27
+ void, ptr, ptr, ptr, i64, ptr, i32)
32
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
33
#else
34
cpu_m68k_set_sr(env, SR_S | SR_I);
35
#endif
36
- for (i = 0; i < 8; i++) {
37
- env->fregs[i].d = nan;
38
- }
39
- cpu_m68k_set_fpcr(env, 0);
40
/*
41
* M68000 FAMILY PROGRAMMER'S REFERENCE MANUAL
42
* 3.4 FLOATING-POINT INSTRUCTION DETAILS
43
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
44
* preceding paragraph for nonsignaling NaNs.
45
*/
46
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
28
+
47
+
29
+DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG,
48
+ nan = floatx80_default_nan(&env->fp_status);
30
+ void, ptr, ptr, ptr, i64, ptr, i32)
49
+ for (i = 0; i < 8; i++) {
31
+DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG,
50
+ env->fregs[i].d = nan;
32
+ void, ptr, ptr, ptr, i64, ptr, i32)
51
+ }
33
+DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG,
52
+ cpu_m68k_set_fpcr(env, 0);
34
+ void, ptr, ptr, ptr, i64, ptr, i32)
53
env->fpsr = 0;
35
+
54
36
+DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG,
55
/* TODO: We should set PC from the interrupt vector. */
37
+ void, ptr, ptr, ptr, i64, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, i64, ptr, i32)
40
+DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, i64, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, i64, ptr, i32)
45
+DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, i64, ptr, i32)
47
+DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, i64, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, i64, ptr, i32)
52
+DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, i64, ptr, i32)
54
+DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, i64, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, i64, ptr, i32)
59
+DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, i64, ptr, i32)
61
+DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, i64, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG,
65
+ void, ptr, ptr, ptr, i64, ptr, i32)
66
+DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, i64, ptr, i32)
68
+DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, i64, ptr, i32)
70
+
71
+DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG,
72
+ void, ptr, ptr, ptr, i64, ptr, i32)
73
+DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
74
+ void, ptr, ptr, ptr, i64, ptr, i32)
75
+DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
76
+ void, ptr, ptr, ptr, i64, ptr, i32)
77
+
78
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
79
void, ptr, ptr, ptr, ptr, i32)
80
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
81
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
82
index XXXXXXX..XXXXXXX 100644
83
--- a/target/arm/sve_helper.c
84
+++ b/target/arm/sve_helper.c
85
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd)
86
87
#undef DO_ZPZZ_FP
88
89
+/* Three-operand expander, with one scalar operand, controlled by
90
+ * a predicate, with the extra float_status parameter.
91
+ */
92
+#define DO_ZPZS_FP(NAME, TYPE, H, OP) \
93
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \
94
+ void *status, uint32_t desc) \
95
+{ \
96
+ intptr_t i = simd_oprsz(desc); \
97
+ uint64_t *g = vg; \
98
+ TYPE mm = scalar; \
99
+ do { \
100
+ uint64_t pg = g[(i - 1) >> 6]; \
101
+ do { \
102
+ i -= sizeof(TYPE); \
103
+ if (likely((pg >> (i & 63)) & 1)) { \
104
+ TYPE nn = *(TYPE *)(vn + H(i)); \
105
+ *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \
106
+ } \
107
+ } while (i & 63); \
108
+ } while (i != 0); \
109
+}
110
+
111
+DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add)
112
+DO_ZPZS_FP(sve_fadds_s, float32, H1_4, float32_add)
113
+DO_ZPZS_FP(sve_fadds_d, float64, , float64_add)
114
+
115
+DO_ZPZS_FP(sve_fsubs_h, float16, H1_2, float16_sub)
116
+DO_ZPZS_FP(sve_fsubs_s, float32, H1_4, float32_sub)
117
+DO_ZPZS_FP(sve_fsubs_d, float64, , float64_sub)
118
+
119
+DO_ZPZS_FP(sve_fmuls_h, float16, H1_2, float16_mul)
120
+DO_ZPZS_FP(sve_fmuls_s, float32, H1_4, float32_mul)
121
+DO_ZPZS_FP(sve_fmuls_d, float64, , float64_mul)
122
+
123
+static inline float16 subr_h(float16 a, float16 b, float_status *s)
124
+{
125
+ return float16_sub(b, a, s);
126
+}
127
+
128
+static inline float32 subr_s(float32 a, float32 b, float_status *s)
129
+{
130
+ return float32_sub(b, a, s);
131
+}
132
+
133
+static inline float64 subr_d(float64 a, float64 b, float_status *s)
134
+{
135
+ return float64_sub(b, a, s);
136
+}
137
+
138
+DO_ZPZS_FP(sve_fsubrs_h, float16, H1_2, subr_h)
139
+DO_ZPZS_FP(sve_fsubrs_s, float32, H1_4, subr_s)
140
+DO_ZPZS_FP(sve_fsubrs_d, float64, , subr_d)
141
+
142
+DO_ZPZS_FP(sve_fmaxnms_h, float16, H1_2, float16_maxnum)
143
+DO_ZPZS_FP(sve_fmaxnms_s, float32, H1_4, float32_maxnum)
144
+DO_ZPZS_FP(sve_fmaxnms_d, float64, , float64_maxnum)
145
+
146
+DO_ZPZS_FP(sve_fminnms_h, float16, H1_2, float16_minnum)
147
+DO_ZPZS_FP(sve_fminnms_s, float32, H1_4, float32_minnum)
148
+DO_ZPZS_FP(sve_fminnms_d, float64, , float64_minnum)
149
+
150
+DO_ZPZS_FP(sve_fmaxs_h, float16, H1_2, float16_max)
151
+DO_ZPZS_FP(sve_fmaxs_s, float32, H1_4, float32_max)
152
+DO_ZPZS_FP(sve_fmaxs_d, float64, , float64_max)
153
+
154
+DO_ZPZS_FP(sve_fmins_h, float16, H1_2, float16_min)
155
+DO_ZPZS_FP(sve_fmins_s, float32, H1_4, float32_min)
156
+DO_ZPZS_FP(sve_fmins_d, float64, , float64_min)
157
+
158
/* Fully general two-operand expander, controlled by a predicate,
159
* With the extra float_status parameter.
160
*/
161
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
162
index XXXXXXX..XXXXXXX 100644
163
--- a/target/arm/translate-sve.c
164
+++ b/target/arm/translate-sve.c
165
@@ -XXX,XX +XXX,XX @@
166
#include "exec/log.h"
167
#include "trace-tcg.h"
168
#include "translate-a64.h"
169
+#include "fpu/softfloat.h"
170
171
172
typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t,
173
@@ -XXX,XX +XXX,XX @@ DO_FP3(FMULX, fmulx)
174
175
#undef DO_FP3
176
177
+typedef void gen_helper_sve_fp2scalar(TCGv_ptr, TCGv_ptr, TCGv_ptr,
178
+ TCGv_i64, TCGv_ptr, TCGv_i32);
179
+
180
+static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16,
181
+ TCGv_i64 scalar, gen_helper_sve_fp2scalar *fn)
182
+{
183
+ unsigned vsz = vec_full_reg_size(s);
184
+ TCGv_ptr t_zd, t_zn, t_pg, status;
185
+ TCGv_i32 desc;
186
+
187
+ t_zd = tcg_temp_new_ptr();
188
+ t_zn = tcg_temp_new_ptr();
189
+ t_pg = tcg_temp_new_ptr();
190
+ tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, zd));
191
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, zn));
192
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
193
+
194
+ status = get_fpstatus_ptr(is_fp16);
195
+ desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
196
+ fn(t_zd, t_zn, t_pg, scalar, status, desc);
197
+
198
+ tcg_temp_free_i32(desc);
199
+ tcg_temp_free_ptr(status);
200
+ tcg_temp_free_ptr(t_pg);
201
+ tcg_temp_free_ptr(t_zn);
202
+ tcg_temp_free_ptr(t_zd);
203
+}
204
+
205
+static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm,
206
+ gen_helper_sve_fp2scalar *fn)
207
+{
208
+ TCGv_i64 temp = tcg_const_i64(imm);
209
+ do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16, temp, fn);
210
+ tcg_temp_free_i64(temp);
211
+}
212
+
213
+#define DO_FP_IMM(NAME, name, const0, const1) \
214
+static bool trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a, \
215
+ uint32_t insn) \
216
+{ \
217
+ static gen_helper_sve_fp2scalar * const fns[3] = { \
218
+ gen_helper_sve_##name##_h, \
219
+ gen_helper_sve_##name##_s, \
220
+ gen_helper_sve_##name##_d \
221
+ }; \
222
+ static uint64_t const val[3][2] = { \
223
+ { float16_##const0, float16_##const1 }, \
224
+ { float32_##const0, float32_##const1 }, \
225
+ { float64_##const0, float64_##const1 }, \
226
+ }; \
227
+ if (a->esz == 0) { \
228
+ return false; \
229
+ } \
230
+ if (sve_access_check(s)) { \
231
+ do_fp_imm(s, a, val[a->esz - 1][a->imm], fns[a->esz - 1]); \
232
+ } \
233
+ return true; \
234
+}
235
+
236
+#define float16_two make_float16(0x4000)
237
+#define float32_two make_float32(0x40000000)
238
+#define float64_two make_float64(0x4000000000000000ULL)
239
+
240
+DO_FP_IMM(FADD, fadds, half, one)
241
+DO_FP_IMM(FSUB, fsubs, half, one)
242
+DO_FP_IMM(FMUL, fmuls, half, two)
243
+DO_FP_IMM(FSUBR, fsubrs, half, one)
244
+DO_FP_IMM(FMAXNM, fmaxnms, zero, one)
245
+DO_FP_IMM(FMINNM, fminnms, zero, one)
246
+DO_FP_IMM(FMAX, fmaxs, zero, one)
247
+DO_FP_IMM(FMIN, fmins, zero, one)
248
+
249
+#undef DO_FP_IMM
250
+
251
static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a,
252
gen_helper_gvec_4_ptr *fn)
253
{
254
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
255
index XXXXXXX..XXXXXXX 100644
256
--- a/target/arm/sve.decode
257
+++ b/target/arm/sve.decode
258
@@ -XXX,XX +XXX,XX @@
259
@rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \
260
&rpri_esz rn=%reg_movprfx
261
262
+# Two register operand, one one-bit floating-point operand.
263
+@rdn_i1 ........ esz:2 ......... pg:3 .... imm:1 rd:5 \
264
+ &rpri_esz rn=%reg_movprfx
265
+
266
# Two register operand, one encoded bitmask.
267
@rdn_dbm ........ .. .... dbm:13 rd:5 \
268
&rr_dbm rn=%reg_movprfx
269
@@ -XXX,XX +XXX,XX @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm
270
FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR
271
FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm
272
273
+# SVE floating-point arithmetic with immediate (predicated)
274
+FADD_zpzi 01100101 .. 011 000 100 ... 0000 . ..... @rdn_i1
275
+FSUB_zpzi 01100101 .. 011 001 100 ... 0000 . ..... @rdn_i1
276
+FMUL_zpzi 01100101 .. 011 010 100 ... 0000 . ..... @rdn_i1
277
+FSUBR_zpzi 01100101 .. 011 011 100 ... 0000 . ..... @rdn_i1
278
+FMAXNM_zpzi 01100101 .. 011 100 100 ... 0000 . ..... @rdn_i1
279
+FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1
280
+FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1
281
+FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1
282
+
283
### SVE FP Multiply-Add Group
284
285
# SVE floating-point multiply-accumulate writing addend
286
--
56
--
287
2.17.1
57
2.34.1
288
289
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
We create our 128-bit default NaN by calling parts64_default_nan()
2
and then adjusting the result. We can do the same trick for creating
3
the floatx80 default NaN, which lets us drop a target ifdef.
2
4
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
floatx80 is used only by:
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
i386
5
Message-id: 20180627043328.11531-26-richard.henderson@linaro.org
7
m68k
8
arm nwfpe old floating-point emulation emulation support
9
(which is essentially dead, especially the parts involving floatx80)
10
PPC (only in the xsrqpxp instruction, which just rounds an input
11
value by converting to floatx80 and back, so will never generate
12
the default NaN)
13
14
The floatx80 default NaN as currently implemented is:
15
m68k: sign = 0, exp = 1...1, int = 1, frac = 1....1
16
i386: sign = 1, exp = 1...1, int = 1, frac = 10...0
17
18
These are the same as the parts64_default_nan for these architectures.
19
20
This is technically a possible behaviour change for arm linux-user
21
nwfpe emulation emulation, because the default NaN will now have the
22
sign bit clear. But we were already generating a different floatx80
23
default NaN from the real kernel emulation we are supposedly
24
following, which appears to use an all-bits-1 value:
25
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L267
26
27
This won't affect the only "real" use of the nwfpe emulation, which
28
is ancient binaries that used it as part of the old floating point
29
calling convention; that only uses loads and stores of 32 and 64 bit
30
floats, not any of the floatx80 behaviour the original hardware had.
31
We also get the nwfpe float64 default NaN value wrong:
32
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L166
33
so if we ever cared about this obscure corner the right fix would be
34
to correct that so nwfpe used its own default-NaN setting rather
35
than the Arm VFP one.
36
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
37
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
38
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
39
Message-id: 20241202131347.498124-29-peter.maydell@linaro.org
7
---
40
---
8
target/arm/helper-sve.h | 14 +++++++
41
fpu/softfloat-specialize.c.inc | 20 ++++++++++----------
9
target/arm/sve_helper.c | 8 ++++
42
1 file changed, 10 insertions(+), 10 deletions(-)
10
target/arm/translate-sve.c | 77 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 9 +++++
12
4 files changed, 108 insertions(+)
13
43
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
44
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
15
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
46
--- a/fpu/softfloat-specialize.c.inc
17
+++ b/target/arm/helper-sve.h
47
+++ b/fpu/softfloat-specialize.c.inc
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
48
@@ -XXX,XX +XXX,XX @@ static void parts128_silence_nan(FloatParts128 *p, float_status *status)
19
DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
49
floatx80 floatx80_default_nan(float_status *status)
20
void, ptr, ptr, ptr, ptr, i32)
50
{
21
51
floatx80 r;
22
+DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG,
52
+ /*
23
+ void, ptr, ptr, ptr, ptr, i32)
53
+ * Extrapolate from the choices made by parts64_default_nan to fill
24
+DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG,
54
+ * in the floatx80 format. We assume that floatx80's explicit
25
+ void, ptr, ptr, ptr, ptr, i32)
55
+ * integer bit is always set (this is true for i386 and m68k,
26
+DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG,
56
+ * which are the only real users of this format).
27
+ void, ptr, ptr, ptr, ptr, i32)
57
+ */
28
+
58
+ FloatParts64 p64;
29
+DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG,
59
+ parts64_default_nan(&p64, status);
30
+ void, ptr, ptr, ptr, ptr, i32)
60
31
+DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
61
- /* None of the targets that have snan_bit_is_one use floatx80. */
32
+ void, ptr, ptr, ptr, ptr, i32)
62
- assert(!snan_bit_is_one(status));
33
+DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
63
-#if defined(TARGET_M68K)
34
+ void, ptr, ptr, ptr, ptr, i32)
64
- r.low = UINT64_C(0xFFFFFFFFFFFFFFFF);
35
+
65
- r.high = 0x7FFF;
36
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
66
-#else
37
void, ptr, ptr, ptr, ptr, i32)
67
- /* X86 */
38
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
68
- r.low = UINT64_C(0xC000000000000000);
39
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
69
- r.high = 0xFFFF;
40
index XXXXXXX..XXXXXXX 100644
70
-#endif
41
--- a/target/arm/sve_helper.c
71
+ r.high = 0x7FFF | (p64.sign << 15);
42
+++ b/target/arm/sve_helper.c
72
+ r.low = (1ULL << DECOMPOSED_BINARY_POINT) | p64.frac;
43
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz)
73
return r;
44
DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd)
45
DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz)
46
47
+DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth)
48
+DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints)
49
+DO_ZPZ_FP(sve_frint_d, uint64_t, , helper_rintd)
50
+
51
+DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int)
52
+DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
53
+DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int)
54
+
55
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
56
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
57
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
58
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-sve.c
61
+++ b/target/arm/translate-sve.c
62
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
63
return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd);
64
}
74
}
65
75
66
+static gen_helper_gvec_3_ptr * const frint_fns[3] = {
67
+ gen_helper_sve_frint_h,
68
+ gen_helper_sve_frint_s,
69
+ gen_helper_sve_frint_d
70
+};
71
+
72
+static bool trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
73
+{
74
+ if (a->esz == 0) {
75
+ return false;
76
+ }
77
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16,
78
+ frint_fns[a->esz - 1]);
79
+}
80
+
81
+static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
82
+{
83
+ static gen_helper_gvec_3_ptr * const fns[3] = {
84
+ gen_helper_sve_frintx_h,
85
+ gen_helper_sve_frintx_s,
86
+ gen_helper_sve_frintx_d
87
+ };
88
+ if (a->esz == 0) {
89
+ return false;
90
+ }
91
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]);
92
+}
93
+
94
+static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode)
95
+{
96
+ if (a->esz == 0) {
97
+ return false;
98
+ }
99
+ if (sve_access_check(s)) {
100
+ unsigned vsz = vec_full_reg_size(s);
101
+ TCGv_i32 tmode = tcg_const_i32(mode);
102
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
103
+
104
+ gen_helper_set_rmode(tmode, tmode, status);
105
+
106
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
107
+ vec_full_reg_offset(s, a->rn),
108
+ pred_full_reg_offset(s, a->pg),
109
+ status, vsz, vsz, 0, frint_fns[a->esz - 1]);
110
+
111
+ gen_helper_set_rmode(tmode, tmode, status);
112
+ tcg_temp_free_i32(tmode);
113
+ tcg_temp_free_ptr(status);
114
+ }
115
+ return true;
116
+}
117
+
118
+static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
119
+{
120
+ return do_frint_mode(s, a, float_round_nearest_even);
121
+}
122
+
123
+static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
124
+{
125
+ return do_frint_mode(s, a, float_round_up);
126
+}
127
+
128
+static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
129
+{
130
+ return do_frint_mode(s, a, float_round_down);
131
+}
132
+
133
+static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
134
+{
135
+ return do_frint_mode(s, a, float_round_to_zero);
136
+}
137
+
138
+static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
139
+{
140
+ return do_frint_mode(s, a, float_round_ties_away);
141
+}
142
+
143
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
144
{
145
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
146
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
147
index XXXXXXX..XXXXXXX 100644
148
--- a/target/arm/sve.decode
149
+++ b/target/arm/sve.decode
150
@@ -XXX,XX +XXX,XX @@ FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
151
FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0
152
FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0
153
154
+# SVE floating-point round to integral value
155
+FRINTN 01100101 .. 000 000 101 ... ..... ..... @rd_pg_rn
156
+FRINTP 01100101 .. 000 001 101 ... ..... ..... @rd_pg_rn
157
+FRINTM 01100101 .. 000 010 101 ... ..... ..... @rd_pg_rn
158
+FRINTZ 01100101 .. 000 011 101 ... ..... ..... @rd_pg_rn
159
+FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn
160
+FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn
161
+FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn
162
+
163
# SVE integer convert to floating-point
164
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
165
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
166
--
76
--
167
2.17.1
77
2.34.1
168
169
diff view generated by jsdifflib
New patch
1
In target/loongarch's helper_fclass_s() and helper_fclass_d() we pass
2
a zero-initialized float_status struct to float32_is_quiet_nan() and
3
float64_is_quiet_nan(), with the cryptic comment "for
4
snan_bit_is_one".
1
5
6
This pattern appears to have been copied from target/riscv, where it
7
is used because the functions there do not have ready access to the
8
CPU state struct. The comment presumably refers to the fact that the
9
main reason the is_quiet_nan() functions want the float_state is
10
because they want to know about the snan_bit_is_one config.
11
12
In the loongarch helpers, though, we have the CPU state struct
13
to hand. Use the usual env->fp_status here. This avoids our needing
14
to track that we need to update the initializer of the local
15
float_status structs when the core softfloat code adds new
16
options for targets to configure their behaviour.
17
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20241202131347.498124-30-peter.maydell@linaro.org
21
---
22
target/loongarch/tcg/fpu_helper.c | 6 ++----
23
1 file changed, 2 insertions(+), 4 deletions(-)
24
25
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/loongarch/tcg/fpu_helper.c
28
+++ b/target/loongarch/tcg/fpu_helper.c
29
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_s(CPULoongArchState *env, uint64_t fj)
30
} else if (float32_is_zero_or_denormal(f)) {
31
return sign ? 1 << 4 : 1 << 8;
32
} else if (float32_is_any_nan(f)) {
33
- float_status s = { }; /* for snan_bit_is_one */
34
- return float32_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
35
+ return float32_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
36
} else {
37
return sign ? 1 << 3 : 1 << 7;
38
}
39
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_d(CPULoongArchState *env, uint64_t fj)
40
} else if (float64_is_zero_or_denormal(f)) {
41
return sign ? 1 << 4 : 1 << 8;
42
} else if (float64_is_any_nan(f)) {
43
- float_status s = { }; /* for snan_bit_is_one */
44
- return float64_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
45
+ return float64_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
46
} else {
47
return sign ? 1 << 3 : 1 << 7;
48
}
49
--
50
2.34.1
diff view generated by jsdifflib
New patch
1
In the frem helper, we have a local float_status because we want to
2
execute the floatx80_div() with a custom rounding mode. Instead of
3
zero-initializing the local float_status and then having to set it up
4
with the m68k standard behaviour (including the NaN propagation rule
5
and copying the rounding precision from env->fp_status), initialize
6
it as a complete copy of env->fp_status. This will avoid our having
7
to add new code in this function for every new config knob we add
8
to fp_status.
1
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-31-peter.maydell@linaro.org
13
---
14
target/m68k/fpu_helper.c | 6 ++----
15
1 file changed, 2 insertions(+), 4 deletions(-)
16
17
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/m68k/fpu_helper.c
20
+++ b/target/m68k/fpu_helper.c
21
@@ -XXX,XX +XXX,XX @@ void HELPER(frem)(CPUM68KState *env, FPReg *res, FPReg *val0, FPReg *val1)
22
23
fp_rem = floatx80_rem(val1->d, val0->d, &env->fp_status);
24
if (!floatx80_is_any_nan(fp_rem)) {
25
- float_status fp_status = { };
26
+ /* Use local temporary fp_status to set different rounding mode */
27
+ float_status fp_status = env->fp_status;
28
uint32_t quotient;
29
int sign;
30
31
/* Calculate quotient directly using round to nearest mode */
32
- set_float_2nan_prop_rule(float_2nan_prop_ab, &fp_status);
33
set_float_rounding_mode(float_round_nearest_even, &fp_status);
34
- set_floatx80_rounding_precision(
35
- get_floatx80_rounding_precision(&env->fp_status), &fp_status);
36
fp_quot.d = floatx80_div(val1->d, val0->d, &fp_status);
37
38
sign = extractFloatx80Sign(fp_quot.d);
39
--
40
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we do the conversion
2
from float64 to floatx80 using a scratch float_status, because we
3
don't want the conversion to affect the CPU's floating point exception
4
status. Currently we use a zero-initialized float_status. This will
5
get steadily more awkward as we add config knobs to float_status
6
that the target must initialize. Avoid having to add any of that
7
configuration here by instead initializing our local float_status
8
from the env->fp_status.
2
9
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-4-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-32-peter.maydell@linaro.org
7
---
13
---
8
target/arm/helper-sve.h | 29 +++++
14
target/m68k/helper.c | 6 ++++--
9
target/arm/sve_helper.c | 211 +++++++++++++++++++++++++++++++++++++
15
1 file changed, 4 insertions(+), 2 deletions(-)
10
target/arm/translate-sve.c | 65 ++++++++++++
11
target/arm/sve.decode | 38 +++++++
12
4 files changed, 343 insertions(+)
13
16
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
17
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
19
--- a/target/m68k/helper.c
17
+++ b/target/arm/helper-sve.h
20
+++ b/target/m68k/helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
21
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_get_reg(CPUState *cs, GByteArray *mem_buf, int n)
19
DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
CPUM68KState *env = &cpu->env;
20
23
21
DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
24
if (n < 8) {
22
+
25
- float_status s = {};
23
+DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
+ /* Use scratch float_status so any exceptions don't change CPU state */
24
+DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
+ float_status s = env->fp_status;
25
+DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
return gdb_get_reg64(mem_buf, floatx80_to_float64(env->fregs[n].d, &s));
26
+DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
+
28
+DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
32
+
33
+DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
+DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
+
38
+DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
+
43
+DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
44
+DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
+
47
+DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+
50
+DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
51
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/sve_helper.c
54
+++ b/target/arm/sve_helper.c
55
@@ -XXX,XX +XXX,XX @@ DO_LDNF1(sds_r)
56
DO_LDNF1(dd_r)
57
58
#undef DO_LDNF1
59
+
60
+/*
61
+ * Store contiguous data, protected by a governing predicate.
62
+ */
63
+#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \
64
+void HELPER(NAME)(CPUARMState *env, void *vg, \
65
+ target_ulong addr, uint32_t desc) \
66
+{ \
67
+ intptr_t i, oprsz = simd_oprsz(desc); \
68
+ intptr_t ra = GETPC(); \
69
+ unsigned rd = simd_data(desc); \
70
+ void *vd = &env->vfp.zregs[rd]; \
71
+ for (i = 0; i < oprsz; ) { \
72
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
73
+ do { \
74
+ if (pg & 1) { \
75
+ TYPEM m = *(TYPEE *)(vd + H(i)); \
76
+ FN(env, addr, m, ra); \
77
+ } \
78
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
79
+ addr += sizeof(TYPEM); \
80
+ } while (i & 15); \
81
+ } \
82
+}
83
+
84
+#define DO_ST1_D(NAME, FN, TYPEM) \
85
+void HELPER(NAME)(CPUARMState *env, void *vg, \
86
+ target_ulong addr, uint32_t desc) \
87
+{ \
88
+ intptr_t i, oprsz = simd_oprsz(desc) / 8; \
89
+ intptr_t ra = GETPC(); \
90
+ unsigned rd = simd_data(desc); \
91
+ uint64_t *d = &env->vfp.zregs[rd].d[0]; \
92
+ uint8_t *pg = vg; \
93
+ for (i = 0; i < oprsz; i += 1) { \
94
+ if (pg[H1(i)] & 1) { \
95
+ FN(env, addr, d[i], ra); \
96
+ } \
97
+ addr += sizeof(TYPEM); \
98
+ } \
99
+}
100
+
101
+#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \
102
+void HELPER(NAME)(CPUARMState *env, void *vg, \
103
+ target_ulong addr, uint32_t desc) \
104
+{ \
105
+ intptr_t i, oprsz = simd_oprsz(desc); \
106
+ intptr_t ra = GETPC(); \
107
+ unsigned rd = simd_data(desc); \
108
+ void *d1 = &env->vfp.zregs[rd]; \
109
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
110
+ for (i = 0; i < oprsz; ) { \
111
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
112
+ do { \
113
+ if (pg & 1) { \
114
+ TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
115
+ TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
116
+ FN(env, addr, m1, ra); \
117
+ FN(env, addr + sizeof(TYPEM), m2, ra); \
118
+ } \
119
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
120
+ addr += 2 * sizeof(TYPEM); \
121
+ } while (i & 15); \
122
+ } \
123
+}
124
+
125
+#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \
126
+void HELPER(NAME)(CPUARMState *env, void *vg, \
127
+ target_ulong addr, uint32_t desc) \
128
+{ \
129
+ intptr_t i, oprsz = simd_oprsz(desc); \
130
+ intptr_t ra = GETPC(); \
131
+ unsigned rd = simd_data(desc); \
132
+ void *d1 = &env->vfp.zregs[rd]; \
133
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
134
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
135
+ for (i = 0; i < oprsz; ) { \
136
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
137
+ do { \
138
+ if (pg & 1) { \
139
+ TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
140
+ TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
141
+ TYPEM m3 = *(TYPEE *)(d3 + H(i)); \
142
+ FN(env, addr, m1, ra); \
143
+ FN(env, addr + sizeof(TYPEM), m2, ra); \
144
+ FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \
145
+ } \
146
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
147
+ addr += 3 * sizeof(TYPEM); \
148
+ } while (i & 15); \
149
+ } \
150
+}
151
+
152
+#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \
153
+void HELPER(NAME)(CPUARMState *env, void *vg, \
154
+ target_ulong addr, uint32_t desc) \
155
+{ \
156
+ intptr_t i, oprsz = simd_oprsz(desc); \
157
+ intptr_t ra = GETPC(); \
158
+ unsigned rd = simd_data(desc); \
159
+ void *d1 = &env->vfp.zregs[rd]; \
160
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
161
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
162
+ void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \
163
+ for (i = 0; i < oprsz; ) { \
164
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
165
+ do { \
166
+ if (pg & 1) { \
167
+ TYPEM m1 = *(TYPEE *)(d1 + H(i)); \
168
+ TYPEM m2 = *(TYPEE *)(d2 + H(i)); \
169
+ TYPEM m3 = *(TYPEE *)(d3 + H(i)); \
170
+ TYPEM m4 = *(TYPEE *)(d4 + H(i)); \
171
+ FN(env, addr, m1, ra); \
172
+ FN(env, addr + sizeof(TYPEM), m2, ra); \
173
+ FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \
174
+ FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \
175
+ } \
176
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
177
+ addr += 4 * sizeof(TYPEM); \
178
+ } while (i & 15); \
179
+ } \
180
+}
181
+
182
+DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2)
183
+DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4)
184
+DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t)
185
+
186
+DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4)
187
+DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t)
188
+
189
+DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t)
190
+
191
+DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
192
+DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
193
+DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
194
+DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1)
195
+
196
+DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
197
+DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
198
+DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
199
+DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2)
200
+
201
+DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
202
+DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
203
+DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
204
+DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4)
205
+
206
+DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t)
207
+
208
+void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg,
209
+ target_ulong addr, uint32_t desc)
210
+{
211
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
212
+ intptr_t ra = GETPC();
213
+ unsigned rd = simd_data(desc);
214
+ uint64_t *d1 = &env->vfp.zregs[rd].d[0];
215
+ uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
216
+ uint8_t *pg = vg;
217
+
218
+ for (i = 0; i < oprsz; i += 1) {
219
+ if (pg[H1(i)] & 1) {
220
+ cpu_stq_data_ra(env, addr, d1[i], ra);
221
+ cpu_stq_data_ra(env, addr + 8, d2[i], ra);
222
+ }
223
+ addr += 2 * 8;
224
+ }
225
+}
226
+
227
+void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg,
228
+ target_ulong addr, uint32_t desc)
229
+{
230
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
231
+ intptr_t ra = GETPC();
232
+ unsigned rd = simd_data(desc);
233
+ uint64_t *d1 = &env->vfp.zregs[rd].d[0];
234
+ uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
235
+ uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0];
236
+ uint8_t *pg = vg;
237
+
238
+ for (i = 0; i < oprsz; i += 1) {
239
+ if (pg[H1(i)] & 1) {
240
+ cpu_stq_data_ra(env, addr, d1[i], ra);
241
+ cpu_stq_data_ra(env, addr + 8, d2[i], ra);
242
+ cpu_stq_data_ra(env, addr + 16, d3[i], ra);
243
+ }
244
+ addr += 3 * 8;
245
+ }
246
+}
247
+
248
+void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
249
+ target_ulong addr, uint32_t desc)
250
+{
251
+ intptr_t i, oprsz = simd_oprsz(desc) / 8;
252
+ intptr_t ra = GETPC();
253
+ unsigned rd = simd_data(desc);
254
+ uint64_t *d1 = &env->vfp.zregs[rd].d[0];
255
+ uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0];
256
+ uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0];
257
+ uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0];
258
+ uint8_t *pg = vg;
259
+
260
+ for (i = 0; i < oprsz; i += 1) {
261
+ if (pg[H1(i)] & 1) {
262
+ cpu_stq_data_ra(env, addr, d1[i], ra);
263
+ cpu_stq_data_ra(env, addr + 8, d2[i], ra);
264
+ cpu_stq_data_ra(env, addr + 16, d3[i], ra);
265
+ cpu_stq_data_ra(env, addr + 24, d4[i], ra);
266
+ }
267
+ addr += 4 * 8;
268
+ }
269
+}
270
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
271
index XXXXXXX..XXXXXXX 100644
272
--- a/target/arm/translate-sve.c
273
+++ b/target/arm/translate-sve.c
274
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
275
}
29
}
276
return true;
30
switch (n) {
277
}
31
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_set_reg(CPUState *cs, uint8_t *mem_buf, int n)
278
+
32
CPUM68KState *env = &cpu->env;
279
+static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
33
280
+ int msz, int esz, int nreg)
34
if (n < 8) {
281
+{
35
- float_status s = {};
282
+ static gen_helper_gvec_mem * const fn_single[4][4] = {
36
+ /* Use scratch float_status so any exceptions don't change CPU state */
283
+ { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r,
37
+ float_status s = env->fp_status;
284
+ gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r },
38
env->fregs[n].d = float64_to_floatx80(ldq_be_p(mem_buf), &s);
285
+ { NULL, gen_helper_sve_st1hh_r,
39
return 8;
286
+ gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r },
40
}
287
+ { NULL, NULL,
288
+ gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r },
289
+ { NULL, NULL, NULL, gen_helper_sve_st1dd_r },
290
+ };
291
+ static gen_helper_gvec_mem * const fn_multiple[3][4] = {
292
+ { gen_helper_sve_st2bb_r, gen_helper_sve_st2hh_r,
293
+ gen_helper_sve_st2ss_r, gen_helper_sve_st2dd_r },
294
+ { gen_helper_sve_st3bb_r, gen_helper_sve_st3hh_r,
295
+ gen_helper_sve_st3ss_r, gen_helper_sve_st3dd_r },
296
+ { gen_helper_sve_st4bb_r, gen_helper_sve_st4hh_r,
297
+ gen_helper_sve_st4ss_r, gen_helper_sve_st4dd_r },
298
+ };
299
+ gen_helper_gvec_mem *fn;
300
+
301
+ if (nreg == 0) {
302
+ /* ST1 */
303
+ fn = fn_single[msz][esz];
304
+ } else {
305
+ /* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
306
+ assert(msz == esz);
307
+ fn = fn_multiple[nreg - 1][msz];
308
+ }
309
+ assert(fn != NULL);
310
+ do_mem_zpa(s, zt, pg, addr, fn);
311
+}
312
+
313
+static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t insn)
314
+{
315
+ if (a->rm == 31 || a->msz > a->esz) {
316
+ return false;
317
+ }
318
+ if (sve_access_check(s)) {
319
+ TCGv_i64 addr = new_tmp_a64(s);
320
+ tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz);
321
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
322
+ do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg);
323
+ }
324
+ return true;
325
+}
326
+
327
+static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn)
328
+{
329
+ if (a->msz > a->esz) {
330
+ return false;
331
+ }
332
+ if (sve_access_check(s)) {
333
+ int vsz = vec_full_reg_size(s);
334
+ int elements = vsz >> a->esz;
335
+ TCGv_i64 addr = new_tmp_a64(s);
336
+
337
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn),
338
+ (a->imm * elements * (a->nreg + 1)) << a->msz);
339
+ do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg);
340
+ }
341
+ return true;
342
+}
343
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
344
index XXXXXXX..XXXXXXX 100644
345
--- a/target/arm/sve.decode
346
+++ b/target/arm/sve.decode
347
@@ -XXX,XX +XXX,XX @@
348
%imm7_22_16 22:2 16:5
349
%imm8_16_10 16:5 10:3
350
%imm9_16_10 16:s6 10:3
351
+%size_23 23:2
352
353
# A combination of tsz:imm3 -- extract esize.
354
%tszimm_esz 22:2 5:5 !function=tszimm_esz
355
@@ -XXX,XX +XXX,XX @@
356
&incdec2_pred rd rn pg esz d u
357
&rprr_load rd pg rn rm dtype nreg
358
&rpri_load rd pg rn imm dtype nreg
359
+&rprr_store rd pg rn rm msz esz nreg
360
+&rpri_store rd pg rn imm msz esz nreg
361
362
###########################################################################
363
# Named instruction formats. These are generally used to
364
@@ -XXX,XX +XXX,XX @@
365
@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \
366
&rpri_load dtype=%msz_dtype
367
368
+# Stores; user must fill in ESZ, MSZ, NREG as needed.
369
+@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store
370
+@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store
371
+@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \
372
+ &rprr_store nreg=0
373
+
374
###########################################################################
375
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
376
377
@@ -XXX,XX +XXX,XX @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz
378
# SVE load multiple structures (scalar plus immediate)
379
# LD2B, LD2H, LD2W, LD2D; etc.
380
LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz
381
+
382
+### SVE Memory Store Group
383
+
384
+# SVE contiguous store (scalar plus immediate)
385
+# ST1B, ST1H, ST1W, ST1D; require msz <= esz
386
+ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \
387
+ @rpri_store_msz nreg=0
388
+
389
+# SVE contiguous store (scalar plus scalar)
390
+# ST1B, ST1H, ST1W, ST1D; require msz <= esz
391
+# Enumerate msz lest we conflict with STR_zri.
392
+ST_zprr 1110010 00 .. ..... 010 ... ..... ..... \
393
+ @rprr_store_esz_n0 msz=0
394
+ST_zprr 1110010 01 .. ..... 010 ... ..... ..... \
395
+ @rprr_store_esz_n0 msz=1
396
+ST_zprr 1110010 10 .. ..... 010 ... ..... ..... \
397
+ @rprr_store_esz_n0 msz=2
398
+ST_zprr 1110010 11 11 ..... 010 ... ..... ..... \
399
+ @rprr_store msz=3 esz=3 nreg=0
400
+
401
+# SVE contiguous non-temporal store (scalar plus immediate) (nreg == 0)
402
+# SVE store multiple structures (scalar plus immediate) (nreg != 0)
403
+ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \
404
+ @rpri_store_msz esz=%size_23
405
+
406
+# SVE contiguous non-temporal store (scalar plus scalar) (nreg == 0)
407
+# SVE store multiple structures (scalar plus scalar) (nreg != 0)
408
+ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \
409
+ @rprr_store esz=%size_23
410
--
41
--
411
2.17.1
42
2.34.1
412
413
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
In the helper functions flcmps and flcmpd we use a scratch float_status
2
so that we don't change the CPU state if the comparison raises any
3
floating point exception flags. Instead of zero-initializing this
4
scratch float_status, initialize it as a copy of env->fp_status. This
5
avoids the need to explicitly initialize settings like the NaN
6
propagation rule or others we might add to softfloat in future.
2
7
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
To do this we need to pass the CPU env pointer in to the helper.
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Message-id: 20180627043328.11531-14-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-33-peter.maydell@linaro.org
8
---
13
---
9
target/arm/helper-sve.h | 67 +++++++++++++++++++++++++
14
target/sparc/helper.h | 4 ++--
10
target/arm/sve_helper.c | 77 ++++++++++++++++++++++++++++
15
target/sparc/fop_helper.c | 8 ++++----
11
target/arm/translate-sve.c | 100 +++++++++++++++++++++++++++++++++++++
16
target/sparc/translate.c | 4 ++--
12
target/arm/sve.decode | 57 +++++++++++++++++++++
17
3 files changed, 8 insertions(+), 8 deletions(-)
13
4 files changed, 301 insertions(+)
14
18
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
19
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
16
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
21
--- a/target/sparc/helper.h
18
+++ b/target/arm/helper-sve.h
22
+++ b/target/sparc/helper.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(fcmpd, TCG_CALL_NO_WG, i32, env, f64, f64)
20
24
DEF_HELPER_FLAGS_3(fcmped, TCG_CALL_NO_WG, i32, env, f64, f64)
21
DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
25
DEF_HELPER_FLAGS_3(fcmpq, TCG_CALL_NO_WG, i32, env, i128, i128)
22
26
DEF_HELPER_FLAGS_3(fcmpeq, TCG_CALL_NO_WG, i32, env, i128, i128)
23
+DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
27
-DEF_HELPER_FLAGS_2(flcmps, TCG_CALL_NO_RWG_SE, i32, f32, f32)
24
+ void, env, ptr, ptr, ptr, tl, i32)
28
-DEF_HELPER_FLAGS_2(flcmpd, TCG_CALL_NO_RWG_SE, i32, f64, f64)
25
+DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG,
29
+DEF_HELPER_FLAGS_3(flcmps, TCG_CALL_NO_RWG_SE, i32, env, f32, f32)
26
+ void, env, ptr, ptr, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_3(flcmpd, TCG_CALL_NO_RWG_SE, i32, env, f64, f64)
27
+DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG,
31
DEF_HELPER_2(raise_exception, noreturn, env, int)
28
+ void, env, ptr, ptr, ptr, tl, i32)
32
29
+DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG,
33
DEF_HELPER_FLAGS_3(faddd, TCG_CALL_NO_WG, f64, env, f64, f64)
30
+ void, env, ptr, ptr, ptr, tl, i32)
34
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
31
+DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG,
32
+ void, env, ptr, ptr, ptr, tl, i32)
33
+
34
+DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG,
35
+ void, env, ptr, ptr, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG,
37
+ void, env, ptr, ptr, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG,
39
+ void, env, ptr, ptr, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG,
41
+ void, env, ptr, ptr, ptr, tl, i32)
42
+DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG,
43
+ void, env, ptr, ptr, ptr, tl, i32)
44
+
45
+DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG,
46
+ void, env, ptr, ptr, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG,
48
+ void, env, ptr, ptr, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG,
50
+ void, env, ptr, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG,
52
+ void, env, ptr, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG,
54
+ void, env, ptr, ptr, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG,
56
+ void, env, ptr, ptr, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG,
58
+ void, env, ptr, ptr, ptr, tl, i32)
59
+
60
+DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG,
61
+ void, env, ptr, ptr, ptr, tl, i32)
62
+DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG,
63
+ void, env, ptr, ptr, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG,
65
+ void, env, ptr, ptr, ptr, tl, i32)
66
+DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG,
67
+ void, env, ptr, ptr, ptr, tl, i32)
68
+DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG,
69
+ void, env, ptr, ptr, ptr, tl, i32)
70
+DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG,
71
+ void, env, ptr, ptr, ptr, tl, i32)
72
+DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG,
73
+ void, env, ptr, ptr, ptr, tl, i32)
74
+
75
+DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG,
76
+ void, env, ptr, ptr, ptr, tl, i32)
77
+DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG,
78
+ void, env, ptr, ptr, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG,
80
+ void, env, ptr, ptr, ptr, tl, i32)
81
+DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG,
82
+ void, env, ptr, ptr, ptr, tl, i32)
83
+DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG,
84
+ void, env, ptr, ptr, ptr, tl, i32)
85
+DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
86
+ void, env, ptr, ptr, ptr, tl, i32)
87
+DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
88
+ void, env, ptr, ptr, ptr, tl, i32)
89
+
90
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
91
void, env, ptr, ptr, ptr, tl, i32)
92
DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
93
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
94
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/sve_helper.c
36
--- a/target/sparc/fop_helper.c
96
+++ b/target/arm/sve_helper.c
37
+++ b/target/sparc/fop_helper.c
97
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
38
@@ -XXX,XX +XXX,XX @@ uint32_t helper_fcmpeq(CPUSPARCState *env, Int128 src1, Int128 src2)
98
}
39
return finish_fcmp(env, r, GETPC());
99
}
40
}
100
41
101
+/* Loads with a vector index. */
42
-uint32_t helper_flcmps(float32 src1, float32 src2)
102
+
43
+uint32_t helper_flcmps(CPUSPARCState *env, float32 src1, float32 src2)
103
+#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
44
{
104
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
45
/*
105
+ target_ulong base, uint32_t desc) \
46
* FLCMP never raises an exception nor modifies any FSR fields.
106
+{ \
47
* Perform the comparison with a dummy fp environment.
107
+ intptr_t i, oprsz = simd_oprsz(desc); \
48
*/
108
+ unsigned scale = simd_data(desc); \
49
- float_status discard = { };
109
+ uintptr_t ra = GETPC(); \
50
+ float_status discard = env->fp_status;
110
+ for (i = 0; i < oprsz; i++) { \
51
FloatRelation r;
111
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
52
112
+ do { \
53
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
113
+ TYPEM m = 0; \
54
@@ -XXX,XX +XXX,XX @@ uint32_t helper_flcmps(float32 src1, float32 src2)
114
+ if (pg & 1) { \
55
g_assert_not_reached();
115
+ target_ulong off = *(TYPEI *)(vm + H1_4(i)); \
56
}
116
+ m = FN(env, base + (off << scale), ra); \
57
117
+ } \
58
-uint32_t helper_flcmpd(float64 src1, float64 src2)
118
+ *(uint32_t *)(vd + H1_4(i)) = m; \
59
+uint32_t helper_flcmpd(CPUSPARCState *env, float64 src1, float64 src2)
119
+ i += 4, pg >>= 4; \
60
{
120
+ } while (i & 15); \
61
- float_status discard = { };
121
+ } \
62
+ float_status discard = env->fp_status;
122
+}
63
FloatRelation r;
123
+
64
124
+#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \
65
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
125
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
66
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
126
+ target_ulong base, uint32_t desc) \
127
+{ \
128
+ intptr_t i, oprsz = simd_oprsz(desc) / 8; \
129
+ unsigned scale = simd_data(desc); \
130
+ uintptr_t ra = GETPC(); \
131
+ uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \
132
+ for (i = 0; i < oprsz; i++) { \
133
+ TYPEM mm = 0; \
134
+ if (pg[H1(i)] & 1) { \
135
+ target_ulong off = (TYPEI)m[i]; \
136
+ mm = FN(env, base + (off << scale), ra); \
137
+ } \
138
+ d[i] = mm; \
139
+ } \
140
+}
141
+
142
+DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
143
+DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
144
+DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
145
+DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
146
+DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
147
+
148
+DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
149
+DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
150
+DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
151
+DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra)
152
+DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra)
153
+
154
+DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
155
+DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
156
+DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
157
+DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra)
158
+DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
159
+DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
160
+DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra)
161
+
162
+DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
163
+DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
164
+DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
165
+DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra)
166
+DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra)
167
+DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra)
168
+DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra)
169
+
170
+DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra)
171
+DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra)
172
+DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra)
173
+DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra)
174
+DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
175
+DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
176
+DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
177
+
178
/* Stores with a vector index. */
179
180
#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
181
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
182
index XXXXXXX..XXXXXXX 100644
67
index XXXXXXX..XXXXXXX 100644
183
--- a/target/arm/translate-sve.c
68
--- a/target/sparc/translate.c
184
+++ b/target/arm/translate-sve.c
69
+++ b/target/sparc/translate.c
185
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale,
70
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPs(DisasContext *dc, arg_FLCMPs *a)
186
tcg_temp_free_i32(desc);
71
72
src1 = gen_load_fpr_F(dc, a->rs1);
73
src2 = gen_load_fpr_F(dc, a->rs2);
74
- gen_helper_flcmps(cpu_fcc[a->cc], src1, src2);
75
+ gen_helper_flcmps(cpu_fcc[a->cc], tcg_env, src1, src2);
76
return advance_pc(dc);
187
}
77
}
188
78
189
+/* Indexed by [ff][xs][u][msz]. */
79
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPd(DisasContext *dc, arg_FLCMPd *a)
190
+static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = {
80
191
+ { { { gen_helper_sve_ldbss_zsu,
81
src1 = gen_load_fpr_D(dc, a->rs1);
192
+ gen_helper_sve_ldhss_zsu,
82
src2 = gen_load_fpr_D(dc, a->rs2);
193
+ NULL, },
83
- gen_helper_flcmpd(cpu_fcc[a->cc], src1, src2);
194
+ { gen_helper_sve_ldbsu_zsu,
84
+ gen_helper_flcmpd(cpu_fcc[a->cc], tcg_env, src1, src2);
195
+ gen_helper_sve_ldhsu_zsu,
85
return advance_pc(dc);
196
+ gen_helper_sve_ldssu_zsu, } },
86
}
197
+ { { gen_helper_sve_ldbss_zss,
198
+ gen_helper_sve_ldhss_zss,
199
+ NULL, },
200
+ { gen_helper_sve_ldbsu_zss,
201
+ gen_helper_sve_ldhsu_zss,
202
+ gen_helper_sve_ldssu_zss, } } },
203
+ /* TODO fill in first-fault handlers */
204
+};
205
+
206
+/* Note that we overload xs=2 to indicate 64-bit offset. */
207
+static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = {
208
+ { { { gen_helper_sve_ldbds_zsu,
209
+ gen_helper_sve_ldhds_zsu,
210
+ gen_helper_sve_ldsds_zsu,
211
+ NULL, },
212
+ { gen_helper_sve_ldbdu_zsu,
213
+ gen_helper_sve_ldhdu_zsu,
214
+ gen_helper_sve_ldsdu_zsu,
215
+ gen_helper_sve_ldddu_zsu, } },
216
+ { { gen_helper_sve_ldbds_zss,
217
+ gen_helper_sve_ldhds_zss,
218
+ gen_helper_sve_ldsds_zss,
219
+ NULL, },
220
+ { gen_helper_sve_ldbdu_zss,
221
+ gen_helper_sve_ldhdu_zss,
222
+ gen_helper_sve_ldsdu_zss,
223
+ gen_helper_sve_ldddu_zss, } },
224
+ { { gen_helper_sve_ldbds_zd,
225
+ gen_helper_sve_ldhds_zd,
226
+ gen_helper_sve_ldsds_zd,
227
+ NULL, },
228
+ { gen_helper_sve_ldbdu_zd,
229
+ gen_helper_sve_ldhdu_zd,
230
+ gen_helper_sve_ldsdu_zd,
231
+ gen_helper_sve_ldddu_zd, } } },
232
+ /* TODO fill in first-fault handlers */
233
+};
234
+
235
+static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
236
+{
237
+ gen_helper_gvec_mem_scatter *fn = NULL;
238
+
239
+ if (!sve_access_check(s)) {
240
+ return true;
241
+ }
242
+
243
+ switch (a->esz) {
244
+ case MO_32:
245
+ fn = gather_load_fn32[a->ff][a->xs][a->u][a->msz];
246
+ break;
247
+ case MO_64:
248
+ fn = gather_load_fn64[a->ff][a->xs][a->u][a->msz];
249
+ break;
250
+ }
251
+ assert(fn != NULL);
252
+
253
+ do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
254
+ cpu_reg_sp(s, a->rn), fn);
255
+ return true;
256
+}
257
+
258
+static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
259
+{
260
+ gen_helper_gvec_mem_scatter *fn = NULL;
261
+ TCGv_i64 imm;
262
+
263
+ if (a->esz < a->msz || (a->esz == a->msz && !a->u)) {
264
+ return false;
265
+ }
266
+ if (!sve_access_check(s)) {
267
+ return true;
268
+ }
269
+
270
+ switch (a->esz) {
271
+ case MO_32:
272
+ fn = gather_load_fn32[a->ff][0][a->u][a->msz];
273
+ break;
274
+ case MO_64:
275
+ fn = gather_load_fn64[a->ff][2][a->u][a->msz];
276
+ break;
277
+ }
278
+ assert(fn != NULL);
279
+
280
+ /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x])
281
+ * by loading the immediate into the scalar parameter.
282
+ */
283
+ imm = tcg_const_i64(a->imm << a->msz);
284
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
285
+ tcg_temp_free_i64(imm);
286
+ return true;
287
+}
288
+
289
static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
290
{
291
/* Indexed by [xs][msz]. */
292
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
293
index XXXXXXX..XXXXXXX 100644
294
--- a/target/arm/sve.decode
295
+++ b/target/arm/sve.decode
296
@@ -XXX,XX +XXX,XX @@
297
&rpri_load rd pg rn imm dtype nreg
298
&rprr_store rd pg rn rm msz esz nreg
299
&rpri_store rd pg rn imm msz esz nreg
300
+&rprr_gather_load rd pg rn rm esz msz u ff xs scale
301
+&rpri_gather_load rd pg rn imm esz msz u ff
302
&rprr_scatter_store rd pg rn rm esz msz xs scale
303
304
###########################################################################
305
@@ -XXX,XX +XXX,XX @@
306
@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \
307
&rpri_load dtype=%msz_dtype
308
309
+# Gather Loads.
310
+@rprr_g_load_u ....... .. . . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
311
+ &rprr_gather_load xs=2
312
+@rprr_g_load_xs_u ....... .. xs:1 . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
313
+ &rprr_gather_load
314
+@rprr_g_load_xs_u_sc ....... .. xs:1 scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
315
+ &rprr_gather_load
316
+@rprr_g_load_xs_sc ....... .. xs:1 scale:1 rm:5 . . ff:1 pg:3 rn:5 rd:5 \
317
+ &rprr_gather_load
318
+@rprr_g_load_u_sc ....... .. . scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
319
+ &rprr_gather_load xs=2
320
+@rprr_g_load_sc ....... .. . scale:1 rm:5 . . ff:1 pg:3 rn:5 rd:5 \
321
+ &rprr_gather_load xs=2
322
+@rpri_g_load ....... msz:2 .. imm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \
323
+ &rpri_gather_load
324
+
325
# Stores; user must fill in ESZ, MSZ, NREG as needed.
326
@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store
327
@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store
328
@@ -XXX,XX +XXX,XX @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
329
LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \
330
&rpri_load dtype=%dtype_23_13 nreg=0
331
332
+# SVE 32-bit gather load (scalar plus 32-bit unscaled offsets)
333
+# SVE 32-bit gather load (scalar plus 32-bit scaled offsets)
334
+LD1_zprz 1000010 00 .0 ..... 0.. ... ..... ..... \
335
+ @rprr_g_load_xs_u esz=2 msz=0 scale=0
336
+LD1_zprz 1000010 01 .. ..... 0.. ... ..... ..... \
337
+ @rprr_g_load_xs_u_sc esz=2 msz=1
338
+LD1_zprz 1000010 10 .. ..... 01. ... ..... ..... \
339
+ @rprr_g_load_xs_sc esz=2 msz=2 u=1
340
+
341
+# SVE 32-bit gather load (vector plus immediate)
342
+LD1_zpiz 1000010 .. 01 ..... 1.. ... ..... ..... \
343
+ @rpri_g_load esz=2
344
+
345
### SVE Memory Contiguous Load Group
346
347
# SVE contiguous load (scalar plus scalar)
348
@@ -XXX,XX +XXX,XX @@ PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ----
349
350
### SVE Memory 64-bit Gather Group
351
352
+# SVE 64-bit gather load (scalar plus 32-bit unpacked unscaled offsets)
353
+# SVE 64-bit gather load (scalar plus 32-bit unpacked scaled offsets)
354
+LD1_zprz 1100010 00 .0 ..... 0.. ... ..... ..... \
355
+ @rprr_g_load_xs_u esz=3 msz=0 scale=0
356
+LD1_zprz 1100010 01 .. ..... 0.. ... ..... ..... \
357
+ @rprr_g_load_xs_u_sc esz=3 msz=1
358
+LD1_zprz 1100010 10 .. ..... 0.. ... ..... ..... \
359
+ @rprr_g_load_xs_u_sc esz=3 msz=2
360
+LD1_zprz 1100010 11 .. ..... 01. ... ..... ..... \
361
+ @rprr_g_load_xs_sc esz=3 msz=3 u=1
362
+
363
+# SVE 64-bit gather load (scalar plus 64-bit unscaled offsets)
364
+# SVE 64-bit gather load (scalar plus 64-bit scaled offsets)
365
+LD1_zprz 1100010 00 10 ..... 1.. ... ..... ..... \
366
+ @rprr_g_load_u esz=3 msz=0 scale=0
367
+LD1_zprz 1100010 01 1. ..... 1.. ... ..... ..... \
368
+ @rprr_g_load_u_sc esz=3 msz=1
369
+LD1_zprz 1100010 10 1. ..... 1.. ... ..... ..... \
370
+ @rprr_g_load_u_sc esz=3 msz=2
371
+LD1_zprz 1100010 11 1. ..... 11. ... ..... ..... \
372
+ @rprr_g_load_sc esz=3 msz=3 u=1
373
+
374
+# SVE 64-bit gather load (vector plus immediate)
375
+LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \
376
+ @rpri_g_load esz=3
377
+
378
# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets)
379
PRF 1100010 00 11 ----- 1-- --- ----- 0 ----
380
87
381
--
88
--
382
2.17.1
89
2.34.1
383
384
diff view generated by jsdifflib
New patch
1
In the helper_compute_fprf functions, we pass a dummy float_status
2
in to the is_signaling_nan() function. This is unnecessary, because
3
we have convenient access to the CPU env pointer here and that
4
is already set up with the correct values for the snan_bit_is_one
5
and no_signaling_nans config settings. is_signaling_nan() doesn't
6
ever update the fp_status with any exception flags, so there is
7
no reason not to use env->fp_status here.
1
8
9
Use env->fp_status instead of the dummy fp_status.
10
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20241202131347.498124-34-peter.maydell@linaro.org
14
---
15
target/ppc/fpu_helper.c | 3 +--
16
1 file changed, 1 insertion(+), 2 deletions(-)
17
18
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/ppc/fpu_helper.c
21
+++ b/target/ppc/fpu_helper.c
22
@@ -XXX,XX +XXX,XX @@ void helper_compute_fprf_##tp(CPUPPCState *env, tp arg) \
23
} else if (tp##_is_infinity(arg)) { \
24
fprf = neg ? 0x09 << FPSCR_FPRF : 0x05 << FPSCR_FPRF; \
25
} else { \
26
- float_status dummy = { }; /* snan_bit_is_one = 0 */ \
27
- if (tp##_is_signaling_nan(arg, &dummy)) { \
28
+ if (tp##_is_signaling_nan(arg, &env->fp_status)) { \
29
fprf = 0x00 << FPSCR_FPRF; \
30
} else { \
31
fprf = 0x11 << FPSCR_FPRF; \
32
--
33
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Now that float_status has a bunch of fp parameters,
4
it is easier to copy an existing structure than create
5
one from scratch. Begin by copying the structure that
6
corresponds to the FPSR and make only the adjustments
7
required for BFloat16 semantics.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241203203949.483774-2-richard.henderson@linaro.org
5
Message-id: 20180627043328.11531-25-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
14
---
8
target/arm/helper-sve.h | 30 +++++++++++++
15
target/arm/tcg/vec_helper.c | 20 +++++++-------------
9
target/arm/helper.h | 12 +++---
16
1 file changed, 7 insertions(+), 13 deletions(-)
10
target/arm/helper.c | 2 +-
11
target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++
12
target/arm/translate-sve.c | 70 ++++++++++++++++++++++++++++++
13
target/arm/sve.decode | 16 +++++++
14
6 files changed, 211 insertions(+), 7 deletions(-)
15
17
16
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
17
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-sve.h
20
--- a/target/arm/tcg/vec_helper.c
19
+++ b/target/arm/helper-sve.h
21
+++ b/target/arm/tcg/vec_helper.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
22
@@ -XXX,XX +XXX,XX @@ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
21
DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
23
* no effect on AArch32 instructions.
22
void, ptr, ptr, ptr, ptr, i32)
24
*/
23
25
bool ebf = is_a64(env) && env->vfp.fpcr & FPCR_EBF;
24
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG,
26
- *statusp = (float_status){
25
+ void, ptr, ptr, ptr, ptr, i32)
27
- .tininess_before_rounding = float_tininess_before_rounding,
26
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG,
28
- .float_rounding_mode = float_round_to_odd_inf,
27
+ void, ptr, ptr, ptr, ptr, i32)
29
- .flush_to_zero = true,
28
+DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG,
30
- .flush_inputs_to_zero = true,
29
+ void, ptr, ptr, ptr, ptr, i32)
31
- .default_nan_mode = true,
30
+DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG,
32
- };
31
+ void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG,
33
+ void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG,
35
+ void, ptr, ptr, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, i32)
38
+
33
+
39
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG,
34
+ *statusp = env->vfp.fp_status;
40
+ void, ptr, ptr, ptr, ptr, i32)
35
+ set_default_nan_mode(true, statusp);
41
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG,
36
42
+ void, ptr, ptr, ptr, ptr, i32)
37
if (ebf) {
43
+DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG,
38
- float_status *fpst = &env->vfp.fp_status;
44
+ void, ptr, ptr, ptr, ptr, i32)
39
- set_flush_to_zero(get_flush_to_zero(fpst), statusp);
45
+DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG,
40
- set_flush_inputs_to_zero(get_flush_inputs_to_zero(fpst), statusp);
46
+ void, ptr, ptr, ptr, ptr, i32)
41
- set_float_rounding_mode(get_float_rounding_mode(fpst), statusp);
47
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG,
42
-
48
+ void, ptr, ptr, ptr, ptr, i32)
43
/* EBF=1 needs to do a step with round-to-odd semantics */
49
+DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
44
*oddstatusp = *statusp;
50
+ void, ptr, ptr, ptr, ptr, i32)
45
set_float_rounding_mode(float_round_to_odd, oddstatusp);
51
+DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
46
+ } else {
52
+ void, ptr, ptr, ptr, ptr, i32)
47
+ set_flush_to_zero(true, statusp);
53
+
48
+ set_flush_inputs_to_zero(true, statusp);
54
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
49
+ set_float_rounding_mode(float_round_to_odd_inf, statusp);
55
void, ptr, ptr, ptr, ptr, i32)
50
}
56
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
51
-
57
diff --git a/target/arm/helper.h b/target/arm/helper.h
52
return ebf;
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/helper.h
60
+++ b/target/arm/helper.h
61
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(vfp_touid, i32, f64, ptr)
62
DEF_HELPER_2(vfp_touizh, i32, f16, ptr)
63
DEF_HELPER_2(vfp_touizs, i32, f32, ptr)
64
DEF_HELPER_2(vfp_touizd, i32, f64, ptr)
65
-DEF_HELPER_2(vfp_tosih, i32, f16, ptr)
66
-DEF_HELPER_2(vfp_tosis, i32, f32, ptr)
67
-DEF_HELPER_2(vfp_tosid, i32, f64, ptr)
68
-DEF_HELPER_2(vfp_tosizh, i32, f16, ptr)
69
-DEF_HELPER_2(vfp_tosizs, i32, f32, ptr)
70
-DEF_HELPER_2(vfp_tosizd, i32, f64, ptr)
71
+DEF_HELPER_2(vfp_tosih, s32, f16, ptr)
72
+DEF_HELPER_2(vfp_tosis, s32, f32, ptr)
73
+DEF_HELPER_2(vfp_tosid, s32, f64, ptr)
74
+DEF_HELPER_2(vfp_tosizh, s32, f16, ptr)
75
+DEF_HELPER_2(vfp_tosizs, s32, f32, ptr)
76
+DEF_HELPER_2(vfp_tosizd, s32, f64, ptr)
77
78
DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr)
79
DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr)
80
diff --git a/target/arm/helper.c b/target/arm/helper.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/helper.c
83
+++ b/target/arm/helper.c
84
@@ -XXX,XX +XXX,XX @@ ftype HELPER(name)(uint32_t x, void *fpstp) \
85
}
53
}
86
54
87
#define CONV_FTOI(name, ftype, fsz, sign, round) \
88
-uint32_t HELPER(name)(ftype x, void *fpstp) \
89
+sign##int32_t HELPER(name)(ftype x, void *fpstp) \
90
{ \
91
float_status *fpst = fpstp; \
92
if (float##fsz##_is_any_nan(x)) { \
93
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/sve_helper.c
96
+++ b/target/arm/sve_helper.c
97
@@ -XXX,XX +XXX,XX @@ static inline float16 sve_f64_to_f16(float64 f, float_status *fpst)
98
return ret;
99
}
100
101
+static inline int16_t vfp_float16_to_int16_rtz(float16 f, float_status *s)
102
+{
103
+ if (float16_is_any_nan(f)) {
104
+ float_raise(float_flag_invalid, s);
105
+ return 0;
106
+ }
107
+ return float16_to_int16_round_to_zero(f, s);
108
+}
109
+
110
+static inline int64_t vfp_float16_to_int64_rtz(float16 f, float_status *s)
111
+{
112
+ if (float16_is_any_nan(f)) {
113
+ float_raise(float_flag_invalid, s);
114
+ return 0;
115
+ }
116
+ return float16_to_int64_round_to_zero(f, s);
117
+}
118
+
119
+static inline int64_t vfp_float32_to_int64_rtz(float32 f, float_status *s)
120
+{
121
+ if (float32_is_any_nan(f)) {
122
+ float_raise(float_flag_invalid, s);
123
+ return 0;
124
+ }
125
+ return float32_to_int64_round_to_zero(f, s);
126
+}
127
+
128
+static inline int64_t vfp_float64_to_int64_rtz(float64 f, float_status *s)
129
+{
130
+ if (float64_is_any_nan(f)) {
131
+ float_raise(float_flag_invalid, s);
132
+ return 0;
133
+ }
134
+ return float64_to_int64_round_to_zero(f, s);
135
+}
136
+
137
+static inline uint16_t vfp_float16_to_uint16_rtz(float16 f, float_status *s)
138
+{
139
+ if (float16_is_any_nan(f)) {
140
+ float_raise(float_flag_invalid, s);
141
+ return 0;
142
+ }
143
+ return float16_to_uint16_round_to_zero(f, s);
144
+}
145
+
146
+static inline uint64_t vfp_float16_to_uint64_rtz(float16 f, float_status *s)
147
+{
148
+ if (float16_is_any_nan(f)) {
149
+ float_raise(float_flag_invalid, s);
150
+ return 0;
151
+ }
152
+ return float16_to_uint64_round_to_zero(f, s);
153
+}
154
+
155
+static inline uint64_t vfp_float32_to_uint64_rtz(float32 f, float_status *s)
156
+{
157
+ if (float32_is_any_nan(f)) {
158
+ float_raise(float_flag_invalid, s);
159
+ return 0;
160
+ }
161
+ return float32_to_uint64_round_to_zero(f, s);
162
+}
163
+
164
+static inline uint64_t vfp_float64_to_uint64_rtz(float64 f, float_status *s)
165
+{
166
+ if (float64_is_any_nan(f)) {
167
+ float_raise(float_flag_invalid, s);
168
+ return 0;
169
+ }
170
+ return float64_to_uint64_round_to_zero(f, s);
171
+}
172
+
173
DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16)
174
DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32)
175
DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16)
176
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64)
177
DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32)
178
DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64)
179
180
+DO_ZPZ_FP(sve_fcvtzs_hh, uint16_t, H1_2, vfp_float16_to_int16_rtz)
181
+DO_ZPZ_FP(sve_fcvtzs_hs, uint32_t, H1_4, helper_vfp_tosizh)
182
+DO_ZPZ_FP(sve_fcvtzs_ss, uint32_t, H1_4, helper_vfp_tosizs)
183
+DO_ZPZ_FP(sve_fcvtzs_hd, uint64_t, , vfp_float16_to_int64_rtz)
184
+DO_ZPZ_FP(sve_fcvtzs_sd, uint64_t, , vfp_float32_to_int64_rtz)
185
+DO_ZPZ_FP(sve_fcvtzs_ds, uint64_t, , helper_vfp_tosizd)
186
+DO_ZPZ_FP(sve_fcvtzs_dd, uint64_t, , vfp_float64_to_int64_rtz)
187
+
188
+DO_ZPZ_FP(sve_fcvtzu_hh, uint16_t, H1_2, vfp_float16_to_uint16_rtz)
189
+DO_ZPZ_FP(sve_fcvtzu_hs, uint32_t, H1_4, helper_vfp_touizh)
190
+DO_ZPZ_FP(sve_fcvtzu_ss, uint32_t, H1_4, helper_vfp_touizs)
191
+DO_ZPZ_FP(sve_fcvtzu_hd, uint64_t, , vfp_float16_to_uint64_rtz)
192
+DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz)
193
+DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd)
194
+DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz)
195
+
196
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
197
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
198
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
199
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
200
index XXXXXXX..XXXXXXX 100644
201
--- a/target/arm/translate-sve.c
202
+++ b/target/arm/translate-sve.c
203
@@ -XXX,XX +XXX,XX @@ static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
204
return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd);
205
}
206
207
+static bool trans_FCVTZS_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
208
+{
209
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hh);
210
+}
211
+
212
+static bool trans_FCVTZU_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
213
+{
214
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hh);
215
+}
216
+
217
+static bool trans_FCVTZS_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
218
+{
219
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hs);
220
+}
221
+
222
+static bool trans_FCVTZU_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
223
+{
224
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hs);
225
+}
226
+
227
+static bool trans_FCVTZS_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
228
+{
229
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hd);
230
+}
231
+
232
+static bool trans_FCVTZU_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
233
+{
234
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hd);
235
+}
236
+
237
+static bool trans_FCVTZS_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
238
+{
239
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ss);
240
+}
241
+
242
+static bool trans_FCVTZU_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
243
+{
244
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ss);
245
+}
246
+
247
+static bool trans_FCVTZS_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
248
+{
249
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_sd);
250
+}
251
+
252
+static bool trans_FCVTZU_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
253
+{
254
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_sd);
255
+}
256
+
257
+static bool trans_FCVTZS_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
258
+{
259
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ds);
260
+}
261
+
262
+static bool trans_FCVTZU_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
263
+{
264
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ds);
265
+}
266
+
267
+static bool trans_FCVTZS_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
268
+{
269
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_dd);
270
+}
271
+
272
+static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
273
+{
274
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd);
275
+}
276
+
277
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
278
{
279
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
280
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
281
index XXXXXXX..XXXXXXX 100644
282
--- a/target/arm/sve.decode
283
+++ b/target/arm/sve.decode
284
@@ -XXX,XX +XXX,XX @@ FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0
285
FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0
286
FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0
287
288
+# SVE floating-point convert to integer
289
+FCVTZS_hh 01100101 01 011 01 0 101 ... ..... ..... @rd_pg_rn_e0
290
+FCVTZU_hh 01100101 01 011 01 1 101 ... ..... ..... @rd_pg_rn_e0
291
+FCVTZS_hs 01100101 01 011 10 0 101 ... ..... ..... @rd_pg_rn_e0
292
+FCVTZU_hs 01100101 01 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
293
+FCVTZS_hd 01100101 01 011 11 0 101 ... ..... ..... @rd_pg_rn_e0
294
+FCVTZU_hd 01100101 01 011 11 1 101 ... ..... ..... @rd_pg_rn_e0
295
+FCVTZS_ss 01100101 10 011 10 0 101 ... ..... ..... @rd_pg_rn_e0
296
+FCVTZU_ss 01100101 10 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
297
+FCVTZS_ds 01100101 11 011 00 0 101 ... ..... ..... @rd_pg_rn_e0
298
+FCVTZU_ds 01100101 11 011 00 1 101 ... ..... ..... @rd_pg_rn_e0
299
+FCVTZS_sd 01100101 11 011 10 0 101 ... ..... ..... @rd_pg_rn_e0
300
+FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0
301
+FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0
302
+FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0
303
+
304
# SVE integer convert to floating-point
305
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
306
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
307
--
55
--
308
2.17.1
56
2.34.1
309
57
310
58
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Currently we hardcode the default NaN value in parts64_default_nan()
2
using a compile-time ifdef ladder. This is awkward for two cases:
3
* for single-QEMU-binary we can't hard-code target-specifics like this
4
* for Arm FEAT_AFP the default NaN value depends on FPCR.AH
5
(specifically the sign bit is different)
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Add a field to float_status to specify the default NaN value; fall
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
back to the old ifdef behaviour if these are not set.
5
Message-id: 20180627043328.11531-20-richard.henderson@linaro.org
9
10
The default NaN value is specified by setting a uint8_t to a
11
pattern corresponding to the sign and upper fraction parts of
12
the NaN; the lower bits of the fraction are set from bit 0 of
13
the pattern.
14
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Message-id: 20241202131347.498124-35-peter.maydell@linaro.org
7
---
18
---
8
target/arm/helper-sve.h | 35 ++++++++++++++++++++++
19
include/fpu/softfloat-helpers.h | 11 +++++++
9
target/arm/sve_helper.c | 61 ++++++++++++++++++++++++++++++++++++++
20
include/fpu/softfloat-types.h | 10 ++++++
10
target/arm/translate-sve.c | 57 +++++++++++++++++++++++++++++++++++
21
fpu/softfloat-specialize.c.inc | 55 ++++++++++++++++++++-------------
11
target/arm/sve.decode | 8 +++++
22
3 files changed, 54 insertions(+), 22 deletions(-)
12
4 files changed, 161 insertions(+)
13
23
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
24
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
15
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
26
--- a/include/fpu/softfloat-helpers.h
17
+++ b/target/arm/helper-sve.h
27
+++ b/include/fpu/softfloat-helpers.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
28
@@ -XXX,XX +XXX,XX @@ static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
19
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
29
status->float_infzeronan_rule = rule;
20
void, ptr, ptr, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG,
23
+ i64, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG,
25
+ i64, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG,
27
+ i64, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG,
30
+ i64, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG,
32
+ i64, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG,
34
+ i64, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG,
37
+ i64, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG,
39
+ i64, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG,
41
+ i64, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG,
44
+ i64, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG,
46
+ i64, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG,
48
+ i64, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG,
51
+ i64, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG,
53
+ i64, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG,
55
+ i64, ptr, ptr, ptr, i32)
56
+
57
DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
58
i64, i64, ptr, ptr, ptr, i32)
59
DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
60
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/sve_helper.c
63
+++ b/target/arm/sve_helper.c
64
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
65
return predtest_ones(d, oprsz, esz_mask);
66
}
30
}
67
31
68
+/* Recursive reduction on a function;
32
+static inline void set_float_default_nan_pattern(uint8_t dnan_pattern,
69
+ * C.f. the ARM ARM function ReducePredicated.
33
+ float_status *status)
70
+ *
34
+{
71
+ * While it would be possible to write this without the DATA temporary,
35
+ status->default_nan_pattern = dnan_pattern;
72
+ * it is much simpler to process the predicate register this way.
73
+ * The recursion is bounded to depth 7 (128 fp16 elements), so there's
74
+ * little to gain with a more complex non-recursive form.
75
+ */
76
+#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \
77
+static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \
78
+{ \
79
+ if (n == 1) { \
80
+ return *data; \
81
+ } else { \
82
+ uintptr_t half = n / 2; \
83
+ TYPE lo = NAME##_reduce(data, status, half); \
84
+ TYPE hi = NAME##_reduce(data + half, status, half); \
85
+ return TYPE##_##FUNC(lo, hi, status); \
86
+ } \
87
+} \
88
+uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \
89
+{ \
90
+ uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc); \
91
+ TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \
92
+ for (i = 0; i < oprsz; ) { \
93
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
94
+ do { \
95
+ TYPE nn = *(TYPE *)(vn + H(i)); \
96
+ *(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT); \
97
+ i += sizeof(TYPE), pg >>= sizeof(TYPE); \
98
+ } while (i & 15); \
99
+ } \
100
+ for (; i < maxsz; i += sizeof(TYPE)) { \
101
+ *(TYPE *)((void *)data + i) = IDENT; \
102
+ } \
103
+ return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \
104
+}
36
+}
105
+
37
+
106
+DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero)
38
static inline void set_flush_to_zero(bool val, float_status *status)
107
+DO_REDUCE(sve_faddv_s, float32, H1_4, add, float32_zero)
108
+DO_REDUCE(sve_faddv_d, float64, , add, float64_zero)
109
+
110
+/* Identity is floatN_default_nan, without the function call. */
111
+DO_REDUCE(sve_fminnmv_h, float16, H1_2, minnum, 0x7E00)
112
+DO_REDUCE(sve_fminnmv_s, float32, H1_4, minnum, 0x7FC00000)
113
+DO_REDUCE(sve_fminnmv_d, float64, , minnum, 0x7FF8000000000000ULL)
114
+
115
+DO_REDUCE(sve_fmaxnmv_h, float16, H1_2, maxnum, 0x7E00)
116
+DO_REDUCE(sve_fmaxnmv_s, float32, H1_4, maxnum, 0x7FC00000)
117
+DO_REDUCE(sve_fmaxnmv_d, float64, , maxnum, 0x7FF8000000000000ULL)
118
+
119
+DO_REDUCE(sve_fminv_h, float16, H1_2, min, float16_infinity)
120
+DO_REDUCE(sve_fminv_s, float32, H1_4, min, float32_infinity)
121
+DO_REDUCE(sve_fminv_d, float64, , min, float64_infinity)
122
+
123
+DO_REDUCE(sve_fmaxv_h, float16, H1_2, max, float16_chs(float16_infinity))
124
+DO_REDUCE(sve_fmaxv_s, float32, H1_4, max, float32_chs(float32_infinity))
125
+DO_REDUCE(sve_fmaxv_d, float64, , max, float64_chs(float64_infinity))
126
+
127
+#undef DO_REDUCE
128
+
129
uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
130
void *status, uint32_t desc)
131
{
39
{
132
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
40
status->flush_to_zero = val;
133
index XXXXXXX..XXXXXXX 100644
41
@@ -XXX,XX +XXX,XX @@ static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status
134
--- a/target/arm/translate-sve.c
42
return status->float_infzeronan_rule;
135
+++ b/target/arm/translate-sve.c
136
@@ -XXX,XX +XXX,XX @@ static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn)
137
return true;
138
}
43
}
139
44
140
+/*
45
+static inline uint8_t get_float_default_nan_pattern(float_status *status)
141
+ *** SVE Floating Point Fast Reduction Group
142
+ */
143
+
144
+typedef void gen_helper_fp_reduce(TCGv_i64, TCGv_ptr, TCGv_ptr,
145
+ TCGv_ptr, TCGv_i32);
146
+
147
+static void do_reduce(DisasContext *s, arg_rpr_esz *a,
148
+ gen_helper_fp_reduce *fn)
149
+{
46
+{
150
+ unsigned vsz = vec_full_reg_size(s);
47
+ return status->default_nan_pattern;
151
+ unsigned p2vsz = pow2ceil(vsz);
152
+ TCGv_i32 t_desc = tcg_const_i32(simd_desc(vsz, p2vsz, 0));
153
+ TCGv_ptr t_zn, t_pg, status;
154
+ TCGv_i64 temp;
155
+
156
+ temp = tcg_temp_new_i64();
157
+ t_zn = tcg_temp_new_ptr();
158
+ t_pg = tcg_temp_new_ptr();
159
+
160
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn));
161
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
162
+ status = get_fpstatus_ptr(a->esz == MO_16);
163
+
164
+ fn(temp, t_zn, t_pg, status, t_desc);
165
+ tcg_temp_free_ptr(t_zn);
166
+ tcg_temp_free_ptr(t_pg);
167
+ tcg_temp_free_ptr(status);
168
+ tcg_temp_free_i32(t_desc);
169
+
170
+ write_fp_dreg(s, a->rd, temp);
171
+ tcg_temp_free_i64(temp);
172
+}
48
+}
173
+
49
+
174
+#define DO_VPZ(NAME, name) \
50
static inline bool get_flush_to_zero(float_status *status)
175
+static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \
51
{
176
+{ \
52
return status->flush_to_zero;
177
+ static gen_helper_fp_reduce * const fns[3] = { \
53
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
178
+ gen_helper_sve_##name##_h, \
54
index XXXXXXX..XXXXXXX 100644
179
+ gen_helper_sve_##name##_s, \
55
--- a/include/fpu/softfloat-types.h
180
+ gen_helper_sve_##name##_d, \
56
+++ b/include/fpu/softfloat-types.h
181
+ }; \
57
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
182
+ if (a->esz == 0) { \
58
/* should denormalised inputs go to zero and set the input_denormal flag? */
183
+ return false; \
59
bool flush_inputs_to_zero;
184
+ } \
60
bool default_nan_mode;
185
+ if (sve_access_check(s)) { \
61
+ /*
186
+ do_reduce(s, a, fns[a->esz - 1]); \
62
+ * The pattern to use for the default NaN. Here the high bit specifies
187
+ } \
63
+ * the default NaN's sign bit, and bits 6..0 specify the high bits of the
188
+ return true; \
64
+ * fractional part. The low bits of the fractional part are copies of bit 0.
189
+}
65
+ * The exponent of the default NaN is (as for any NaN) always all 1s.
66
+ * Note that a value of 0 here is not a valid NaN. The target must set
67
+ * this to the correct non-zero value, or we will assert when trying to
68
+ * create a default NaN.
69
+ */
70
+ uint8_t default_nan_pattern;
71
/*
72
* The flags below are not used on all specializations and may
73
* constant fold away (see snan_bit_is_one()/no_signalling_nans() in
74
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
75
index XXXXXXX..XXXXXXX 100644
76
--- a/fpu/softfloat-specialize.c.inc
77
+++ b/fpu/softfloat-specialize.c.inc
78
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
79
{
80
bool sign = 0;
81
uint64_t frac;
82
+ uint8_t dnan_pattern = status->default_nan_pattern;
83
84
+ if (dnan_pattern == 0) {
85
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
86
- /* !snan_bit_is_one, set all bits */
87
- frac = (1ULL << DECOMPOSED_BINARY_POINT) - 1;
88
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
89
+ /* Sign bit clear, all frac bits set */
90
+ dnan_pattern = 0b01111111;
91
+#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
92
|| defined(TARGET_MICROBLAZE)
93
- /* !snan_bit_is_one, set sign and msb */
94
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
95
- sign = 1;
96
+ /* Sign bit set, most significant frac bit set */
97
+ dnan_pattern = 0b11000000;
98
#elif defined(TARGET_HPPA)
99
- /* snan_bit_is_one, set msb-1. */
100
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 2);
101
+ /* Sign bit clear, msb-1 frac bit set */
102
+ dnan_pattern = 0b00100000;
103
#elif defined(TARGET_HEXAGON)
104
- sign = 1;
105
- frac = ~0ULL;
106
+ /* Sign bit set, all frac bits set. */
107
+ dnan_pattern = 0b11111111;
108
#else
109
- /*
110
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
111
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
112
- * do not have floating-point.
113
- */
114
- if (snan_bit_is_one(status)) {
115
- /* set all bits other than msb */
116
- frac = (1ULL << (DECOMPOSED_BINARY_POINT - 1)) - 1;
117
- } else {
118
- /* set msb */
119
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
120
- }
121
+ /*
122
+ * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
123
+ * S390, SH4, TriCore, and Xtensa. Our other supported targets
124
+ * do not have floating-point.
125
+ */
126
+ if (snan_bit_is_one(status)) {
127
+ /* sign bit clear, set all frac bits other than msb */
128
+ dnan_pattern = 0b00111111;
129
+ } else {
130
+ /* sign bit clear, set frac msb */
131
+ dnan_pattern = 0b01000000;
132
+ }
133
#endif
134
+ }
135
+ assert(dnan_pattern != 0);
190
+
136
+
191
+DO_VPZ(FADDV, faddv)
137
+ sign = dnan_pattern >> 7;
192
+DO_VPZ(FMINNMV, fminnmv)
138
+ /*
193
+DO_VPZ(FMAXNMV, fmaxnmv)
139
+ * Place default_nan_pattern [6:0] into bits [62:56],
194
+DO_VPZ(FMINV, fminv)
140
+ * and replecate bit [0] down into [55:0]
195
+DO_VPZ(FMAXV, fmaxv)
141
+ */
196
+
142
+ frac = deposit64(0, DECOMPOSED_BINARY_POINT - 7, 7, dnan_pattern);
197
/*
143
+ frac = deposit64(frac, 0, DECOMPOSED_BINARY_POINT - 7, -(dnan_pattern & 1));
198
*** SVE Floating Point Accumulating Reduction Group
144
199
*/
145
*p = (FloatParts64) {
200
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
146
.cls = float_class_qnan,
201
index XXXXXXX..XXXXXXX 100644
202
--- a/target/arm/sve.decode
203
+++ b/target/arm/sve.decode
204
@@ -XXX,XX +XXX,XX @@ FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \
205
FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2
206
FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3
207
208
+### SVE FP Fast Reduction Group
209
+
210
+FADDV 01100101 .. 000 000 001 ... ..... ..... @rd_pg_rn
211
+FMAXNMV 01100101 .. 000 100 001 ... ..... ..... @rd_pg_rn
212
+FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn
213
+FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn
214
+FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn
215
+
216
### SVE FP Accumulating Reduction Group
217
218
# SVE floating-point serial reduction (predicated)
219
--
147
--
220
2.17.1
148
2.34.1
221
222
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for the tests/fp code.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-36-peter.maydell@linaro.org
6
---
7
tests/fp/fp-bench.c | 1 +
8
tests/fp/fp-test-log2.c | 1 +
9
tests/fp/fp-test.c | 1 +
10
3 files changed, 3 insertions(+)
11
12
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tests/fp/fp-bench.c
15
+++ b/tests/fp/fp-bench.c
16
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
17
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
18
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
19
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
20
+ set_float_default_nan_pattern(0b01000000, &soft_status);
21
22
f = bench_funcs[operation][precision];
23
g_assert(f);
24
diff --git a/tests/fp/fp-test-log2.c b/tests/fp/fp-test-log2.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/tests/fp/fp-test-log2.c
27
+++ b/tests/fp/fp-test-log2.c
28
@@ -XXX,XX +XXX,XX @@ int main(int ac, char **av)
29
int i;
30
31
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
32
+ set_float_default_nan_pattern(0b01000000, &qsf);
33
set_float_rounding_mode(float_round_nearest_even, &qsf);
34
35
test.d = 0.0;
36
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/tests/fp/fp-test.c
39
+++ b/tests/fp/fp-test.c
40
@@ -XXX,XX +XXX,XX @@ void run_test(void)
41
*/
42
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
43
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
44
+ set_float_default_nan_pattern(0b01000000, &qsf);
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
46
47
genCases_setLevel(test_level);
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-37-peter.maydell@linaro.org
7
---
8
target/microblaze/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 +--
10
2 files changed, 3 insertions(+), 2 deletions(-)
11
12
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/microblaze/cpu.c
15
+++ b/target/microblaze/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_reset_hold(Object *obj, ResetType type)
17
* this architecture.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
20
+ /* Default NaN: sign bit set, most significant frac bit set */
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
22
23
#if defined(CONFIG_USER_ONLY)
24
/* start in user mode with interrupts enabled. */
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
31
/* Sign bit clear, all frac bits set */
32
dnan_pattern = 0b01111111;
33
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
34
- || defined(TARGET_MICROBLAZE)
35
+#elif defined(TARGET_I386) || defined(TARGET_X86_64)
36
/* Sign bit set, most significant frac bit set */
37
dnan_pattern = 0b11000000;
38
#elif defined(TARGET_HPPA)
39
--
40
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
2
3
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-15-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-38-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-sve.h | 67 +++++++++++++++++++++++++++++
8
target/i386/tcg/fpu_helper.c | 4 ++++
9
target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++
9
fpu/softfloat-specialize.c.inc | 3 ---
10
target/arm/translate-sve.c | 40 ++++++++++++++++-
10
2 files changed, 4 insertions(+), 3 deletions(-)
11
3 files changed, 193 insertions(+), 2 deletions(-)
12
11
13
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
12
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-sve.h
14
--- a/target/i386/tcg/fpu_helper.c
16
+++ b/target/arm/helper-sve.h
15
+++ b/target/i386/tcg/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
16
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
18
DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
17
*/
19
void, env, ptr, ptr, ptr, tl, i32)
18
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
20
19
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
21
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
20
+ /* Default NaN: sign bit set, most significant frac bit set */
22
+ void, env, ptr, ptr, ptr, tl, i32)
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
23
+DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG,
22
+ set_float_default_nan_pattern(0b11000000, &env->mmx_status);
24
+ void, env, ptr, ptr, ptr, tl, i32)
23
+ set_float_default_nan_pattern(0b11000000, &env->sse_status);
25
+DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG,
24
}
26
+ void, env, ptr, ptr, ptr, tl, i32)
25
27
+DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG,
26
static inline uint8_t save_exception_flags(CPUX86State *env)
28
+ void, env, ptr, ptr, ptr, tl, i32)
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
+DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG,
30
+ void, env, ptr, ptr, ptr, tl, i32)
31
+
32
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG,
33
+ void, env, ptr, ptr, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG,
35
+ void, env, ptr, ptr, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG,
37
+ void, env, ptr, ptr, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG,
39
+ void, env, ptr, ptr, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG,
41
+ void, env, ptr, ptr, ptr, tl, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG,
44
+ void, env, ptr, ptr, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG,
46
+ void, env, ptr, ptr, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG,
48
+ void, env, ptr, ptr, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG,
50
+ void, env, ptr, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG,
52
+ void, env, ptr, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG,
54
+ void, env, ptr, ptr, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG,
56
+ void, env, ptr, ptr, ptr, tl, i32)
57
+
58
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG,
59
+ void, env, ptr, ptr, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG,
61
+ void, env, ptr, ptr, ptr, tl, i32)
62
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG,
63
+ void, env, ptr, ptr, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG,
65
+ void, env, ptr, ptr, ptr, tl, i32)
66
+DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG,
67
+ void, env, ptr, ptr, ptr, tl, i32)
68
+DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG,
69
+ void, env, ptr, ptr, ptr, tl, i32)
70
+DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG,
71
+ void, env, ptr, ptr, ptr, tl, i32)
72
+
73
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG,
74
+ void, env, ptr, ptr, ptr, tl, i32)
75
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG,
76
+ void, env, ptr, ptr, ptr, tl, i32)
77
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG,
78
+ void, env, ptr, ptr, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG,
80
+ void, env, ptr, ptr, ptr, tl, i32)
81
+DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG,
82
+ void, env, ptr, ptr, ptr, tl, i32)
83
+DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG,
84
+ void, env, ptr, ptr, ptr, tl, i32)
85
+DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG,
86
+ void, env, ptr, ptr, ptr, tl, i32)
87
+
88
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
89
void, env, ptr, ptr, ptr, tl, i32)
90
DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
91
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
92
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/sve_helper.c
29
--- a/fpu/softfloat-specialize.c.inc
94
+++ b/target/arm/sve_helper.c
30
+++ b/fpu/softfloat-specialize.c.inc
95
@@ -XXX,XX +XXX,XX @@ DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
31
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
96
DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
32
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
97
DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
33
/* Sign bit clear, all frac bits set */
98
34
dnan_pattern = 0b01111111;
99
+/* First fault loads with a vector index. */
35
-#elif defined(TARGET_I386) || defined(TARGET_X86_64)
100
+
36
- /* Sign bit set, most significant frac bit set */
101
+#ifdef CONFIG_USER_ONLY
37
- dnan_pattern = 0b11000000;
102
+
38
#elif defined(TARGET_HPPA)
103
+#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \
39
/* Sign bit clear, msb-1 frac bit set */
104
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
40
dnan_pattern = 0b00100000;
105
+ target_ulong base, uint32_t desc) \
106
+{ \
107
+ intptr_t i, oprsz = simd_oprsz(desc); \
108
+ unsigned scale = simd_data(desc); \
109
+ uintptr_t ra = GETPC(); \
110
+ bool first = true; \
111
+ mmap_lock(); \
112
+ for (i = 0; i < oprsz; i++) { \
113
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
114
+ do { \
115
+ TYPEM m = 0; \
116
+ if (pg & 1) { \
117
+ target_ulong off = *(TYPEI *)(vm + H(i)); \
118
+ target_ulong addr = base + (off << scale); \
119
+ if (!first && \
120
+ page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \
121
+ record_fault(env, i, oprsz); \
122
+ goto exit; \
123
+ } \
124
+ m = FN(env, addr, ra); \
125
+ first = false; \
126
+ } \
127
+ *(TYPEE *)(vd + H(i)) = m; \
128
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
129
+ } while (i & 15); \
130
+ } \
131
+ exit: \
132
+ mmap_unlock(); \
133
+}
134
+
135
+#else
136
+
137
+#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \
138
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
139
+ target_ulong base, uint32_t desc) \
140
+{ \
141
+ g_assert_not_reached(); \
142
+}
143
+
144
+#endif
145
+
146
+#define DO_LDFF1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \
147
+ DO_LDFF1_ZPZ(NAME, uint32_t, TYPEI, TYPEM, FN, H1_4)
148
+#define DO_LDFF1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \
149
+ DO_LDFF1_ZPZ(NAME, uint64_t, TYPEI, TYPEM, FN, )
150
+
151
+DO_LDFF1_ZPZ_S(sve_ldffbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
152
+DO_LDFF1_ZPZ_S(sve_ldffhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
153
+DO_LDFF1_ZPZ_S(sve_ldffssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
154
+DO_LDFF1_ZPZ_S(sve_ldffbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
155
+DO_LDFF1_ZPZ_S(sve_ldffhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
156
+
157
+DO_LDFF1_ZPZ_S(sve_ldffbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
158
+DO_LDFF1_ZPZ_S(sve_ldffhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
159
+DO_LDFF1_ZPZ_S(sve_ldffssu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
160
+DO_LDFF1_ZPZ_S(sve_ldffbss_zss, int32_t, int8_t, cpu_ldub_data_ra)
161
+DO_LDFF1_ZPZ_S(sve_ldffhss_zss, int32_t, int16_t, cpu_lduw_data_ra)
162
+
163
+DO_LDFF1_ZPZ_D(sve_ldffbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra)
164
+DO_LDFF1_ZPZ_D(sve_ldffhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra)
165
+DO_LDFF1_ZPZ_D(sve_ldffsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra)
166
+DO_LDFF1_ZPZ_D(sve_ldffddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra)
167
+DO_LDFF1_ZPZ_D(sve_ldffbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra)
168
+DO_LDFF1_ZPZ_D(sve_ldffhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra)
169
+DO_LDFF1_ZPZ_D(sve_ldffsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra)
170
+
171
+DO_LDFF1_ZPZ_D(sve_ldffbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra)
172
+DO_LDFF1_ZPZ_D(sve_ldffhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra)
173
+DO_LDFF1_ZPZ_D(sve_ldffsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra)
174
+DO_LDFF1_ZPZ_D(sve_ldffddu_zss, int32_t, uint64_t, cpu_ldq_data_ra)
175
+DO_LDFF1_ZPZ_D(sve_ldffbds_zss, int32_t, int8_t, cpu_ldub_data_ra)
176
+DO_LDFF1_ZPZ_D(sve_ldffhds_zss, int32_t, int16_t, cpu_lduw_data_ra)
177
+DO_LDFF1_ZPZ_D(sve_ldffsds_zss, int32_t, int32_t, cpu_ldl_data_ra)
178
+
179
+DO_LDFF1_ZPZ_D(sve_ldffbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra)
180
+DO_LDFF1_ZPZ_D(sve_ldffhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra)
181
+DO_LDFF1_ZPZ_D(sve_ldffsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra)
182
+DO_LDFF1_ZPZ_D(sve_ldffddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra)
183
+DO_LDFF1_ZPZ_D(sve_ldffbds_zd, uint64_t, int8_t, cpu_ldub_data_ra)
184
+DO_LDFF1_ZPZ_D(sve_ldffhds_zd, uint64_t, int16_t, cpu_lduw_data_ra)
185
+DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t, cpu_ldl_data_ra)
186
+
187
/* Stores with a vector index. */
188
189
#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
190
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
191
index XXXXXXX..XXXXXXX 100644
192
--- a/target/arm/translate-sve.c
193
+++ b/target/arm/translate-sve.c
194
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = {
195
{ gen_helper_sve_ldbsu_zss,
196
gen_helper_sve_ldhsu_zss,
197
gen_helper_sve_ldssu_zss, } } },
198
- /* TODO fill in first-fault handlers */
199
+
200
+ { { { gen_helper_sve_ldffbss_zsu,
201
+ gen_helper_sve_ldffhss_zsu,
202
+ NULL, },
203
+ { gen_helper_sve_ldffbsu_zsu,
204
+ gen_helper_sve_ldffhsu_zsu,
205
+ gen_helper_sve_ldffssu_zsu, } },
206
+ { { gen_helper_sve_ldffbss_zss,
207
+ gen_helper_sve_ldffhss_zss,
208
+ NULL, },
209
+ { gen_helper_sve_ldffbsu_zss,
210
+ gen_helper_sve_ldffhsu_zss,
211
+ gen_helper_sve_ldffssu_zss, } } }
212
};
213
214
/* Note that we overload xs=2 to indicate 64-bit offset. */
215
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = {
216
gen_helper_sve_ldhdu_zd,
217
gen_helper_sve_ldsdu_zd,
218
gen_helper_sve_ldddu_zd, } } },
219
- /* TODO fill in first-fault handlers */
220
+
221
+ { { { gen_helper_sve_ldffbds_zsu,
222
+ gen_helper_sve_ldffhds_zsu,
223
+ gen_helper_sve_ldffsds_zsu,
224
+ NULL, },
225
+ { gen_helper_sve_ldffbdu_zsu,
226
+ gen_helper_sve_ldffhdu_zsu,
227
+ gen_helper_sve_ldffsdu_zsu,
228
+ gen_helper_sve_ldffddu_zsu, } },
229
+ { { gen_helper_sve_ldffbds_zss,
230
+ gen_helper_sve_ldffhds_zss,
231
+ gen_helper_sve_ldffsds_zss,
232
+ NULL, },
233
+ { gen_helper_sve_ldffbdu_zss,
234
+ gen_helper_sve_ldffhdu_zss,
235
+ gen_helper_sve_ldffsdu_zss,
236
+ gen_helper_sve_ldffddu_zss, } },
237
+ { { gen_helper_sve_ldffbds_zd,
238
+ gen_helper_sve_ldffhds_zd,
239
+ gen_helper_sve_ldffsds_zd,
240
+ NULL, },
241
+ { gen_helper_sve_ldffbdu_zd,
242
+ gen_helper_sve_ldffhdu_zd,
243
+ gen_helper_sve_ldffsdu_zd,
244
+ gen_helper_sve_ldffddu_zd, } } }
245
};
246
247
static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn)
248
--
41
--
249
2.17.1
42
2.34.1
250
251
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
2
3
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-39-peter.maydell@linaro.org
7
---
7
---
8
target/arm/translate-sve.c | 21 +++++++++++++++++++++
8
target/hppa/fpu_helper.c | 2 ++
9
target/arm/sve.decode | 23 +++++++++++++++++++++++
9
fpu/softfloat-specialize.c.inc | 3 ---
10
2 files changed, 44 insertions(+)
10
2 files changed, 2 insertions(+), 3 deletions(-)
11
11
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
12
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
13
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
14
--- a/target/hppa/fpu_helper.c
15
+++ b/target/arm/translate-sve.c
15
+++ b/target/hppa/fpu_helper.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
16
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
17
cpu_reg_sp(s, a->rn), fn);
17
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
18
return true;
18
/* For inf * 0 + NaN, return the input NaN */
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
20
+ /* Default NaN: sign bit clear, msb-1 frac bit set */
21
+ set_float_default_nan_pattern(0b00100000, &env->fp_status);
19
}
22
}
20
+
23
21
+/*
24
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
22
+ * Prefetches
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
23
+ */
24
+
25
+static bool trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn)
26
+{
27
+ /* Prefetch is a nop within QEMU. */
28
+ sve_access_check(s);
29
+ return true;
30
+}
31
+
32
+static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn)
33
+{
34
+ if (a->rm == 31) {
35
+ return false;
36
+ }
37
+ /* Prefetch is a nop within QEMU. */
38
+ sve_access_check(s);
39
+ return true;
40
+}
41
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
42
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/sve.decode
27
--- a/fpu/softfloat-specialize.c.inc
44
+++ b/target/arm/sve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
45
@@ -XXX,XX +XXX,XX @@ LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
46
LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
47
@rpri_load_msz nreg=0
31
/* Sign bit clear, all frac bits set */
48
32
dnan_pattern = 0b01111111;
49
+# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets)
33
-#elif defined(TARGET_HPPA)
50
+PRF 1000010 00 -1 ----- 0-- --- ----- 0 ----
34
- /* Sign bit clear, msb-1 frac bit set */
51
+
35
- dnan_pattern = 0b00100000;
52
+# SVE 32-bit gather prefetch (vector plus immediate)
36
#elif defined(TARGET_HEXAGON)
53
+PRF 1000010 -- 00 ----- 111 --- ----- 0 ----
37
/* Sign bit set, all frac bits set. */
54
+
38
dnan_pattern = 0b11111111;
55
+# SVE contiguous prefetch (scalar plus immediate)
56
+PRF 1000010 11 1- ----- 0-- --- ----- 0 ----
57
+
58
+# SVE contiguous prefetch (scalar plus scalar)
59
+PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ----
60
+
61
+### SVE Memory 64-bit Gather Group
62
+
63
+# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets)
64
+PRF 1100010 00 11 ----- 1-- --- ----- 0 ----
65
+
66
+# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets)
67
+PRF 1100010 00 -1 ----- 0-- --- ----- 0 ----
68
+
69
+# SVE 64-bit gather prefetch (vector plus immediate)
70
+PRF 1100010 -- 00 ----- 111 --- ----- 0 ----
71
+
72
### SVE Memory Store Group
73
74
# SVE store predicate register
75
--
39
--
76
2.17.1
40
2.34.1
77
78
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for the alpha target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-40-peter.maydell@linaro.org
6
---
7
target/alpha/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
9
10
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/target/alpha/cpu.c
13
+++ b/target/alpha/cpu.c
14
@@ -XXX,XX +XXX,XX @@ static void alpha_cpu_initfn(Object *obj)
15
* operand in Fa. That is float_2nan_prop_ba.
16
*/
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
18
+ /* Default NaN: sign bit clear, msb frac bit set */
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
20
#if defined(CONFIG_USER_ONLY)
21
env->flags = ENV_FLAG_PS_USER | ENV_FLAG_FEN;
22
cpu_alpha_store_fpcr(env, (uint64_t)(FPCR_INVD | FPCR_DZED | FPCR_OVFD
23
--
24
2.34.1
diff view generated by jsdifflib
1
From: Aaron Lindsay <alindsay@codeaurora.org>
1
Set the default NaN pattern explicitly for the arm target.
2
This includes setting it for the old linux-user nwfpe emulation.
3
For nwfpe, our default doesn't match the real kernel, but we
4
avoid making a behaviour change in this commit.
2
5
3
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
4
Message-id: 1529699547-17044-5-git-send-email-alindsay@codeaurora.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-41-peter.maydell@linaro.org
6
---
9
---
7
target/arm/cpu.h | 1 +
10
linux-user/arm/nwfpe/fpa11.c | 5 +++++
8
target/arm/cpu.c | 21 ++++++++++++++-------
11
target/arm/cpu.c | 2 ++
9
target/arm/kvm32.c | 8 ++++----
12
2 files changed, 7 insertions(+)
10
3 files changed, 19 insertions(+), 11 deletions(-)
11
13
12
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/linux-user/arm/nwfpe/fpa11.c b/linux-user/arm/nwfpe/fpa11.c
13
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.h
16
--- a/linux-user/arm/nwfpe/fpa11.c
15
+++ b/target/arm/cpu.h
17
+++ b/linux-user/arm/nwfpe/fpa11.c
16
@@ -XXX,XX +XXX,XX @@ enum arm_features {
18
@@ -XXX,XX +XXX,XX @@ void resetFPA11(void)
17
ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling. */
19
* this late date.
18
ARM_FEATURE_THUMB2EE,
20
*/
19
ARM_FEATURE_V7MP, /* v7 Multiprocessing Extensions */
21
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &fpa11->fp_status);
20
+ ARM_FEATURE_V7VE, /* v7 Virtualization Extensions (non-EL2 parts) */
22
+ /*
21
ARM_FEATURE_V4T,
23
+ * Use the same default NaN value as Arm VFP. This doesn't match
22
ARM_FEATURE_V5,
24
+ * the Linux kernel's nwfpe emulation, which uses an all-1s value.
23
ARM_FEATURE_STRONGARM,
25
+ */
26
+ set_float_default_nan_pattern(0b01000000, &fpa11->fp_status);
27
}
28
29
void SetRoundingMode(const unsigned int opcode)
24
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
30
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
25
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/cpu.c
32
--- a/target/arm/cpu.c
27
+++ b/target/arm/cpu.c
33
+++ b/target/arm/cpu.c
28
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
34
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
29
35
* the pseudocode function the arguments are in the order c, a, b.
30
/* Some features automatically imply others: */
36
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
31
if (arm_feature(env, ARM_FEATURE_V8)) {
37
* and the input NaN if it is signalling
32
- set_feature(env, ARM_FEATURE_V7);
38
+ * * Default NaN has sign bit clear, msb frac bit set
33
+ set_feature(env, ARM_FEATURE_V7VE);
39
*/
34
+ }
40
static void arm_set_default_fp_behaviours(float_status *s)
35
+ if (arm_feature(env, ARM_FEATURE_V7VE)) {
41
{
36
+ /* v7 Virtualization Extensions. In real hardware this implies
42
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
37
+ * EL2 and also the presence of the Security Extensions.
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
38
+ * For QEMU, for backwards-compatibility we implement some
44
set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
39
+ * CPUs or CPU configs which have no actual EL2 or EL3 but do
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
40
+ * include the various other features that V7VE implies.
46
+ set_float_default_nan_pattern(0b01000000, s);
41
+ * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
47
}
42
+ * Security Extensions is ARM_FEATURE_EL3.
48
43
+ */
49
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
44
set_feature(env, ARM_FEATURE_ARM_DIV);
45
set_feature(env, ARM_FEATURE_LPAE);
46
+ set_feature(env, ARM_FEATURE_V7);
47
}
48
if (arm_feature(env, ARM_FEATURE_V7)) {
49
set_feature(env, ARM_FEATURE_VAPA);
50
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
51
ARMCPU *cpu = ARM_CPU(obj);
52
53
cpu->dtb_compatible = "arm,cortex-a7";
54
- set_feature(&cpu->env, ARM_FEATURE_V7);
55
+ set_feature(&cpu->env, ARM_FEATURE_V7VE);
56
set_feature(&cpu->env, ARM_FEATURE_VFP4);
57
set_feature(&cpu->env, ARM_FEATURE_NEON);
58
set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
59
- set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
60
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
61
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
62
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
63
- set_feature(&cpu->env, ARM_FEATURE_LPAE);
64
set_feature(&cpu->env, ARM_FEATURE_EL3);
65
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7;
66
cpu->midr = 0x410fc075;
67
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
68
ARMCPU *cpu = ARM_CPU(obj);
69
70
cpu->dtb_compatible = "arm,cortex-a15";
71
- set_feature(&cpu->env, ARM_FEATURE_V7);
72
+ set_feature(&cpu->env, ARM_FEATURE_V7VE);
73
set_feature(&cpu->env, ARM_FEATURE_VFP4);
74
set_feature(&cpu->env, ARM_FEATURE_NEON);
75
set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
76
- set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
77
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
78
set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
79
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
80
- set_feature(&cpu->env, ARM_FEATURE_LPAE);
81
set_feature(&cpu->env, ARM_FEATURE_EL3);
82
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A15;
83
cpu->midr = 0x412fc0f1;
84
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/kvm32.c
87
+++ b/target/arm/kvm32.c
88
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
89
/* Now we've retrieved all the register information we can
90
* set the feature bits based on the ID register fields.
91
* We can assume any KVM supporting CPU is at least a v7
92
- * with VFPv3, LPAE and the generic timers; this in turn implies
93
- * most of the other feature bits, but a few must be tested.
94
+ * with VFPv3, virtualization extensions, and the generic
95
+ * timers; this in turn implies most of the other feature
96
+ * bits, but a few must be tested.
97
*/
98
- set_feature(&features, ARM_FEATURE_V7);
99
+ set_feature(&features, ARM_FEATURE_V7VE);
100
set_feature(&features, ARM_FEATURE_VFP3);
101
- set_feature(&features, ARM_FEATURE_LPAE);
102
set_feature(&features, ARM_FEATURE_GENERIC_TIMER);
103
104
switch (extract32(id_isar0, 24, 4)) {
105
--
50
--
106
2.17.1
51
2.34.1
107
108
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for loongarch.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-24-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-42-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-sve.h | 13 +++++++++
7
target/loongarch/tcg/fpu_helper.c | 2 ++
9
target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++
8
1 file changed, 2 insertions(+)
10
target/arm/translate-sve.c | 30 +++++++++++++++++++++
11
target/arm/sve.decode | 8 ++++++
12
4 files changed, 106 insertions(+)
13
9
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
10
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
12
--- a/target/loongarch/tcg/fpu_helper.c
17
+++ b/target/arm/helper-sve.h
13
+++ b/target/loongarch/tcg/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
14
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
19
DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
15
*/
20
void, ptr, ptr, ptr, i64, ptr, i32)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
21
17
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
22
+DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG,
18
+ /* Default NaN: sign bit clear, msb frac bit set */
23
+ void, ptr, ptr, ptr, ptr, i32)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
+DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG,
29
+ void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
31
+ void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
33
+ void, ptr, ptr, ptr, ptr, i32)
34
+
35
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
36
void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
38
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/sve_helper.c
41
+++ b/target/arm/sve_helper.c
42
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
43
} while (i != 0); \
44
}
20
}
45
21
46
+/* SVE fp16 conversions always use IEEE mode. Like AdvSIMD, they ignore
22
int ieee_ex_to_loongarch(int xcpt)
47
+ * FZ16. When converting from fp16, this affects flushing input denormals;
48
+ * when converting to fp16, this affects flushing output denormals.
49
+ */
50
+static inline float32 sve_f16_to_f32(float16 f, float_status *fpst)
51
+{
52
+ flag save = get_flush_inputs_to_zero(fpst);
53
+ float32 ret;
54
+
55
+ set_flush_inputs_to_zero(false, fpst);
56
+ ret = float16_to_float32(f, true, fpst);
57
+ set_flush_inputs_to_zero(save, fpst);
58
+ return ret;
59
+}
60
+
61
+static inline float64 sve_f16_to_f64(float16 f, float_status *fpst)
62
+{
63
+ flag save = get_flush_inputs_to_zero(fpst);
64
+ float64 ret;
65
+
66
+ set_flush_inputs_to_zero(false, fpst);
67
+ ret = float16_to_float64(f, true, fpst);
68
+ set_flush_inputs_to_zero(save, fpst);
69
+ return ret;
70
+}
71
+
72
+static inline float16 sve_f32_to_f16(float32 f, float_status *fpst)
73
+{
74
+ flag save = get_flush_to_zero(fpst);
75
+ float16 ret;
76
+
77
+ set_flush_to_zero(false, fpst);
78
+ ret = float32_to_float16(f, true, fpst);
79
+ set_flush_to_zero(save, fpst);
80
+ return ret;
81
+}
82
+
83
+static inline float16 sve_f64_to_f16(float64 f, float_status *fpst)
84
+{
85
+ flag save = get_flush_to_zero(fpst);
86
+ float16 ret;
87
+
88
+ set_flush_to_zero(false, fpst);
89
+ ret = float64_to_float16(f, true, fpst);
90
+ set_flush_to_zero(save, fpst);
91
+ return ret;
92
+}
93
+
94
+DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16)
95
+DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32)
96
+DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16)
97
+DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64)
98
+DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32)
99
+DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64)
100
+
101
DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
102
DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
103
DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
104
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
105
index XXXXXXX..XXXXXXX 100644
106
--- a/target/arm/translate-sve.c
107
+++ b/target/arm/translate-sve.c
108
@@ -XXX,XX +XXX,XX @@ static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg,
109
return true;
110
}
111
112
+static bool trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
113
+{
114
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh);
115
+}
116
+
117
+static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
118
+{
119
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs);
120
+}
121
+
122
+static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
123
+{
124
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh);
125
+}
126
+
127
+static bool trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
128
+{
129
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd);
130
+}
131
+
132
+static bool trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
133
+{
134
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds);
135
+}
136
+
137
+static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
138
+{
139
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd);
140
+}
141
+
142
static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
143
{
144
return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
145
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
146
index XXXXXXX..XXXXXXX 100644
147
--- a/target/arm/sve.decode
148
+++ b/target/arm/sve.decode
149
@@ -XXX,XX +XXX,XX @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra
150
151
### SVE FP Unary Operations Predicated Group
152
153
+# SVE floating-point convert precision
154
+FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0
155
+FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0
156
+FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0
157
+FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0
158
+FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0
159
+FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0
160
+
161
# SVE integer convert to floating-point
162
SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
163
SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
164
--
23
--
165
2.17.1
24
2.34.1
166
167
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for m68k.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-43-peter.maydell@linaro.org
6
---
7
target/m68k/cpu.c | 2 ++
8
fpu/softfloat-specialize.c.inc | 2 +-
9
2 files changed, 3 insertions(+), 1 deletion(-)
10
11
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/m68k/cpu.c
14
+++ b/target/m68k/cpu.c
15
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
16
* preceding paragraph for nonsignaling NaNs.
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
+ /* Default NaN: sign bit clear, all frac bits set */
20
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
21
22
nan = floatx80_default_nan(&env->fp_status);
23
for (i = 0; i < 8; i++) {
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
29
uint8_t dnan_pattern = status->default_nan_pattern;
30
31
if (dnan_pattern == 0) {
32
-#if defined(TARGET_SPARC) || defined(TARGET_M68K)
33
+#if defined(TARGET_SPARC)
34
/* Sign bit clear, all frac bits set */
35
dnan_pattern = 0b01111111;
36
#elif defined(TARGET_HEXAGON)
37
--
38
2.34.1
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
1
Set the default NaN pattern explicitly for MIPS. Note that this
2
is our only target which currently changes the default NaN
3
at runtime (which it was previously doing indirectly when it
4
changed the snan_bit_is_one setting).
2
5
3
Use error_report() + exit() instead of error_setg(&error_fatal),
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
as suggested by the "qapi/error.h" documentation:
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-44-peter.maydell@linaro.org
9
---
10
target/mips/fpu_helper.h | 7 +++++++
11
target/mips/msa.c | 3 +++
12
2 files changed, 10 insertions(+)
5
13
6
Please don't error_setg(&error_fatal, ...), use error_report() and
14
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
7
exit(), because that's more obvious.
8
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Eric Auger <eric.auger@redhat.com>
11
Reviewed-by: Markus Armbruster <armbru@redhat.com>
12
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
13
Message-id: 20180625165749.3910-4-f4bug@amsat.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
16
device_tree.c | 23 +++++++++++++----------
17
1 file changed, 13 insertions(+), 10 deletions(-)
18
19
diff --git a/device_tree.c b/device_tree.c
20
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
21
--- a/device_tree.c
16
--- a/target/mips/fpu_helper.h
22
+++ b/device_tree.c
17
+++ b/target/mips/fpu_helper.h
23
@@ -XXX,XX +XXX,XX @@ static void read_fstree(void *fdt, const char *dirname)
18
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
24
const char *parent_node;
19
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
25
20
nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
26
if (strstr(dirname, root_dir) != dirname) {
21
set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
27
- error_setg(&error_fatal, "%s: %s must be searched within %s",
22
+ /*
28
- __func__, dirname, root_dir);
23
+ * With nan2008, the default NaN value has the sign bit clear and the
29
+ error_report("%s: %s must be searched within %s",
24
+ * frac msb set; with the older mode, the sign bit is clear, and all
30
+ __func__, dirname, root_dir);
25
+ * frac bits except the msb are set.
31
+ exit(1);
26
+ */
32
}
27
+ set_float_default_nan_pattern(nan2008 ? 0b01000000 : 0b00111111,
33
parent_node = &dirname[strlen(SYSFS_DT_BASEDIR)];
28
+ &env->active_fpu.fp_status);
34
29
35
d = opendir(dirname);
30
}
36
if (!d) {
31
37
- error_setg(&error_fatal, "%s cannot open %s", __func__, dirname);
32
diff --git a/target/mips/msa.c b/target/mips/msa.c
38
- return;
33
index XXXXXXX..XXXXXXX 100644
39
+ error_report("%s cannot open %s", __func__, dirname);
34
--- a/target/mips/msa.c
40
+ exit(1);
35
+++ b/target/mips/msa.c
41
}
36
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
42
37
/* Inf * 0 + NaN returns the input NaN */
43
while ((de = readdir(d)) != NULL) {
38
set_float_infzeronan_rule(float_infzeronan_dnan_never,
44
@@ -XXX,XX +XXX,XX @@ static void read_fstree(void *fdt, const char *dirname)
39
&env->active_tc.msa_fp_status);
45
tmpnam = g_strdup_printf("%s/%s", dirname, de->d_name);
40
+ /* Default NaN: sign bit clear, frac msb set */
46
41
+ set_float_default_nan_pattern(0b01000000,
47
if (lstat(tmpnam, &st) < 0) {
42
+ &env->active_tc.msa_fp_status);
48
- error_setg(&error_fatal, "%s cannot lstat %s", __func__, tmpnam);
49
+ error_report("%s cannot lstat %s", __func__, tmpnam);
50
+ exit(1);
51
}
52
53
if (S_ISREG(st.st_mode)) {
54
@@ -XXX,XX +XXX,XX @@ static void read_fstree(void *fdt, const char *dirname)
55
gsize len;
56
57
if (!g_file_get_contents(tmpnam, &val, &len, NULL)) {
58
- error_setg(&error_fatal, "%s not able to extract info from %s",
59
- __func__, tmpnam);
60
+ error_report("%s not able to extract info from %s",
61
+ __func__, tmpnam);
62
+ exit(1);
63
}
64
65
if (strlen(parent_node) > 0) {
66
@@ -XXX,XX +XXX,XX @@ void *load_device_tree_from_sysfs(void)
67
host_fdt = create_device_tree(&host_fdt_size);
68
read_fstree(host_fdt, SYSFS_DT_BASEDIR);
69
if (fdt_check_header(host_fdt)) {
70
- error_setg(&error_fatal,
71
- "%s host device tree extracted into memory is invalid",
72
- __func__);
73
+ error_report("%s host device tree extracted into memory is invalid",
74
+ __func__);
75
+ exit(1);
76
}
77
return host_fdt;
78
}
43
}
79
--
44
--
80
2.17.1
45
2.34.1
81
82
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for openrisc.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-45-peter.maydell@linaro.org
6
---
7
target/openrisc/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
9
10
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/target/openrisc/cpu.c
13
+++ b/target/openrisc/cpu.c
14
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_reset_hold(Object *obj, ResetType type)
15
*/
16
set_float_2nan_prop_rule(float_2nan_prop_x87, &cpu->env.fp_status);
17
18
+ /* Default NaN: sign bit clear, frac msb set */
19
+ set_float_default_nan_pattern(0b01000000, &cpu->env.fp_status);
20
21
#ifndef CONFIG_USER_ONLY
22
cpu->env.picmr = 0x00000000;
23
--
24
2.34.1
diff view generated by jsdifflib
1
From: Alex Bennée <alex.bennee@linaro.org>
1
Set the default NaN pattern explicitly for ppc.
2
2
3
Since kernel commit a86bd139f2 (arm64: arch_timer: Enable CNTVCT_EL0
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
trap..), released in kernel version v4.12, user-space has been able
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
to read these system registers. As we can't use QEMUTimer's in
5
Message-id: 20241202131347.498124-46-peter.maydell@linaro.org
6
linux-user mode we just directly call cpu_get_clock().
6
---
7
target/ppc/cpu_init.c | 4 ++++
8
1 file changed, 4 insertions(+)
7
9
8
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
10
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20180625160009.17437-2-alex.bennee@linaro.org
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
target/arm/helper.c | 27 ++++++++++++++++++++++++---
15
1 file changed, 24 insertions(+), 3 deletions(-)
16
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
18
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.c
12
--- a/target/ppc/cpu_init.c
20
+++ b/target/arm/helper.c
13
+++ b/target/ppc/cpu_init.c
21
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
14
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
22
};
15
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
23
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
24
#else
17
25
-/* In user-mode none of the generic timer registers are accessible,
18
+ /* Default NaN: sign bit clear, set frac msb */
26
- * and their implementation depends on QEMU_CLOCK_VIRTUAL and qdev gpio outputs,
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
27
- * so instead just don't register any of them.
20
+ set_float_default_nan_pattern(0b01000000, &env->vec_status);
28
+
21
+
29
+/* In user-mode most of the generic timer registers are inaccessible
22
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
30
+ * however modern kernels (4.12+) allow access to cntvct_el0
23
ppc_spr_t *spr = &env->spr_cb[i];
31
*/
32
+
33
+static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
34
+{
35
+ /* Currently we have no support for QEMUTimer in linux-user so we
36
+ * can't call gt_get_countervalue(env), instead we directly
37
+ * call the lower level functions.
38
+ */
39
+ return cpu_get_clock() / GTIMER_SCALE;
40
+}
41
+
42
static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
43
+ { .name = "CNTFRQ_EL0", .state = ARM_CP_STATE_AA64,
44
+ .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 0, .opc2 = 0,
45
+ .type = ARM_CP_CONST, .access = PL0_R /* no PL1_RW in linux-user */,
46
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_cntfrq),
47
+ .resetvalue = NANOSECONDS_PER_SECOND / GTIMER_SCALE,
48
+ },
49
+ { .name = "CNTVCT_EL0", .state = ARM_CP_STATE_AA64,
50
+ .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 0, .opc2 = 2,
51
+ .access = PL0_R, .type = ARM_CP_NO_RAW | ARM_CP_IO,
52
+ .readfn = gt_virt_cnt_read,
53
+ },
54
REGINFO_SENTINEL
55
};
56
24
57
--
25
--
58
2.17.1
26
2.34.1
59
60
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for sh4. Note that sh4
2
is one of the only three targets (the others being HPPA and
3
sometimes MIPS) that has snan_bit_is_one set.
2
4
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-22-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-47-peter.maydell@linaro.org
7
---
8
---
8
target/arm/helper-sve.h | 42 +++++++++++++++++++++++++++++++++++++
9
target/sh4/cpu.c | 2 ++
9
target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++++
10
1 file changed, 2 insertions(+)
10
target/arm/translate-sve.c | 43 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 10 +++++++++
12
4 files changed, 138 insertions(+)
13
11
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
12
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
14
--- a/target/sh4/cpu.c
17
+++ b/target/arm/helper-sve.h
15
+++ b/target/sh4/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
16
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
17
set_flush_to_zero(1, &env->fp_status);
20
i64, i64, ptr, ptr, ptr, i32)
18
#endif
21
19
set_default_nan_mode(1, &env->fp_status);
22
+DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG,
20
+ /* sign bit clear, set all frac bits other than msb */
23
+ void, ptr, ptr, ptr, ptr, i32)
21
+ set_float_default_nan_pattern(0b00111111, &env->fp_status);
24
+DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, i32)
63
+
64
DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
65
void, ptr, ptr, ptr, ptr, ptr, i32)
66
DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
67
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
68
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/sve_helper.c
70
+++ b/target/arm/sve_helper.c
71
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
72
73
#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0
74
#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0
75
+#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <= 0
76
+#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0
77
#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0
78
#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0
79
#define DO_FCMUO(TYPE, X, Y, ST) \
80
@@ -XXX,XX +XXX,XX @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT)
81
#undef DO_FPCMP_PPZZ_H
82
#undef DO_FPCMP_PPZZ
83
84
+/* One operand floating-point comparison against zero, controlled
85
+ * by a predicate.
86
+ */
87
+#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \
88
+void HELPER(NAME)(void *vd, void *vn, void *vg, \
89
+ void *status, uint32_t desc) \
90
+{ \
91
+ intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \
92
+ uint64_t *d = vd, *g = vg; \
93
+ do { \
94
+ uint64_t out = 0, pg = g[j]; \
95
+ do { \
96
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
97
+ if ((pg >> (i & 63)) & 1) { \
98
+ TYPE nn = *(TYPE *)(vn + H(i)); \
99
+ out |= OP(TYPE, nn, 0, status); \
100
+ } \
101
+ } while (i & 63); \
102
+ d[j--] = out; \
103
+ } while (i > 0); \
104
+}
105
+
106
+#define DO_FPCMP_PPZ0_H(NAME, OP) \
107
+ DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP)
108
+#define DO_FPCMP_PPZ0_S(NAME, OP) \
109
+ DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP)
110
+#define DO_FPCMP_PPZ0_D(NAME, OP) \
111
+ DO_FPCMP_PPZ0(NAME##_d, float64, , OP)
112
+
113
+#define DO_FPCMP_PPZ0_ALL(NAME, OP) \
114
+ DO_FPCMP_PPZ0_H(NAME, OP) \
115
+ DO_FPCMP_PPZ0_S(NAME, OP) \
116
+ DO_FPCMP_PPZ0_D(NAME, OP)
117
+
118
+DO_FPCMP_PPZ0_ALL(sve_fcmge0, DO_FCMGE)
119
+DO_FPCMP_PPZ0_ALL(sve_fcmgt0, DO_FCMGT)
120
+DO_FPCMP_PPZ0_ALL(sve_fcmle0, DO_FCMLE)
121
+DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT)
122
+DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ)
123
+DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE)
124
+
125
/*
126
* Load contiguous data, protected by a governing predicate.
127
*/
128
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
129
index XXXXXXX..XXXXXXX 100644
130
--- a/target/arm/translate-sve.c
131
+++ b/target/arm/translate-sve.c
132
@@ -XXX,XX +XXX,XX @@ static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
133
return true;
134
}
22
}
135
23
136
+/*
24
static void superh_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
137
+ *** SVE Floating Point Compare with Zero Group
138
+ */
139
+
140
+static void do_ppz_fp(DisasContext *s, arg_rpr_esz *a,
141
+ gen_helper_gvec_3_ptr *fn)
142
+{
143
+ unsigned vsz = vec_full_reg_size(s);
144
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
145
+
146
+ tcg_gen_gvec_3_ptr(pred_full_reg_offset(s, a->rd),
147
+ vec_full_reg_offset(s, a->rn),
148
+ pred_full_reg_offset(s, a->pg),
149
+ status, vsz, vsz, 0, fn);
150
+ tcg_temp_free_ptr(status);
151
+}
152
+
153
+#define DO_PPZ(NAME, name) \
154
+static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \
155
+{ \
156
+ static gen_helper_gvec_3_ptr * const fns[3] = { \
157
+ gen_helper_sve_##name##_h, \
158
+ gen_helper_sve_##name##_s, \
159
+ gen_helper_sve_##name##_d, \
160
+ }; \
161
+ if (a->esz == 0) { \
162
+ return false; \
163
+ } \
164
+ if (sve_access_check(s)) { \
165
+ do_ppz_fp(s, a, fns[a->esz - 1]); \
166
+ } \
167
+ return true; \
168
+}
169
+
170
+DO_PPZ(FCMGE_ppz0, fcmge0)
171
+DO_PPZ(FCMGT_ppz0, fcmgt0)
172
+DO_PPZ(FCMLE_ppz0, fcmle0)
173
+DO_PPZ(FCMLT_ppz0, fcmlt0)
174
+DO_PPZ(FCMEQ_ppz0, fcmeq0)
175
+DO_PPZ(FCMNE_ppz0, fcmne0)
176
+
177
+#undef DO_PPZ
178
+
179
/*
180
*** SVE Floating Point Accumulating Reduction Group
181
*/
182
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
183
index XXXXXXX..XXXXXXX 100644
184
--- a/target/arm/sve.decode
185
+++ b/target/arm/sve.decode
186
@@ -XXX,XX +XXX,XX @@
187
# One register operand, with governing predicate, vector element size
188
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
189
@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz
190
+@pd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 . rd:4 &rpr_esz
191
192
# One register operand, with governing predicate, no vector element size
193
@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0
194
@@ -XXX,XX +XXX,XX @@ FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn
195
FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn
196
FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn
197
198
+### SVE FP Compare with Zero Group
199
+
200
+FCMGE_ppz0 01100101 .. 0100 00 001 ... ..... 0 .... @pd_pg_rn
201
+FCMGT_ppz0 01100101 .. 0100 00 001 ... ..... 1 .... @pd_pg_rn
202
+FCMLT_ppz0 01100101 .. 0100 01 001 ... ..... 0 .... @pd_pg_rn
203
+FCMLE_ppz0 01100101 .. 0100 01 001 ... ..... 1 .... @pd_pg_rn
204
+FCMEQ_ppz0 01100101 .. 0100 10 001 ... ..... 0 .... @pd_pg_rn
205
+FCMNE_ppz0 01100101 .. 0100 11 001 ... ..... 0 .... @pd_pg_rn
206
+
207
### SVE FP Accumulating Reduction Group
208
209
# SVE floating-point serial reduction (predicated)
210
--
25
--
211
2.17.1
26
2.34.1
212
213
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for rx.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-17-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-48-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-sve.h | 49 ++++++++++++++++++++++++++++++
7
target/rx/cpu.c | 2 ++
9
target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++
8
1 file changed, 2 insertions(+)
10
target/arm/translate-sve.c | 40 ++++++++++++++++++++++++
11
target/arm/sve.decode | 11 +++++++
12
4 files changed, 162 insertions(+)
13
9
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
10
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
12
--- a/target/rx/cpu.c
17
+++ b/target/arm/helper-sve.h
13
+++ b/target/rx/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
14
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
15
* then prefer dest over source", which is float_2nan_prop_s_ab.
20
void, ptr, ptr, ptr, ptr, i32)
16
*/
21
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
22
+DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG,
18
+ /* Default NaN value: sign bit clear, set frac msb */
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
+DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG,
65
+ void, ptr, ptr, ptr, ptr, ptr, i32)
66
+DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, ptr, ptr, i32)
68
+DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, ptr, ptr, i32)
70
+
71
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
72
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
73
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
74
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
75
index XXXXXXX..XXXXXXX 100644
76
--- a/target/arm/sve_helper.c
77
+++ b/target/arm/sve_helper.c
78
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
79
do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN);
80
}
20
}
81
21
82
+/* Two operand floating-point comparison controlled by a predicate.
22
static ObjectClass *rx_cpu_class_by_name(const char *cpu_model)
83
+ * Unlike the integer version, we are not allowed to optimistically
84
+ * compare operands, since the comparison may have side effects wrt
85
+ * the FPSR.
86
+ */
87
+#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \
88
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
89
+ void *status, uint32_t desc) \
90
+{ \
91
+ intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \
92
+ uint64_t *d = vd, *g = vg; \
93
+ do { \
94
+ uint64_t out = 0, pg = g[j]; \
95
+ do { \
96
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
97
+ if (likely((pg >> (i & 63)) & 1)) { \
98
+ TYPE nn = *(TYPE *)(vn + H(i)); \
99
+ TYPE mm = *(TYPE *)(vm + H(i)); \
100
+ out |= OP(TYPE, nn, mm, status); \
101
+ } \
102
+ } while (i & 63); \
103
+ d[j--] = out; \
104
+ } while (i > 0); \
105
+}
106
+
107
+#define DO_FPCMP_PPZZ_H(NAME, OP) \
108
+ DO_FPCMP_PPZZ(NAME##_h, float16, H1_2, OP)
109
+#define DO_FPCMP_PPZZ_S(NAME, OP) \
110
+ DO_FPCMP_PPZZ(NAME##_s, float32, H1_4, OP)
111
+#define DO_FPCMP_PPZZ_D(NAME, OP) \
112
+ DO_FPCMP_PPZZ(NAME##_d, float64, , OP)
113
+
114
+#define DO_FPCMP_PPZZ_ALL(NAME, OP) \
115
+ DO_FPCMP_PPZZ_H(NAME, OP) \
116
+ DO_FPCMP_PPZZ_S(NAME, OP) \
117
+ DO_FPCMP_PPZZ_D(NAME, OP)
118
+
119
+#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0
120
+#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0
121
+#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0
122
+#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0
123
+#define DO_FCMUO(TYPE, X, Y, ST) \
124
+ TYPE##_compare_quiet(X, Y, ST) == float_relation_unordered
125
+#define DO_FACGE(TYPE, X, Y, ST) \
126
+ TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) <= 0
127
+#define DO_FACGT(TYPE, X, Y, ST) \
128
+ TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) < 0
129
+
130
+DO_FPCMP_PPZZ_ALL(sve_fcmge, DO_FCMGE)
131
+DO_FPCMP_PPZZ_ALL(sve_fcmgt, DO_FCMGT)
132
+DO_FPCMP_PPZZ_ALL(sve_fcmeq, DO_FCMEQ)
133
+DO_FPCMP_PPZZ_ALL(sve_fcmne, DO_FCMNE)
134
+DO_FPCMP_PPZZ_ALL(sve_fcmuo, DO_FCMUO)
135
+DO_FPCMP_PPZZ_ALL(sve_facge, DO_FACGE)
136
+DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT)
137
+
138
+#undef DO_FPCMP_PPZZ_ALL
139
+#undef DO_FPCMP_PPZZ_D
140
+#undef DO_FPCMP_PPZZ_S
141
+#undef DO_FPCMP_PPZZ_H
142
+#undef DO_FPCMP_PPZZ
143
+
144
/*
145
* Load contiguous data, protected by a governing predicate.
146
*/
147
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
148
index XXXXXXX..XXXXXXX 100644
149
--- a/target/arm/translate-sve.c
150
+++ b/target/arm/translate-sve.c
151
@@ -XXX,XX +XXX,XX @@ DO_FP3(FMULX, fmulx)
152
153
#undef DO_FP3
154
155
+static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a,
156
+ gen_helper_gvec_4_ptr *fn)
157
+{
158
+ if (fn == NULL) {
159
+ return false;
160
+ }
161
+ if (sve_access_check(s)) {
162
+ unsigned vsz = vec_full_reg_size(s);
163
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
164
+ tcg_gen_gvec_4_ptr(pred_full_reg_offset(s, a->rd),
165
+ vec_full_reg_offset(s, a->rn),
166
+ vec_full_reg_offset(s, a->rm),
167
+ pred_full_reg_offset(s, a->pg),
168
+ status, vsz, vsz, 0, fn);
169
+ tcg_temp_free_ptr(status);
170
+ }
171
+ return true;
172
+}
173
+
174
+#define DO_FPCMP(NAME, name) \
175
+static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \
176
+ uint32_t insn) \
177
+{ \
178
+ static gen_helper_gvec_4_ptr * const fns[4] = { \
179
+ NULL, gen_helper_sve_##name##_h, \
180
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \
181
+ }; \
182
+ return do_fp_cmp(s, a, fns[a->esz]); \
183
+}
184
+
185
+DO_FPCMP(FCMGE, fcmge)
186
+DO_FPCMP(FCMGT, fcmgt)
187
+DO_FPCMP(FCMEQ, fcmeq)
188
+DO_FPCMP(FCMNE, fcmne)
189
+DO_FPCMP(FCMUO, fcmuo)
190
+DO_FPCMP(FACGE, facge)
191
+DO_FPCMP(FACGT, facgt)
192
+
193
+#undef DO_FPCMP
194
+
195
typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32);
196
197
static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn)
198
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
199
index XXXXXXX..XXXXXXX 100644
200
--- a/target/arm/sve.decode
201
+++ b/target/arm/sve.decode
202
@@ -XXX,XX +XXX,XX @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn
203
SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn
204
UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn
205
206
+### SVE Floating Point Compare - Vectors Group
207
+
208
+# SVE floating-point compare vectors
209
+FCMGE_ppzz 01100101 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm
210
+FCMGT_ppzz 01100101 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm
211
+FCMEQ_ppzz 01100101 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm
212
+FCMNE_ppzz 01100101 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm
213
+FCMUO_ppzz 01100101 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm
214
+FACGE_ppzz 01100101 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm
215
+FACGT_ppzz 01100101 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm
216
+
217
### SVE Integer Multiply-Add Group
218
219
# SVE integer multiply-add writing addend (predicated)
220
--
23
--
221
2.17.1
24
2.34.1
222
223
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for s390x.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-49-peter.maydell@linaro.org
6
---
7
target/s390x/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
9
10
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/target/s390x/cpu.c
13
+++ b/target/s390x/cpu.c
14
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
15
set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
16
set_float_infzeronan_rule(float_infzeronan_dnan_always,
17
&env->fpu_status);
18
+ /* Default NaN value: sign bit clear, frac msb set */
19
+ set_float_default_nan_pattern(0b01000000, &env->fpu_status);
20
/* fall through */
21
case RESET_TYPE_S390_CPU_NORMAL:
22
env->psw.mask &= ~PSW_MASK_RI;
23
--
24
2.34.1
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
1
Set the default NaN pattern explicitly for SPARC, and remove
2
the ifdef from parts64_default_nan.
2
3
3
The load/store API will ease further code movement.
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-50-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 5 +----
10
2 files changed, 3 insertions(+), 4 deletions(-)
4
11
5
Per the Physical Layer Simplified Spec. "3.6 Bus Protocol":
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
6
7
"In the CMD line the Most Significant Bit (MSB) is transmitted
8
first, the Least Significant Bit (LSB) is the last."
9
10
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
hw/sd/bcm2835_sdhost.c | 13 +++++--------
15
hw/sd/milkymist-memcard.c | 3 +--
16
hw/sd/omap_mmc.c | 6 ++----
17
hw/sd/pl181.c | 11 ++++-------
18
hw/sd/sdhci.c | 15 +++++----------
19
hw/sd/ssi-sd.c | 6 ++----
20
6 files changed, 19 insertions(+), 35 deletions(-)
21
22
diff --git a/hw/sd/bcm2835_sdhost.c b/hw/sd/bcm2835_sdhost.c
23
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
24
--- a/hw/sd/bcm2835_sdhost.c
14
--- a/target/sparc/cpu.c
25
+++ b/hw/sd/bcm2835_sdhost.c
15
+++ b/target/sparc/cpu.c
26
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_send_command(BCM2835SDHostState *s)
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
27
goto error;
17
set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
28
}
18
/* For inf * 0 + NaN, return the input NaN */
29
if (!(s->cmd & SDCMD_NO_RESPONSE)) {
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
30
-#define RWORD(n) (((uint32_t)rsp[n] << 24) | (rsp[n + 1] << 16) \
20
+ /* Default NaN value: sign bit clear, all frac bits set */
31
- | (rsp[n + 2] << 8) | rsp[n + 3])
21
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
32
if (rlen == 0 || (rlen == 4 && (s->cmd & SDCMD_LONG_RESPONSE))) {
22
33
goto error;
23
cpu_exec_realizefn(cs, &local_err);
34
}
24
if (local_err != NULL) {
35
@@ -XXX,XX +XXX,XX @@ static void bcm2835_sdhost_send_command(BCM2835SDHostState *s)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
36
goto error;
37
}
38
if (rlen == 4) {
39
- s->rsp[0] = RWORD(0);
40
+ s->rsp[0] = ldl_be_p(&rsp[0]);
41
s->rsp[1] = s->rsp[2] = s->rsp[3] = 0;
42
} else {
43
- s->rsp[0] = RWORD(12);
44
- s->rsp[1] = RWORD(8);
45
- s->rsp[2] = RWORD(4);
46
- s->rsp[3] = RWORD(0);
47
+ s->rsp[0] = ldl_be_p(&rsp[12]);
48
+ s->rsp[1] = ldl_be_p(&rsp[8]);
49
+ s->rsp[2] = ldl_be_p(&rsp[4]);
50
+ s->rsp[3] = ldl_be_p(&rsp[0]);
51
}
52
-#undef RWORD
53
}
54
/* We never really delay commands, so if this was a 'busywait' command
55
* then we've completed it now and can raise the interrupt.
56
diff --git a/hw/sd/milkymist-memcard.c b/hw/sd/milkymist-memcard.c
57
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
58
--- a/hw/sd/milkymist-memcard.c
27
--- a/fpu/softfloat-specialize.c.inc
59
+++ b/hw/sd/milkymist-memcard.c
28
+++ b/fpu/softfloat-specialize.c.inc
60
@@ -XXX,XX +XXX,XX @@ static void memcard_sd_command(MilkymistMemcardState *s)
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
61
SDRequest req;
30
uint8_t dnan_pattern = status->default_nan_pattern;
62
31
63
req.cmd = s->command[0] & 0x3f;
32
if (dnan_pattern == 0) {
64
- req.arg = (s->command[1] << 24) | (s->command[2] << 16)
33
-#if defined(TARGET_SPARC)
65
- | (s->command[3] << 8) | s->command[4];
34
- /* Sign bit clear, all frac bits set */
66
+ req.arg = ldl_be_p(s->command + 1);
35
- dnan_pattern = 0b01111111;
67
req.crc = s->command[5];
36
-#elif defined(TARGET_HEXAGON)
68
37
+#if defined(TARGET_HEXAGON)
69
s->response[0] = req.cmd;
38
/* Sign bit set, all frac bits set. */
70
diff --git a/hw/sd/omap_mmc.c b/hw/sd/omap_mmc.c
39
dnan_pattern = 0b11111111;
71
index XXXXXXX..XXXXXXX 100644
40
#else
72
--- a/hw/sd/omap_mmc.c
73
+++ b/hw/sd/omap_mmc.c
74
@@ -XXX,XX +XXX,XX @@ static void omap_mmc_command(struct omap_mmc_s *host, int cmd, int dir,
75
CID_CSD_OVERWRITE;
76
if (host->sdio & (1 << 13))
77
mask |= AKE_SEQ_ERROR;
78
- rspstatus = (response[0] << 24) | (response[1] << 16) |
79
- (response[2] << 8) | (response[3] << 0);
80
+ rspstatus = ldl_be_p(response);
81
break;
82
83
case sd_r2:
84
@@ -XXX,XX +XXX,XX @@ static void omap_mmc_command(struct omap_mmc_s *host, int cmd, int dir,
85
}
86
rsplen = 4;
87
88
- rspstatus = (response[0] << 24) | (response[1] << 16) |
89
- (response[2] << 8) | (response[3] << 0);
90
+ rspstatus = ldl_be_p(response);
91
if (rspstatus & 0x80000000)
92
host->status &= 0xe000;
93
else
94
diff --git a/hw/sd/pl181.c b/hw/sd/pl181.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/hw/sd/pl181.c
97
+++ b/hw/sd/pl181.c
98
@@ -XXX,XX +XXX,XX @@ static void pl181_send_command(PL181State *s)
99
if (rlen < 0)
100
goto error;
101
if (s->cmd & PL181_CMD_RESPONSE) {
102
-#define RWORD(n) (((uint32_t)response[n] << 24) | (response[n + 1] << 16) \
103
- | (response[n + 2] << 8) | response[n + 3])
104
if (rlen == 0 || (rlen == 4 && (s->cmd & PL181_CMD_LONGRESP)))
105
goto error;
106
if (rlen != 4 && rlen != 16)
107
goto error;
108
- s->response[0] = RWORD(0);
109
+ s->response[0] = ldl_be_p(&response[0]);
110
if (rlen == 4) {
111
s->response[1] = s->response[2] = s->response[3] = 0;
112
} else {
113
- s->response[1] = RWORD(4);
114
- s->response[2] = RWORD(8);
115
- s->response[3] = RWORD(12) & ~1;
116
+ s->response[1] = ldl_be_p(&response[4]);
117
+ s->response[2] = ldl_be_p(&response[8]);
118
+ s->response[3] = ldl_be_p(&response[12]) & ~1;
119
}
120
DPRINTF("Response received\n");
121
s->status |= PL181_STATUS_CMDRESPEND;
122
-#undef RWORD
123
} else {
124
DPRINTF("Command sent\n");
125
s->status |= PL181_STATUS_CMDSENT;
126
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
127
index XXXXXXX..XXXXXXX 100644
128
--- a/hw/sd/sdhci.c
129
+++ b/hw/sd/sdhci.c
130
@@ -XXX,XX +XXX,XX @@ static void sdhci_send_command(SDHCIState *s)
131
132
if (s->cmdreg & SDHC_CMD_RESPONSE) {
133
if (rlen == 4) {
134
- s->rspreg[0] = (response[0] << 24) | (response[1] << 16) |
135
- (response[2] << 8) | response[3];
136
+ s->rspreg[0] = ldl_be_p(response);
137
s->rspreg[1] = s->rspreg[2] = s->rspreg[3] = 0;
138
trace_sdhci_response4(s->rspreg[0]);
139
} else if (rlen == 16) {
140
- s->rspreg[0] = (response[11] << 24) | (response[12] << 16) |
141
- (response[13] << 8) | response[14];
142
- s->rspreg[1] = (response[7] << 24) | (response[8] << 16) |
143
- (response[9] << 8) | response[10];
144
- s->rspreg[2] = (response[3] << 24) | (response[4] << 16) |
145
- (response[5] << 8) | response[6];
146
+ s->rspreg[0] = ldl_be_p(&response[11]);
147
+ s->rspreg[1] = ldl_be_p(&response[7]);
148
+ s->rspreg[2] = ldl_be_p(&response[3]);
149
s->rspreg[3] = (response[0] << 16) | (response[1] << 8) |
150
response[2];
151
trace_sdhci_response16(s->rspreg[3], s->rspreg[2],
152
@@ -XXX,XX +XXX,XX @@ static void sdhci_end_transfer(SDHCIState *s)
153
trace_sdhci_end_transfer(request.cmd, request.arg);
154
sdbus_do_command(&s->sdbus, &request, response);
155
/* Auto CMD12 response goes to the upper Response register */
156
- s->rspreg[3] = (response[0] << 24) | (response[1] << 16) |
157
- (response[2] << 8) | response[3];
158
+ s->rspreg[3] = ldl_be_p(response);
159
}
160
161
s->prnsts &= ~(SDHC_DOING_READ | SDHC_DOING_WRITE |
162
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
163
index XXXXXXX..XXXXXXX 100644
164
--- a/hw/sd/ssi-sd.c
165
+++ b/hw/sd/ssi-sd.c
166
@@ -XXX,XX +XXX,XX @@ static uint32_t ssi_sd_transfer(SSISlave *dev, uint32_t val)
167
uint8_t longresp[16];
168
/* FIXME: Check CRC. */
169
request.cmd = s->cmd;
170
- request.arg = (s->cmdarg[0] << 24) | (s->cmdarg[1] << 16)
171
- | (s->cmdarg[2] << 8) | s->cmdarg[3];
172
+ request.arg = ldl_be_p(s->cmdarg);
173
DPRINTF("CMD%d arg 0x%08x\n", s->cmd, request.arg);
174
s->arglen = sdbus_do_command(&s->sdbus, &request, longresp);
175
if (s->arglen <= 0) {
176
@@ -XXX,XX +XXX,XX @@ static uint32_t ssi_sd_transfer(SSISlave *dev, uint32_t val)
177
/* CMD13 returns a 2-byte statuse work. Other commands
178
only return the first byte. */
179
s->arglen = (s->cmd == 13) ? 2 : 1;
180
- cardstatus = (longresp[0] << 24) | (longresp[1] << 16)
181
- | (longresp[2] << 8) | longresp[3];
182
+ cardstatus = ldl_be_p(longresp);
183
status = 0;
184
if (((cardstatus >> 9) & 0xf) < 4)
185
status |= SSI_SDR_IDLE;
186
--
41
--
187
2.17.1
42
2.34.1
188
189
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for xtensa.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-7-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-51-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-sve.h | 77 +++++++++++++++++++++++++++++++++
7
target/xtensa/cpu.c | 2 ++
9
target/arm/sve_helper.c | 89 ++++++++++++++++++++++++++++++++++++++
8
1 file changed, 2 insertions(+)
10
target/arm/translate-sve.c | 46 ++++++++++++++++++++
11
target/arm/sve.decode | 17 ++++++++
12
4 files changed, 229 insertions(+)
13
9
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
10
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
12
--- a/target/xtensa/cpu.c
17
+++ b/target/arm/helper-sve.h
13
+++ b/target/xtensa/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
14
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
15
/* For inf * 0 + NaN, return the input NaN */
20
void, ptr, ptr, ptr, ptr, i32)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
21
17
set_no_signaling_nans(!dfpu, &env->fp_status);
22
+DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
18
+ /* Default NaN value: sign bit clear, set frac msb */
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
+DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
20
xtensa_use_first_nan(env, !dfpu);
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG,
30
+ void, ptr, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, ptr, i32)
35
+
36
+DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, ptr, i32)
42
+
43
+DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG,
44
+ void, ptr, ptr, ptr, ptr, ptr, i32)
45
+DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG,
46
+ void, ptr, ptr, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG,
48
+ void, ptr, ptr, ptr, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG,
53
+ void, ptr, ptr, ptr, ptr, ptr, i32)
54
+DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG,
55
+ void, ptr, ptr, ptr, ptr, ptr, i32)
56
+
57
+DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG,
58
+ void, ptr, ptr, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG,
60
+ void, ptr, ptr, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG,
62
+ void, ptr, ptr, ptr, ptr, ptr, i32)
63
+
64
+DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG,
65
+ void, ptr, ptr, ptr, ptr, ptr, i32)
66
+DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG,
67
+ void, ptr, ptr, ptr, ptr, ptr, i32)
68
+DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG,
69
+ void, ptr, ptr, ptr, ptr, ptr, i32)
70
+
71
+DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG,
72
+ void, ptr, ptr, ptr, ptr, ptr, i32)
73
+DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG,
74
+ void, ptr, ptr, ptr, ptr, ptr, i32)
75
+DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG,
76
+ void, ptr, ptr, ptr, ptr, ptr, i32)
77
+
78
+DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG,
79
+ void, ptr, ptr, ptr, ptr, ptr, i32)
80
+DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG,
81
+ void, ptr, ptr, ptr, ptr, ptr, i32)
82
+DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG,
83
+ void, ptr, ptr, ptr, ptr, ptr, i32)
84
+
85
+DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG,
86
+ void, ptr, ptr, ptr, ptr, ptr, i32)
87
+DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG,
88
+ void, ptr, ptr, ptr, ptr, ptr, i32)
89
+DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG,
90
+ void, ptr, ptr, ptr, ptr, ptr, i32)
91
+
92
+DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG,
93
+ void, ptr, ptr, ptr, ptr, ptr, i32)
94
+DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
95
+ void, ptr, ptr, ptr, ptr, ptr, i32)
96
+DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
97
+ void, ptr, ptr, ptr, ptr, ptr, i32)
98
+
99
DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
100
void, ptr, ptr, ptr, ptr, i32)
101
DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
102
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/target/arm/sve_helper.c
105
+++ b/target/arm/sve_helper.c
106
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
107
return predtest_ones(d, oprsz, esz_mask);
108
}
21
}
109
22
110
+/* Fully general three-operand expander, controlled by a predicate,
111
+ * With the extra float_status parameter.
112
+ */
113
+#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \
114
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \
115
+ void *status, uint32_t desc) \
116
+{ \
117
+ intptr_t i = simd_oprsz(desc); \
118
+ uint64_t *g = vg; \
119
+ do { \
120
+ uint64_t pg = g[(i - 1) >> 6]; \
121
+ do { \
122
+ i -= sizeof(TYPE); \
123
+ if (likely((pg >> (i & 63)) & 1)) { \
124
+ TYPE nn = *(TYPE *)(vn + H(i)); \
125
+ TYPE mm = *(TYPE *)(vm + H(i)); \
126
+ *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \
127
+ } \
128
+ } while (i & 63); \
129
+ } while (i != 0); \
130
+}
131
+
132
+DO_ZPZZ_FP(sve_fadd_h, uint16_t, H1_2, float16_add)
133
+DO_ZPZZ_FP(sve_fadd_s, uint32_t, H1_4, float32_add)
134
+DO_ZPZZ_FP(sve_fadd_d, uint64_t, , float64_add)
135
+
136
+DO_ZPZZ_FP(sve_fsub_h, uint16_t, H1_2, float16_sub)
137
+DO_ZPZZ_FP(sve_fsub_s, uint32_t, H1_4, float32_sub)
138
+DO_ZPZZ_FP(sve_fsub_d, uint64_t, , float64_sub)
139
+
140
+DO_ZPZZ_FP(sve_fmul_h, uint16_t, H1_2, float16_mul)
141
+DO_ZPZZ_FP(sve_fmul_s, uint32_t, H1_4, float32_mul)
142
+DO_ZPZZ_FP(sve_fmul_d, uint64_t, , float64_mul)
143
+
144
+DO_ZPZZ_FP(sve_fdiv_h, uint16_t, H1_2, float16_div)
145
+DO_ZPZZ_FP(sve_fdiv_s, uint32_t, H1_4, float32_div)
146
+DO_ZPZZ_FP(sve_fdiv_d, uint64_t, , float64_div)
147
+
148
+DO_ZPZZ_FP(sve_fmin_h, uint16_t, H1_2, float16_min)
149
+DO_ZPZZ_FP(sve_fmin_s, uint32_t, H1_4, float32_min)
150
+DO_ZPZZ_FP(sve_fmin_d, uint64_t, , float64_min)
151
+
152
+DO_ZPZZ_FP(sve_fmax_h, uint16_t, H1_2, float16_max)
153
+DO_ZPZZ_FP(sve_fmax_s, uint32_t, H1_4, float32_max)
154
+DO_ZPZZ_FP(sve_fmax_d, uint64_t, , float64_max)
155
+
156
+DO_ZPZZ_FP(sve_fminnum_h, uint16_t, H1_2, float16_minnum)
157
+DO_ZPZZ_FP(sve_fminnum_s, uint32_t, H1_4, float32_minnum)
158
+DO_ZPZZ_FP(sve_fminnum_d, uint64_t, , float64_minnum)
159
+
160
+DO_ZPZZ_FP(sve_fmaxnum_h, uint16_t, H1_2, float16_maxnum)
161
+DO_ZPZZ_FP(sve_fmaxnum_s, uint32_t, H1_4, float32_maxnum)
162
+DO_ZPZZ_FP(sve_fmaxnum_d, uint64_t, , float64_maxnum)
163
+
164
+static inline float16 abd_h(float16 a, float16 b, float_status *s)
165
+{
166
+ return float16_abs(float16_sub(a, b, s));
167
+}
168
+
169
+static inline float32 abd_s(float32 a, float32 b, float_status *s)
170
+{
171
+ return float32_abs(float32_sub(a, b, s));
172
+}
173
+
174
+static inline float64 abd_d(float64 a, float64 b, float_status *s)
175
+{
176
+ return float64_abs(float64_sub(a, b, s));
177
+}
178
+
179
+DO_ZPZZ_FP(sve_fabd_h, uint16_t, H1_2, abd_h)
180
+DO_ZPZZ_FP(sve_fabd_s, uint32_t, H1_4, abd_s)
181
+DO_ZPZZ_FP(sve_fabd_d, uint64_t, , abd_d)
182
+
183
+static inline float64 scalbn_d(float64 a, int64_t b, float_status *s)
184
+{
185
+ int b_int = MIN(MAX(b, INT_MIN), INT_MAX);
186
+ return float64_scalbn(a, b_int, s);
187
+}
188
+
189
+DO_ZPZZ_FP(sve_fscalbn_h, int16_t, H1_2, float16_scalbn)
190
+DO_ZPZZ_FP(sve_fscalbn_s, int32_t, H1_4, float32_scalbn)
191
+DO_ZPZZ_FP(sve_fscalbn_d, int64_t, , scalbn_d)
192
+
193
+DO_ZPZZ_FP(sve_fmulx_h, uint16_t, H1_2, helper_advsimd_mulxh)
194
+DO_ZPZZ_FP(sve_fmulx_s, uint32_t, H1_4, helper_vfp_mulxs)
195
+DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd)
196
+
197
+#undef DO_ZPZZ_FP
198
+
199
/* Fully general two-operand expander, controlled by a predicate,
200
* With the extra float_status parameter.
201
*/
202
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
203
index XXXXXXX..XXXXXXX 100644
204
--- a/target/arm/translate-sve.c
205
+++ b/target/arm/translate-sve.c
206
@@ -XXX,XX +XXX,XX @@ DO_FP3(FRSQRTS, rsqrts)
207
208
#undef DO_FP3
209
210
+/*
211
+ *** SVE Floating Point Arithmetic - Predicated Group
212
+ */
213
+
214
+static bool do_zpzz_fp(DisasContext *s, arg_rprr_esz *a,
215
+ gen_helper_gvec_4_ptr *fn)
216
+{
217
+ if (fn == NULL) {
218
+ return false;
219
+ }
220
+ if (sve_access_check(s)) {
221
+ unsigned vsz = vec_full_reg_size(s);
222
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
223
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
224
+ vec_full_reg_offset(s, a->rn),
225
+ vec_full_reg_offset(s, a->rm),
226
+ pred_full_reg_offset(s, a->pg),
227
+ status, vsz, vsz, 0, fn);
228
+ tcg_temp_free_ptr(status);
229
+ }
230
+ return true;
231
+}
232
+
233
+#define DO_FP3(NAME, name) \
234
+static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \
235
+{ \
236
+ static gen_helper_gvec_4_ptr * const fns[4] = { \
237
+ NULL, gen_helper_sve_##name##_h, \
238
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \
239
+ }; \
240
+ return do_zpzz_fp(s, a, fns[a->esz]); \
241
+}
242
+
243
+DO_FP3(FADD_zpzz, fadd)
244
+DO_FP3(FSUB_zpzz, fsub)
245
+DO_FP3(FMUL_zpzz, fmul)
246
+DO_FP3(FMIN_zpzz, fmin)
247
+DO_FP3(FMAX_zpzz, fmax)
248
+DO_FP3(FMINNM_zpzz, fminnum)
249
+DO_FP3(FMAXNM_zpzz, fmaxnum)
250
+DO_FP3(FABD, fabd)
251
+DO_FP3(FSCALE, fscalbn)
252
+DO_FP3(FDIV, fdiv)
253
+DO_FP3(FMULX, fmulx)
254
+
255
+#undef DO_FP3
256
257
/*
258
*** SVE Floating Point Unary Operations Predicated Group
259
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
260
index XXXXXXX..XXXXXXX 100644
261
--- a/target/arm/sve.decode
262
+++ b/target/arm/sve.decode
263
@@ -XXX,XX +XXX,XX @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm
264
FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm
265
FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm
266
267
+### SVE FP Arithmetic Predicated Group
268
+
269
+# SVE floating-point arithmetic (predicated)
270
+FADD_zpzz 01100101 .. 00 0000 100 ... ..... ..... @rdn_pg_rm
271
+FSUB_zpzz 01100101 .. 00 0001 100 ... ..... ..... @rdn_pg_rm
272
+FMUL_zpzz 01100101 .. 00 0010 100 ... ..... ..... @rdn_pg_rm
273
+FSUB_zpzz 01100101 .. 00 0011 100 ... ..... ..... @rdm_pg_rn # FSUBR
274
+FMAXNM_zpzz 01100101 .. 00 0100 100 ... ..... ..... @rdn_pg_rm
275
+FMINNM_zpzz 01100101 .. 00 0101 100 ... ..... ..... @rdn_pg_rm
276
+FMAX_zpzz 01100101 .. 00 0110 100 ... ..... ..... @rdn_pg_rm
277
+FMIN_zpzz 01100101 .. 00 0111 100 ... ..... ..... @rdn_pg_rm
278
+FABD 01100101 .. 00 1000 100 ... ..... ..... @rdn_pg_rm
279
+FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm
280
+FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm
281
+FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR
282
+FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm
283
+
284
### SVE FP Unary Operations Predicated Group
285
286
# SVE integer convert to floating-point
287
--
23
--
288
2.17.1
24
2.34.1
289
290
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for hexagon.
2
Remove the ifdef from parts64_default_nan(); the only
3
remaining unconverted targets all use the default case.
2
4
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-5-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-52-peter.maydell@linaro.org
7
---
8
---
8
target/arm/translate-sve.c | 52 ++++++++++++++++++++++++++++++++++++++
9
target/hexagon/cpu.c | 2 ++
9
target/arm/sve.decode | 9 +++++++
10
fpu/softfloat-specialize.c.inc | 5 -----
10
2 files changed, 61 insertions(+)
11
2 files changed, 2 insertions(+), 5 deletions(-)
11
12
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
13
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
15
--- a/target/hexagon/cpu.c
15
+++ b/target/arm/translate-sve.c
16
+++ b/target/hexagon/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
17
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_reset_hold(Object *obj, ResetType type)
17
return true;
18
19
set_default_nan_mode(1, &env->fp_status);
20
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
21
+ /* Default NaN value: sign bit set, all frac bits set */
22
+ set_float_default_nan_pattern(0b11111111, &env->fp_status);
18
}
23
}
19
24
20
+static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
25
static void hexagon_cpu_disas_set_info(CPUState *s, disassemble_info *info)
21
+{
26
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
22
+ static gen_helper_gvec_mem * const fns[4] = {
23
+ gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r,
24
+ gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r,
25
+ };
26
+ unsigned vsz = vec_full_reg_size(s);
27
+ TCGv_ptr t_pg;
28
+ TCGv_i32 desc;
29
+
30
+ /* Load the first quadword using the normal predicated load helpers. */
31
+ desc = tcg_const_i32(simd_desc(16, 16, zt));
32
+ t_pg = tcg_temp_new_ptr();
33
+
34
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
35
+ fns[msz](cpu_env, t_pg, addr, desc);
36
+
37
+ tcg_temp_free_ptr(t_pg);
38
+ tcg_temp_free_i32(desc);
39
+
40
+ /* Replicate that first quadword. */
41
+ if (vsz > 16) {
42
+ unsigned dofs = vec_full_reg_offset(s, zt);
43
+ tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16);
44
+ }
45
+}
46
+
47
+static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
48
+{
49
+ if (a->rm == 31) {
50
+ return false;
51
+ }
52
+ if (sve_access_check(s)) {
53
+ int msz = dtype_msz(a->dtype);
54
+ TCGv_i64 addr = new_tmp_a64(s);
55
+ tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz);
56
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
57
+ do_ldrq(s, a->rd, a->pg, addr, msz);
58
+ }
59
+ return true;
60
+}
61
+
62
+static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
63
+{
64
+ if (sve_access_check(s)) {
65
+ TCGv_i64 addr = new_tmp_a64(s);
66
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16);
67
+ do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype));
68
+ }
69
+ return true;
70
+}
71
+
72
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
73
int msz, int esz, int nreg)
74
{
75
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
76
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/sve.decode
28
--- a/fpu/softfloat-specialize.c.inc
78
+++ b/target/arm/sve.decode
29
+++ b/fpu/softfloat-specialize.c.inc
79
@@ -XXX,XX +XXX,XX @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz
30
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
80
# LD2B, LD2H, LD2W, LD2D; etc.
31
uint8_t dnan_pattern = status->default_nan_pattern;
81
LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz
32
82
33
if (dnan_pattern == 0) {
83
+# SVE load and broadcast quadword (scalar plus scalar)
34
-#if defined(TARGET_HEXAGON)
84
+LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \
35
- /* Sign bit set, all frac bits set. */
85
+ @rprr_load_msz nreg=0
36
- dnan_pattern = 0b11111111;
86
+
37
-#else
87
+# SVE load and broadcast quadword (scalar plus immediate)
38
/*
88
+# LD1RQB, LD1RQH, LD1RQS, LD1RQD
39
* This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
89
+LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \
40
* S390, SH4, TriCore, and Xtensa. Our other supported targets
90
+ @rpri_load_msz nreg=0
41
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
91
+
42
/* sign bit clear, set frac msb */
92
### SVE Memory Store Group
43
dnan_pattern = 0b01000000;
93
44
}
94
# SVE contiguous store (scalar plus immediate)
45
-#endif
46
}
47
assert(dnan_pattern != 0);
48
95
--
49
--
96
2.17.1
50
2.34.1
97
98
diff view generated by jsdifflib
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
1
Set the default NaN pattern explicitly for riscv.
2
2
3
The qdev_get_gpio_in() function accept an int as second parameter.
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-53-peter.maydell@linaro.org
6
---
7
target/riscv/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
4
9
5
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
10
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
hw/arm/fsl-imx7.c | 6 +++---
11
1 file changed, 3 insertions(+), 3 deletions(-)
12
13
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/hw/arm/fsl-imx7.c
12
--- a/target/riscv/cpu.c
16
+++ b/hw/arm/fsl-imx7.c
13
+++ b/target/riscv/cpu.c
17
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
14
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type)
18
FSL_IMX7_ECSPI4_ADDR,
15
cs->exception_index = RISCV_EXCP_NONE;
19
};
16
env->load_res = -1;
20
17
set_default_nan_mode(1, &env->fp_status);
21
- static const hwaddr FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
18
+ /* Default NaN value: sign bit clear, frac msb set */
22
+ static const int FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
23
FSL_IMX7_ECSPI1_IRQ,
20
env->vill = true;
24
FSL_IMX7_ECSPI2_IRQ,
21
25
FSL_IMX7_ECSPI3_IRQ,
22
#ifndef CONFIG_USER_ONLY
26
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
27
FSL_IMX7_I2C4_ADDR,
28
};
29
30
- static const hwaddr FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
31
+ static const int FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
32
FSL_IMX7_I2C1_IRQ,
33
FSL_IMX7_I2C2_IRQ,
34
FSL_IMX7_I2C3_IRQ,
35
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
36
FSL_IMX7_USB3_ADDR,
37
};
38
39
- static const hwaddr FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
40
+ static const int FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
41
FSL_IMX7_USB1_IRQ,
42
FSL_IMX7_USB2_IRQ,
43
FSL_IMX7_USB3_IRQ,
44
--
23
--
45
2.17.1
24
2.34.1
46
47
diff view generated by jsdifflib
1
From: Eric Auger <eric.auger@redhat.com>
1
Set the default NaN pattern explicitly for tricore.
2
2
3
This helper allows to retrieve the paths of nodes whose name
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
match node-name or node-name@unit-address patterns.
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-54-peter.maydell@linaro.org
6
---
7
target/tricore/helper.c | 2 ++
8
1 file changed, 2 insertions(+)
5
9
6
Signed-off-by: Eric Auger <eric.auger@redhat.com>
10
diff --git a/target/tricore/helper.c b/target/tricore/helper.c
7
Message-id: 1530044492-24921-2-git-send-email-eric.auger@redhat.com
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
include/sysemu/device_tree.h | 16 +++++++++++
12
device_tree.c | 55 ++++++++++++++++++++++++++++++++++++
13
2 files changed, 71 insertions(+)
14
15
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
16
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
17
--- a/include/sysemu/device_tree.h
12
--- a/target/tricore/helper.c
18
+++ b/include/sysemu/device_tree.h
13
+++ b/target/tricore/helper.c
19
@@ -XXX,XX +XXX,XX @@ void *load_device_tree_from_sysfs(void);
14
@@ -XXX,XX +XXX,XX @@ void fpu_set_state(CPUTriCoreState *env)
20
char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
15
set_flush_to_zero(1, &env->fp_status);
21
Error **errp);
16
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
22
17
set_default_nan_mode(1, &env->fp_status);
23
+/**
18
+ /* Default NaN pattern: sign bit clear, frac msb set */
24
+ * qemu_fdt_node_unit_path: return the paths of nodes matching a given
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
25
+ * node-name, ie. node-name and node-name@unit-address
26
+ * @fdt: pointer to the dt blob
27
+ * @name: node name
28
+ * @errp: handle to an error object
29
+ *
30
+ * returns a newly allocated NULL-terminated array of node paths.
31
+ * Use g_strfreev() to free it. If one or more nodes were found, the
32
+ * array contains the path of each node and the last element equals to
33
+ * NULL. If there is no error but no matching node was found, the
34
+ * returned array contains a single element equal to NULL. If an error
35
+ * was encountered when parsing the blob, the function returns NULL
36
+ */
37
+char **qemu_fdt_node_unit_path(void *fdt, const char *name, Error **errp);
38
+
39
int qemu_fdt_setprop(void *fdt, const char *node_path,
40
const char *property, const void *val, int size);
41
int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
42
diff --git a/device_tree.c b/device_tree.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/device_tree.c
45
+++ b/device_tree.c
46
@@ -XXX,XX +XXX,XX @@ static int findnode_nofail(void *fdt, const char *node_path)
47
return offset;
48
}
20
}
49
21
50
+char **qemu_fdt_node_unit_path(void *fdt, const char *name, Error **errp)
22
uint32_t psw_read(CPUTriCoreState *env)
51
+{
52
+ char *prefix = g_strdup_printf("%s@", name);
53
+ unsigned int path_len = 16, n = 0;
54
+ GSList *path_list = NULL, *iter;
55
+ const char *iter_name;
56
+ int offset, len, ret;
57
+ char **path_array;
58
+
59
+ offset = fdt_next_node(fdt, -1, NULL);
60
+
61
+ while (offset >= 0) {
62
+ iter_name = fdt_get_name(fdt, offset, &len);
63
+ if (!iter_name) {
64
+ offset = len;
65
+ break;
66
+ }
67
+ if (!strcmp(iter_name, name) || g_str_has_prefix(iter_name, prefix)) {
68
+ char *path;
69
+
70
+ path = g_malloc(path_len);
71
+ while ((ret = fdt_get_path(fdt, offset, path, path_len))
72
+ == -FDT_ERR_NOSPACE) {
73
+ path_len += 16;
74
+ path = g_realloc(path, path_len);
75
+ }
76
+ path_list = g_slist_prepend(path_list, path);
77
+ n++;
78
+ }
79
+ offset = fdt_next_node(fdt, offset, NULL);
80
+ }
81
+ g_free(prefix);
82
+
83
+ if (offset < 0 && offset != -FDT_ERR_NOTFOUND) {
84
+ error_setg(errp, "%s: abort parsing dt for %s node units: %s",
85
+ __func__, name, fdt_strerror(offset));
86
+ for (iter = path_list; iter; iter = iter->next) {
87
+ g_free(iter->data);
88
+ }
89
+ g_slist_free(path_list);
90
+ return NULL;
91
+ }
92
+
93
+ path_array = g_new(char *, n + 1);
94
+ path_array[n--] = NULL;
95
+
96
+ for (iter = path_list; iter; iter = iter->next) {
97
+ path_array[n--] = iter->data;
98
+ }
99
+
100
+ g_slist_free(path_list);
101
+
102
+ return path_array;
103
+}
104
+
105
char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
106
Error **errp)
107
{
108
--
23
--
109
2.17.1
24
2.34.1
110
111
diff view generated by jsdifflib
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
1
Now that all our targets have bene converted to explicitly specify
2
their pattern for the default NaN value we can remove the remaining
3
fallback code in parts64_default_nan().
2
4
3
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-55-peter.maydell@linaro.org
6
---
8
---
7
hw/arm/fsl-imx7.c | 2 +-
9
fpu/softfloat-specialize.c.inc | 14 --------------
8
1 file changed, 1 insertion(+), 1 deletion(-)
10
1 file changed, 14 deletions(-)
9
11
10
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
12
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
11
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
12
--- a/hw/arm/fsl-imx7.c
14
--- a/fpu/softfloat-specialize.c.inc
13
+++ b/hw/arm/fsl-imx7.c
15
+++ b/fpu/softfloat-specialize.c.inc
14
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
16
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
15
/*
17
uint64_t frac;
16
* SRC
18
uint8_t dnan_pattern = status->default_nan_pattern;
17
*/
19
18
- create_unimplemented_device("sdma", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
20
- if (dnan_pattern == 0) {
19
+ create_unimplemented_device("src", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
21
- /*
20
22
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
21
/*
23
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
22
* Watchdog
24
- * do not have floating-point.
25
- */
26
- if (snan_bit_is_one(status)) {
27
- /* sign bit clear, set all frac bits other than msb */
28
- dnan_pattern = 0b00111111;
29
- } else {
30
- /* sign bit clear, set frac msb */
31
- dnan_pattern = 0b01000000;
32
- }
33
- }
34
assert(dnan_pattern != 0);
35
36
sign = dnan_pattern >> 7;
23
--
37
--
24
2.17.1
38
2.34.1
25
26
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Inline pickNaNMulAdd into its only caller. This makes
4
one assert redundant with the immediately preceding IF.
5
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-11-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-3-richard.henderson@linaro.org
9
[PMM: keep comment from old code in new location]
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/translate-sve.c | 103 +++++++++++++++++++++++++++++++++++++
12
fpu/softfloat-parts.c.inc | 41 +++++++++++++++++++++++++-
9
target/arm/sve.decode | 6 +++
13
fpu/softfloat-specialize.c.inc | 54 ----------------------------------
10
2 files changed, 109 insertions(+)
14
2 files changed, 40 insertions(+), 55 deletions(-)
11
15
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
13
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
18
--- a/fpu/softfloat-parts.c.inc
15
+++ b/target/arm/translate-sve.c
19
+++ b/fpu/softfloat-parts.c.inc
16
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len,
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
17
tcg_temp_free_i64(t0);
21
}
18
}
22
19
23
if (s->default_nan_mode) {
20
+/* Similarly for stores. */
24
+ /*
21
+static void do_str(DisasContext *s, uint32_t vofs, uint32_t len,
25
+ * We guarantee not to require the target to tell us how to
22
+ int rn, int imm)
26
+ * pick a NaN if we're always returning the default NaN.
23
+{
27
+ * But if we're not in default-NaN mode then the target must
24
+ uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
28
+ * specify.
25
+ uint32_t len_remain = len % 8;
26
+ uint32_t nparts = len / 8 + ctpop8(len_remain);
27
+ int midx = get_mem_index(s);
28
+ TCGv_i64 addr, t0;
29
+
30
+ addr = tcg_temp_new_i64();
31
+ t0 = tcg_temp_new_i64();
32
+
33
+ /* Note that unpredicated load/store of vector/predicate registers
34
+ * are defined as a stream of bytes, which equates to little-endian
35
+ * operations on larger quantities. There is no nice way to force
36
+ * a little-endian store for aarch64_be-linux-user out of line.
37
+ *
38
+ * Attempt to keep code expansion to a minimum by limiting the
39
+ * amount of unrolling done.
40
+ */
41
+ if (nparts <= 4) {
42
+ int i;
43
+
44
+ for (i = 0; i < len_align; i += 8) {
45
+ tcg_gen_ld_i64(t0, cpu_env, vofs + i);
46
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i);
47
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
48
+ }
49
+ } else {
50
+ TCGLabel *loop = gen_new_label();
51
+ TCGv_ptr t2, i = tcg_const_local_ptr(0);
52
+
53
+ gen_set_label(loop);
54
+
55
+ t2 = tcg_temp_new_ptr();
56
+ tcg_gen_add_ptr(t2, cpu_env, i);
57
+ tcg_gen_ld_i64(t0, t2, vofs);
58
+
59
+ /* Minimize the number of local temps that must be re-read from
60
+ * the stack each iteration. Instead, re-compute values other
61
+ * than the loop counter.
62
+ */
29
+ */
63
+ tcg_gen_addi_ptr(t2, i, imm);
30
which = 3;
64
+ tcg_gen_extu_ptr_i64(addr, t2);
31
+ } else if (infzero) {
65
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn));
32
+ /*
66
+ tcg_temp_free_ptr(t2);
33
+ * Inf * 0 + NaN -- some implementations return the
67
+
34
+ * default NaN here, and some return the input NaN.
68
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
35
+ */
69
+
36
+ switch (s->float_infzeronan_rule) {
70
+ tcg_gen_addi_ptr(i, i, 8);
37
+ case float_infzeronan_dnan_never:
71
+
38
+ which = 2;
72
+ tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
73
+ tcg_temp_free_ptr(i);
74
+ }
75
+
76
+ /* Predicate register stores can be any multiple of 2. */
77
+ if (len_remain) {
78
+ tcg_gen_ld_i64(t0, cpu_env, vofs + len_align);
79
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align);
80
+
81
+ switch (len_remain) {
82
+ case 2:
83
+ case 4:
84
+ case 8:
85
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain));
86
+ break;
39
+ break;
87
+
40
+ case float_infzeronan_dnan_always:
88
+ case 6:
41
+ which = 3;
89
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL);
90
+ tcg_gen_addi_i64(addr, addr, 4);
91
+ tcg_gen_shri_i64(t0, t0, 32);
92
+ tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW);
93
+ break;
42
+ break;
94
+
43
+ case float_infzeronan_dnan_if_qnan:
44
+ which = is_qnan(c->cls) ? 3 : 2;
45
+ break;
95
+ default:
46
+ default:
96
+ g_assert_not_reached();
47
+ g_assert_not_reached();
97
+ }
48
+ }
98
+ }
49
} else {
99
+ tcg_temp_free_i64(addr);
50
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
100
+ tcg_temp_free_i64(t0);
51
+ FloatClass cls[3] = { a->cls, b->cls, c->cls };
101
+}
52
+ Float3NaNPropRule rule = s->float_3nan_prop_rule;
102
+
53
+
103
static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
54
+ assert(rule != float_3nan_prop_none);
104
{
55
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
105
if (sve_access_check(s)) {
56
+ /* We have at least one SNaN input and should prefer it */
106
@@ -XXX,XX +XXX,XX @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
57
+ do {
107
return true;
58
+ which = rule & R_3NAN_1ST_MASK;
59
+ rule >>= R_3NAN_1ST_LENGTH;
60
+ } while (!is_snan(cls[which]));
61
+ } else {
62
+ do {
63
+ which = rule & R_3NAN_1ST_MASK;
64
+ rule >>= R_3NAN_1ST_LENGTH;
65
+ } while (!is_nan(cls[which]));
66
+ }
67
}
68
69
if (which == 3) {
70
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
71
index XXXXXXX..XXXXXXX 100644
72
--- a/fpu/softfloat-specialize.c.inc
73
+++ b/fpu/softfloat-specialize.c.inc
74
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
75
}
108
}
76
}
109
77
110
+static bool trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
78
-/*----------------------------------------------------------------------------
111
+{
79
-| Select which NaN to propagate for a three-input operation.
112
+ if (sve_access_check(s)) {
80
-| For the moment we assume that no CPU needs the 'larger significand'
113
+ int size = vec_full_reg_size(s);
81
-| information.
114
+ int off = vec_full_reg_offset(s, a->rd);
82
-| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
115
+ do_str(s, off, size, a->rn, a->imm * size);
83
-*----------------------------------------------------------------------------*/
116
+ }
84
-static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
117
+ return true;
85
- bool infzero, bool have_snan, float_status *status)
118
+}
86
-{
119
+
87
- FloatClass cls[3] = { a_cls, b_cls, c_cls };
120
+static bool trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
88
- Float3NaNPropRule rule = status->float_3nan_prop_rule;
121
+{
89
- int which;
122
+ if (sve_access_check(s)) {
90
-
123
+ int size = pred_full_reg_size(s);
91
- /*
124
+ int off = pred_full_reg_offset(s, a->rd);
92
- * We guarantee not to require the target to tell us how to
125
+ do_str(s, off, size, a->rn, a->imm * size);
93
- * pick a NaN if we're always returning the default NaN.
126
+ }
94
- * But if we're not in default-NaN mode then the target must
127
+ return true;
95
- * specify.
128
+}
96
- */
129
+
97
- assert(!status->default_nan_mode);
130
/*
98
-
131
*** SVE Memory - Contiguous Load Group
99
- if (infzero) {
132
*/
100
- /*
133
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
101
- * Inf * 0 + NaN -- some implementations return the default NaN here,
134
index XXXXXXX..XXXXXXX 100644
102
- * and some return the input NaN.
135
--- a/target/arm/sve.decode
103
- */
136
+++ b/target/arm/sve.decode
104
- switch (status->float_infzeronan_rule) {
137
@@ -XXX,XX +XXX,XX @@ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \
105
- case float_infzeronan_dnan_never:
138
106
- return 2;
139
### SVE Memory Store Group
107
- case float_infzeronan_dnan_always:
140
108
- return 3;
141
+# SVE store predicate register
109
- case float_infzeronan_dnan_if_qnan:
142
+STR_pri 1110010 11 0. ..... 000 ... ..... 0 .... @pd_rn_i9
110
- return is_qnan(c_cls) ? 3 : 2;
143
+
111
- default:
144
+# SVE store vector register
112
- g_assert_not_reached();
145
+STR_zri 1110010 11 0. ..... 010 ... ..... ..... @rd_rn_i9
113
- }
146
+
114
- }
147
# SVE contiguous store (scalar plus immediate)
115
-
148
# ST1B, ST1H, ST1W, ST1D; require msz <= esz
116
- assert(rule != float_3nan_prop_none);
149
ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \
117
- if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
118
- /* We have at least one SNaN input and should prefer it */
119
- do {
120
- which = rule & R_3NAN_1ST_MASK;
121
- rule >>= R_3NAN_1ST_LENGTH;
122
- } while (!is_snan(cls[which]));
123
- } else {
124
- do {
125
- which = rule & R_3NAN_1ST_MASK;
126
- rule >>= R_3NAN_1ST_LENGTH;
127
- } while (!is_nan(cls[which]));
128
- }
129
- return which;
130
-}
131
-
132
/*----------------------------------------------------------------------------
133
| Returns 1 if the double-precision floating-point value `a' is a quiet
134
| NaN; otherwise returns 0.
150
--
135
--
151
2.17.1
136
2.34.1
152
137
153
138
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Remove "3" as a special case for which and simply
4
branch to return the desired value.
5
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-12-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-4-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/helper-sve.h | 41 +++++++++++++++++++++
11
fpu/softfloat-parts.c.inc | 20 ++++++++++----------
9
target/arm/sve_helper.c | 61 +++++++++++++++++++++++++++++++
12
1 file changed, 10 insertions(+), 10 deletions(-)
10
target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 39 ++++++++++++++++++++
12
4 files changed, 216 insertions(+)
13
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
16
--- a/fpu/softfloat-parts.c.inc
17
+++ b/target/arm/helper-sve.h
17
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
19
DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
19
* But if we're not in default-NaN mode then the target must
20
20
* specify.
21
DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
21
*/
22
- which = 3;
23
+ goto default_nan;
24
} else if (infzero) {
25
/*
26
* Inf * 0 + NaN -- some implementations return the
27
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
28
*/
29
switch (s->float_infzeronan_rule) {
30
case float_infzeronan_dnan_never:
31
- which = 2;
32
break;
33
case float_infzeronan_dnan_always:
34
- which = 3;
35
- break;
36
+ goto default_nan;
37
case float_infzeronan_dnan_if_qnan:
38
- which = is_qnan(c->cls) ? 3 : 2;
39
+ if (is_qnan(c->cls)) {
40
+ goto default_nan;
41
+ }
42
break;
43
default:
44
g_assert_not_reached();
45
}
46
+ which = 2;
47
} else {
48
FloatClass cls[3] = { a->cls, b->cls, c->cls };
49
Float3NaNPropRule rule = s->float_3nan_prop_rule;
50
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
51
}
52
}
53
54
- if (which == 3) {
55
- parts_default_nan(a, s);
56
- return a;
57
- }
58
-
59
switch (which) {
60
case 0:
61
break;
62
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
63
parts_silence_nan(a, s);
64
}
65
return a;
22
+
66
+
23
+DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
67
+ default_nan:
24
+ void, env, ptr, ptr, ptr, tl, i32)
68
+ parts_default_nan(a, s);
25
+DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
69
+ return a;
26
+ void, env, ptr, ptr, ptr, tl, i32)
27
+DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG,
28
+ void, env, ptr, ptr, ptr, tl, i32)
29
+
30
+DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG,
31
+ void, env, ptr, ptr, ptr, tl, i32)
32
+DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG,
33
+ void, env, ptr, ptr, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG,
35
+ void, env, ptr, ptr, ptr, tl, i32)
36
+
37
+DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG,
38
+ void, env, ptr, ptr, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG,
40
+ void, env, ptr, ptr, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG,
42
+ void, env, ptr, ptr, ptr, tl, i32)
43
+DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG,
44
+ void, env, ptr, ptr, ptr, tl, i32)
45
+
46
+DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG,
47
+ void, env, ptr, ptr, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG,
49
+ void, env, ptr, ptr, ptr, tl, i32)
50
+DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG,
51
+ void, env, ptr, ptr, ptr, tl, i32)
52
+DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG,
53
+ void, env, ptr, ptr, ptr, tl, i32)
54
+
55
+DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG,
56
+ void, env, ptr, ptr, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG,
58
+ void, env, ptr, ptr, ptr, tl, i32)
59
+DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG,
60
+ void, env, ptr, ptr, ptr, tl, i32)
61
+DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG,
62
+ void, env, ptr, ptr, ptr, tl, i32)
63
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/sve_helper.c
66
+++ b/target/arm/sve_helper.c
67
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
68
addr += 4 * 8;
69
}
70
}
70
}
71
+
72
+/* Stores with a vector index. */
73
+
74
+#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \
75
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
76
+ target_ulong base, uint32_t desc) \
77
+{ \
78
+ intptr_t i, oprsz = simd_oprsz(desc); \
79
+ unsigned scale = simd_data(desc); \
80
+ uintptr_t ra = GETPC(); \
81
+ for (i = 0; i < oprsz; ) { \
82
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
83
+ do { \
84
+ if (likely(pg & 1)) { \
85
+ target_ulong off = *(TYPEI *)(vm + H1_4(i)); \
86
+ uint32_t d = *(uint32_t *)(vd + H1_4(i)); \
87
+ FN(env, base + (off << scale), d, ra); \
88
+ } \
89
+ i += sizeof(uint32_t), pg >>= sizeof(uint32_t); \
90
+ } while (i & 15); \
91
+ } \
92
+}
93
+
94
+#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \
95
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \
96
+ target_ulong base, uint32_t desc) \
97
+{ \
98
+ intptr_t i, oprsz = simd_oprsz(desc) / 8; \
99
+ unsigned scale = simd_data(desc); \
100
+ uintptr_t ra = GETPC(); \
101
+ uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \
102
+ for (i = 0; i < oprsz; i++) { \
103
+ if (likely(pg[H1(i)] & 1)) { \
104
+ target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \
105
+ FN(env, base + off, d[i], ra); \
106
+ } \
107
+ } \
108
+}
109
+
110
+DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra)
111
+DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra)
112
+DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra)
113
+
114
+DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra)
115
+DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra)
116
+DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra)
117
+
118
+DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra)
119
+DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra)
120
+DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra)
121
+DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra)
122
+
123
+DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra)
124
+DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra)
125
+DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra)
126
+DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra)
127
+
128
+DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra)
129
+DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra)
130
+DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra)
131
+DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra)
132
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
133
index XXXXXXX..XXXXXXX 100644
134
--- a/target/arm/translate-sve.c
135
+++ b/target/arm/translate-sve.c
136
@@ -XXX,XX +XXX,XX @@ typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
137
TCGv_ptr, TCGv_ptr, TCGv_i32);
138
139
typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32);
140
+typedef void gen_helper_gvec_mem_scatter(TCGv_env, TCGv_ptr, TCGv_ptr,
141
+ TCGv_ptr, TCGv_i64, TCGv_i32);
142
71
143
/*
72
/*
144
* Helpers for extracting complex instruction fields.
145
@@ -XXX,XX +XXX,XX @@ static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn)
146
}
147
return true;
148
}
149
+
150
+/*
151
+ *** SVE gather loads / scatter stores
152
+ */
153
+
154
+static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale,
155
+ TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn)
156
+{
157
+ unsigned vsz = vec_full_reg_size(s);
158
+ TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, scale));
159
+ TCGv_ptr t_zm = tcg_temp_new_ptr();
160
+ TCGv_ptr t_pg = tcg_temp_new_ptr();
161
+ TCGv_ptr t_zt = tcg_temp_new_ptr();
162
+
163
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
164
+ tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm));
165
+ tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt));
166
+ fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc);
167
+
168
+ tcg_temp_free_ptr(t_zt);
169
+ tcg_temp_free_ptr(t_zm);
170
+ tcg_temp_free_ptr(t_pg);
171
+ tcg_temp_free_i32(desc);
172
+}
173
+
174
+static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
175
+{
176
+ /* Indexed by [xs][msz]. */
177
+ static gen_helper_gvec_mem_scatter * const fn32[2][3] = {
178
+ { gen_helper_sve_stbs_zsu,
179
+ gen_helper_sve_sths_zsu,
180
+ gen_helper_sve_stss_zsu, },
181
+ { gen_helper_sve_stbs_zss,
182
+ gen_helper_sve_sths_zss,
183
+ gen_helper_sve_stss_zss, },
184
+ };
185
+ /* Note that we overload xs=2 to indicate 64-bit offset. */
186
+ static gen_helper_gvec_mem_scatter * const fn64[3][4] = {
187
+ { gen_helper_sve_stbd_zsu,
188
+ gen_helper_sve_sthd_zsu,
189
+ gen_helper_sve_stsd_zsu,
190
+ gen_helper_sve_stdd_zsu, },
191
+ { gen_helper_sve_stbd_zss,
192
+ gen_helper_sve_sthd_zss,
193
+ gen_helper_sve_stsd_zss,
194
+ gen_helper_sve_stdd_zss, },
195
+ { gen_helper_sve_stbd_zd,
196
+ gen_helper_sve_sthd_zd,
197
+ gen_helper_sve_stsd_zd,
198
+ gen_helper_sve_stdd_zd, },
199
+ };
200
+ gen_helper_gvec_mem_scatter *fn;
201
+
202
+ if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
203
+ return false;
204
+ }
205
+ if (!sve_access_check(s)) {
206
+ return true;
207
+ }
208
+ switch (a->esz) {
209
+ case MO_32:
210
+ fn = fn32[a->xs][a->msz];
211
+ break;
212
+ case MO_64:
213
+ fn = fn64[a->xs][a->msz];
214
+ break;
215
+ default:
216
+ g_assert_not_reached();
217
+ }
218
+ do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
219
+ cpu_reg_sp(s, a->rn), fn);
220
+ return true;
221
+}
222
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
223
index XXXXXXX..XXXXXXX 100644
224
--- a/target/arm/sve.decode
225
+++ b/target/arm/sve.decode
226
@@ -XXX,XX +XXX,XX @@
227
&rpri_load rd pg rn imm dtype nreg
228
&rprr_store rd pg rn rm msz esz nreg
229
&rpri_store rd pg rn imm msz esz nreg
230
+&rprr_scatter_store rd pg rn rm esz msz xs scale
231
232
###########################################################################
233
# Named instruction formats. These are generally used to
234
@@ -XXX,XX +XXX,XX @@
235
@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store
236
@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \
237
&rprr_store nreg=0
238
+@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \
239
+ &rprr_scatter_store
240
241
###########################################################################
242
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
243
@@ -XXX,XX +XXX,XX @@ ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \
244
# SVE store multiple structures (scalar plus scalar) (nreg != 0)
245
ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \
246
@rprr_store esz=%size_23
247
+
248
+# SVE 32-bit scatter store (scalar plus 32-bit scaled offsets)
249
+# Require msz > 0 && msz <= esz.
250
+ST1_zprz 1110010 .. 11 ..... 100 ... ..... ..... \
251
+ @rprr_scatter_store xs=0 esz=2 scale=1
252
+ST1_zprz 1110010 .. 11 ..... 110 ... ..... ..... \
253
+ @rprr_scatter_store xs=1 esz=2 scale=1
254
+
255
+# SVE 32-bit scatter store (scalar plus 32-bit unscaled offsets)
256
+# Require msz <= esz.
257
+ST1_zprz 1110010 .. 10 ..... 100 ... ..... ..... \
258
+ @rprr_scatter_store xs=0 esz=2 scale=0
259
+ST1_zprz 1110010 .. 10 ..... 110 ... ..... ..... \
260
+ @rprr_scatter_store xs=1 esz=2 scale=0
261
+
262
+# SVE 64-bit scatter store (scalar plus 64-bit scaled offset)
263
+# Require msz > 0
264
+ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \
265
+ @rprr_scatter_store xs=2 esz=3 scale=1
266
+
267
+# SVE 64-bit scatter store (scalar plus 64-bit unscaled offset)
268
+ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \
269
+ @rprr_scatter_store xs=2 esz=3 scale=0
270
+
271
+# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset)
272
+# Require msz > 0
273
+ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \
274
+ @rprr_scatter_store xs=0 esz=3 scale=1
275
+ST1_zprz 1110010 .. 01 ..... 110 ... ..... ..... \
276
+ @rprr_scatter_store xs=1 esz=3 scale=1
277
+
278
+# SVE 64-bit scatter store (scalar plus unpacked 32-bit unscaled offset)
279
+ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \
280
+ @rprr_scatter_store xs=0 esz=3 scale=0
281
+ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \
282
+ @rprr_scatter_store xs=1 esz=3 scale=0
283
--
73
--
284
2.17.1
74
2.34.1
285
75
286
76
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
There is no need to re-set these 9 features already
3
Assign the pointer return value to 'a' directly,
4
implied by the call to aarch64_a57_initfn.
4
rather than going through an intermediary index.
5
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241203203949.483774-5-richard.henderson@linaro.org
9
Message-id: 20180629001538.11415-4-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/cpu64.c | 9 ---------
11
fpu/softfloat-parts.c.inc | 32 ++++++++++----------------------
13
1 file changed, 9 deletions(-)
12
1 file changed, 10 insertions(+), 22 deletions(-)
14
13
15
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu64.c
16
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/cpu64.c
17
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
20
* whereas the architecture requires them to be present in both if
19
FloatPartsN *c, float_status *s,
21
* present in either.
20
int ab_mask, int abc_mask)
22
*/
21
{
23
- set_feature(&cpu->env, ARM_FEATURE_V8);
22
- int which;
24
- set_feature(&cpu->env, ARM_FEATURE_VFP4);
23
bool infzero = (ab_mask == float_cmask_infzero);
25
- set_feature(&cpu->env, ARM_FEATURE_NEON);
24
bool have_snan = (abc_mask & float_cmask_snan);
26
- set_feature(&cpu->env, ARM_FEATURE_AARCH64);
25
+ FloatPartsN *ret;
27
- set_feature(&cpu->env, ARM_FEATURE_V8_AES);
26
28
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
27
if (unlikely(have_snan)) {
29
- set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
28
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
30
set_feature(&cpu->env, ARM_FEATURE_V8_SHA512);
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
31
set_feature(&cpu->env, ARM_FEATURE_V8_SHA3);
30
default:
32
set_feature(&cpu->env, ARM_FEATURE_V8_SM3);
31
g_assert_not_reached();
33
set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
32
}
34
- set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
33
- which = 2;
35
- set_feature(&cpu->env, ARM_FEATURE_CRC);
34
+ ret = c;
36
set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
35
} else {
37
set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
36
- FloatClass cls[3] = { a->cls, b->cls, c->cls };
38
set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
37
+ FloatPartsN *val[3] = { a, b, c };
38
Float3NaNPropRule rule = s->float_3nan_prop_rule;
39
40
assert(rule != float_3nan_prop_none);
41
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
42
/* We have at least one SNaN input and should prefer it */
43
do {
44
- which = rule & R_3NAN_1ST_MASK;
45
+ ret = val[rule & R_3NAN_1ST_MASK];
46
rule >>= R_3NAN_1ST_LENGTH;
47
- } while (!is_snan(cls[which]));
48
+ } while (!is_snan(ret->cls));
49
} else {
50
do {
51
- which = rule & R_3NAN_1ST_MASK;
52
+ ret = val[rule & R_3NAN_1ST_MASK];
53
rule >>= R_3NAN_1ST_LENGTH;
54
- } while (!is_nan(cls[which]));
55
+ } while (!is_nan(ret->cls));
56
}
57
}
58
59
- switch (which) {
60
- case 0:
61
- break;
62
- case 1:
63
- a = b;
64
- break;
65
- case 2:
66
- a = c;
67
- break;
68
- default:
69
- g_assert_not_reached();
70
+ if (is_snan(ret->cls)) {
71
+ parts_silence_nan(ret, s);
72
}
73
- if (is_snan(a->cls)) {
74
- parts_silence_nan(a, s);
75
- }
76
- return a;
77
+ return ret;
78
79
default_nan:
80
parts_default_nan(a, s);
39
--
81
--
40
2.17.1
82
2.34.1
41
83
42
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
While all indices into val[] should be in [0-2], the mask
4
applied is two bits. To help static analysis see there is
5
no possibility of read beyond the end of the array, pad the
6
array to 4 entries, with the final being (implicitly) NULL.
7
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
10
Message-id: 20241203203949.483774-6-richard.henderson@linaro.org
6
Message-id: 20180627043328.11531-3-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
12
---
9
target/arm/helper-sve.h | 40 ++++++++++
13
fpu/softfloat-parts.c.inc | 2 +-
10
target/arm/sve_helper.c | 157 +++++++++++++++++++++++++++++++++++++
14
1 file changed, 1 insertion(+), 1 deletion(-)
11
target/arm/translate-sve.c | 69 ++++++++++++++++
12
target/arm/sve.decode | 6 ++
13
4 files changed, 272 insertions(+)
14
15
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
18
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/helper-sve.h
19
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
20
21
}
21
DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
22
ret = c;
22
DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
23
} else {
23
+
24
- FloatPartsN *val[3] = { a, b, c };
24
+DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
25
+ FloatPartsN *val[R_3NAN_1ST_MASK + 1] = { a, b, c };
25
+DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
Float3NaNPropRule rule = s->float_3nan_prop_rule;
26
+DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
27
27
+DEF_HELPER_FLAGS_4(sve_ldff1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
assert(rule != float_3nan_prop_none);
28
+DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
+
32
+DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
33
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
35
+DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
+
38
+DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+
42
+DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
43
+
44
+DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
+DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
50
+DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
51
+
52
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
56
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
57
+
58
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
59
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
61
+
62
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
63
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/sve_helper.c
66
+++ b/target/arm/sve_helper.c
67
@@ -XXX,XX +XXX,XX @@ DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
68
#undef DO_LD2
69
#undef DO_LD3
70
#undef DO_LD4
71
+
72
+/*
73
+ * Load contiguous data, first-fault and no-fault.
74
+ */
75
+
76
+#ifdef CONFIG_USER_ONLY
77
+
78
+/* Fault on byte I. All bits in FFR from I are cleared. The vector
79
+ * result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE
80
+ * option, which leaves subsequent data unchanged.
81
+ */
82
+static void record_fault(CPUARMState *env, uintptr_t i, uintptr_t oprsz)
83
+{
84
+ uint64_t *ffr = env->vfp.pregs[FFR_PRED_NUM].p;
85
+
86
+ if (i & 63) {
87
+ ffr[i / 64] &= MAKE_64BIT_MASK(0, i & 63);
88
+ i = ROUND_UP(i, 64);
89
+ }
90
+ for (; i < oprsz; i += 64) {
91
+ ffr[i / 64] = 0;
92
+ }
93
+}
94
+
95
+/* Hold the mmap lock during the operation so that there is no race
96
+ * between page_check_range and the load operation. We expect the
97
+ * usual case to have no faults at all, so we check the whole range
98
+ * first and if successful defer to the normal load operation.
99
+ *
100
+ * TODO: Change mmap_lock to a rwlock so that multiple readers
101
+ * can run simultaneously. This will probably help other uses
102
+ * within QEMU as well.
103
+ */
104
+#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
105
+static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg, \
106
+ target_ulong addr, intptr_t oprsz, \
107
+ bool first, uintptr_t ra) \
108
+{ \
109
+ intptr_t i = 0; \
110
+ do { \
111
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
112
+ do { \
113
+ TYPEM m = 0; \
114
+ if (pg & 1) { \
115
+ if (!first && \
116
+ unlikely(page_check_range(addr, sizeof(TYPEM), \
117
+ PAGE_READ))) { \
118
+ record_fault(env, i, oprsz); \
119
+ return; \
120
+ } \
121
+ m = FN(env, addr, ra); \
122
+ first = false; \
123
+ } \
124
+ *(TYPEE *)(vd + H(i)) = m; \
125
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
126
+ addr += sizeof(TYPEM); \
127
+ } while (i & 15); \
128
+ } while (i < oprsz); \
129
+} \
130
+void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \
131
+ target_ulong addr, uint32_t desc) \
132
+{ \
133
+ intptr_t oprsz = simd_oprsz(desc); \
134
+ unsigned rd = simd_data(desc); \
135
+ void *vd = &env->vfp.zregs[rd]; \
136
+ mmap_lock(); \
137
+ if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \
138
+ do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \
139
+ } else { \
140
+ do_sve_ldff1##PART(env, vd, vg, addr, oprsz, true, GETPC()); \
141
+ } \
142
+ mmap_unlock(); \
143
+}
144
+
145
+/* No-fault loads are like first-fault loads without the
146
+ * first faulting special case.
147
+ */
148
+#define DO_LDNF1(PART) \
149
+void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \
150
+ target_ulong addr, uint32_t desc) \
151
+{ \
152
+ intptr_t oprsz = simd_oprsz(desc); \
153
+ unsigned rd = simd_data(desc); \
154
+ void *vd = &env->vfp.zregs[rd]; \
155
+ mmap_lock(); \
156
+ if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \
157
+ do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \
158
+ } else { \
159
+ do_sve_ldff1##PART(env, vd, vg, addr, oprsz, false, GETPC()); \
160
+ } \
161
+ mmap_unlock(); \
162
+}
163
+
164
+#else
165
+
166
+/* TODO: System mode is not yet supported.
167
+ * This would probably use tlb_vaddr_to_host.
168
+ */
169
+#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
170
+void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \
171
+ target_ulong addr, uint32_t desc) \
172
+{ \
173
+ g_assert_not_reached(); \
174
+}
175
+
176
+#define DO_LDNF1(PART) \
177
+void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \
178
+ target_ulong addr, uint32_t desc) \
179
+{ \
180
+ g_assert_not_reached(); \
181
+}
182
+
183
+#endif
184
+
185
+DO_LDFF1(bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
186
+DO_LDFF1(bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2)
187
+DO_LDFF1(bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2)
188
+DO_LDFF1(bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4)
189
+DO_LDFF1(bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4)
190
+DO_LDFF1(bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, )
191
+DO_LDFF1(bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
192
+
193
+DO_LDFF1(hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
194
+DO_LDFF1(hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
195
+DO_LDFF1(hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4)
196
+DO_LDFF1(hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
197
+DO_LDFF1(hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
198
+
199
+DO_LDFF1(ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
200
+DO_LDFF1(sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, )
201
+DO_LDFF1(sds_r, cpu_ldl_data_ra, uint64_t, int32_t, )
202
+
203
+DO_LDFF1(dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
204
+
205
+#undef DO_LDFF1
206
+
207
+DO_LDNF1(bb_r)
208
+DO_LDNF1(bhu_r)
209
+DO_LDNF1(bhs_r)
210
+DO_LDNF1(bsu_r)
211
+DO_LDNF1(bss_r)
212
+DO_LDNF1(bdu_r)
213
+DO_LDNF1(bds_r)
214
+
215
+DO_LDNF1(hh_r)
216
+DO_LDNF1(hsu_r)
217
+DO_LDNF1(hss_r)
218
+DO_LDNF1(hdu_r)
219
+DO_LDNF1(hds_r)
220
+
221
+DO_LDNF1(ss_r)
222
+DO_LDNF1(sdu_r)
223
+DO_LDNF1(sds_r)
224
+
225
+DO_LDNF1(dd_r)
226
+
227
+#undef DO_LDNF1
228
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
229
index XXXXXXX..XXXXXXX 100644
230
--- a/target/arm/translate-sve.c
231
+++ b/target/arm/translate-sve.c
232
@@ -XXX,XX +XXX,XX @@ static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
233
}
234
return true;
235
}
236
+
237
+static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
238
+{
239
+ static gen_helper_gvec_mem * const fns[16] = {
240
+ gen_helper_sve_ldff1bb_r,
241
+ gen_helper_sve_ldff1bhu_r,
242
+ gen_helper_sve_ldff1bsu_r,
243
+ gen_helper_sve_ldff1bdu_r,
244
+
245
+ gen_helper_sve_ldff1sds_r,
246
+ gen_helper_sve_ldff1hh_r,
247
+ gen_helper_sve_ldff1hsu_r,
248
+ gen_helper_sve_ldff1hdu_r,
249
+
250
+ gen_helper_sve_ldff1hds_r,
251
+ gen_helper_sve_ldff1hss_r,
252
+ gen_helper_sve_ldff1ss_r,
253
+ gen_helper_sve_ldff1sdu_r,
254
+
255
+ gen_helper_sve_ldff1bds_r,
256
+ gen_helper_sve_ldff1bss_r,
257
+ gen_helper_sve_ldff1bhs_r,
258
+ gen_helper_sve_ldff1dd_r,
259
+ };
260
+
261
+ if (sve_access_check(s)) {
262
+ TCGv_i64 addr = new_tmp_a64(s);
263
+ tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
264
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
265
+ do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]);
266
+ }
267
+ return true;
268
+}
269
+
270
+static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
271
+{
272
+ static gen_helper_gvec_mem * const fns[16] = {
273
+ gen_helper_sve_ldnf1bb_r,
274
+ gen_helper_sve_ldnf1bhu_r,
275
+ gen_helper_sve_ldnf1bsu_r,
276
+ gen_helper_sve_ldnf1bdu_r,
277
+
278
+ gen_helper_sve_ldnf1sds_r,
279
+ gen_helper_sve_ldnf1hh_r,
280
+ gen_helper_sve_ldnf1hsu_r,
281
+ gen_helper_sve_ldnf1hdu_r,
282
+
283
+ gen_helper_sve_ldnf1hds_r,
284
+ gen_helper_sve_ldnf1hss_r,
285
+ gen_helper_sve_ldnf1ss_r,
286
+ gen_helper_sve_ldnf1sdu_r,
287
+
288
+ gen_helper_sve_ldnf1bds_r,
289
+ gen_helper_sve_ldnf1bss_r,
290
+ gen_helper_sve_ldnf1bhs_r,
291
+ gen_helper_sve_ldnf1dd_r,
292
+ };
293
+
294
+ if (sve_access_check(s)) {
295
+ int vsz = vec_full_reg_size(s);
296
+ int elements = vsz >> dtype_esz[a->dtype];
297
+ int off = (a->imm * elements) << dtype_msz(a->dtype);
298
+ TCGv_i64 addr = new_tmp_a64(s);
299
+
300
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off);
301
+ do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]);
302
+ }
303
+ return true;
304
+}
305
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
306
index XXXXXXX..XXXXXXX 100644
307
--- a/target/arm/sve.decode
308
+++ b/target/arm/sve.decode
309
@@ -XXX,XX +XXX,XX @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
310
# SVE contiguous load (scalar plus scalar)
311
LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0
312
313
+# SVE contiguous first-fault load (scalar plus scalar)
314
+LDFF1_zprr 1010010 .... ..... 011 ... ..... ..... @rprr_load_dt nreg=0
315
+
316
# SVE contiguous load (scalar plus immediate)
317
LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0
318
319
+# SVE contiguous non-fault load (scalar plus immediate)
320
+LDNF1_zpri 1010010 .... 1.... 101 ... ..... ..... @rpri_load_dt nreg=0
321
+
322
# SVE contiguous non-temporal load (scalar plus scalar)
323
# LDNT1B, LDNT1H, LDNT1W, LDNT1D
324
# SVE load multiple structures (scalar plus scalar)
325
--
29
--
326
2.17.1
30
2.34.1
327
31
328
32
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
This function is part of the public interface and
4
is not "specialized" to any target in any way.
2
5
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Message-id: 20241203203949.483774-7-richard.henderson@linaro.org
6
Message-id: 20180627043328.11531-34-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
10
---
9
target/arm/helper.h | 5 ++
11
fpu/softfloat.c | 52 ++++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 18 ++++++
12
fpu/softfloat-specialize.c.inc | 52 ----------------------------------
11
target/arm/vec_helper.c | 124 +++++++++++++++++++++++++++++++++++++
13
2 files changed, 52 insertions(+), 52 deletions(-)
12
target/arm/sve.decode | 6 ++
13
4 files changed, 153 insertions(+)
14
14
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.h
17
--- a/fpu/softfloat.c
18
+++ b/target/arm/helper.h
18
+++ b/fpu/softfloat.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
20
DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
*zExpPtr = 1 - shiftCount;
21
DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
}
22
22
23
+DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+/*----------------------------------------------------------------------------
24
+DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+| Takes two extended double-precision floating-point values `a' and `b', one
25
+DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+| of which is a NaN, and returns the appropriate NaN result. If either `a' or
26
+DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+| `b' is a signaling NaN, the invalid exception is raised.
27
+*----------------------------------------------------------------------------*/
27
+
28
+
28
DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
29
+floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
29
void, ptr, ptr, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
31
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/translate-sve.c
34
+++ b/target/arm/translate-sve.c
35
@@ -XXX,XX +XXX,XX @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn)
36
return true;
37
}
38
39
+static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a, uint32_t insn)
40
+{
30
+{
41
+ static gen_helper_gvec_3 * const fns[2][2] = {
31
+ bool aIsLargerSignificand;
42
+ { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h },
32
+ FloatClass a_cls, b_cls;
43
+ { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h }
44
+ };
45
+
33
+
46
+ if (sve_access_check(s)) {
34
+ /* This is not complete, but is good enough for pickNaN. */
47
+ unsigned vsz = vec_full_reg_size(s);
35
+ a_cls = (!floatx80_is_any_nan(a)
48
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
36
+ ? float_class_normal
49
+ vec_full_reg_offset(s, a->rn),
37
+ : floatx80_is_signaling_nan(a, status)
50
+ vec_full_reg_offset(s, a->rm),
38
+ ? float_class_snan
51
+ vsz, vsz, a->index, fns[a->u][a->sz]);
39
+ : float_class_qnan);
40
+ b_cls = (!floatx80_is_any_nan(b)
41
+ ? float_class_normal
42
+ : floatx80_is_signaling_nan(b, status)
43
+ ? float_class_snan
44
+ : float_class_qnan);
45
+
46
+ if (is_snan(a_cls) || is_snan(b_cls)) {
47
+ float_raise(float_flag_invalid, status);
52
+ }
48
+ }
53
+ return true;
49
+
50
+ if (status->default_nan_mode) {
51
+ return floatx80_default_nan(status);
52
+ }
53
+
54
+ if (a.low < b.low) {
55
+ aIsLargerSignificand = 0;
56
+ } else if (b.low < a.low) {
57
+ aIsLargerSignificand = 1;
58
+ } else {
59
+ aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
60
+ }
61
+
62
+ if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
63
+ if (is_snan(b_cls)) {
64
+ return floatx80_silence_nan(b, status);
65
+ }
66
+ return b;
67
+ } else {
68
+ if (is_snan(a_cls)) {
69
+ return floatx80_silence_nan(a, status);
70
+ }
71
+ return a;
72
+ }
54
+}
73
+}
55
+
74
+
56
+
75
/*----------------------------------------------------------------------------
57
/*
76
| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
58
*** SVE Floating Point Multiply-Add Indexed Group
77
| and extended significand formed by the concatenation of `zSig0' and `zSig1',
59
*/
78
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
60
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
61
index XXXXXXX..XXXXXXX 100644
79
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/vec_helper.c
80
--- a/fpu/softfloat-specialize.c.inc
63
+++ b/target/arm/vec_helper.c
81
+++ b/fpu/softfloat-specialize.c.inc
64
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc)
82
@@ -XXX,XX +XXX,XX @@ floatx80 floatx80_silence_nan(floatx80 a, float_status *status)
65
clear_tail(d, opr_sz, simd_maxsz(desc));
83
return a;
66
}
84
}
67
85
68
+void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
86
-/*----------------------------------------------------------------------------
69
+{
87
-| Takes two extended double-precision floating-point values `a' and `b', one
70
+ intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
88
-| of which is a NaN, and returns the appropriate NaN result. If either `a' or
71
+ intptr_t index = simd_data(desc);
89
-| `b' is a signaling NaN, the invalid exception is raised.
72
+ uint32_t *d = vd;
90
-*----------------------------------------------------------------------------*/
73
+ int8_t *n = vn;
91
-
74
+ int8_t *m_indexed = (int8_t *)vm + index * 4;
92
-floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
75
+
93
-{
76
+ /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
94
- bool aIsLargerSignificand;
77
+ * Otherwise opr_sz is a multiple of 16.
95
- FloatClass a_cls, b_cls;
78
+ */
96
-
79
+ segend = MIN(4, opr_sz_4);
97
- /* This is not complete, but is good enough for pickNaN. */
80
+ i = 0;
98
- a_cls = (!floatx80_is_any_nan(a)
81
+ do {
99
- ? float_class_normal
82
+ int8_t m0 = m_indexed[i * 4 + 0];
100
- : floatx80_is_signaling_nan(a, status)
83
+ int8_t m1 = m_indexed[i * 4 + 1];
101
- ? float_class_snan
84
+ int8_t m2 = m_indexed[i * 4 + 2];
102
- : float_class_qnan);
85
+ int8_t m3 = m_indexed[i * 4 + 3];
103
- b_cls = (!floatx80_is_any_nan(b)
86
+
104
- ? float_class_normal
87
+ do {
105
- : floatx80_is_signaling_nan(b, status)
88
+ d[i] += n[i * 4 + 0] * m0
106
- ? float_class_snan
89
+ + n[i * 4 + 1] * m1
107
- : float_class_qnan);
90
+ + n[i * 4 + 2] * m2
108
-
91
+ + n[i * 4 + 3] * m3;
109
- if (is_snan(a_cls) || is_snan(b_cls)) {
92
+ } while (++i < segend);
110
- float_raise(float_flag_invalid, status);
93
+ segend = i + 4;
111
- }
94
+ } while (i < opr_sz_4);
112
-
95
+
113
- if (status->default_nan_mode) {
96
+ clear_tail(d, opr_sz, simd_maxsz(desc));
114
- return floatx80_default_nan(status);
97
+}
115
- }
98
+
116
-
99
+void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
117
- if (a.low < b.low) {
100
+{
118
- aIsLargerSignificand = 0;
101
+ intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
119
- } else if (b.low < a.low) {
102
+ intptr_t index = simd_data(desc);
120
- aIsLargerSignificand = 1;
103
+ uint32_t *d = vd;
121
- } else {
104
+ uint8_t *n = vn;
122
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
105
+ uint8_t *m_indexed = (uint8_t *)vm + index * 4;
123
- }
106
+
124
-
107
+ /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
125
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
108
+ * Otherwise opr_sz is a multiple of 16.
126
- if (is_snan(b_cls)) {
109
+ */
127
- return floatx80_silence_nan(b, status);
110
+ segend = MIN(4, opr_sz_4);
128
- }
111
+ i = 0;
129
- return b;
112
+ do {
130
- } else {
113
+ uint8_t m0 = m_indexed[i * 4 + 0];
131
- if (is_snan(a_cls)) {
114
+ uint8_t m1 = m_indexed[i * 4 + 1];
132
- return floatx80_silence_nan(a, status);
115
+ uint8_t m2 = m_indexed[i * 4 + 2];
133
- }
116
+ uint8_t m3 = m_indexed[i * 4 + 3];
134
- return a;
117
+
135
- }
118
+ do {
136
-}
119
+ d[i] += n[i * 4 + 0] * m0
137
-
120
+ + n[i * 4 + 1] * m1
138
/*----------------------------------------------------------------------------
121
+ + n[i * 4 + 2] * m2
139
| Returns 1 if the quadruple-precision floating-point value `a' is a quiet
122
+ + n[i * 4 + 3] * m3;
140
| NaN; otherwise returns 0.
123
+ } while (++i < segend);
124
+ segend = i + 4;
125
+ } while (i < opr_sz_4);
126
+
127
+ clear_tail(d, opr_sz, simd_maxsz(desc));
128
+}
129
+
130
+void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc)
131
+{
132
+ intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8;
133
+ intptr_t index = simd_data(desc);
134
+ uint64_t *d = vd;
135
+ int16_t *n = vn;
136
+ int16_t *m_indexed = (int16_t *)vm + index * 4;
137
+
138
+ /* This is supported by SVE only, so opr_sz is always a multiple of 16.
139
+ * Process the entire segment all at once, writing back the results
140
+ * only after we've consumed all of the inputs.
141
+ */
142
+ for (i = 0; i < opr_sz_8 ; i += 2) {
143
+ uint64_t d0, d1;
144
+
145
+ d0 = n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0];
146
+ d0 += n[i * 4 + 1] * (int64_t)m_indexed[i * 4 + 1];
147
+ d0 += n[i * 4 + 2] * (int64_t)m_indexed[i * 4 + 2];
148
+ d0 += n[i * 4 + 3] * (int64_t)m_indexed[i * 4 + 3];
149
+ d1 = n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0];
150
+ d1 += n[i * 4 + 5] * (int64_t)m_indexed[i * 4 + 1];
151
+ d1 += n[i * 4 + 6] * (int64_t)m_indexed[i * 4 + 2];
152
+ d1 += n[i * 4 + 7] * (int64_t)m_indexed[i * 4 + 3];
153
+
154
+ d[i + 0] += d0;
155
+ d[i + 1] += d1;
156
+ }
157
+
158
+ clear_tail(d, opr_sz, simd_maxsz(desc));
159
+}
160
+
161
+void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc)
162
+{
163
+ intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8;
164
+ intptr_t index = simd_data(desc);
165
+ uint64_t *d = vd;
166
+ uint16_t *n = vn;
167
+ uint16_t *m_indexed = (uint16_t *)vm + index * 4;
168
+
169
+ /* This is supported by SVE only, so opr_sz is always a multiple of 16.
170
+ * Process the entire segment all at once, writing back the results
171
+ * only after we've consumed all of the inputs.
172
+ */
173
+ for (i = 0; i < opr_sz_8 ; i += 2) {
174
+ uint64_t d0, d1;
175
+
176
+ d0 = n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0];
177
+ d0 += n[i * 4 + 1] * (uint64_t)m_indexed[i * 4 + 1];
178
+ d0 += n[i * 4 + 2] * (uint64_t)m_indexed[i * 4 + 2];
179
+ d0 += n[i * 4 + 3] * (uint64_t)m_indexed[i * 4 + 3];
180
+ d1 = n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0];
181
+ d1 += n[i * 4 + 5] * (uint64_t)m_indexed[i * 4 + 1];
182
+ d1 += n[i * 4 + 6] * (uint64_t)m_indexed[i * 4 + 2];
183
+ d1 += n[i * 4 + 7] * (uint64_t)m_indexed[i * 4 + 3];
184
+
185
+ d[i + 0] += d0;
186
+ d[i + 1] += d1;
187
+ }
188
+
189
+ clear_tail(d, opr_sz, simd_maxsz(desc));
190
+}
191
+
192
void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
193
void *vfpst, uint32_t desc)
194
{
195
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
196
index XXXXXXX..XXXXXXX 100644
197
--- a/target/arm/sve.decode
198
+++ b/target/arm/sve.decode
199
@@ -XXX,XX +XXX,XX @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
200
# SVE integer dot product (unpredicated)
201
DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx
202
203
+# SVE integer dot product (indexed)
204
+DOT_zzx 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \
205
+ sz=0 ra=%reg_movprfx
206
+DOT_zzx 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \
207
+ sz=1 ra=%reg_movprfx
208
+
209
# SVE floating-point complex add (predicated)
210
FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
211
rn=%reg_movprfx
212
--
141
--
213
2.17.1
142
2.34.1
214
215
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Unpacking and repacking the parts may be slightly more work
4
than we did before, but we get to reuse more code. For a
5
code path handling exceptional values, this is an improvement.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241203203949.483774-8-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
[PMM: fixed typo]
6
Message-id: 20180627043328.11531-6-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
11
---
9
target/arm/helper-sve.h | 30 +++++++++++++
12
fpu/softfloat.c | 43 +++++--------------------------------------
10
target/arm/sve_helper.c | 38 ++++++++++++++++
13
1 file changed, 5 insertions(+), 38 deletions(-)
11
target/arm/translate-sve.c | 90 ++++++++++++++++++++++++++++++++++++++
12
target/arm/sve.decode | 22 ++++++++++
13
4 files changed, 180 insertions(+)
14
14
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
17
--- a/fpu/softfloat.c
18
+++ b/target/arm/helper-sve.h
18
+++ b/fpu/softfloat.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
20
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
20
21
void, ptr, ptr, ptr, ptr, i32)
21
floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
22
22
{
23
+DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
23
- bool aIsLargerSignificand;
24
+ void, ptr, ptr, ptr, ptr, i32)
24
- FloatClass a_cls, b_cls;
25
+DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
25
+ FloatParts128 pa, pb, *pr;
26
+ void, ptr, ptr, ptr, ptr, i32)
26
27
+DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG,
27
- /* This is not complete, but is good enough for pickNaN. */
28
+ void, ptr, ptr, ptr, ptr, i32)
28
- a_cls = (!floatx80_is_any_nan(a)
29
+DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG,
29
- ? float_class_normal
30
+ void, ptr, ptr, ptr, ptr, i32)
30
- : floatx80_is_signaling_nan(a, status)
31
+DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG,
31
- ? float_class_snan
32
+ void, ptr, ptr, ptr, ptr, i32)
32
- : float_class_qnan);
33
+DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG,
33
- b_cls = (!floatx80_is_any_nan(b)
34
+ void, ptr, ptr, ptr, ptr, i32)
34
- ? float_class_normal
35
+DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG,
35
- : floatx80_is_signaling_nan(b, status)
36
+ void, ptr, ptr, ptr, ptr, i32)
36
- ? float_class_snan
37
+
37
- : float_class_qnan);
38
+DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG,
38
-
39
+ void, ptr, ptr, ptr, ptr, i32)
39
- if (is_snan(a_cls) || is_snan(b_cls)) {
40
+DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG,
40
- float_raise(float_flag_invalid, status);
41
+ void, ptr, ptr, ptr, ptr, i32)
41
- }
42
+DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG,
42
-
43
+ void, ptr, ptr, ptr, ptr, i32)
43
- if (status->default_nan_mode) {
44
+DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG,
44
+ if (!floatx80_unpack_canonical(&pa, a, status) ||
45
+ void, ptr, ptr, ptr, ptr, i32)
45
+ !floatx80_unpack_canonical(&pb, b, status)) {
46
+DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG,
46
return floatx80_default_nan(status);
47
+ void, ptr, ptr, ptr, ptr, i32)
47
}
48
+DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
48
49
+ void, ptr, ptr, ptr, ptr, i32)
49
- if (a.low < b.low) {
50
+DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
50
- aIsLargerSignificand = 0;
51
+ void, ptr, ptr, ptr, ptr, i32)
51
- } else if (b.low < a.low) {
52
+
52
- aIsLargerSignificand = 1;
53
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
- } else {
54
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
55
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
- }
56
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
56
-
57
index XXXXXXX..XXXXXXX 100644
57
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
58
--- a/target/arm/sve_helper.c
58
- if (is_snan(b_cls)) {
59
+++ b/target/arm/sve_helper.c
59
- return floatx80_silence_nan(b, status);
60
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
60
- }
61
return predtest_ones(d, oprsz, esz_mask);
61
- return b;
62
- } else {
63
- if (is_snan(a_cls)) {
64
- return floatx80_silence_nan(a, status);
65
- }
66
- return a;
67
- }
68
+ pr = parts_pick_nan(&pa, &pb, status);
69
+ return floatx80_round_pack_canonical(pr, status);
62
}
70
}
63
71
64
+/* Fully general two-operand expander, controlled by a predicate,
72
/*----------------------------------------------------------------------------
65
+ * With the extra float_status parameter.
66
+ */
67
+#define DO_ZPZ_FP(NAME, TYPE, H, OP) \
68
+void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
69
+{ \
70
+ intptr_t i = simd_oprsz(desc); \
71
+ uint64_t *g = vg; \
72
+ do { \
73
+ uint64_t pg = g[(i - 1) >> 6]; \
74
+ do { \
75
+ i -= sizeof(TYPE); \
76
+ if (likely((pg >> (i & 63)) & 1)) { \
77
+ TYPE nn = *(TYPE *)(vn + H(i)); \
78
+ *(TYPE *)(vd + H(i)) = OP(nn, status); \
79
+ } \
80
+ } while (i & 63); \
81
+ } while (i != 0); \
82
+}
83
+
84
+DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
85
+DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
86
+DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
87
+DO_ZPZ_FP(sve_scvt_sd, uint64_t, , int32_to_float64)
88
+DO_ZPZ_FP(sve_scvt_dh, uint64_t, , int64_to_float16)
89
+DO_ZPZ_FP(sve_scvt_ds, uint64_t, , int64_to_float32)
90
+DO_ZPZ_FP(sve_scvt_dd, uint64_t, , int64_to_float64)
91
+
92
+DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16)
93
+DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16)
94
+DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32)
95
+DO_ZPZ_FP(sve_ucvt_sd, uint64_t, , uint32_to_float64)
96
+DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16)
97
+DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32)
98
+DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64)
99
+
100
+#undef DO_ZPZ_FP
101
+
102
/*
103
* Load contiguous data, protected by a governing predicate.
104
*/
105
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
106
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/translate-sve.c
108
+++ b/target/arm/translate-sve.c
109
@@ -XXX,XX +XXX,XX @@ DO_FP3(FRSQRTS, rsqrts)
110
111
#undef DO_FP3
112
113
+
114
+/*
115
+ *** SVE Floating Point Unary Operations Predicated Group
116
+ */
117
+
118
+static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg,
119
+ bool is_fp16, gen_helper_gvec_3_ptr *fn)
120
+{
121
+ if (sve_access_check(s)) {
122
+ unsigned vsz = vec_full_reg_size(s);
123
+ TCGv_ptr status = get_fpstatus_ptr(is_fp16);
124
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
125
+ vec_full_reg_offset(s, rn),
126
+ pred_full_reg_offset(s, pg),
127
+ status, vsz, vsz, 0, fn);
128
+ tcg_temp_free_ptr(status);
129
+ }
130
+ return true;
131
+}
132
+
133
+static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
134
+{
135
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
136
+}
137
+
138
+static bool trans_SCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
139
+{
140
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_sh);
141
+}
142
+
143
+static bool trans_SCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
144
+{
145
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_dh);
146
+}
147
+
148
+static bool trans_SCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
149
+{
150
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ss);
151
+}
152
+
153
+static bool trans_SCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
154
+{
155
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ds);
156
+}
157
+
158
+static bool trans_SCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
159
+{
160
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_sd);
161
+}
162
+
163
+static bool trans_SCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
164
+{
165
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_dd);
166
+}
167
+
168
+static bool trans_UCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
169
+{
170
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_hh);
171
+}
172
+
173
+static bool trans_UCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
174
+{
175
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_sh);
176
+}
177
+
178
+static bool trans_UCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
179
+{
180
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_dh);
181
+}
182
+
183
+static bool trans_UCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
184
+{
185
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ss);
186
+}
187
+
188
+static bool trans_UCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
189
+{
190
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ds);
191
+}
192
+
193
+static bool trans_UCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
194
+{
195
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_sd);
196
+}
197
+
198
+static bool trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
199
+{
200
+ return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_dd);
201
+}
202
+
203
/*
204
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
205
*/
206
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
207
index XXXXXXX..XXXXXXX 100644
208
--- a/target/arm/sve.decode
209
+++ b/target/arm/sve.decode
210
@@ -XXX,XX +XXX,XX @@
211
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
212
@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz
213
214
+# One register operand, with governing predicate, no vector element size
215
+@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0
216
+
217
# Two register operands with a 6-bit signed immediate.
218
@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri
219
220
@@ -XXX,XX +XXX,XX @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm
221
FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm
222
FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm
223
224
+### SVE FP Unary Operations Predicated Group
225
+
226
+# SVE integer convert to floating-point
227
+SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0
228
+SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
229
+SCVTF_dh 01100101 01 010 11 0 101 ... ..... ..... @rd_pg_rn_e0
230
+SCVTF_ss 01100101 10 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
231
+SCVTF_sd 01100101 11 010 00 0 101 ... ..... ..... @rd_pg_rn_e0
232
+SCVTF_ds 01100101 11 010 10 0 101 ... ..... ..... @rd_pg_rn_e0
233
+SCVTF_dd 01100101 11 010 11 0 101 ... ..... ..... @rd_pg_rn_e0
234
+
235
+UCVTF_hh 01100101 01 010 01 1 101 ... ..... ..... @rd_pg_rn_e0
236
+UCVTF_sh 01100101 01 010 10 1 101 ... ..... ..... @rd_pg_rn_e0
237
+UCVTF_dh 01100101 01 010 11 1 101 ... ..... ..... @rd_pg_rn_e0
238
+UCVTF_ss 01100101 10 010 10 1 101 ... ..... ..... @rd_pg_rn_e0
239
+UCVTF_sd 01100101 11 010 00 1 101 ... ..... ..... @rd_pg_rn_e0
240
+UCVTF_ds 01100101 11 010 10 1 101 ... ..... ..... @rd_pg_rn_e0
241
+UCVTF_dd 01100101 11 010 11 1 101 ... ..... ..... @rd_pg_rn_e0
242
+
243
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
244
245
# SVE load predicate register
246
--
73
--
247
2.17.1
74
2.34.1
248
249
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Inline pickNaN into its only caller. This makes one assert
4
redundant with the immediately preceding IF.
5
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-29-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-9-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/helper-sve.h | 7 +++
11
fpu/softfloat-parts.c.inc | 82 +++++++++++++++++++++++++----
9
target/arm/sve_helper.c | 100 +++++++++++++++++++++++++++++++++++++
12
fpu/softfloat-specialize.c.inc | 96 ----------------------------------
10
target/arm/translate-sve.c | 24 +++++++++
13
2 files changed, 73 insertions(+), 105 deletions(-)
11
target/arm/sve.decode | 4 ++
14
12
4 files changed, 135 insertions(+)
15
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
17
--- a/fpu/softfloat-parts.c.inc
17
+++ b/target/arm/helper-sve.h
18
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
19
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
19
DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
20
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
20
void, ptr, ptr, ptr, ptr, ptr, i32)
21
float_status *s)
21
22
{
22
+DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG,
23
+ int cmp, which;
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
28
+
24
+
29
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
25
if (is_snan(a->cls) || is_snan(b->cls)) {
30
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
31
DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
27
}
32
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
28
29
if (s->default_nan_mode) {
30
parts_default_nan(a, s);
31
- } else {
32
- int cmp = frac_cmp(a, b);
33
- if (cmp == 0) {
34
- cmp = a->sign < b->sign;
35
- }
36
+ return a;
37
+ }
38
39
- if (pickNaN(a->cls, b->cls, cmp > 0, s)) {
40
- a = b;
41
- }
42
+ cmp = frac_cmp(a, b);
43
+ if (cmp == 0) {
44
+ cmp = a->sign < b->sign;
45
+ }
46
+
47
+ switch (s->float_2nan_prop_rule) {
48
+ case float_2nan_prop_s_ab:
49
if (is_snan(a->cls)) {
50
- parts_silence_nan(a, s);
51
+ which = 0;
52
+ } else if (is_snan(b->cls)) {
53
+ which = 1;
54
+ } else if (is_qnan(a->cls)) {
55
+ which = 0;
56
+ } else {
57
+ which = 1;
58
}
59
+ break;
60
+ case float_2nan_prop_s_ba:
61
+ if (is_snan(b->cls)) {
62
+ which = 1;
63
+ } else if (is_snan(a->cls)) {
64
+ which = 0;
65
+ } else if (is_qnan(b->cls)) {
66
+ which = 1;
67
+ } else {
68
+ which = 0;
69
+ }
70
+ break;
71
+ case float_2nan_prop_ab:
72
+ which = is_nan(a->cls) ? 0 : 1;
73
+ break;
74
+ case float_2nan_prop_ba:
75
+ which = is_nan(b->cls) ? 1 : 0;
76
+ break;
77
+ case float_2nan_prop_x87:
78
+ /*
79
+ * This implements x87 NaN propagation rules:
80
+ * SNaN + QNaN => return the QNaN
81
+ * two SNaNs => return the one with the larger significand, silenced
82
+ * two QNaNs => return the one with the larger significand
83
+ * SNaN and a non-NaN => return the SNaN, silenced
84
+ * QNaN and a non-NaN => return the QNaN
85
+ *
86
+ * If we get down to comparing significands and they are the same,
87
+ * return the NaN with the positive sign bit (if any).
88
+ */
89
+ if (is_snan(a->cls)) {
90
+ if (is_snan(b->cls)) {
91
+ which = cmp > 0 ? 0 : 1;
92
+ } else {
93
+ which = is_qnan(b->cls) ? 1 : 0;
94
+ }
95
+ } else if (is_qnan(a->cls)) {
96
+ if (is_snan(b->cls) || !is_qnan(b->cls)) {
97
+ which = 0;
98
+ } else {
99
+ which = cmp > 0 ? 0 : 1;
100
+ }
101
+ } else {
102
+ which = 1;
103
+ }
104
+ break;
105
+ default:
106
+ g_assert_not_reached();
107
+ }
108
+
109
+ if (which) {
110
+ a = b;
111
+ }
112
+ if (is_snan(a->cls)) {
113
+ parts_silence_nan(a, s);
114
}
115
return a;
116
}
117
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
index XXXXXXX..XXXXXXX 100644
118
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/sve_helper.c
119
--- a/fpu/softfloat-specialize.c.inc
35
+++ b/target/arm/sve_helper.c
120
+++ b/fpu/softfloat-specialize.c.inc
36
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
121
@@ -XXX,XX +XXX,XX @@ bool float32_is_signaling_nan(float32 a_, float_status *status)
37
}
122
}
38
}
123
}
39
124
40
+/*
125
-/*----------------------------------------------------------------------------
41
+ * FP Complex Add
126
-| Select which NaN to propagate for a two-input operation.
42
+ */
127
-| IEEE754 doesn't specify all the details of this, so the
43
+
128
-| algorithm is target-specific.
44
+void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg,
129
-| The routine is passed various bits of information about the
45
+ void *vs, uint32_t desc)
130
-| two NaNs and should return 0 to select NaN a and 1 for NaN b.
46
+{
131
-| Note that signalling NaNs are always squashed to quiet NaNs
47
+ intptr_t j, i = simd_oprsz(desc);
132
-| by the caller, by calling floatXX_silence_nan() before
48
+ uint64_t *g = vg;
133
-| returning them.
49
+ float16 neg_imag = float16_set_sign(0, simd_data(desc));
134
-|
50
+ float16 neg_real = float16_chs(neg_imag);
135
-| aIsLargerSignificand is only valid if both a and b are NaNs
51
+
136
-| of some kind, and is true if a has the larger significand,
52
+ do {
137
-| or if both a and b have the same significand but a is
53
+ uint64_t pg = g[(i - 1) >> 6];
138
-| positive but b is negative. It is only needed for the x87
54
+ do {
139
-| tie-break rule.
55
+ float16 e0, e1, e2, e3;
140
-*----------------------------------------------------------------------------*/
56
+
141
-
57
+ /* I holds the real index; J holds the imag index. */
142
-static int pickNaN(FloatClass a_cls, FloatClass b_cls,
58
+ j = i - sizeof(float16);
143
- bool aIsLargerSignificand, float_status *status)
59
+ i -= 2 * sizeof(float16);
144
-{
60
+
145
- /*
61
+ e0 = *(float16 *)(vn + H1_2(i));
146
- * We guarantee not to require the target to tell us how to
62
+ e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real;
147
- * pick a NaN if we're always returning the default NaN.
63
+ e2 = *(float16 *)(vn + H1_2(j));
148
- * But if we're not in default-NaN mode then the target must
64
+ e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag;
149
- * specify via set_float_2nan_prop_rule().
65
+
150
- */
66
+ if (likely((pg >> (i & 63)) & 1)) {
151
- assert(!status->default_nan_mode);
67
+ *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs);
152
-
68
+ }
153
- switch (status->float_2nan_prop_rule) {
69
+ if (likely((pg >> (j & 63)) & 1)) {
154
- case float_2nan_prop_s_ab:
70
+ *(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs);
155
- if (is_snan(a_cls)) {
71
+ }
156
- return 0;
72
+ } while (i & 63);
157
- } else if (is_snan(b_cls)) {
73
+ } while (i != 0);
158
- return 1;
74
+}
159
- } else if (is_qnan(a_cls)) {
75
+
160
- return 0;
76
+void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg,
161
- } else {
77
+ void *vs, uint32_t desc)
162
- return 1;
78
+{
163
- }
79
+ intptr_t j, i = simd_oprsz(desc);
164
- break;
80
+ uint64_t *g = vg;
165
- case float_2nan_prop_s_ba:
81
+ float32 neg_imag = float32_set_sign(0, simd_data(desc));
166
- if (is_snan(b_cls)) {
82
+ float32 neg_real = float32_chs(neg_imag);
167
- return 1;
83
+
168
- } else if (is_snan(a_cls)) {
84
+ do {
169
- return 0;
85
+ uint64_t pg = g[(i - 1) >> 6];
170
- } else if (is_qnan(b_cls)) {
86
+ do {
171
- return 1;
87
+ float32 e0, e1, e2, e3;
172
- } else {
88
+
173
- return 0;
89
+ /* I holds the real index; J holds the imag index. */
174
- }
90
+ j = i - sizeof(float32);
175
- break;
91
+ i -= 2 * sizeof(float32);
176
- case float_2nan_prop_ab:
92
+
177
- if (is_nan(a_cls)) {
93
+ e0 = *(float32 *)(vn + H1_2(i));
178
- return 0;
94
+ e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real;
179
- } else {
95
+ e2 = *(float32 *)(vn + H1_2(j));
180
- return 1;
96
+ e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag;
181
- }
97
+
182
- break;
98
+ if (likely((pg >> (i & 63)) & 1)) {
183
- case float_2nan_prop_ba:
99
+ *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs);
184
- if (is_nan(b_cls)) {
100
+ }
185
- return 1;
101
+ if (likely((pg >> (j & 63)) & 1)) {
186
- } else {
102
+ *(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs);
187
- return 0;
103
+ }
188
- }
104
+ } while (i & 63);
189
- break;
105
+ } while (i != 0);
190
- case float_2nan_prop_x87:
106
+}
191
- /*
107
+
192
- * This implements x87 NaN propagation rules:
108
+void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
193
- * SNaN + QNaN => return the QNaN
109
+ void *vs, uint32_t desc)
194
- * two SNaNs => return the one with the larger significand, silenced
110
+{
195
- * two QNaNs => return the one with the larger significand
111
+ intptr_t j, i = simd_oprsz(desc);
196
- * SNaN and a non-NaN => return the SNaN, silenced
112
+ uint64_t *g = vg;
197
- * QNaN and a non-NaN => return the QNaN
113
+ float64 neg_imag = float64_set_sign(0, simd_data(desc));
198
- *
114
+ float64 neg_real = float64_chs(neg_imag);
199
- * If we get down to comparing significands and they are the same,
115
+
200
- * return the NaN with the positive sign bit (if any).
116
+ do {
201
- */
117
+ uint64_t pg = g[(i - 1) >> 6];
202
- if (is_snan(a_cls)) {
118
+ do {
203
- if (is_snan(b_cls)) {
119
+ float64 e0, e1, e2, e3;
204
- return aIsLargerSignificand ? 0 : 1;
120
+
205
- }
121
+ /* I holds the real index; J holds the imag index. */
206
- return is_qnan(b_cls) ? 1 : 0;
122
+ j = i - sizeof(float64);
207
- } else if (is_qnan(a_cls)) {
123
+ i -= 2 * sizeof(float64);
208
- if (is_snan(b_cls) || !is_qnan(b_cls)) {
124
+
209
- return 0;
125
+ e0 = *(float64 *)(vn + H1_2(i));
210
- } else {
126
+ e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real;
211
- return aIsLargerSignificand ? 0 : 1;
127
+ e2 = *(float64 *)(vn + H1_2(j));
212
- }
128
+ e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag;
213
- } else {
129
+
214
- return 1;
130
+ if (likely((pg >> (i & 63)) & 1)) {
215
- }
131
+ *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs);
216
- default:
132
+ }
217
- g_assert_not_reached();
133
+ if (likely((pg >> (j & 63)) & 1)) {
218
- }
134
+ *(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs);
219
-}
135
+ }
220
-
136
+ } while (i & 63);
221
/*----------------------------------------------------------------------------
137
+ } while (i != 0);
222
| Returns 1 if the double-precision floating-point value `a' is a quiet
138
+}
223
| NaN; otherwise returns 0.
139
+
140
/*
141
* Load contiguous data, protected by a governing predicate.
142
*/
143
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/arm/translate-sve.c
146
+++ b/target/arm/translate-sve.c
147
@@ -XXX,XX +XXX,XX @@ DO_FPCMP(FACGT, facgt)
148
149
#undef DO_FPCMP
150
151
+static bool trans_FCADD(DisasContext *s, arg_FCADD *a, uint32_t insn)
152
+{
153
+ static gen_helper_gvec_4_ptr * const fns[3] = {
154
+ gen_helper_sve_fcadd_h,
155
+ gen_helper_sve_fcadd_s,
156
+ gen_helper_sve_fcadd_d
157
+ };
158
+
159
+ if (a->esz == 0) {
160
+ return false;
161
+ }
162
+ if (sve_access_check(s)) {
163
+ unsigned vsz = vec_full_reg_size(s);
164
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
165
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
166
+ vec_full_reg_offset(s, a->rn),
167
+ vec_full_reg_offset(s, a->rm),
168
+ pred_full_reg_offset(s, a->pg),
169
+ status, vsz, vsz, a->rot, fns[a->esz - 1]);
170
+ tcg_temp_free_ptr(status);
171
+ }
172
+ return true;
173
+}
174
+
175
typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32);
176
177
static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn)
178
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
179
index XXXXXXX..XXXXXXX 100644
180
--- a/target/arm/sve.decode
181
+++ b/target/arm/sve.decode
182
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
183
# SVE integer multiply immediate (unpredicated)
184
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
185
186
+# SVE floating-point complex add (predicated)
187
+FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
188
+ rn=%reg_movprfx
189
+
190
### SVE FP Multiply-Add Indexed Group
191
192
# SVE floating-point multiply-add (indexed)
193
--
224
--
194
2.17.1
225
2.34.1
195
226
196
227
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Enhance the existing helpers to support SVE, which takes the
3
Remember if there was an SNaN, and use that to simplify
4
index from each 128-bit segment. The change has no effect
4
float_2nan_prop_s_{ab,ba} to only the snan component.
5
for AdvSIMD, since there is only one such segment.
5
Then, fall through to the corresponding
6
float_2nan_prop_{ab,ba} case to handle any remaining
7
nans, which must be quiet.
6
8
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
11
Message-id: 20241203203949.483774-10-richard.henderson@linaro.org
10
Message-id: 20180627043328.11531-32-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
---
13
target/arm/translate-sve.c | 23 ++++++++++++++++++
14
fpu/softfloat-parts.c.inc | 32 ++++++++++++--------------------
14
target/arm/vec_helper.c | 50 +++++++++++++++++++++++---------------
15
1 file changed, 12 insertions(+), 20 deletions(-)
15
target/arm/sve.decode | 6 +++++
16
3 files changed, 59 insertions(+), 20 deletions(-)
17
16
18
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
19
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate-sve.c
19
--- a/fpu/softfloat-parts.c.inc
21
+++ b/target/arm/translate-sve.c
20
+++ b/fpu/softfloat-parts.c.inc
22
@@ -XXX,XX +XXX,XX @@ static bool trans_FCMLA_zpzzz(DisasContext *s,
21
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
23
return true;
22
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
24
}
23
float_status *s)
25
24
{
26
+static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn)
25
+ bool have_snan = false;
27
+{
26
int cmp, which;
28
+ static gen_helper_gvec_3_ptr * const fns[2] = {
27
29
+ gen_helper_gvec_fcmlah_idx,
28
if (is_snan(a->cls) || is_snan(b->cls)) {
30
+ gen_helper_gvec_fcmlas_idx,
29
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
31
+ };
30
+ have_snan = true;
32
+
31
}
33
+ tcg_debug_assert(a->esz == 1 || a->esz == 2);
32
34
+ tcg_debug_assert(a->rd == a->ra);
33
if (s->default_nan_mode) {
35
+ if (sve_access_check(s)) {
34
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
36
+ unsigned vsz = vec_full_reg_size(s);
35
37
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
36
switch (s->float_2nan_prop_rule) {
38
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
37
case float_2nan_prop_s_ab:
39
+ vec_full_reg_offset(s, a->rn),
38
- if (is_snan(a->cls)) {
40
+ vec_full_reg_offset(s, a->rm),
39
- which = 0;
41
+ status, vsz, vsz,
40
- } else if (is_snan(b->cls)) {
42
+ a->index * 4 + a->rot,
41
- which = 1;
43
+ fns[a->esz - 1]);
42
- } else if (is_qnan(a->cls)) {
44
+ tcg_temp_free_ptr(status);
43
- which = 0;
45
+ }
44
- } else {
46
+ return true;
45
- which = 1;
47
+}
46
+ if (have_snan) {
48
+
47
+ which = is_snan(a->cls) ? 0 : 1;
49
/*
48
+ break;
50
*** SVE Floating Point Unary Operations Predicated Group
49
}
51
*/
50
- break;
52
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
51
- case float_2nan_prop_s_ba:
53
index XXXXXXX..XXXXXXX 100644
52
- if (is_snan(b->cls)) {
54
--- a/target/arm/vec_helper.c
53
- which = 1;
55
+++ b/target/arm/vec_helper.c
54
- } else if (is_snan(a->cls)) {
56
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
55
- which = 0;
57
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
56
- } else if (is_qnan(b->cls)) {
58
intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
57
- which = 1;
59
uint32_t neg_real = flip ^ neg_imag;
58
- } else {
60
- uintptr_t i;
59
- which = 0;
61
- float16 e1 = m[H2(2 * index + flip)];
60
- }
62
- float16 e3 = m[H2(2 * index + 1 - flip)];
61
- break;
63
+ intptr_t elements = opr_sz / sizeof(float16);
62
+ /* fall through */
64
+ intptr_t eltspersegment = 16 / sizeof(float16);
63
case float_2nan_prop_ab:
65
+ intptr_t i, j;
64
which = is_nan(a->cls) ? 0 : 1;
66
65
break;
67
/* Shift boolean to the sign bit so we can xor to negate. */
66
+ case float_2nan_prop_s_ba:
68
neg_real <<= 15;
67
+ if (have_snan) {
69
neg_imag <<= 15;
68
+ which = is_snan(b->cls) ? 1 : 0;
70
- e1 ^= neg_real;
69
+ break;
71
- e3 ^= neg_imag;
72
73
- for (i = 0; i < opr_sz / 2; i += 2) {
74
- float16 e2 = n[H2(i + flip)];
75
- float16 e4 = e2;
76
+ for (i = 0; i < elements; i += eltspersegment) {
77
+ float16 mr = m[H2(i + 2 * index + 0)];
78
+ float16 mi = m[H2(i + 2 * index + 1)];
79
+ float16 e1 = neg_real ^ (flip ? mi : mr);
80
+ float16 e3 = neg_imag ^ (flip ? mr : mi);
81
82
- d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst);
83
- d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst);
84
+ for (j = i; j < i + eltspersegment; j += 2) {
85
+ float16 e2 = n[H2(j + flip)];
86
+ float16 e4 = e2;
87
+
88
+ d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst);
89
+ d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst);
90
+ }
70
+ }
91
}
71
+ /* fall through */
92
clear_tail(d, opr_sz, simd_maxsz(desc));
72
case float_2nan_prop_ba:
93
}
73
which = is_nan(b->cls) ? 1 : 0;
94
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm,
74
break;
95
uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
96
intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
97
uint32_t neg_real = flip ^ neg_imag;
98
- uintptr_t i;
99
- float32 e1 = m[H4(2 * index + flip)];
100
- float32 e3 = m[H4(2 * index + 1 - flip)];
101
+ intptr_t elements = opr_sz / sizeof(float32);
102
+ intptr_t eltspersegment = 16 / sizeof(float32);
103
+ intptr_t i, j;
104
105
/* Shift boolean to the sign bit so we can xor to negate. */
106
neg_real <<= 31;
107
neg_imag <<= 31;
108
- e1 ^= neg_real;
109
- e3 ^= neg_imag;
110
111
- for (i = 0; i < opr_sz / 4; i += 2) {
112
- float32 e2 = n[H4(i + flip)];
113
- float32 e4 = e2;
114
+ for (i = 0; i < elements; i += eltspersegment) {
115
+ float32 mr = m[H4(i + 2 * index + 0)];
116
+ float32 mi = m[H4(i + 2 * index + 1)];
117
+ float32 e1 = neg_real ^ (flip ? mi : mr);
118
+ float32 e3 = neg_imag ^ (flip ? mr : mi);
119
120
- d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst);
121
- d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst);
122
+ for (j = i; j < i + eltspersegment; j += 2) {
123
+ float32 e2 = n[H4(j + flip)];
124
+ float32 e4 = e2;
125
+
126
+ d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst);
127
+ d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst);
128
+ }
129
}
130
clear_tail(d, opr_sz, simd_maxsz(desc));
131
}
132
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
133
index XXXXXXX..XXXXXXX 100644
134
--- a/target/arm/sve.decode
135
+++ b/target/arm/sve.decode
136
@@ -XXX,XX +XXX,XX @@ FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \
137
FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \
138
ra=%reg_movprfx
139
140
+# SVE floating-point complex multiply-add (indexed)
141
+FCMLA_zzxz 01100100 10 1 index:2 rm:3 0001 rot:2 rn:5 rd:5 \
142
+ ra=%reg_movprfx esz=1
143
+FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \
144
+ ra=%reg_movprfx esz=2
145
+
146
### SVE FP Multiply-Add Indexed Group
147
148
# SVE floating-point multiply-add (indexed)
149
--
75
--
150
2.17.1
76
2.34.1
151
152
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
Move the fractional comparison to the end of the
4
float_2nan_prop_x87 case. This is not required for
5
any other 2nan propagation rule. Reorganize the
6
x87 case itself to break out of the switch when the
7
fractional comparison is not required.
2
8
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
11
Message-id: 20241203203949.483774-11-richard.henderson@linaro.org
6
Message-id: 20180627043328.11531-8-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
13
---
9
target/arm/helper-sve.h | 16 ++++
14
fpu/softfloat-parts.c.inc | 19 +++++++++----------
10
target/arm/sve_helper.c | 158 +++++++++++++++++++++++++++++++++++++
15
1 file changed, 9 insertions(+), 10 deletions(-)
11
target/arm/translate-sve.c | 49 ++++++++++++
12
target/arm/sve.decode | 18 +++++
13
4 files changed, 241 insertions(+)
14
16
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
19
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/helper-sve.h
20
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
21
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
20
DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
22
return a;
21
void, ptr, ptr, ptr, ptr, i32)
23
}
22
24
23
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
25
- cmp = frac_cmp(a, b);
24
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
26
- if (cmp == 0) {
25
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
27
- cmp = a->sign < b->sign;
26
+
28
- }
27
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
29
-
28
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
30
switch (s->float_2nan_prop_rule) {
29
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
31
case float_2nan_prop_s_ab:
30
+
32
if (have_snan) {
31
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
33
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
32
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
34
* return the NaN with the positive sign bit (if any).
33
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
35
*/
34
+
36
if (is_snan(a->cls)) {
35
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
37
- if (is_snan(b->cls)) {
36
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
38
- which = cmp > 0 ? 0 : 1;
37
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
39
- } else {
38
+
40
+ if (!is_snan(b->cls)) {
39
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
which = is_qnan(b->cls) ? 1 : 0;
40
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
+ break;
41
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
43
}
42
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
44
} else if (is_qnan(a->cls)) {
43
index XXXXXXX..XXXXXXX 100644
45
if (is_snan(b->cls) || !is_qnan(b->cls)) {
44
--- a/target/arm/sve_helper.c
46
which = 0;
45
+++ b/target/arm/sve_helper.c
47
- } else {
46
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64)
48
- which = cmp > 0 ? 0 : 1;
47
49
+ break;
48
#undef DO_ZPZ_FP
50
}
49
51
} else {
50
+/* 4-operand predicated multiply-add. This requires 7 operands to pass
52
which = 1;
51
+ * "properly", so we need to encode some of the registers into DESC.
53
+ break;
52
+ */
54
}
53
+QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32);
55
+ cmp = frac_cmp(a, b);
54
+
56
+ if (cmp == 0) {
55
+static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc,
57
+ cmp = a->sign < b->sign;
56
+ uint16_t neg1, uint16_t neg3)
58
+ }
57
+{
59
+ which = cmp > 0 ? 0 : 1;
58
+ intptr_t i = simd_oprsz(desc);
60
break;
59
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
61
default:
60
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
62
g_assert_not_reached();
61
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
62
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
63
+ void *vd = &env->vfp.zregs[rd];
64
+ void *vn = &env->vfp.zregs[rn];
65
+ void *vm = &env->vfp.zregs[rm];
66
+ void *va = &env->vfp.zregs[ra];
67
+ uint64_t *g = vg;
68
+
69
+ do {
70
+ uint64_t pg = g[(i - 1) >> 6];
71
+ do {
72
+ i -= 2;
73
+ if (likely((pg >> (i & 63)) & 1)) {
74
+ float16 e1, e2, e3, r;
75
+
76
+ e1 = *(uint16_t *)(vn + H1_2(i)) ^ neg1;
77
+ e2 = *(uint16_t *)(vm + H1_2(i));
78
+ e3 = *(uint16_t *)(va + H1_2(i)) ^ neg3;
79
+ r = float16_muladd(e1, e2, e3, 0, &env->vfp.fp_status);
80
+ *(uint16_t *)(vd + H1_2(i)) = r;
81
+ }
82
+ } while (i & 63);
83
+ } while (i != 0);
84
+}
85
+
86
+void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
87
+{
88
+ do_fmla_zpzzz_h(env, vg, desc, 0, 0);
89
+}
90
+
91
+void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
92
+{
93
+ do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0);
94
+}
95
+
96
+void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
97
+{
98
+ do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000);
99
+}
100
+
101
+void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
102
+{
103
+ do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000);
104
+}
105
+
106
+static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc,
107
+ uint32_t neg1, uint32_t neg3)
108
+{
109
+ intptr_t i = simd_oprsz(desc);
110
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
111
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
112
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
113
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
114
+ void *vd = &env->vfp.zregs[rd];
115
+ void *vn = &env->vfp.zregs[rn];
116
+ void *vm = &env->vfp.zregs[rm];
117
+ void *va = &env->vfp.zregs[ra];
118
+ uint64_t *g = vg;
119
+
120
+ do {
121
+ uint64_t pg = g[(i - 1) >> 6];
122
+ do {
123
+ i -= 4;
124
+ if (likely((pg >> (i & 63)) & 1)) {
125
+ float32 e1, e2, e3, r;
126
+
127
+ e1 = *(uint32_t *)(vn + H1_4(i)) ^ neg1;
128
+ e2 = *(uint32_t *)(vm + H1_4(i));
129
+ e3 = *(uint32_t *)(va + H1_4(i)) ^ neg3;
130
+ r = float32_muladd(e1, e2, e3, 0, &env->vfp.fp_status);
131
+ *(uint32_t *)(vd + H1_4(i)) = r;
132
+ }
133
+ } while (i & 63);
134
+ } while (i != 0);
135
+}
136
+
137
+void HELPER(sve_fmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
138
+{
139
+ do_fmla_zpzzz_s(env, vg, desc, 0, 0);
140
+}
141
+
142
+void HELPER(sve_fmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
143
+{
144
+ do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0);
145
+}
146
+
147
+void HELPER(sve_fnmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
148
+{
149
+ do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0x80000000);
150
+}
151
+
152
+void HELPER(sve_fnmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
153
+{
154
+ do_fmla_zpzzz_s(env, vg, desc, 0, 0x80000000);
155
+}
156
+
157
+static void do_fmla_zpzzz_d(CPUARMState *env, void *vg, uint32_t desc,
158
+ uint64_t neg1, uint64_t neg3)
159
+{
160
+ intptr_t i = simd_oprsz(desc);
161
+ unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
162
+ unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
163
+ unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
164
+ unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
165
+ void *vd = &env->vfp.zregs[rd];
166
+ void *vn = &env->vfp.zregs[rn];
167
+ void *vm = &env->vfp.zregs[rm];
168
+ void *va = &env->vfp.zregs[ra];
169
+ uint64_t *g = vg;
170
+
171
+ do {
172
+ uint64_t pg = g[(i - 1) >> 6];
173
+ do {
174
+ i -= 8;
175
+ if (likely((pg >> (i & 63)) & 1)) {
176
+ float64 e1, e2, e3, r;
177
+
178
+ e1 = *(uint64_t *)(vn + i) ^ neg1;
179
+ e2 = *(uint64_t *)(vm + i);
180
+ e3 = *(uint64_t *)(va + i) ^ neg3;
181
+ r = float64_muladd(e1, e2, e3, 0, &env->vfp.fp_status);
182
+ *(uint64_t *)(vd + i) = r;
183
+ }
184
+ } while (i & 63);
185
+ } while (i != 0);
186
+}
187
+
188
+void HELPER(sve_fmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
189
+{
190
+ do_fmla_zpzzz_d(env, vg, desc, 0, 0);
191
+}
192
+
193
+void HELPER(sve_fmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
194
+{
195
+ do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, 0);
196
+}
197
+
198
+void HELPER(sve_fnmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
199
+{
200
+ do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, INT64_MIN);
201
+}
202
+
203
+void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
204
+{
205
+ do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN);
206
+}
207
+
208
/*
209
* Load contiguous data, protected by a governing predicate.
210
*/
211
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
212
index XXXXXXX..XXXXXXX 100644
213
--- a/target/arm/translate-sve.c
214
+++ b/target/arm/translate-sve.c
215
@@ -XXX,XX +XXX,XX @@ DO_FP3(FMULX, fmulx)
216
217
#undef DO_FP3
218
219
+typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32);
220
+
221
+static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn)
222
+{
223
+ if (fn == NULL) {
224
+ return false;
225
+ }
226
+ if (!sve_access_check(s)) {
227
+ return true;
228
+ }
229
+
230
+ unsigned vsz = vec_full_reg_size(s);
231
+ unsigned desc;
232
+ TCGv_i32 t_desc;
233
+ TCGv_ptr pg = tcg_temp_new_ptr();
234
+
235
+ /* We would need 7 operands to pass these arguments "properly".
236
+ * So we encode all the register numbers into the descriptor.
237
+ */
238
+ desc = deposit32(a->rd, 5, 5, a->rn);
239
+ desc = deposit32(desc, 10, 5, a->rm);
240
+ desc = deposit32(desc, 15, 5, a->ra);
241
+ desc = simd_desc(vsz, vsz, desc);
242
+
243
+ t_desc = tcg_const_i32(desc);
244
+ tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
245
+ fn(cpu_env, pg, t_desc);
246
+ tcg_temp_free_i32(t_desc);
247
+ tcg_temp_free_ptr(pg);
248
+ return true;
249
+}
250
+
251
+#define DO_FMLA(NAME, name) \
252
+static bool trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \
253
+{ \
254
+ static gen_helper_sve_fmla * const fns[4] = { \
255
+ NULL, gen_helper_sve_##name##_h, \
256
+ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \
257
+ }; \
258
+ return do_fmla(s, a, fns[a->esz]); \
259
+}
260
+
261
+DO_FMLA(FMLA_zpzzz, fmla_zpzzz)
262
+DO_FMLA(FMLS_zpzzz, fmls_zpzzz)
263
+DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz)
264
+DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz)
265
+
266
+#undef DO_FMLA
267
+
268
/*
269
*** SVE Floating Point Unary Operations Predicated Group
270
*/
271
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
272
index XXXXXXX..XXXXXXX 100644
273
--- a/target/arm/sve.decode
274
+++ b/target/arm/sve.decode
275
@@ -XXX,XX +XXX,XX @@
276
&rprrr_esz ra=%reg_movprfx
277
@rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \
278
&rprrr_esz rn=%reg_movprfx
279
+@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \
280
+ &rprrr_esz rn=%reg_movprfx
281
282
# One register operand, with governing predicate, vector element size
283
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
284
@@ -XXX,XX +XXX,XX @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm
285
FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR
286
FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm
287
288
+### SVE FP Multiply-Add Group
289
+
290
+# SVE floating-point multiply-accumulate writing addend
291
+FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm
292
+FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm
293
+FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm
294
+FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm
295
+
296
+# SVE floating-point multiply-accumulate writing multiplicand
297
+# Alter the operand extraction order and reuse the helpers from above.
298
+# FMAD, FMSB, FNMAD, FNMS
299
+FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra
300
+FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra
301
+FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra
302
+FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra
303
+
304
### SVE FP Unary Operations Predicated Group
305
306
# SVE integer convert to floating-point
307
--
63
--
308
2.17.1
64
2.34.1
309
310
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Replace the "index" selecting between A and B with a result variable
4
of the proper type. This improves clarity within the function.
5
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180627043328.11531-16-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-12-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/translate-sve.c | 85 ++++++++++++++++++++++++++------------
11
fpu/softfloat-parts.c.inc | 28 +++++++++++++---------------
9
target/arm/sve.decode | 11 +++++
12
1 file changed, 13 insertions(+), 15 deletions(-)
10
2 files changed, 70 insertions(+), 26 deletions(-)
11
13
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
13
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
16
--- a/fpu/softfloat-parts.c.inc
15
+++ b/target/arm/translate-sve.c
17
+++ b/fpu/softfloat-parts.c.inc
16
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
17
return true;
19
float_status *s)
18
}
19
20
+/* Indexed by [xs][msz]. */
21
+static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = {
22
+ { gen_helper_sve_stbs_zsu,
23
+ gen_helper_sve_sths_zsu,
24
+ gen_helper_sve_stss_zsu, },
25
+ { gen_helper_sve_stbs_zss,
26
+ gen_helper_sve_sths_zss,
27
+ gen_helper_sve_stss_zss, },
28
+};
29
+
30
+/* Note that we overload xs=2 to indicate 64-bit offset. */
31
+static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = {
32
+ { gen_helper_sve_stbd_zsu,
33
+ gen_helper_sve_sthd_zsu,
34
+ gen_helper_sve_stsd_zsu,
35
+ gen_helper_sve_stdd_zsu, },
36
+ { gen_helper_sve_stbd_zss,
37
+ gen_helper_sve_sthd_zss,
38
+ gen_helper_sve_stsd_zss,
39
+ gen_helper_sve_stdd_zss, },
40
+ { gen_helper_sve_stbd_zd,
41
+ gen_helper_sve_sthd_zd,
42
+ gen_helper_sve_stsd_zd,
43
+ gen_helper_sve_stdd_zd, },
44
+};
45
+
46
static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
47
{
20
{
48
- /* Indexed by [xs][msz]. */
21
bool have_snan = false;
49
- static gen_helper_gvec_mem_scatter * const fn32[2][3] = {
22
- int cmp, which;
50
- { gen_helper_sve_stbs_zsu,
23
+ FloatPartsN *ret;
51
- gen_helper_sve_sths_zsu,
24
+ int cmp;
52
- gen_helper_sve_stss_zsu, },
25
53
- { gen_helper_sve_stbs_zss,
26
if (is_snan(a->cls) || is_snan(b->cls)) {
54
- gen_helper_sve_sths_zss,
27
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
55
- gen_helper_sve_stss_zss, },
28
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
56
- };
29
switch (s->float_2nan_prop_rule) {
57
- /* Note that we overload xs=2 to indicate 64-bit offset. */
30
case float_2nan_prop_s_ab:
58
- static gen_helper_gvec_mem_scatter * const fn64[3][4] = {
31
if (have_snan) {
59
- { gen_helper_sve_stbd_zsu,
32
- which = is_snan(a->cls) ? 0 : 1;
60
- gen_helper_sve_sthd_zsu,
33
+ ret = is_snan(a->cls) ? a : b;
61
- gen_helper_sve_stsd_zsu,
34
break;
62
- gen_helper_sve_stdd_zsu, },
35
}
63
- { gen_helper_sve_stbd_zss,
36
/* fall through */
64
- gen_helper_sve_sthd_zss,
37
case float_2nan_prop_ab:
65
- gen_helper_sve_stsd_zss,
38
- which = is_nan(a->cls) ? 0 : 1;
66
- gen_helper_sve_stdd_zss, },
39
+ ret = is_nan(a->cls) ? a : b;
67
- { gen_helper_sve_stbd_zd,
68
- gen_helper_sve_sthd_zd,
69
- gen_helper_sve_stsd_zd,
70
- gen_helper_sve_stdd_zd, },
71
- };
72
gen_helper_gvec_mem_scatter *fn;
73
74
if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
75
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
76
}
77
switch (a->esz) {
78
case MO_32:
79
- fn = fn32[a->xs][a->msz];
80
+ fn = scatter_store_fn32[a->xs][a->msz];
81
break;
40
break;
82
case MO_64:
41
case float_2nan_prop_s_ba:
83
- fn = fn64[a->xs][a->msz];
42
if (have_snan) {
84
+ fn = scatter_store_fn64[a->xs][a->msz];
43
- which = is_snan(b->cls) ? 1 : 0;
44
+ ret = is_snan(b->cls) ? b : a;
45
break;
46
}
47
/* fall through */
48
case float_2nan_prop_ba:
49
- which = is_nan(b->cls) ? 1 : 0;
50
+ ret = is_nan(b->cls) ? b : a;
51
break;
52
case float_2nan_prop_x87:
53
/*
54
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
55
*/
56
if (is_snan(a->cls)) {
57
if (!is_snan(b->cls)) {
58
- which = is_qnan(b->cls) ? 1 : 0;
59
+ ret = is_qnan(b->cls) ? b : a;
60
break;
61
}
62
} else if (is_qnan(a->cls)) {
63
if (is_snan(b->cls) || !is_qnan(b->cls)) {
64
- which = 0;
65
+ ret = a;
66
break;
67
}
68
} else {
69
- which = 1;
70
+ ret = b;
71
break;
72
}
73
cmp = frac_cmp(a, b);
74
if (cmp == 0) {
75
cmp = a->sign < b->sign;
76
}
77
- which = cmp > 0 ? 0 : 1;
78
+ ret = cmp > 0 ? a : b;
85
break;
79
break;
86
default:
80
default:
87
g_assert_not_reached();
81
g_assert_not_reached();
88
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
82
}
89
return true;
83
84
- if (which) {
85
- a = b;
86
+ if (is_snan(ret->cls)) {
87
+ parts_silence_nan(ret, s);
88
}
89
- if (is_snan(a->cls)) {
90
- parts_silence_nan(a, s);
91
- }
92
- return a;
93
+ return ret;
90
}
94
}
91
95
92
+static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn)
96
static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
93
+{
94
+ gen_helper_gvec_mem_scatter *fn = NULL;
95
+ TCGv_i64 imm;
96
+
97
+ if (a->esz < a->msz) {
98
+ return false;
99
+ }
100
+ if (!sve_access_check(s)) {
101
+ return true;
102
+ }
103
+
104
+ switch (a->esz) {
105
+ case MO_32:
106
+ fn = scatter_store_fn32[0][a->msz];
107
+ break;
108
+ case MO_64:
109
+ fn = scatter_store_fn64[2][a->msz];
110
+ break;
111
+ }
112
+ assert(fn != NULL);
113
+
114
+ /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x])
115
+ * by loading the immediate into the scalar parameter.
116
+ */
117
+ imm = tcg_const_i64(a->imm << a->msz);
118
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
119
+ tcg_temp_free_i64(imm);
120
+ return true;
121
+}
122
+
123
/*
124
* Prefetches
125
*/
126
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
127
index XXXXXXX..XXXXXXX 100644
128
--- a/target/arm/sve.decode
129
+++ b/target/arm/sve.decode
130
@@ -XXX,XX +XXX,XX @@
131
&rprr_gather_load rd pg rn rm esz msz u ff xs scale
132
&rpri_gather_load rd pg rn imm esz msz u ff
133
&rprr_scatter_store rd pg rn rm esz msz xs scale
134
+&rpri_scatter_store rd pg rn imm esz msz
135
136
###########################################################################
137
# Named instruction formats. These are generally used to
138
@@ -XXX,XX +XXX,XX @@
139
&rprr_store nreg=0
140
@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \
141
&rprr_scatter_store
142
+@rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \
143
+ &rpri_scatter_store
144
145
###########################################################################
146
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
147
@@ -XXX,XX +XXX,XX @@ ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \
148
ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \
149
@rprr_scatter_store xs=2 esz=3 scale=0
150
151
+# SVE 64-bit scatter store (vector plus immediate)
152
+ST1_zpiz 1110010 .. 10 ..... 101 ... ..... ..... \
153
+ @rpri_scatter_store esz=3
154
+
155
+# SVE 32-bit scatter store (vector plus immediate)
156
+ST1_zpiz 1110010 .. 11 ..... 101 ... ..... ..... \
157
+ @rpri_scatter_store esz=2
158
+
159
# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset)
160
# Require msz > 0
161
ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \
162
--
97
--
163
2.17.1
98
2.34.1
164
99
165
100
diff view generated by jsdifflib
1
From: Jean-Christophe Dubois <jcd@tribudubois.net>
1
From: Leif Lindholm <quic_llindhol@quicinc.com>
2
2
3
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
3
I'm migrating to Qualcomm's new open source email infrastructure, so
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
update my email address, and update the mailmap to match.
5
6
Signed-off-by: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
7
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
8
Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20241205114047.1125842-1-leif.lindholm@oss.qualcomm.com
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
---
13
---
7
hw/arm/mcimx7d-sabre.c | 2 --
14
MAINTAINERS | 2 +-
8
1 file changed, 2 deletions(-)
15
.mailmap | 5 +++--
16
2 files changed, 4 insertions(+), 3 deletions(-)
9
17
10
diff --git a/hw/arm/mcimx7d-sabre.c b/hw/arm/mcimx7d-sabre.c
18
diff --git a/MAINTAINERS b/MAINTAINERS
11
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
12
--- a/hw/arm/mcimx7d-sabre.c
20
--- a/MAINTAINERS
13
+++ b/hw/arm/mcimx7d-sabre.c
21
+++ b/MAINTAINERS
14
@@ -XXX,XX +XXX,XX @@
22
@@ -XXX,XX +XXX,XX @@ F: include/hw/ssi/imx_spi.h
15
#include "hw/arm/fsl-imx7.h"
23
SBSA-REF
16
#include "hw/boards.h"
24
M: Radoslaw Biernacki <rad@semihalf.com>
17
#include "sysemu/sysemu.h"
25
M: Peter Maydell <peter.maydell@linaro.org>
18
-#include "sysemu/device_tree.h"
26
-R: Leif Lindholm <quic_llindhol@quicinc.com>
19
#include "qemu/error-report.h"
27
+R: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
20
#include "sysemu/qtest.h"
28
R: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
21
-#include "net/net.h"
29
L: qemu-arm@nongnu.org
22
30
S: Maintained
23
typedef struct {
31
diff --git a/.mailmap b/.mailmap
24
FslIMX7State soc;
32
index XXXXXXX..XXXXXXX 100644
33
--- a/.mailmap
34
+++ b/.mailmap
35
@@ -XXX,XX +XXX,XX @@ Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
36
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
37
James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
38
Juan Quintela <quintela@trasno.org> <quintela@redhat.com>
39
-Leif Lindholm <quic_llindhol@quicinc.com> <leif.lindholm@linaro.org>
40
-Leif Lindholm <quic_llindhol@quicinc.com> <leif@nuviainc.com>
41
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <quic_llindhol@quicinc.com>
42
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif.lindholm@linaro.org>
43
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif@nuviainc.com>
44
Luc Michel <luc@lmichel.fr> <luc.michel@git.antfield.fr>
45
Luc Michel <luc@lmichel.fr> <luc.michel@greensocs.com>
46
Luc Michel <luc@lmichel.fr> <lmichel@kalray.eu>
25
--
47
--
26
2.17.1
48
2.34.1
27
49
28
50
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Vikram Garhwal <vikram.garhwal@bytedance.com>
2
2
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
3
Previously, maintainer role was paused due to inactive email id. Commit id:
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
c009d715721861984c4987bcc78b7ee183e86d75.
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
6
Message-id: 20180627043328.11531-2-richard.henderson@linaro.org
6
Signed-off-by: Vikram Garhwal <vikram.garhwal@bytedance.com>
7
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
8
Message-id: 20241204184205.12952-1-vikram.garhwal@bytedance.com
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
10
---
9
target/arm/helper-sve.h | 35 +++++++++
11
MAINTAINERS | 2 ++
10
target/arm/sve_helper.c | 153 +++++++++++++++++++++++++++++++++++++
12
1 file changed, 2 insertions(+)
11
target/arm/translate-sve.c | 121 +++++++++++++++++++++++++++++
12
target/arm/sve.decode | 34 +++++++++
13
4 files changed, 343 insertions(+)
14
13
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
14
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-sve.h
16
--- a/MAINTAINERS
18
+++ b/target/arm/helper-sve.h
17
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
18
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/fuzz-sb16-test.c
20
void, ptr, ptr, ptr, ptr, i32)
19
21
DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
20
Xilinx CAN
22
void, ptr, ptr, ptr, ptr, i32)
21
M: Francisco Iglesias <francisco.iglesias@amd.com>
23
+
22
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
24
+DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
23
S: Maintained
25
+DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
24
F: hw/net/can/xlnx-*
26
+DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
25
F: include/hw/net/xlnx-*
27
+DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
26
@@ -XXX,XX +XXX,XX @@ F: include/hw/rx/
28
+
27
CAN bus subsystem and hardware
29
+DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
28
M: Pavel Pisa <pisa@cmp.felk.cvut.cz>
30
+DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
29
M: Francisco Iglesias <francisco.iglesias@amd.com>
31
+DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
30
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
32
+DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
31
S: Maintained
33
+
32
W: https://canbus.pages.fel.cvut.cz/
34
+DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
33
F: net/can/*
35
+DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
37
+DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
38
+
39
+DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
40
+DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
41
+DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
42
+DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
43
+
44
+DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
46
+DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
49
+DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
50
+
51
+DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
52
+DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
55
+
56
+DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
58
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/sve_helper.c
61
+++ b/target/arm/sve_helper.c
62
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
63
64
return predtest_ones(d, oprsz, esz_mask);
65
}
66
+
67
+/*
68
+ * Load contiguous data, protected by a governing predicate.
69
+ */
70
+#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \
71
+static void do_##NAME(CPUARMState *env, void *vd, void *vg, \
72
+ target_ulong addr, intptr_t oprsz, \
73
+ uintptr_t ra) \
74
+{ \
75
+ intptr_t i = 0; \
76
+ do { \
77
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
78
+ do { \
79
+ TYPEM m = 0; \
80
+ if (pg & 1) { \
81
+ m = FN(env, addr, ra); \
82
+ } \
83
+ *(TYPEE *)(vd + H(i)) = m; \
84
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
85
+ addr += sizeof(TYPEM); \
86
+ } while (i & 15); \
87
+ } while (i < oprsz); \
88
+} \
89
+void HELPER(NAME)(CPUARMState *env, void *vg, \
90
+ target_ulong addr, uint32_t desc) \
91
+{ \
92
+ do_##NAME(env, &env->vfp.zregs[simd_data(desc)], vg, \
93
+ addr, simd_oprsz(desc), GETPC()); \
94
+}
95
+
96
+#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \
97
+void HELPER(NAME)(CPUARMState *env, void *vg, \
98
+ target_ulong addr, uint32_t desc) \
99
+{ \
100
+ intptr_t i, oprsz = simd_oprsz(desc); \
101
+ intptr_t ra = GETPC(); \
102
+ unsigned rd = simd_data(desc); \
103
+ void *d1 = &env->vfp.zregs[rd]; \
104
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
105
+ for (i = 0; i < oprsz; ) { \
106
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
107
+ do { \
108
+ TYPEM m1 = 0, m2 = 0; \
109
+ if (pg & 1) { \
110
+ m1 = FN(env, addr, ra); \
111
+ m2 = FN(env, addr + sizeof(TYPEM), ra); \
112
+ } \
113
+ *(TYPEE *)(d1 + H(i)) = m1; \
114
+ *(TYPEE *)(d2 + H(i)) = m2; \
115
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
116
+ addr += 2 * sizeof(TYPEM); \
117
+ } while (i & 15); \
118
+ } \
119
+}
120
+
121
+#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \
122
+void HELPER(NAME)(CPUARMState *env, void *vg, \
123
+ target_ulong addr, uint32_t desc) \
124
+{ \
125
+ intptr_t i, oprsz = simd_oprsz(desc); \
126
+ intptr_t ra = GETPC(); \
127
+ unsigned rd = simd_data(desc); \
128
+ void *d1 = &env->vfp.zregs[rd]; \
129
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
130
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
131
+ for (i = 0; i < oprsz; ) { \
132
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
133
+ do { \
134
+ TYPEM m1 = 0, m2 = 0, m3 = 0; \
135
+ if (pg & 1) { \
136
+ m1 = FN(env, addr, ra); \
137
+ m2 = FN(env, addr + sizeof(TYPEM), ra); \
138
+ m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
139
+ } \
140
+ *(TYPEE *)(d1 + H(i)) = m1; \
141
+ *(TYPEE *)(d2 + H(i)) = m2; \
142
+ *(TYPEE *)(d3 + H(i)) = m3; \
143
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
144
+ addr += 3 * sizeof(TYPEM); \
145
+ } while (i & 15); \
146
+ } \
147
+}
148
+
149
+#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \
150
+void HELPER(NAME)(CPUARMState *env, void *vg, \
151
+ target_ulong addr, uint32_t desc) \
152
+{ \
153
+ intptr_t i, oprsz = simd_oprsz(desc); \
154
+ intptr_t ra = GETPC(); \
155
+ unsigned rd = simd_data(desc); \
156
+ void *d1 = &env->vfp.zregs[rd]; \
157
+ void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \
158
+ void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \
159
+ void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \
160
+ for (i = 0; i < oprsz; ) { \
161
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
162
+ do { \
163
+ TYPEM m1 = 0, m2 = 0, m3 = 0, m4 = 0; \
164
+ if (pg & 1) { \
165
+ m1 = FN(env, addr, ra); \
166
+ m2 = FN(env, addr + sizeof(TYPEM), ra); \
167
+ m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
168
+ m4 = FN(env, addr + 3 * sizeof(TYPEM), ra); \
169
+ } \
170
+ *(TYPEE *)(d1 + H(i)) = m1; \
171
+ *(TYPEE *)(d2 + H(i)) = m2; \
172
+ *(TYPEE *)(d3 + H(i)) = m3; \
173
+ *(TYPEE *)(d4 + H(i)) = m4; \
174
+ i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \
175
+ addr += 4 * sizeof(TYPEM); \
176
+ } while (i & 15); \
177
+ } \
178
+}
179
+
180
+DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2)
181
+DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2)
182
+DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4)
183
+DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4)
184
+DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, )
185
+DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
186
+
187
+DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
188
+DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4)
189
+DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
190
+DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
191
+
192
+DO_LD1(sve_ld1sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, )
193
+DO_LD1(sve_ld1sds_r, cpu_ldl_data_ra, uint64_t, int32_t, )
194
+
195
+DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
196
+DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
197
+DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
198
+DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1)
199
+
200
+DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
201
+DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
202
+DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
203
+DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2)
204
+
205
+DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
206
+DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
207
+DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
208
+DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4)
209
+
210
+DO_LD1(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
211
+DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
212
+DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
213
+DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, )
214
+
215
+#undef DO_LD1
216
+#undef DO_LD2
217
+#undef DO_LD3
218
+#undef DO_LD4
219
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
220
index XXXXXXX..XXXXXXX 100644
221
--- a/target/arm/translate-sve.c
222
+++ b/target/arm/translate-sve.c
223
@@ -XXX,XX +XXX,XX @@ typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr,
224
typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
225
TCGv_ptr, TCGv_ptr, TCGv_i32);
226
227
+typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32);
228
+
229
/*
230
* Helpers for extracting complex instruction fields.
231
*/
232
@@ -XXX,XX +XXX,XX @@ static inline int expand_imm_sh8u(int x)
233
return (uint8_t)x << (x & 0x100 ? 8 : 0);
234
}
235
236
+/* Convert a 2-bit memory size (msz) to a 4-bit data type (dtype)
237
+ * with unsigned data. C.f. SVE Memory Contiguous Load Group.
238
+ */
239
+static inline int msz_dtype(int msz)
240
+{
241
+ static const uint8_t dtype[4] = { 0, 5, 10, 15 };
242
+ return dtype[msz];
243
+}
244
+
245
/*
246
* Include the generated decoder.
247
*/
248
@@ -XXX,XX +XXX,XX @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
249
}
250
return true;
251
}
252
+
253
+/*
254
+ *** SVE Memory - Contiguous Load Group
255
+ */
256
+
257
+/* The memory mode of the dtype. */
258
+static const TCGMemOp dtype_mop[16] = {
259
+ MO_UB, MO_UB, MO_UB, MO_UB,
260
+ MO_SL, MO_UW, MO_UW, MO_UW,
261
+ MO_SW, MO_SW, MO_UL, MO_UL,
262
+ MO_SB, MO_SB, MO_SB, MO_Q
263
+};
264
+
265
+#define dtype_msz(x) (dtype_mop[x] & MO_SIZE)
266
+
267
+/* The vector element size of dtype. */
268
+static const uint8_t dtype_esz[16] = {
269
+ 0, 1, 2, 3,
270
+ 3, 1, 2, 3,
271
+ 3, 2, 2, 3,
272
+ 3, 2, 1, 3
273
+};
274
+
275
+static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
276
+ gen_helper_gvec_mem *fn)
277
+{
278
+ unsigned vsz = vec_full_reg_size(s);
279
+ TCGv_ptr t_pg;
280
+ TCGv_i32 desc;
281
+
282
+ /* For e.g. LD4, there are not enough arguments to pass all 4
283
+ * registers as pointers, so encode the regno into the data field.
284
+ * For consistency, do this even for LD1.
285
+ */
286
+ desc = tcg_const_i32(simd_desc(vsz, vsz, zt));
287
+ t_pg = tcg_temp_new_ptr();
288
+
289
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
290
+ fn(cpu_env, t_pg, addr, desc);
291
+
292
+ tcg_temp_free_ptr(t_pg);
293
+ tcg_temp_free_i32(desc);
294
+}
295
+
296
+static void do_ld_zpa(DisasContext *s, int zt, int pg,
297
+ TCGv_i64 addr, int dtype, int nreg)
298
+{
299
+ static gen_helper_gvec_mem * const fns[16][4] = {
300
+ { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
301
+ gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
302
+ { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
303
+ { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
304
+ { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
305
+
306
+ { gen_helper_sve_ld1sds_r, NULL, NULL, NULL },
307
+ { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r,
308
+ gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r },
309
+ { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL },
310
+ { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL },
311
+
312
+ { gen_helper_sve_ld1hds_r, NULL, NULL, NULL },
313
+ { gen_helper_sve_ld1hss_r, NULL, NULL, NULL },
314
+ { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r,
315
+ gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r },
316
+ { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL },
317
+
318
+ { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
319
+ { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
320
+ { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
321
+ { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r,
322
+ gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r },
323
+ };
324
+ gen_helper_gvec_mem *fn = fns[dtype][nreg];
325
+
326
+ /* While there are holes in the table, they are not
327
+ * accessible via the instruction encoding.
328
+ */
329
+ assert(fn != NULL);
330
+ do_mem_zpa(s, zt, pg, addr, fn);
331
+}
332
+
333
+static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
334
+{
335
+ if (a->rm == 31) {
336
+ return false;
337
+ }
338
+ if (sve_access_check(s)) {
339
+ TCGv_i64 addr = new_tmp_a64(s);
340
+ tcg_gen_muli_i64(addr, cpu_reg(s, a->rm),
341
+ (a->nreg + 1) << dtype_msz(a->dtype));
342
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
343
+ do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg);
344
+ }
345
+ return true;
346
+}
347
+
348
+static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
349
+{
350
+ if (sve_access_check(s)) {
351
+ int vsz = vec_full_reg_size(s);
352
+ int elements = vsz >> dtype_esz[a->dtype];
353
+ TCGv_i64 addr = new_tmp_a64(s);
354
+
355
+ tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn),
356
+ (a->imm * elements * (a->nreg + 1))
357
+ << dtype_msz(a->dtype));
358
+ do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg);
359
+ }
360
+ return true;
361
+}
362
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
363
index XXXXXXX..XXXXXXX 100644
364
--- a/target/arm/sve.decode
365
+++ b/target/arm/sve.decode
366
@@ -XXX,XX +XXX,XX @@
367
# Unsigned 8-bit immediate, optionally shifted left by 8.
368
%sh8_i8u 5:9 !function=expand_imm_sh8u
369
370
+# Unsigned load of msz into esz=2, represented as a dtype.
371
+%msz_dtype 23:2 !function=msz_dtype
372
+
373
# Either a copy of rd (at bit 0), or a different source
374
# as propagated via the MOVPRFX instruction.
375
%reg_movprfx 0:5
376
@@ -XXX,XX +XXX,XX @@
377
&incdec2_cnt rd rn pat esz imm d u
378
&incdec_pred rd pg esz d u
379
&incdec2_pred rd rn pg esz d u
380
+&rprr_load rd pg rn rm dtype nreg
381
+&rpri_load rd pg rn imm dtype nreg
382
383
###########################################################################
384
# Named instruction formats. These are generally used to
385
@@ -XXX,XX +XXX,XX @@
386
@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \
387
&incdec2_pred rn=%reg_movprfx
388
389
+# Loads; user must fill in NREG.
390
+@rprr_load_dt ....... dtype:4 rm:5 ... pg:3 rn:5 rd:5 &rprr_load
391
+@rpri_load_dt ....... dtype:4 . imm:s4 ... pg:3 rn:5 rd:5 &rpri_load
392
+
393
+@rprr_load_msz ....... .... rm:5 ... pg:3 rn:5 rd:5 \
394
+ &rprr_load dtype=%msz_dtype
395
+@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \
396
+ &rpri_load dtype=%msz_dtype
397
+
398
###########################################################################
399
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
400
401
@@ -XXX,XX +XXX,XX @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9
402
403
# SVE load vector register
404
LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9
405
+
406
+### SVE Memory Contiguous Load Group
407
+
408
+# SVE contiguous load (scalar plus scalar)
409
+LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0
410
+
411
+# SVE contiguous load (scalar plus immediate)
412
+LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0
413
+
414
+# SVE contiguous non-temporal load (scalar plus scalar)
415
+# LDNT1B, LDNT1H, LDNT1W, LDNT1D
416
+# SVE load multiple structures (scalar plus scalar)
417
+# LD2B, LD2H, LD2W, LD2D; etc.
418
+LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz
419
+
420
+# SVE contiguous non-temporal load (scalar plus immediate)
421
+# LDNT1B, LDNT1H, LDNT1W, LDNT1D
422
+# SVE load multiple structures (scalar plus immediate)
423
+# LD2B, LD2H, LD2W, LD2D; etc.
424
+LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz
425
--
34
--
426
2.17.1
35
2.34.1
427
428
diff view generated by jsdifflib