1
The following changes since commit 53f306f316549d20c76886903181413d20842423:
1
First arm pullreq of the cycle; this is mostly my softfloat NaN
2
handling series. (Lots more in my to-review queue, but I don't
3
like pullreqs growing too close to a hundred patches at a time :-))
2
4
3
Merge remote-tracking branch 'remotes/ehabkost-gl/tags/x86-next-pull-request' into staging (2021-06-21 11:26:04 +0100)
5
thanks
6
-- PMM
7
8
The following changes since commit 97f2796a3736ed37a1b85dc1c76a6c45b829dd17:
9
10
Open 10.0 development tree (2024-12-10 17:41:17 +0000)
4
11
5
are available in the Git repository at:
12
are available in the Git repository at:
6
13
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20210621
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20241211
8
15
9
for you to fetch changes up to a83f1d9263d281f938a3984cda7104d55affd43a:
16
for you to fetch changes up to 1abe28d519239eea5cf9620bb13149423e5665f8:
10
17
11
docs/system: arm: Add nRF boards description (2021-06-21 17:24:33 +0100)
18
MAINTAINERS: Add correct email address for Vikram Garhwal (2024-12-11 15:31:09 +0000)
12
19
13
----------------------------------------------------------------
20
----------------------------------------------------------------
14
target-arm queue:
21
target-arm queue:
15
* Don't require 'virt' board to be compiled in for ACPI GHES code
22
* hw/net/lan9118: Extract PHY model, reuse with imx_fec, fix bugs
16
* docs: Document which architecture extensions we emulate
23
* fpu: Make muladd NaN handling runtime-selected, not compile-time
17
* Fix bugs in M-profile FPCXT_NS accesses
24
* fpu: Make default NaN pattern runtime-selected, not compile-time
18
* First slice of MVE patches
25
* fpu: Minor NaN-related cleanups
19
* Implement MTE3
26
* MAINTAINERS: email address updates
20
* docs/system: arm: Add nRF boards description
21
27
22
----------------------------------------------------------------
28
----------------------------------------------------------------
23
Alexandre Iooss (1):
29
Bernhard Beschow (5):
24
docs/system: arm: Add nRF boards description
30
hw/net/lan9118: Extract lan9118_phy
31
hw/net/lan9118_phy: Reuse in imx_fec and consolidate implementations
32
hw/net/lan9118_phy: Fix off-by-one error in MII_ANLPAR register
33
hw/net/lan9118_phy: Reuse MII constants
34
hw/net/lan9118_phy: Add missing 100 mbps full duplex advertisement
25
35
26
Peter Collingbourne (1):
36
Leif Lindholm (1):
27
target/arm: Implement MTE3
37
MAINTAINERS: update email address for Leif Lindholm
28
38
29
Peter Maydell (55):
39
Peter Maydell (54):
30
hw/acpi: Provide stub version of acpi_ghes_record_errors()
40
fpu: handle raising Invalid for infzero in pick_nan_muladd
31
hw/acpi: Provide function acpi_ghes_present()
41
fpu: Check for default_nan_mode before calling pickNaNMulAdd
32
target/arm: Use acpi_ghes_present() to see if we report ACPI memory errors
42
softfloat: Allow runtime choice of inf * 0 + NaN result
33
docs/system/arm: Document which architecture extensions we emulate
43
tests/fp: Explicitly set inf-zero-nan rule
34
target/arm/translate-vfp.c: Whitespace fixes
44
target/arm: Set FloatInfZeroNaNRule explicitly
35
target/arm: Handle FPU being disabled in FPCXT_NS accesses
45
target/s390: Set FloatInfZeroNaNRule explicitly
36
target/arm: Don't NOCP fault for FPCXT_NS accesses
46
target/ppc: Set FloatInfZeroNaNRule explicitly
37
target/arm: Handle writeback in VLDR/VSTR sysreg with no memory access
47
target/mips: Set FloatInfZeroNaNRule explicitly
38
target/arm: Factor FP context update code out into helper function
48
target/sparc: Set FloatInfZeroNaNRule explicitly
39
target/arm: Split vfp_access_check() into A and M versions
49
target/xtensa: Set FloatInfZeroNaNRule explicitly
40
target/arm: Handle FPU check for FPCXT_NS insns via vfp_access_check_m()
50
target/x86: Set FloatInfZeroNaNRule explicitly
41
target/arm: Implement MVE VLDR/VSTR (non-widening forms)
51
target/loongarch: Set FloatInfZeroNaNRule explicitly
42
target/arm: Implement widening/narrowing MVE VLDR/VSTR insns
52
target/hppa: Set FloatInfZeroNaNRule explicitly
43
target/arm: Implement MVE VCLZ
53
softfloat: Pass have_snan to pickNaNMulAdd
44
target/arm: Implement MVE VCLS
54
softfloat: Allow runtime choice of NaN propagation for muladd
45
target/arm: Implement MVE VREV16, VREV32, VREV64
55
tests/fp: Explicitly set 3-NaN propagation rule
46
target/arm: Implement MVE VMVN (register)
56
target/arm: Set Float3NaNPropRule explicitly
47
target/arm: Implement MVE VABS
57
target/loongarch: Set Float3NaNPropRule explicitly
48
target/arm: Implement MVE VNEG
58
target/ppc: Set Float3NaNPropRule explicitly
49
tcg: Make gen_dup_i32/i64() public as tcg_gen_dup_i32/i64
59
target/s390x: Set Float3NaNPropRule explicitly
50
target/arm: Implement MVE VDUP
60
target/sparc: Set Float3NaNPropRule explicitly
51
target/arm: Implement MVE VAND, VBIC, VORR, VORN, VEOR
61
target/mips: Set Float3NaNPropRule explicitly
52
target/arm: Implement MVE VADD, VSUB, VMUL
62
target/xtensa: Set Float3NaNPropRule explicitly
53
target/arm: Implement MVE VMULH
63
target/i386: Set Float3NaNPropRule explicitly
54
target/arm: Implement MVE VRMULH
64
target/hppa: Set Float3NaNPropRule explicitly
55
target/arm: Implement MVE VMAX, VMIN
65
fpu: Remove use_first_nan field from float_status
56
target/arm: Implement MVE VABD
66
target/m68k: Don't pass NULL float_status to floatx80_default_nan()
57
target/arm: Implement MVE VHADD, VHSUB
67
softfloat: Create floatx80 default NaN from parts64_default_nan
58
target/arm: Implement MVE VMULL
68
target/loongarch: Use normal float_status in fclass_s and fclass_d helpers
59
target/arm: Implement MVE VMLALDAV
69
target/m68k: In frem helper, initialize local float_status from env->fp_status
60
target/arm: Implement MVE VMLSLDAV
70
target/m68k: Init local float_status from env fp_status in gdb get/set reg
61
target/arm: Implement MVE VRMLALDAVH, VRMLSLDAVH
71
target/sparc: Initialize local scratch float_status from env->fp_status
62
target/arm: Implement MVE VADD (scalar)
72
target/ppc: Use env->fp_status in helper_compute_fprf functions
63
target/arm: Implement MVE VSUB, VMUL (scalar)
73
fpu: Allow runtime choice of default NaN value
64
target/arm: Implement MVE VHADD, VHSUB (scalar)
74
tests/fp: Set default NaN pattern explicitly
65
target/arm: Implement MVE VBRSR
75
target/microblaze: Set default NaN pattern explicitly
66
target/arm: Implement MVE VPST
76
target/i386: Set default NaN pattern explicitly
67
target/arm: Implement MVE VQADD and VQSUB
77
target/hppa: Set default NaN pattern explicitly
68
target/arm: Implement MVE VQDMULH and VQRDMULH (scalar)
78
target/alpha: Set default NaN pattern explicitly
69
target/arm: Implement MVE VQDMULL scalar
79
target/arm: Set default NaN pattern explicitly
70
target/arm: Implement MVE VQDMULH, VQRDMULH (vector)
80
target/loongarch: Set default NaN pattern explicitly
71
target/arm: Implement MVE VQADD, VQSUB (vector)
81
target/m68k: Set default NaN pattern explicitly
72
target/arm: Implement MVE VQSHL (vector)
82
target/mips: Set default NaN pattern explicitly
73
target/arm: Implement MVE VQRSHL
83
target/openrisc: Set default NaN pattern explicitly
74
target/arm: Implement MVE VSHL insn
84
target/ppc: Set default NaN pattern explicitly
75
target/arm: Implement MVE VRSHL
85
target/sh4: Set default NaN pattern explicitly
76
target/arm: Implement MVE VQDMLADH and VQRDMLADH
86
target/rx: Set default NaN pattern explicitly
77
target/arm: Implement MVE VQDMLSDH and VQRDMLSDH
87
target/s390x: Set default NaN pattern explicitly
78
target/arm: Implement MVE VQDMULL (vector)
88
target/sparc: Set default NaN pattern explicitly
79
target/arm: Implement MVE VRHADD
89
target/xtensa: Set default NaN pattern explicitly
80
target/arm: Implement MVE VADC, VSBC
90
target/hexagon: Set default NaN pattern explicitly
81
target/arm: Implement MVE VCADD
91
target/riscv: Set default NaN pattern explicitly
82
target/arm: Implement MVE VHCADD
92
target/tricore: Set default NaN pattern explicitly
83
target/arm: Implement MVE VADDV
93
fpu: Remove default handling for dnan_pattern
84
target/arm: Make VMOV scalar <-> gpreg beatwise for MVE
85
94
86
docs/system/arm/emulation.rst | 103 ++++
95
Richard Henderson (11):
87
docs/system/arm/nrf.rst | 51 ++
96
target/arm: Copy entire float_status in is_ebf
88
docs/system/target-arm.rst | 7 +
97
softfloat: Inline pickNaNMulAdd
89
include/hw/acpi/ghes.h | 9 +
98
softfloat: Use goto for default nan case in pick_nan_muladd
90
include/tcg/tcg-op.h | 8 +
99
softfloat: Remove which from parts_pick_nan_muladd
91
include/tcg/tcg.h | 1 -
100
softfloat: Pad array size in pick_nan_muladd
92
target/arm/helper-mve.h | 357 +++++++++++++
101
softfloat: Move propagateFloatx80NaN to softfloat.c
93
target/arm/helper.h | 2 +
102
softfloat: Use parts_pick_nan in propagateFloatx80NaN
94
target/arm/internals.h | 11 +
103
softfloat: Inline pickNaN
95
target/arm/translate-a32.h | 3 +
104
softfloat: Share code between parts_pick_nan cases
96
target/arm/translate.h | 10 +
105
softfloat: Sink frac_cmp in parts_pick_nan until needed
97
target/arm/m-nocp.decode | 24 +
106
softfloat: Replace WHICH with RET in parts_pick_nan
98
target/arm/mve.decode | 240 +++++++++
99
target/arm/vfp.decode | 14 -
100
hw/acpi/ghes-stub.c | 22 +
101
hw/acpi/ghes.c | 17 +
102
target/arm/cpu64.c | 2 +-
103
target/arm/kvm64.c | 6 +-
104
target/arm/mte_helper.c | 82 +--
105
target/arm/mve_helper.c | 1160 +++++++++++++++++++++++++++++++++++++++++
106
target/arm/translate-m-nocp.c | 550 +++++++++++++++++++
107
target/arm/translate-mve.c | 759 +++++++++++++++++++++++++++
108
target/arm/translate-vfp.c | 741 +++++++-------------------
109
tcg/tcg-op-gvec.c | 20 +-
110
MAINTAINERS | 1 +
111
hw/acpi/meson.build | 6 +-
112
target/arm/meson.build | 1 +
113
27 files changed, 3578 insertions(+), 629 deletions(-)
114
create mode 100644 docs/system/arm/emulation.rst
115
create mode 100644 docs/system/arm/nrf.rst
116
create mode 100644 target/arm/helper-mve.h
117
create mode 100644 hw/acpi/ghes-stub.c
118
create mode 100644 target/arm/mve_helper.c
119
107
108
Vikram Garhwal (1):
109
MAINTAINERS: Add correct email address for Vikram Garhwal
110
111
MAINTAINERS | 4 +-
112
include/fpu/softfloat-helpers.h | 38 +++-
113
include/fpu/softfloat-types.h | 89 +++++++-
114
include/hw/net/imx_fec.h | 9 +-
115
include/hw/net/lan9118_phy.h | 37 ++++
116
include/hw/net/mii.h | 6 +
117
target/mips/fpu_helper.h | 20 ++
118
target/sparc/helper.h | 4 +-
119
fpu/softfloat.c | 19 ++
120
hw/net/imx_fec.c | 146 ++------------
121
hw/net/lan9118.c | 137 ++-----------
122
hw/net/lan9118_phy.c | 222 ++++++++++++++++++++
123
linux-user/arm/nwfpe/fpa11.c | 5 +
124
target/alpha/cpu.c | 2 +
125
target/arm/cpu.c | 10 +
126
target/arm/tcg/vec_helper.c | 20 +-
127
target/hexagon/cpu.c | 2 +
128
target/hppa/fpu_helper.c | 12 ++
129
target/i386/tcg/fpu_helper.c | 12 ++
130
target/loongarch/tcg/fpu_helper.c | 14 +-
131
target/m68k/cpu.c | 14 +-
132
target/m68k/fpu_helper.c | 6 +-
133
target/m68k/helper.c | 6 +-
134
target/microblaze/cpu.c | 2 +
135
target/mips/msa.c | 10 +
136
target/openrisc/cpu.c | 2 +
137
target/ppc/cpu_init.c | 19 ++
138
target/ppc/fpu_helper.c | 3 +-
139
target/riscv/cpu.c | 2 +
140
target/rx/cpu.c | 2 +
141
target/s390x/cpu.c | 5 +
142
target/sh4/cpu.c | 2 +
143
target/sparc/cpu.c | 6 +
144
target/sparc/fop_helper.c | 8 +-
145
target/sparc/translate.c | 4 +-
146
target/tricore/helper.c | 2 +
147
target/xtensa/cpu.c | 4 +
148
target/xtensa/fpu_helper.c | 3 +-
149
tests/fp/fp-bench.c | 7 +
150
tests/fp/fp-test-log2.c | 1 +
151
tests/fp/fp-test.c | 7 +
152
fpu/softfloat-parts.c.inc | 152 +++++++++++---
153
fpu/softfloat-specialize.c.inc | 412 ++------------------------------------
154
.mailmap | 5 +-
155
hw/net/Kconfig | 5 +
156
hw/net/meson.build | 1 +
157
hw/net/trace-events | 10 +-
158
47 files changed, 778 insertions(+), 730 deletions(-)
159
create mode 100644 include/hw/net/lan9118_phy.h
160
create mode 100644 hw/net/lan9118_phy.c
diff view generated by jsdifflib
1
Generic code in target/arm wants to call acpi_ghes_record_errors();
1
From: Bernhard Beschow <shentey@gmail.com>
2
provide a stub version so that we don't fail to link when
3
CONFIG_ACPI_APEI is not set. This requires us to add a new
4
ghes-stub.c file to contain it and the meson.build mechanics
5
to use it when appropriate.
6
2
3
A very similar implementation of the same device exists in imx_fec. Prepare for
4
a common implementation by extracting a device model into its own files.
5
6
Some migration state has been moved into the new device model which breaks
7
migration compatibility for the following machines:
8
* smdkc210
9
* realview-*
10
* vexpress-*
11
* kzm
12
* mps2-*
13
14
While breaking migration ABI, fix the size of the MII registers to be 16 bit,
15
as defined by IEEE 802.3u.
16
17
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
18
Tested-by: Guenter Roeck <linux@roeck-us.net>
19
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
20
Message-id: 20241102125724.532843-2-shentey@gmail.com
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Dongjiu Geng <gengdongjiu1@gmail.com>
10
Message-id: 20210603171259.27962-2-peter.maydell@linaro.org
11
---
22
---
12
hw/acpi/ghes-stub.c | 17 +++++++++++++++++
23
include/hw/net/lan9118_phy.h | 37 ++++++++
13
hw/acpi/meson.build | 6 +++---
24
hw/net/lan9118.c | 137 +++++-----------------------
14
2 files changed, 20 insertions(+), 3 deletions(-)
25
hw/net/lan9118_phy.c | 169 +++++++++++++++++++++++++++++++++++
15
create mode 100644 hw/acpi/ghes-stub.c
26
hw/net/Kconfig | 4 +
27
hw/net/meson.build | 1 +
28
5 files changed, 233 insertions(+), 115 deletions(-)
29
create mode 100644 include/hw/net/lan9118_phy.h
30
create mode 100644 hw/net/lan9118_phy.c
16
31
17
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
32
diff --git a/include/hw/net/lan9118_phy.h b/include/hw/net/lan9118_phy.h
18
new file mode 100644
33
new file mode 100644
19
index XXXXXXX..XXXXXXX
34
index XXXXXXX..XXXXXXX
20
--- /dev/null
35
--- /dev/null
21
+++ b/hw/acpi/ghes-stub.c
36
+++ b/include/hw/net/lan9118_phy.h
22
@@ -XXX,XX +XXX,XX @@
37
@@ -XXX,XX +XXX,XX @@
23
+/*
38
+/*
24
+ * Support for generating APEI tables and recording CPER for Guests:
39
+ * SMSC LAN9118 PHY emulation
25
+ * stub functions.
26
+ *
40
+ *
27
+ * Copyright (c) 2021 Linaro, Ltd
41
+ * Copyright (c) 2009 CodeSourcery, LLC.
42
+ * Written by Paul Brook
28
+ *
43
+ *
29
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
30
+ * See the COPYING file in the top-level directory.
45
+ * See the COPYING file in the top-level directory.
31
+ */
46
+ */
32
+
47
+
48
+#ifndef HW_NET_LAN9118_PHY_H
49
+#define HW_NET_LAN9118_PHY_H
50
+
51
+#include "qom/object.h"
52
+#include "hw/sysbus.h"
53
+
54
+#define TYPE_LAN9118_PHY "lan9118-phy"
55
+OBJECT_DECLARE_SIMPLE_TYPE(Lan9118PhyState, LAN9118_PHY)
56
+
57
+typedef struct Lan9118PhyState {
58
+ SysBusDevice parent_obj;
59
+
60
+ uint16_t status;
61
+ uint16_t control;
62
+ uint16_t advertise;
63
+ uint16_t ints;
64
+ uint16_t int_mask;
65
+ qemu_irq irq;
66
+ bool link_down;
67
+} Lan9118PhyState;
68
+
69
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down);
70
+void lan9118_phy_reset(Lan9118PhyState *s);
71
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg);
72
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val);
73
+
74
+#endif
75
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
76
index XXXXXXX..XXXXXXX 100644
77
--- a/hw/net/lan9118.c
78
+++ b/hw/net/lan9118.c
79
@@ -XXX,XX +XXX,XX @@
80
#include "net/net.h"
81
#include "net/eth.h"
82
#include "hw/irq.h"
83
+#include "hw/net/lan9118_phy.h"
84
#include "hw/net/lan9118.h"
85
#include "hw/ptimer.h"
86
#include "hw/qdev-properties.h"
87
@@ -XXX,XX +XXX,XX @@ do { printf("lan9118: " fmt , ## __VA_ARGS__); } while (0)
88
#define MAC_CR_RXEN 0x00000004
89
#define MAC_CR_RESERVED 0x7f404213
90
91
-#define PHY_INT_ENERGYON 0x80
92
-#define PHY_INT_AUTONEG_COMPLETE 0x40
93
-#define PHY_INT_FAULT 0x20
94
-#define PHY_INT_DOWN 0x10
95
-#define PHY_INT_AUTONEG_LP 0x08
96
-#define PHY_INT_PARFAULT 0x04
97
-#define PHY_INT_AUTONEG_PAGE 0x02
98
-
99
#define GPT_TIMER_EN 0x20000000
100
101
/*
102
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
103
uint32_t mac_mii_data;
104
uint32_t mac_flow;
105
106
- uint32_t phy_status;
107
- uint32_t phy_control;
108
- uint32_t phy_advertise;
109
- uint32_t phy_int;
110
- uint32_t phy_int_mask;
111
+ Lan9118PhyState mii;
112
+ IRQState mii_irq;
113
114
int32_t eeprom_writable;
115
uint8_t eeprom[128];
116
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
117
118
static const VMStateDescription vmstate_lan9118 = {
119
.name = "lan9118",
120
- .version_id = 2,
121
- .minimum_version_id = 1,
122
+ .version_id = 3,
123
+ .minimum_version_id = 3,
124
.fields = (const VMStateField[]) {
125
VMSTATE_PTIMER(timer, lan9118_state),
126
VMSTATE_UINT32(irq_cfg, lan9118_state),
127
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118 = {
128
VMSTATE_UINT32(mac_mii_acc, lan9118_state),
129
VMSTATE_UINT32(mac_mii_data, lan9118_state),
130
VMSTATE_UINT32(mac_flow, lan9118_state),
131
- VMSTATE_UINT32(phy_status, lan9118_state),
132
- VMSTATE_UINT32(phy_control, lan9118_state),
133
- VMSTATE_UINT32(phy_advertise, lan9118_state),
134
- VMSTATE_UINT32(phy_int, lan9118_state),
135
- VMSTATE_UINT32(phy_int_mask, lan9118_state),
136
VMSTATE_INT32(eeprom_writable, lan9118_state),
137
VMSTATE_UINT8_ARRAY(eeprom, lan9118_state, 128),
138
VMSTATE_INT32(tx_fifo_size, lan9118_state),
139
@@ -XXX,XX +XXX,XX @@ static void lan9118_reload_eeprom(lan9118_state *s)
140
lan9118_mac_changed(s);
141
}
142
143
-static void phy_update_irq(lan9118_state *s)
144
+static void lan9118_update_irq(void *opaque, int n, int level)
145
{
146
- if (s->phy_int & s->phy_int_mask) {
147
+ lan9118_state *s = opaque;
148
+
149
+ if (level) {
150
s->int_sts |= PHY_INT;
151
} else {
152
s->int_sts &= ~PHY_INT;
153
@@ -XXX,XX +XXX,XX @@ static void phy_update_irq(lan9118_state *s)
154
lan9118_update(s);
155
}
156
157
-static void phy_update_link(lan9118_state *s)
158
-{
159
- /* Autonegotiation status mirrors link status. */
160
- if (qemu_get_queue(s->nic)->link_down) {
161
- s->phy_status &= ~0x0024;
162
- s->phy_int |= PHY_INT_DOWN;
163
- } else {
164
- s->phy_status |= 0x0024;
165
- s->phy_int |= PHY_INT_ENERGYON;
166
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
167
- }
168
- phy_update_irq(s);
169
-}
170
-
171
static void lan9118_set_link(NetClientState *nc)
172
{
173
- phy_update_link(qemu_get_nic_opaque(nc));
174
-}
175
-
176
-static void phy_reset(lan9118_state *s)
177
-{
178
- s->phy_status = 0x7809;
179
- s->phy_control = 0x3000;
180
- s->phy_advertise = 0x01e1;
181
- s->phy_int_mask = 0;
182
- s->phy_int = 0;
183
- phy_update_link(s);
184
+ lan9118_phy_update_link(&LAN9118(qemu_get_nic_opaque(nc))->mii,
185
+ nc->link_down);
186
}
187
188
static void lan9118_reset(DeviceState *d)
189
@@ -XXX,XX +XXX,XX @@ static void lan9118_reset(DeviceState *d)
190
s->read_word_n = 0;
191
s->write_word_n = 0;
192
193
- phy_reset(s);
194
-
195
s->eeprom_writable = 0;
196
lan9118_reload_eeprom(s);
197
}
198
@@ -XXX,XX +XXX,XX @@ static void do_tx_packet(lan9118_state *s)
199
uint32_t status;
200
201
/* FIXME: Honor TX disable, and allow queueing of packets. */
202
- if (s->phy_control & 0x4000) {
203
+ if (s->mii.control & 0x4000) {
204
/* This assumes the receive routine doesn't touch the VLANClient. */
205
qemu_receive_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
206
} else {
207
@@ -XXX,XX +XXX,XX @@ static void tx_fifo_push(lan9118_state *s, uint32_t val)
208
}
209
}
210
211
-static uint32_t do_phy_read(lan9118_state *s, int reg)
212
-{
213
- uint32_t val;
214
-
215
- switch (reg) {
216
- case 0: /* Basic Control */
217
- return s->phy_control;
218
- case 1: /* Basic Status */
219
- return s->phy_status;
220
- case 2: /* ID1 */
221
- return 0x0007;
222
- case 3: /* ID2 */
223
- return 0xc0d1;
224
- case 4: /* Auto-neg advertisement */
225
- return s->phy_advertise;
226
- case 5: /* Auto-neg Link Partner Ability */
227
- return 0x0f71;
228
- case 6: /* Auto-neg Expansion */
229
- return 1;
230
- /* TODO 17, 18, 27, 29, 30, 31 */
231
- case 29: /* Interrupt source. */
232
- val = s->phy_int;
233
- s->phy_int = 0;
234
- phy_update_irq(s);
235
- return val;
236
- case 30: /* Interrupt mask */
237
- return s->phy_int_mask;
238
- default:
239
- qemu_log_mask(LOG_GUEST_ERROR,
240
- "do_phy_read: PHY read reg %d\n", reg);
241
- return 0;
242
- }
243
-}
244
-
245
-static void do_phy_write(lan9118_state *s, int reg, uint32_t val)
246
-{
247
- switch (reg) {
248
- case 0: /* Basic Control */
249
- if (val & 0x8000) {
250
- phy_reset(s);
251
- break;
252
- }
253
- s->phy_control = val & 0x7980;
254
- /* Complete autonegotiation immediately. */
255
- if (val & 0x1000) {
256
- s->phy_status |= 0x0020;
257
- }
258
- break;
259
- case 4: /* Auto-neg advertisement */
260
- s->phy_advertise = (val & 0x2d7f) | 0x80;
261
- break;
262
- /* TODO 17, 18, 27, 31 */
263
- case 30: /* Interrupt mask */
264
- s->phy_int_mask = val & 0xff;
265
- phy_update_irq(s);
266
- break;
267
- default:
268
- qemu_log_mask(LOG_GUEST_ERROR,
269
- "do_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
270
- }
271
-}
272
-
273
static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
274
{
275
switch (reg) {
276
@@ -XXX,XX +XXX,XX @@ static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
277
if (val & 2) {
278
DPRINTF("PHY write %d = 0x%04x\n",
279
(val >> 6) & 0x1f, s->mac_mii_data);
280
- do_phy_write(s, (val >> 6) & 0x1f, s->mac_mii_data);
281
+ lan9118_phy_write(&s->mii, (val >> 6) & 0x1f, s->mac_mii_data);
282
} else {
283
- s->mac_mii_data = do_phy_read(s, (val >> 6) & 0x1f);
284
+ s->mac_mii_data = lan9118_phy_read(&s->mii, (val >> 6) & 0x1f);
285
DPRINTF("PHY read %d = 0x%04x\n",
286
(val >> 6) & 0x1f, s->mac_mii_data);
287
}
288
@@ -XXX,XX +XXX,XX @@ static void lan9118_writel(void *opaque, hwaddr offset,
289
break;
290
case CSR_PMT_CTRL:
291
if (val & 0x400) {
292
- phy_reset(s);
293
+ lan9118_phy_reset(&s->mii);
294
}
295
s->pmt_ctrl &= ~0x34e;
296
s->pmt_ctrl |= (val & 0x34e);
297
@@ -XXX,XX +XXX,XX @@ static void lan9118_realize(DeviceState *dev, Error **errp)
298
const MemoryRegionOps *mem_ops =
299
s->mode_16bit ? &lan9118_16bit_mem_ops : &lan9118_mem_ops;
300
301
+ qemu_init_irq(&s->mii_irq, lan9118_update_irq, s, 0);
302
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
303
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
304
+ return;
305
+ }
306
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
307
+
308
memory_region_init_io(&s->mmio, OBJECT(dev), mem_ops, s,
309
"lan9118-mmio", 0x100);
310
sysbus_init_mmio(sbd, &s->mmio);
311
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
312
new file mode 100644
313
index XXXXXXX..XXXXXXX
314
--- /dev/null
315
+++ b/hw/net/lan9118_phy.c
316
@@ -XXX,XX +XXX,XX @@
317
+/*
318
+ * SMSC LAN9118 PHY emulation
319
+ *
320
+ * Copyright (c) 2009 CodeSourcery, LLC.
321
+ * Written by Paul Brook
322
+ *
323
+ * This code is licensed under the GNU GPL v2
324
+ *
325
+ * Contributions after 2012-01-13 are licensed under the terms of the
326
+ * GNU GPL, version 2 or (at your option) any later version.
327
+ */
328
+
33
+#include "qemu/osdep.h"
329
+#include "qemu/osdep.h"
34
+#include "hw/acpi/ghes.h"
330
+#include "hw/net/lan9118_phy.h"
35
+
331
+#include "hw/irq.h"
36
+int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
332
+#include "hw/resettable.h"
37
+{
333
+#include "migration/vmstate.h"
38
+ return -1;
334
+#include "qemu/log.h"
39
+}
335
+
40
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
336
+#define PHY_INT_ENERGYON (1 << 7)
337
+#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
338
+#define PHY_INT_FAULT (1 << 5)
339
+#define PHY_INT_DOWN (1 << 4)
340
+#define PHY_INT_AUTONEG_LP (1 << 3)
341
+#define PHY_INT_PARFAULT (1 << 2)
342
+#define PHY_INT_AUTONEG_PAGE (1 << 1)
343
+
344
+static void lan9118_phy_update_irq(Lan9118PhyState *s)
345
+{
346
+ qemu_set_irq(s->irq, !!(s->ints & s->int_mask));
347
+}
348
+
349
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
350
+{
351
+ uint16_t val;
352
+
353
+ switch (reg) {
354
+ case 0: /* Basic Control */
355
+ return s->control;
356
+ case 1: /* Basic Status */
357
+ return s->status;
358
+ case 2: /* ID1 */
359
+ return 0x0007;
360
+ case 3: /* ID2 */
361
+ return 0xc0d1;
362
+ case 4: /* Auto-neg advertisement */
363
+ return s->advertise;
364
+ case 5: /* Auto-neg Link Partner Ability */
365
+ return 0x0f71;
366
+ case 6: /* Auto-neg Expansion */
367
+ return 1;
368
+ /* TODO 17, 18, 27, 29, 30, 31 */
369
+ case 29: /* Interrupt source. */
370
+ val = s->ints;
371
+ s->ints = 0;
372
+ lan9118_phy_update_irq(s);
373
+ return val;
374
+ case 30: /* Interrupt mask */
375
+ return s->int_mask;
376
+ default:
377
+ qemu_log_mask(LOG_GUEST_ERROR,
378
+ "lan9118_phy_read: PHY read reg %d\n", reg);
379
+ return 0;
380
+ }
381
+}
382
+
383
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
384
+{
385
+ switch (reg) {
386
+ case 0: /* Basic Control */
387
+ if (val & 0x8000) {
388
+ lan9118_phy_reset(s);
389
+ break;
390
+ }
391
+ s->control = val & 0x7980;
392
+ /* Complete autonegotiation immediately. */
393
+ if (val & 0x1000) {
394
+ s->status |= 0x0020;
395
+ }
396
+ break;
397
+ case 4: /* Auto-neg advertisement */
398
+ s->advertise = (val & 0x2d7f) | 0x80;
399
+ break;
400
+ /* TODO 17, 18, 27, 31 */
401
+ case 30: /* Interrupt mask */
402
+ s->int_mask = val & 0xff;
403
+ lan9118_phy_update_irq(s);
404
+ break;
405
+ default:
406
+ qemu_log_mask(LOG_GUEST_ERROR,
407
+ "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
408
+ }
409
+}
410
+
411
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
412
+{
413
+ s->link_down = link_down;
414
+
415
+ /* Autonegotiation status mirrors link status. */
416
+ if (link_down) {
417
+ s->status &= ~0x0024;
418
+ s->ints |= PHY_INT_DOWN;
419
+ } else {
420
+ s->status |= 0x0024;
421
+ s->ints |= PHY_INT_ENERGYON;
422
+ s->ints |= PHY_INT_AUTONEG_COMPLETE;
423
+ }
424
+ lan9118_phy_update_irq(s);
425
+}
426
+
427
+void lan9118_phy_reset(Lan9118PhyState *s)
428
+{
429
+ s->control = 0x3000;
430
+ s->status = 0x7809;
431
+ s->advertise = 0x01e1;
432
+ s->int_mask = 0;
433
+ s->ints = 0;
434
+ lan9118_phy_update_link(s, s->link_down);
435
+}
436
+
437
+static void lan9118_phy_reset_hold(Object *obj, ResetType type)
438
+{
439
+ Lan9118PhyState *s = LAN9118_PHY(obj);
440
+
441
+ lan9118_phy_reset(s);
442
+}
443
+
444
+static void lan9118_phy_init(Object *obj)
445
+{
446
+ Lan9118PhyState *s = LAN9118_PHY(obj);
447
+
448
+ qdev_init_gpio_out(DEVICE(s), &s->irq, 1);
449
+}
450
+
451
+static const VMStateDescription vmstate_lan9118_phy = {
452
+ .name = "lan9118-phy",
453
+ .version_id = 1,
454
+ .minimum_version_id = 1,
455
+ .fields = (const VMStateField[]) {
456
+ VMSTATE_UINT16(control, Lan9118PhyState),
457
+ VMSTATE_UINT16(status, Lan9118PhyState),
458
+ VMSTATE_UINT16(advertise, Lan9118PhyState),
459
+ VMSTATE_UINT16(ints, Lan9118PhyState),
460
+ VMSTATE_UINT16(int_mask, Lan9118PhyState),
461
+ VMSTATE_BOOL(link_down, Lan9118PhyState),
462
+ VMSTATE_END_OF_LIST()
463
+ }
464
+};
465
+
466
+static void lan9118_phy_class_init(ObjectClass *klass, void *data)
467
+{
468
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
469
+ DeviceClass *dc = DEVICE_CLASS(klass);
470
+
471
+ rc->phases.hold = lan9118_phy_reset_hold;
472
+ dc->vmsd = &vmstate_lan9118_phy;
473
+}
474
+
475
+static const TypeInfo types[] = {
476
+ {
477
+ .name = TYPE_LAN9118_PHY,
478
+ .parent = TYPE_SYS_BUS_DEVICE,
479
+ .instance_size = sizeof(Lan9118PhyState),
480
+ .instance_init = lan9118_phy_init,
481
+ .class_init = lan9118_phy_class_init,
482
+ }
483
+};
484
+
485
+DEFINE_TYPES(types)
486
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
41
index XXXXXXX..XXXXXXX 100644
487
index XXXXXXX..XXXXXXX 100644
42
--- a/hw/acpi/meson.build
488
--- a/hw/net/Kconfig
43
+++ b/hw/acpi/meson.build
489
+++ b/hw/net/Kconfig
44
@@ -XXX,XX +XXX,XX @@ acpi_ss.add(when: 'CONFIG_ACPI_PCI', if_true: files('pci.c'))
490
@@ -XXX,XX +XXX,XX @@ config VMXNET3_PCI
45
acpi_ss.add(when: 'CONFIG_ACPI_VMGENID', if_true: files('vmgenid.c'))
491
config SMC91C111
46
acpi_ss.add(when: 'CONFIG_ACPI_HW_REDUCED', if_true: files('generic_event_device.c'))
492
bool
47
acpi_ss.add(when: 'CONFIG_ACPI_HMAT', if_true: files('hmat.c'))
493
48
-acpi_ss.add(when: 'CONFIG_ACPI_APEI', if_true: files('ghes.c'))
494
+config LAN9118_PHY
49
+acpi_ss.add(when: 'CONFIG_ACPI_APEI', if_true: files('ghes.c'), if_false: files('ghes-stub.c'))
495
+ bool
50
acpi_ss.add(when: 'CONFIG_ACPI_X86', if_true: files('core.c', 'piix4.c', 'pcihp.c'), if_false: files('acpi-stub.c'))
496
+
51
acpi_ss.add(when: 'CONFIG_ACPI_X86_ICH', if_true: files('ich9.c', 'tco.c'))
497
config LAN9118
52
acpi_ss.add(when: 'CONFIG_IPMI', if_true: files('ipmi.c'), if_false: files('ipmi-stub.c'))
498
bool
53
acpi_ss.add(when: 'CONFIG_PC', if_false: files('acpi-x86-stub.c'))
499
+ select LAN9118_PHY
54
acpi_ss.add(when: 'CONFIG_TPM', if_true: files('tpm.c'))
500
select PTIMER
55
-softmmu_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c'))
501
56
+softmmu_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c', 'ghes-stub.c'))
502
config NE2000_ISA
57
softmmu_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
503
diff --git a/hw/net/meson.build b/hw/net/meson.build
58
softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('acpi-stub.c', 'aml-build-stub.c',
504
index XXXXXXX..XXXXXXX 100644
59
- 'acpi-x86-stub.c', 'ipmi-stub.c'))
505
--- a/hw/net/meson.build
60
+ 'acpi-x86-stub.c', 'ipmi-stub.c', 'ghes-stub.c'))
506
+++ b/hw/net/meson.build
507
@@ -XXX,XX +XXX,XX @@ system_ss.add(when: 'CONFIG_VMXNET3_PCI', if_true: files('vmxnet3.c'))
508
509
system_ss.add(when: 'CONFIG_SMC91C111', if_true: files('smc91c111.c'))
510
system_ss.add(when: 'CONFIG_LAN9118', if_true: files('lan9118.c'))
511
+system_ss.add(when: 'CONFIG_LAN9118_PHY', if_true: files('lan9118_phy.c'))
512
system_ss.add(when: 'CONFIG_NE2000_ISA', if_true: files('ne2000-isa.c'))
513
system_ss.add(when: 'CONFIG_OPENCORES_ETH', if_true: files('opencores_eth.c'))
514
system_ss.add(when: 'CONFIG_XGMAC', if_true: files('xgmac.c'))
61
--
515
--
62
2.20.1
516
2.34.1
63
64
diff view generated by jsdifflib
New patch
1
From: Bernhard Beschow <shentey@gmail.com>
1
2
3
imx_fec models the same PHY as lan9118_phy. The code is almost the same with
4
imx_fec having more logging and tracing. Merge these improvements into
5
lan9118_phy and reuse in imx_fec to fix the code duplication.
6
7
Some migration state how resides in the new device model which breaks migration
8
compatibility for the following machines:
9
* imx25-pdk
10
* sabrelite
11
* mcimx7d-sabre
12
* mcimx6ul-evk
13
14
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
15
Tested-by: Guenter Roeck <linux@roeck-us.net>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
17
Message-id: 20241102125724.532843-3-shentey@gmail.com
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
---
20
include/hw/net/imx_fec.h | 9 ++-
21
hw/net/imx_fec.c | 146 ++++-----------------------------------
22
hw/net/lan9118_phy.c | 82 ++++++++++++++++------
23
hw/net/Kconfig | 1 +
24
hw/net/trace-events | 10 +--
25
5 files changed, 85 insertions(+), 163 deletions(-)
26
27
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
28
index XXXXXXX..XXXXXXX 100644
29
--- a/include/hw/net/imx_fec.h
30
+++ b/include/hw/net/imx_fec.h
31
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(IMXFECState, IMX_FEC)
32
#define TYPE_IMX_ENET "imx.enet"
33
34
#include "hw/sysbus.h"
35
+#include "hw/net/lan9118_phy.h"
36
+#include "hw/irq.h"
37
#include "net/net.h"
38
39
#define ENET_EIR 1
40
@@ -XXX,XX +XXX,XX @@ struct IMXFECState {
41
uint32_t tx_descriptor[ENET_TX_RING_NUM];
42
uint32_t tx_ring_num;
43
44
- uint32_t phy_status;
45
- uint32_t phy_control;
46
- uint32_t phy_advertise;
47
- uint32_t phy_int;
48
- uint32_t phy_int_mask;
49
+ Lan9118PhyState mii;
50
+ IRQState mii_irq;
51
uint32_t phy_num;
52
bool phy_connected;
53
struct IMXFECState *phy_consumer;
54
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/hw/net/imx_fec.c
57
+++ b/hw/net/imx_fec.c
58
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth_txdescs = {
59
60
static const VMStateDescription vmstate_imx_eth = {
61
.name = TYPE_IMX_FEC,
62
- .version_id = 2,
63
- .minimum_version_id = 2,
64
+ .version_id = 3,
65
+ .minimum_version_id = 3,
66
.fields = (const VMStateField[]) {
67
VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
68
VMSTATE_UINT32(rx_descriptor, IMXFECState),
69
VMSTATE_UINT32(tx_descriptor[0], IMXFECState),
70
- VMSTATE_UINT32(phy_status, IMXFECState),
71
- VMSTATE_UINT32(phy_control, IMXFECState),
72
- VMSTATE_UINT32(phy_advertise, IMXFECState),
73
- VMSTATE_UINT32(phy_int, IMXFECState),
74
- VMSTATE_UINT32(phy_int_mask, IMXFECState),
75
VMSTATE_END_OF_LIST()
76
},
77
.subsections = (const VMStateDescription * const []) {
78
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth = {
79
},
80
};
81
82
-#define PHY_INT_ENERGYON (1 << 7)
83
-#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
84
-#define PHY_INT_FAULT (1 << 5)
85
-#define PHY_INT_DOWN (1 << 4)
86
-#define PHY_INT_AUTONEG_LP (1 << 3)
87
-#define PHY_INT_PARFAULT (1 << 2)
88
-#define PHY_INT_AUTONEG_PAGE (1 << 1)
89
-
90
static void imx_eth_update(IMXFECState *s);
91
92
/*
93
@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
94
* For now we don't handle any GPIO/interrupt line, so the OS will
95
* have to poll for the PHY status.
96
*/
97
-static void imx_phy_update_irq(IMXFECState *s)
98
+static void imx_phy_update_irq(void *opaque, int n, int level)
99
{
100
- imx_eth_update(s);
101
-}
102
-
103
-static void imx_phy_update_link(IMXFECState *s)
104
-{
105
- /* Autonegotiation status mirrors link status. */
106
- if (qemu_get_queue(s->nic)->link_down) {
107
- trace_imx_phy_update_link("down");
108
- s->phy_status &= ~0x0024;
109
- s->phy_int |= PHY_INT_DOWN;
110
- } else {
111
- trace_imx_phy_update_link("up");
112
- s->phy_status |= 0x0024;
113
- s->phy_int |= PHY_INT_ENERGYON;
114
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
115
- }
116
- imx_phy_update_irq(s);
117
+ imx_eth_update(opaque);
118
}
119
120
static void imx_eth_set_link(NetClientState *nc)
121
{
122
- imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
123
-}
124
-
125
-static void imx_phy_reset(IMXFECState *s)
126
-{
127
- trace_imx_phy_reset();
128
-
129
- s->phy_status = 0x7809;
130
- s->phy_control = 0x3000;
131
- s->phy_advertise = 0x01e1;
132
- s->phy_int_mask = 0;
133
- s->phy_int = 0;
134
- imx_phy_update_link(s);
135
+ lan9118_phy_update_link(&IMX_FEC(qemu_get_nic_opaque(nc))->mii,
136
+ nc->link_down);
137
}
138
139
static uint32_t imx_phy_read(IMXFECState *s, int reg)
140
{
141
- uint32_t val;
142
uint32_t phy = reg / 32;
143
144
if (!s->phy_connected) {
145
@@ -XXX,XX +XXX,XX @@ static uint32_t imx_phy_read(IMXFECState *s, int reg)
146
147
reg %= 32;
148
149
- switch (reg) {
150
- case 0: /* Basic Control */
151
- val = s->phy_control;
152
- break;
153
- case 1: /* Basic Status */
154
- val = s->phy_status;
155
- break;
156
- case 2: /* ID1 */
157
- val = 0x0007;
158
- break;
159
- case 3: /* ID2 */
160
- val = 0xc0d1;
161
- break;
162
- case 4: /* Auto-neg advertisement */
163
- val = s->phy_advertise;
164
- break;
165
- case 5: /* Auto-neg Link Partner Ability */
166
- val = 0x0f71;
167
- break;
168
- case 6: /* Auto-neg Expansion */
169
- val = 1;
170
- break;
171
- case 29: /* Interrupt source. */
172
- val = s->phy_int;
173
- s->phy_int = 0;
174
- imx_phy_update_irq(s);
175
- break;
176
- case 30: /* Interrupt mask */
177
- val = s->phy_int_mask;
178
- break;
179
- case 17:
180
- case 18:
181
- case 27:
182
- case 31:
183
- qemu_log_mask(LOG_UNIMP, "[%s.phy]%s: reg %d not implemented\n",
184
- TYPE_IMX_FEC, __func__, reg);
185
- val = 0;
186
- break;
187
- default:
188
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
189
- TYPE_IMX_FEC, __func__, reg);
190
- val = 0;
191
- break;
192
- }
193
-
194
- trace_imx_phy_read(val, phy, reg);
195
-
196
- return val;
197
+ return lan9118_phy_read(&s->mii, reg);
198
}
199
200
static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
201
@@ -XXX,XX +XXX,XX @@ static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
202
203
reg %= 32;
204
205
- trace_imx_phy_write(val, phy, reg);
206
-
207
- switch (reg) {
208
- case 0: /* Basic Control */
209
- if (val & 0x8000) {
210
- imx_phy_reset(s);
211
- } else {
212
- s->phy_control = val & 0x7980;
213
- /* Complete autonegotiation immediately. */
214
- if (val & 0x1000) {
215
- s->phy_status |= 0x0020;
216
- }
217
- }
218
- break;
219
- case 4: /* Auto-neg advertisement */
220
- s->phy_advertise = (val & 0x2d7f) | 0x80;
221
- break;
222
- case 30: /* Interrupt mask */
223
- s->phy_int_mask = val & 0xff;
224
- imx_phy_update_irq(s);
225
- break;
226
- case 17:
227
- case 18:
228
- case 27:
229
- case 31:
230
- qemu_log_mask(LOG_UNIMP, "[%s.phy)%s: reg %d not implemented\n",
231
- TYPE_IMX_FEC, __func__, reg);
232
- break;
233
- default:
234
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
235
- TYPE_IMX_FEC, __func__, reg);
236
- break;
237
- }
238
+ lan9118_phy_write(&s->mii, reg, val);
239
}
240
241
static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
242
@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
243
244
s->rx_descriptor = 0;
245
memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
246
-
247
- /* We also reset the PHY */
248
- imx_phy_reset(s);
249
}
250
251
static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
252
@@ -XXX,XX +XXX,XX @@ static void imx_eth_realize(DeviceState *dev, Error **errp)
253
sysbus_init_irq(sbd, &s->irq[0]);
254
sysbus_init_irq(sbd, &s->irq[1]);
255
256
+ qemu_init_irq(&s->mii_irq, imx_phy_update_irq, s, 0);
257
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
258
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
259
+ return;
260
+ }
261
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
262
+
263
qemu_macaddr_default_if_unset(&s->conf.macaddr);
264
265
s->nic = qemu_new_nic(&imx_eth_net_info, &s->conf,
266
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
267
index XXXXXXX..XXXXXXX 100644
268
--- a/hw/net/lan9118_phy.c
269
+++ b/hw/net/lan9118_phy.c
270
@@ -XXX,XX +XXX,XX @@
271
* Copyright (c) 2009 CodeSourcery, LLC.
272
* Written by Paul Brook
273
*
274
+ * Copyright (c) 2013 Jean-Christophe Dubois. <jcd@tribudubois.net>
275
+ *
276
* This code is licensed under the GNU GPL v2
277
*
278
* Contributions after 2012-01-13 are licensed under the terms of the
279
@@ -XXX,XX +XXX,XX @@
280
#include "hw/resettable.h"
281
#include "migration/vmstate.h"
282
#include "qemu/log.h"
283
+#include "trace.h"
284
285
#define PHY_INT_ENERGYON (1 << 7)
286
#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
287
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
288
289
switch (reg) {
290
case 0: /* Basic Control */
291
- return s->control;
292
+ val = s->control;
293
+ break;
294
case 1: /* Basic Status */
295
- return s->status;
296
+ val = s->status;
297
+ break;
298
case 2: /* ID1 */
299
- return 0x0007;
300
+ val = 0x0007;
301
+ break;
302
case 3: /* ID2 */
303
- return 0xc0d1;
304
+ val = 0xc0d1;
305
+ break;
306
case 4: /* Auto-neg advertisement */
307
- return s->advertise;
308
+ val = s->advertise;
309
+ break;
310
case 5: /* Auto-neg Link Partner Ability */
311
- return 0x0f71;
312
+ val = 0x0f71;
313
+ break;
314
case 6: /* Auto-neg Expansion */
315
- return 1;
316
- /* TODO 17, 18, 27, 29, 30, 31 */
317
+ val = 1;
318
+ break;
319
case 29: /* Interrupt source. */
320
val = s->ints;
321
s->ints = 0;
322
lan9118_phy_update_irq(s);
323
- return val;
324
+ break;
325
case 30: /* Interrupt mask */
326
- return s->int_mask;
327
+ val = s->int_mask;
328
+ break;
329
+ case 17:
330
+ case 18:
331
+ case 27:
332
+ case 31:
333
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
334
+ __func__, reg);
335
+ val = 0;
336
+ break;
337
default:
338
- qemu_log_mask(LOG_GUEST_ERROR,
339
- "lan9118_phy_read: PHY read reg %d\n", reg);
340
- return 0;
341
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
342
+ __func__, reg);
343
+ val = 0;
344
+ break;
345
}
346
+
347
+ trace_lan9118_phy_read(val, reg);
348
+
349
+ return val;
350
}
351
352
void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
353
{
354
+ trace_lan9118_phy_write(val, reg);
355
+
356
switch (reg) {
357
case 0: /* Basic Control */
358
if (val & 0x8000) {
359
lan9118_phy_reset(s);
360
- break;
361
- }
362
- s->control = val & 0x7980;
363
- /* Complete autonegotiation immediately. */
364
- if (val & 0x1000) {
365
- s->status |= 0x0020;
366
+ } else {
367
+ s->control = val & 0x7980;
368
+ /* Complete autonegotiation immediately. */
369
+ if (val & 0x1000) {
370
+ s->status |= 0x0020;
371
+ }
372
}
373
break;
374
case 4: /* Auto-neg advertisement */
375
s->advertise = (val & 0x2d7f) | 0x80;
376
break;
377
- /* TODO 17, 18, 27, 31 */
378
case 30: /* Interrupt mask */
379
s->int_mask = val & 0xff;
380
lan9118_phy_update_irq(s);
381
break;
382
+ case 17:
383
+ case 18:
384
+ case 27:
385
+ case 31:
386
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
387
+ __func__, reg);
388
+ break;
389
default:
390
- qemu_log_mask(LOG_GUEST_ERROR,
391
- "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
392
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
393
+ __func__, reg);
394
+ break;
395
}
396
}
397
398
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
399
400
/* Autonegotiation status mirrors link status. */
401
if (link_down) {
402
+ trace_lan9118_phy_update_link("down");
403
s->status &= ~0x0024;
404
s->ints |= PHY_INT_DOWN;
405
} else {
406
+ trace_lan9118_phy_update_link("up");
407
s->status |= 0x0024;
408
s->ints |= PHY_INT_ENERGYON;
409
s->ints |= PHY_INT_AUTONEG_COMPLETE;
410
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
411
412
void lan9118_phy_reset(Lan9118PhyState *s)
413
{
414
+ trace_lan9118_phy_reset();
415
+
416
s->control = 0x3000;
417
s->status = 0x7809;
418
s->advertise = 0x01e1;
419
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_phy = {
420
.version_id = 1,
421
.minimum_version_id = 1,
422
.fields = (const VMStateField[]) {
423
- VMSTATE_UINT16(control, Lan9118PhyState),
424
VMSTATE_UINT16(status, Lan9118PhyState),
425
+ VMSTATE_UINT16(control, Lan9118PhyState),
426
VMSTATE_UINT16(advertise, Lan9118PhyState),
427
VMSTATE_UINT16(ints, Lan9118PhyState),
428
VMSTATE_UINT16(int_mask, Lan9118PhyState),
429
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
430
index XXXXXXX..XXXXXXX 100644
431
--- a/hw/net/Kconfig
432
+++ b/hw/net/Kconfig
433
@@ -XXX,XX +XXX,XX @@ config ALLWINNER_SUN8I_EMAC
434
435
config IMX_FEC
436
bool
437
+ select LAN9118_PHY
438
439
config CADENCE
440
bool
441
diff --git a/hw/net/trace-events b/hw/net/trace-events
442
index XXXXXXX..XXXXXXX 100644
443
--- a/hw/net/trace-events
444
+++ b/hw/net/trace-events
445
@@ -XXX,XX +XXX,XX @@ allwinner_sun8i_emac_set_link(bool active) "Set link: active=%u"
446
allwinner_sun8i_emac_read(uint64_t offset, uint64_t val) "MMIO read: offset=0x%" PRIx64 " value=0x%" PRIx64
447
allwinner_sun8i_emac_write(uint64_t offset, uint64_t val) "MMIO write: offset=0x%" PRIx64 " value=0x%" PRIx64
448
449
+# lan9118_phy.c
450
+lan9118_phy_read(uint16_t val, int reg) "[0x%02x] -> 0x%04" PRIx16
451
+lan9118_phy_write(uint16_t val, int reg) "[0x%02x] <- 0x%04" PRIx16
452
+lan9118_phy_update_link(const char *s) "%s"
453
+lan9118_phy_reset(void) ""
454
+
455
# lance.c
456
lance_mem_readw(uint64_t addr, uint32_t ret) "addr=0x%"PRIx64"val=0x%04x"
457
lance_mem_writew(uint64_t addr, uint32_t val) "addr=0x%"PRIx64"val=0x%04x"
458
@@ -XXX,XX +XXX,XX @@ i82596_set_multicast(uint16_t count) "Added %d multicast entries"
459
i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
460
461
# imx_fec.c
462
-imx_phy_read(uint32_t val, int phy, int reg) "0x%04"PRIx32" <= phy[%d].reg[%d]"
463
imx_phy_read_num(int phy, int configured) "read request from unconfigured phy %d (configured %d)"
464
-imx_phy_write(uint32_t val, int phy, int reg) "0x%04"PRIx32" => phy[%d].reg[%d]"
465
imx_phy_write_num(int phy, int configured) "write request to unconfigured phy %d (configured %d)"
466
-imx_phy_update_link(const char *s) "%s"
467
-imx_phy_reset(void) ""
468
imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
469
imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
470
imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
471
--
472
2.34.1
diff view generated by jsdifflib
New patch
1
From: Bernhard Beschow <shentey@gmail.com>
1
2
3
Turns 0x70 into 0xe0 (== 0x70 << 1) which adds the missing MII_ANLPAR_TX and
4
fixes the MSB of selector field to be zero, as specified in the datasheet.
5
6
Fixes: 2a424990170b "LAN9118 emulation"
7
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
8
Tested-by: Guenter Roeck <linux@roeck-us.net>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20241102125724.532843-4-shentey@gmail.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
hw/net/lan9118_phy.c | 2 +-
14
1 file changed, 1 insertion(+), 1 deletion(-)
15
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/net/lan9118_phy.c
19
+++ b/hw/net/lan9118_phy.c
20
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
21
val = s->advertise;
22
break;
23
case 5: /* Auto-neg Link Partner Ability */
24
- val = 0x0f71;
25
+ val = 0x0fe1;
26
break;
27
case 6: /* Auto-neg Expansion */
28
val = 1;
29
--
30
2.34.1
diff view generated by jsdifflib
New patch
1
From: Bernhard Beschow <shentey@gmail.com>
1
2
3
Prefer named constants over magic values for better readability.
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
7
Tested-by: Guenter Roeck <linux@roeck-us.net>
8
Message-id: 20241102125724.532843-5-shentey@gmail.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
include/hw/net/mii.h | 6 +++++
12
hw/net/lan9118_phy.c | 63 ++++++++++++++++++++++++++++----------------
13
2 files changed, 46 insertions(+), 23 deletions(-)
14
15
diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/include/hw/net/mii.h
18
+++ b/include/hw/net/mii.h
19
@@ -XXX,XX +XXX,XX @@
20
#define MII_BMSR_JABBER (1 << 1) /* Jabber detected */
21
#define MII_BMSR_EXTCAP (1 << 0) /* Ext-reg capability */
22
23
+#define MII_ANAR_RFAULT (1 << 13) /* Say we can detect faults */
24
#define MII_ANAR_PAUSE_ASYM (1 << 11) /* Try for asymmetric pause */
25
#define MII_ANAR_PAUSE (1 << 10) /* Try for pause */
26
#define MII_ANAR_TXFD (1 << 8)
27
@@ -XXX,XX +XXX,XX @@
28
#define MII_ANAR_10FD (1 << 6)
29
#define MII_ANAR_10 (1 << 5)
30
#define MII_ANAR_CSMACD (1 << 0)
31
+#define MII_ANAR_SELECT (0x001f) /* Selector bits */
32
33
#define MII_ANLPAR_ACK (1 << 14)
34
#define MII_ANLPAR_PAUSEASY (1 << 11) /* can pause asymmetrically */
35
@@ -XXX,XX +XXX,XX @@
36
#define RTL8201CP_PHYID1 0x0000
37
#define RTL8201CP_PHYID2 0x8201
38
39
+/* SMSC LAN9118 */
40
+#define SMSCLAN9118_PHYID1 0x0007
41
+#define SMSCLAN9118_PHYID2 0xc0d1
42
+
43
/* RealTek 8211E */
44
#define RTL8211E_PHYID1 0x001c
45
#define RTL8211E_PHYID2 0xc915
46
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/hw/net/lan9118_phy.c
49
+++ b/hw/net/lan9118_phy.c
50
@@ -XXX,XX +XXX,XX @@
51
52
#include "qemu/osdep.h"
53
#include "hw/net/lan9118_phy.h"
54
+#include "hw/net/mii.h"
55
#include "hw/irq.h"
56
#include "hw/resettable.h"
57
#include "migration/vmstate.h"
58
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
59
uint16_t val;
60
61
switch (reg) {
62
- case 0: /* Basic Control */
63
+ case MII_BMCR:
64
val = s->control;
65
break;
66
- case 1: /* Basic Status */
67
+ case MII_BMSR:
68
val = s->status;
69
break;
70
- case 2: /* ID1 */
71
- val = 0x0007;
72
+ case MII_PHYID1:
73
+ val = SMSCLAN9118_PHYID1;
74
break;
75
- case 3: /* ID2 */
76
- val = 0xc0d1;
77
+ case MII_PHYID2:
78
+ val = SMSCLAN9118_PHYID2;
79
break;
80
- case 4: /* Auto-neg advertisement */
81
+ case MII_ANAR:
82
val = s->advertise;
83
break;
84
- case 5: /* Auto-neg Link Partner Ability */
85
- val = 0x0fe1;
86
+ case MII_ANLPAR:
87
+ val = MII_ANLPAR_PAUSEASY | MII_ANLPAR_PAUSE | MII_ANLPAR_T4 |
88
+ MII_ANLPAR_TXFD | MII_ANLPAR_TX | MII_ANLPAR_10FD |
89
+ MII_ANLPAR_10 | MII_ANLPAR_CSMACD;
90
break;
91
- case 6: /* Auto-neg Expansion */
92
- val = 1;
93
+ case MII_ANER:
94
+ val = MII_ANER_NWAY;
95
break;
96
case 29: /* Interrupt source. */
97
val = s->ints;
98
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
99
trace_lan9118_phy_write(val, reg);
100
101
switch (reg) {
102
- case 0: /* Basic Control */
103
- if (val & 0x8000) {
104
+ case MII_BMCR:
105
+ if (val & MII_BMCR_RESET) {
106
lan9118_phy_reset(s);
107
} else {
108
- s->control = val & 0x7980;
109
+ s->control = val & (MII_BMCR_LOOPBACK | MII_BMCR_SPEED100 |
110
+ MII_BMCR_AUTOEN | MII_BMCR_PDOWN | MII_BMCR_FD |
111
+ MII_BMCR_CTST);
112
/* Complete autonegotiation immediately. */
113
- if (val & 0x1000) {
114
- s->status |= 0x0020;
115
+ if (val & MII_BMCR_AUTOEN) {
116
+ s->status |= MII_BMSR_AN_COMP;
117
}
118
}
119
break;
120
- case 4: /* Auto-neg advertisement */
121
- s->advertise = (val & 0x2d7f) | 0x80;
122
+ case MII_ANAR:
123
+ s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
124
+ MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
125
+ MII_ANAR_SELECT))
126
+ | MII_ANAR_TX;
127
break;
128
case 30: /* Interrupt mask */
129
s->int_mask = val & 0xff;
130
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
131
/* Autonegotiation status mirrors link status. */
132
if (link_down) {
133
trace_lan9118_phy_update_link("down");
134
- s->status &= ~0x0024;
135
+ s->status &= ~(MII_BMSR_AN_COMP | MII_BMSR_LINK_ST);
136
s->ints |= PHY_INT_DOWN;
137
} else {
138
trace_lan9118_phy_update_link("up");
139
- s->status |= 0x0024;
140
+ s->status |= MII_BMSR_AN_COMP | MII_BMSR_LINK_ST;
141
s->ints |= PHY_INT_ENERGYON;
142
s->ints |= PHY_INT_AUTONEG_COMPLETE;
143
}
144
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_reset(Lan9118PhyState *s)
145
{
146
trace_lan9118_phy_reset();
147
148
- s->control = 0x3000;
149
- s->status = 0x7809;
150
- s->advertise = 0x01e1;
151
+ s->control = MII_BMCR_AUTOEN | MII_BMCR_SPEED100;
152
+ s->status = MII_BMSR_100TX_FD
153
+ | MII_BMSR_100TX_HD
154
+ | MII_BMSR_10T_FD
155
+ | MII_BMSR_10T_HD
156
+ | MII_BMSR_AUTONEG
157
+ | MII_BMSR_EXTCAP;
158
+ s->advertise = MII_ANAR_TXFD
159
+ | MII_ANAR_TX
160
+ | MII_ANAR_10FD
161
+ | MII_ANAR_10
162
+ | MII_ANAR_CSMACD;
163
s->int_mask = 0;
164
s->ints = 0;
165
lan9118_phy_update_link(s, s->link_down);
166
--
167
2.34.1
diff view generated by jsdifflib
New patch
1
From: Bernhard Beschow <shentey@gmail.com>
1
2
3
The real device advertises this mode and the device model already advertises
4
100 mbps half duplex and 10 mbps full+half duplex. So advertise this mode to
5
make the model more realistic.
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
9
Tested-by: Guenter Roeck <linux@roeck-us.net>
10
Message-id: 20241102125724.532843-6-shentey@gmail.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
hw/net/lan9118_phy.c | 4 ++--
14
1 file changed, 2 insertions(+), 2 deletions(-)
15
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/net/lan9118_phy.c
19
+++ b/hw/net/lan9118_phy.c
20
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
21
break;
22
case MII_ANAR:
23
s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
24
- MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
25
- MII_ANAR_SELECT))
26
+ MII_ANAR_PAUSE | MII_ANAR_TXFD | MII_ANAR_10FD |
27
+ MII_ANAR_10 | MII_ANAR_SELECT))
28
| MII_ANAR_TX;
29
break;
30
case 30: /* Interrupt mask */
31
--
32
2.34.1
diff view generated by jsdifflib
1
Implement the MVE VADDV insn, which performs an addition
1
For IEEE fused multiply-add, the (0 * inf) + NaN case should raise
2
across vector lanes.
2
Invalid for the multiplication of 0 by infinity. Currently we handle
3
this in the per-architecture ifdef ladder in pickNaNMulAdd().
4
However, since this isn't really architecture specific we can hoist
5
it up to the generic code.
6
7
For the cases where the infzero test in pickNaNMulAdd was
8
returning 2, we can delete the check entirely and allow the
9
code to fall into the normal pick-a-NaN handling, because this
10
will return 2 anyway (input 'c' being the only NaN in this case).
11
For the cases where infzero was returning 3 to indicate "return
12
the default NaN", we must retain that "return 3".
13
14
For Arm, this looks like it might be a behaviour change because we
15
used to set float_flag_invalid | float_flag_invalid_imz only if C is
16
a quiet NaN. However, it is not, because Arm target code never looks
17
at float_flag_invalid_imz, and for the (0 * inf) + SNaN case we
18
already raised float_flag_invalid via the "abc_mask &
19
float_cmask_snan" check in pick_nan_muladd.
20
21
For any target architecture using the "default implementation" at the
22
bottom of the ifdef, this is a behaviour change but will be fixing a
23
bug (where we failed to raise the Invalid exception for (0 * inf +
24
QNaN). The architectures using the default case are:
25
* hppa
26
* i386
27
* sh4
28
* tricore
29
30
The x86, Tricore and SH4 CPU architecture manuals are clear that this
31
should have raised Invalid; HPPA is a bit vaguer but still seems
32
clear enough.
3
33
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
34
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
35
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-44-peter.maydell@linaro.org
36
Message-id: 20241202131347.498124-2-peter.maydell@linaro.org
7
---
37
---
8
target/arm/helper-mve.h | 7 +++++++
38
fpu/softfloat-parts.c.inc | 13 +++++++------
9
target/arm/mve.decode | 2 ++
39
fpu/softfloat-specialize.c.inc | 29 +----------------------------
10
target/arm/mve_helper.c | 24 +++++++++++++++++++++
40
2 files changed, 8 insertions(+), 34 deletions(-)
11
target/arm/translate-mve.c | 43 ++++++++++++++++++++++++++++++++++++++
12
4 files changed, 76 insertions(+)
13
41
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
42
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
44
--- a/fpu/softfloat-parts.c.inc
17
+++ b/target/arm/helper-mve.h
45
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrmlaldavhuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
46
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
19
47
int ab_mask, int abc_mask)
20
DEF_HELPER_FLAGS_4(mve_vrmlsldavhsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
48
{
21
DEF_HELPER_FLAGS_4(mve_vrmlsldavhxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
49
int which;
22
+
50
+ bool infzero = (ab_mask == float_cmask_infzero);
23
+DEF_HELPER_FLAGS_3(mve_vaddvsb, TCG_CALL_NO_WG, i32, env, ptr, i32)
51
24
+DEF_HELPER_FLAGS_3(mve_vaddvub, TCG_CALL_NO_WG, i32, env, ptr, i32)
52
if (unlikely(abc_mask & float_cmask_snan)) {
25
+DEF_HELPER_FLAGS_3(mve_vaddvsh, TCG_CALL_NO_WG, i32, env, ptr, i32)
53
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
26
+DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
54
}
27
+DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
55
28
+DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
56
- which = pickNaNMulAdd(a->cls, b->cls, c->cls,
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
57
- ab_mask == float_cmask_infzero, s);
30
index XXXXXXX..XXXXXXX 100644
58
+ if (infzero) {
31
--- a/target/arm/mve.decode
59
+ /* This is (0 * inf) + NaN or (inf * 0) + NaN */
32
+++ b/target/arm/mve.decode
60
+ float_raise(float_flag_invalid | float_flag_invalid_imz, s);
33
@@ -XXX,XX +XXX,XX @@ VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
34
VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
35
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
36
37
+# Vector add across vector
38
+VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
39
40
# Predicate operations
41
%mask_22_13 22:1 13:3
42
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/mve_helper.c
45
+++ b/target/arm/mve_helper.c
46
@@ -XXX,XX +XXX,XX @@ DO_LDAVH(vrmlaldavhuw, 4, uint32_t, false, int128_add, int128_add, int128_make64
47
48
DO_LDAVH(vrmlsldavhsw, 4, int32_t, false, int128_add, int128_sub, int128_makes64)
49
DO_LDAVH(vrmlsldavhxsw, 4, int32_t, true, int128_add, int128_sub, int128_makes64)
50
+
51
+/* Vector add across vector */
52
+#define DO_VADDV(OP, ESIZE, TYPE) \
53
+ uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \
54
+ uint32_t ra) \
55
+ { \
56
+ uint16_t mask = mve_element_mask(env); \
57
+ unsigned e; \
58
+ TYPE *m = vm; \
59
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
60
+ if (mask & 1) { \
61
+ ra += m[H##ESIZE(e)]; \
62
+ } \
63
+ } \
64
+ mve_advance_vpt(env); \
65
+ return ra; \
66
+ } \
67
+
68
+DO_VADDV(vaddvsb, 1, uint8_t)
69
+DO_VADDV(vaddvsh, 2, uint16_t)
70
+DO_VADDV(vaddvsw, 4, uint32_t)
71
+DO_VADDV(vaddvub, 1, uint8_t)
72
+DO_VADDV(vaddvuh, 2, uint16_t)
73
+DO_VADDV(vaddvuw, 4, uint32_t)
74
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
75
index XXXXXXX..XXXXXXX 100644
76
--- a/target/arm/translate-mve.c
77
+++ b/target/arm/translate-mve.c
78
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
79
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
80
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
81
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
82
+typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
83
84
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
85
static inline long mve_qreg_offset(unsigned reg)
86
@@ -XXX,XX +XXX,XX @@ static bool trans_VPST(DisasContext *s, arg_VPST *a)
87
mve_update_and_store_eci(s);
88
return true;
89
}
90
+
91
+static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
92
+{
93
+ /* VADDV: vector add across vector */
94
+ static MVEGenVADDVFn * const fns[4][2] = {
95
+ { gen_helper_mve_vaddvsb, gen_helper_mve_vaddvub },
96
+ { gen_helper_mve_vaddvsh, gen_helper_mve_vaddvuh },
97
+ { gen_helper_mve_vaddvsw, gen_helper_mve_vaddvuw },
98
+ { NULL, NULL }
99
+ };
100
+ TCGv_ptr qm;
101
+ TCGv_i32 rda;
102
+
103
+ if (!dc_isar_feature(aa32_mve, s) ||
104
+ a->size == 3) {
105
+ return false;
106
+ }
107
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
108
+ return true;
109
+ }
61
+ }
110
+
62
+
111
+ /*
63
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
112
+ * This insn is subject to beat-wise execution. Partial execution
64
113
+ * of an A=0 (no-accumulate) insn which does not execute the first
65
if (s->default_nan_mode || which == 3) {
114
+ * beat must start with the current value of Rda, not zero.
66
- /*
115
+ */
67
- * Note that this check is after pickNaNMulAdd so that function
116
+ if (a->a || mve_skip_first_beat(s)) {
68
- * has an opportunity to set the Invalid flag for infzero.
117
+ /* Accumulate input from Rda */
69
- */
118
+ rda = load_reg(s, a->rda);
70
parts_default_nan(a, s);
119
+ } else {
71
return a;
120
+ /* Accumulate starting at zero */
72
}
121
+ rda = tcg_const_i32(0);
73
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
122
+ }
74
index XXXXXXX..XXXXXXX 100644
75
--- a/fpu/softfloat-specialize.c.inc
76
+++ b/fpu/softfloat-specialize.c.inc
77
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
78
* the default NaN
79
*/
80
if (infzero && is_qnan(c_cls)) {
81
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
82
return 3;
83
}
84
85
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
86
* case sets InvalidOp and returns the default NaN
87
*/
88
if (infzero) {
89
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
90
return 3;
91
}
92
/* Prefer sNaN over qNaN, in the a, b, c order. */
93
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
94
* For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
95
* case sets InvalidOp and returns the input value 'c'
96
*/
97
- if (infzero) {
98
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
99
- return 2;
100
- }
101
/* Prefer sNaN over qNaN, in the c, a, b order. */
102
if (is_snan(c_cls)) {
103
return 2;
104
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
105
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
106
* case sets InvalidOp and returns the input value 'c'
107
*/
108
- if (infzero) {
109
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
110
- return 2;
111
- }
123
+
112
+
124
+ qm = mve_qreg_ptr(a->qm);
113
/* Prefer sNaN over qNaN, in the c, a, b order. */
125
+ fns[a->size][a->u](rda, cpu_env, qm, rda);
114
if (is_snan(c_cls)) {
126
+ store_reg(s, a->rda, rda);
115
return 2;
127
+ tcg_temp_free_ptr(qm);
116
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
128
+
117
* to return an input NaN if we have one (ie c) rather than generating
129
+ mve_update_eci(s);
118
* a default NaN
130
+ return true;
119
*/
131
+}
120
- if (infzero) {
121
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
122
- return 2;
123
- }
124
125
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
126
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
127
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
128
return 1;
129
}
130
#elif defined(TARGET_RISCV)
131
- /* For RISC-V, InvalidOp is set when multiplicands are Inf and zero */
132
- if (infzero) {
133
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
134
- }
135
return 3; /* default NaN */
136
#elif defined(TARGET_S390X)
137
if (infzero) {
138
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
139
return 3;
140
}
141
142
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
143
return 2;
144
}
145
#elif defined(TARGET_SPARC)
146
- /* For (inf,0,nan) return c. */
147
- if (infzero) {
148
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
149
- return 2;
150
- }
151
/* Prefer SNaN over QNaN, order C, B, A. */
152
if (is_snan(c_cls)) {
153
return 2;
154
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
155
* For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
156
* an input NaN if we have one (ie c).
157
*/
158
- if (infzero) {
159
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
160
- return 2;
161
- }
162
if (status->use_first_nan) {
163
if (is_nan(a_cls)) {
164
return 0;
132
--
165
--
133
2.20.1
166
2.34.1
134
135
diff view generated by jsdifflib
New patch
1
If the target sets default_nan_mode then we're always going to return
2
the default NaN, and pickNaNMulAdd() no longer has any side effects.
3
For consistency with pickNaN(), check for default_nan_mode before
4
calling pickNaNMulAdd().
1
5
6
When we convert pickNaNMulAdd() to allow runtime selection of the NaN
7
propagation rule, this means we won't have to make the targets which
8
use default_nan_mode also set a propagation rule.
9
10
Since RiscV always uses default_nan_mode, this allows us to remove
11
its ifdef case from pickNaNMulAdd().
12
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20241202131347.498124-3-peter.maydell@linaro.org
16
---
17
fpu/softfloat-parts.c.inc | 8 ++++++--
18
fpu/softfloat-specialize.c.inc | 9 +++++++--
19
2 files changed, 13 insertions(+), 4 deletions(-)
20
21
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
22
index XXXXXXX..XXXXXXX 100644
23
--- a/fpu/softfloat-parts.c.inc
24
+++ b/fpu/softfloat-parts.c.inc
25
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
26
float_raise(float_flag_invalid | float_flag_invalid_imz, s);
27
}
28
29
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
30
+ if (s->default_nan_mode) {
31
+ which = 3;
32
+ } else {
33
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ }
35
36
- if (s->default_nan_mode || which == 3) {
37
+ if (which == 3) {
38
parts_default_nan(a, s);
39
return a;
40
}
41
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
42
index XXXXXXX..XXXXXXX 100644
43
--- a/fpu/softfloat-specialize.c.inc
44
+++ b/fpu/softfloat-specialize.c.inc
45
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
46
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
47
bool infzero, float_status *status)
48
{
49
+ /*
50
+ * We guarantee not to require the target to tell us how to
51
+ * pick a NaN if we're always returning the default NaN.
52
+ * But if we're not in default-NaN mode then the target must
53
+ * specify.
54
+ */
55
+ assert(!status->default_nan_mode);
56
#if defined(TARGET_ARM)
57
/* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
58
* the default NaN
59
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
60
} else {
61
return 1;
62
}
63
-#elif defined(TARGET_RISCV)
64
- return 3; /* default NaN */
65
#elif defined(TARGET_S390X)
66
if (infzero) {
67
return 3;
68
--
69
2.34.1
diff view generated by jsdifflib
1
Implement the forms of the MVE VLDR and VSTR insns which perform
1
IEEE 758 does not define a fixed rule for what NaN to return in
2
non-widening loads of bytes, halfwords or words from memory into
2
the case of a fused multiply-add of inf * 0 + NaN. Different
3
vector elements of the same width (encodings T5, T6, T7).
3
architectures thus do different things:
4
4
* some return the default NaN
5
(At the moment we know for MVE and M-profile in general that
5
* some return the input NaN
6
vfp_access_check() can never return false, but we include the
6
* Arm returns the default NaN if the input NaN is quiet,
7
conventional return-true-on-failure check for consistency
7
and the input NaN if it is signalling
8
with non-M-profile translation code.)
8
9
We want to make this logic be runtime selected rather than
10
hardcoded into the binary, because:
11
* this will let us have multiple targets in one QEMU binary
12
* the Arm FEAT_AFP architectural feature includes letting
13
the guest select a NaN propagation rule at runtime
14
15
In this commit we add an enum for the propagation rule, the field in
16
float_status, and the corresponding getters and setters. We change
17
pickNaNMulAdd to honour this, but because all targets still leave
18
this field at its default 0 value, the fallback logic will pick the
19
rule type with the old ifdef ladder.
20
21
Note that four architectures both use the muladd softfloat functions
22
and did not have a branch of the ifdef ladder to specify their
23
behaviour (and so were ending up with the "default" case, probably
24
wrongly): i386, HPPA, SH4 and Tricore. SH4 and Tricore both set
25
default_nan_mode, and so will never get into pickNaNMulAdd(). For
26
HPPA and i386 we retain the same behaviour as the old default-case,
27
which is to not ever return the default NaN. This might not be
28
correct but it is not a behaviour change.
9
29
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
31
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20210617121628.20116-2-peter.maydell@linaro.org
32
Message-id: 20241202131347.498124-4-peter.maydell@linaro.org
13
---
33
---
14
target/arm/{translate-mve.c => helper-mve.h} | 19 +-
34
include/fpu/softfloat-helpers.h | 11 ++++
15
target/arm/helper.h | 2 +
35
include/fpu/softfloat-types.h | 23 +++++++++
16
target/arm/internals.h | 11 ++
36
fpu/softfloat-specialize.c.inc | 91 ++++++++++++++++++++++-----------
17
target/arm/mve.decode | 22 +++
37
3 files changed, 95 insertions(+), 30 deletions(-)
18
target/arm/mve_helper.c | 172 +++++++++++++++++++
38
19
target/arm/translate-mve.c | 119 +++++++++++++
39
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
20
target/arm/meson.build | 1 +
21
7 files changed, 334 insertions(+), 12 deletions(-)
22
copy target/arm/{translate-mve.c => helper-mve.h} (61%)
23
create mode 100644 target/arm/mve_helper.c
24
25
diff --git a/target/arm/translate-mve.c b/target/arm/helper-mve.h
26
similarity index 61%
27
copy from target/arm/translate-mve.c
28
copy to target/arm/helper-mve.h
29
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-mve.c
41
--- a/include/fpu/softfloat-helpers.h
31
+++ b/target/arm/helper-mve.h
42
+++ b/include/fpu/softfloat-helpers.h
32
@@ -XXX,XX +XXX,XX @@
43
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
44
status->float_2nan_prop_rule = rule;
45
}
46
47
+static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
48
+ float_status *status)
49
+{
50
+ status->float_infzeronan_rule = rule;
51
+}
52
+
53
static inline void set_flush_to_zero(bool val, float_status *status)
54
{
55
status->flush_to_zero = val;
56
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
57
return status->float_2nan_prop_rule;
58
}
59
60
+static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
61
+{
62
+ return status->float_infzeronan_rule;
63
+}
64
+
65
static inline bool get_flush_to_zero(float_status *status)
66
{
67
return status->flush_to_zero;
68
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
69
index XXXXXXX..XXXXXXX 100644
70
--- a/include/fpu/softfloat-types.h
71
+++ b/include/fpu/softfloat-types.h
72
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
73
float_2nan_prop_x87,
74
} Float2NaNPropRule;
75
76
+/*
77
+ * Rule for result of fused multiply-add 0 * Inf + NaN.
78
+ * This must be a NaN, but implementations differ on whether this
79
+ * is the input NaN or the default NaN.
80
+ *
81
+ * You don't need to set this if default_nan_mode is enabled.
82
+ * When not in default-NaN mode, it is an error for the target
83
+ * not to set the rule in float_status if it uses muladd, and we
84
+ * will assert if we need to handle an input NaN and no rule was
85
+ * selected.
86
+ */
87
+typedef enum __attribute__((__packed__)) {
88
+ /* No propagation rule specified */
89
+ float_infzeronan_none = 0,
90
+ /* Result is never the default NaN (so always the input NaN) */
91
+ float_infzeronan_dnan_never,
92
+ /* Result is always the default NaN */
93
+ float_infzeronan_dnan_always,
94
+ /* Result is the default NaN if the input NaN is quiet */
95
+ float_infzeronan_dnan_if_qnan,
96
+} FloatInfZeroNaNRule;
97
+
33
/*
98
/*
34
- * ARM translation: M-profile MVE instructions
99
* Floating Point Status. Individual architectures may maintain
35
+ * M-profile MVE specific helper definitions
100
* several versions of float_status for different functions. The
36
*
101
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
37
* Copyright (c) 2021 Linaro, Ltd.
102
FloatRoundMode float_rounding_mode;
38
*
103
FloatX80RoundPrec floatx80_rounding_precision;
39
@@ -XXX,XX +XXX,XX @@
104
Float2NaNPropRule float_2nan_prop_rule;
40
* You should have received a copy of the GNU Lesser General Public
105
+ FloatInfZeroNaNRule float_infzeronan_rule;
41
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
106
bool tininess_before_rounding;
42
*/
107
/* should denormalised results go to zero and set the inexact flag? */
43
-
108
bool flush_to_zero;
44
-#include "qemu/osdep.h"
109
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
45
-#include "tcg/tcg-op.h"
46
-#include "tcg/tcg-op-gvec.h"
47
-#include "exec/exec-all.h"
48
-#include "exec/gen-icount.h"
49
-#include "translate.h"
50
-#include "translate-a32.h"
51
-
52
-/* Include the generated decoder */
53
-#include "decode-mve.c.inc"
54
+DEF_HELPER_FLAGS_3(mve_vldrb, TCG_CALL_NO_WG, void, env, ptr, i32)
55
+DEF_HELPER_FLAGS_3(mve_vldrh, TCG_CALL_NO_WG, void, env, ptr, i32)
56
+DEF_HELPER_FLAGS_3(mve_vldrw, TCG_CALL_NO_WG, void, env, ptr, i32)
57
+DEF_HELPER_FLAGS_3(mve_vstrb, TCG_CALL_NO_WG, void, env, ptr, i32)
58
+DEF_HELPER_FLAGS_3(mve_vstrh, TCG_CALL_NO_WG, void, env, ptr, i32)
59
+DEF_HELPER_FLAGS_3(mve_vstrw, TCG_CALL_NO_WG, void, env, ptr, i32)
60
diff --git a/target/arm/helper.h b/target/arm/helper.h
61
index XXXXXXX..XXXXXXX 100644
110
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/helper.h
111
--- a/fpu/softfloat-specialize.c.inc
63
+++ b/target/arm/helper.h
112
+++ b/fpu/softfloat-specialize.c.inc
64
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_bfmlal_idx, TCG_CALL_NO_RWG,
113
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
65
#include "helper-a64.h"
114
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
66
#include "helper-sve.h"
115
bool infzero, float_status *status)
67
#endif
116
{
68
+
117
+ FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
69
+#include "helper-mve.h"
118
+
70
diff --git a/target/arm/internals.h b/target/arm/internals.h
119
/*
71
index XXXXXXX..XXXXXXX 100644
120
* We guarantee not to require the target to tell us how to
72
--- a/target/arm/internals.h
121
* pick a NaN if we're always returning the default NaN.
73
+++ b/target/arm/internals.h
122
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
74
@@ -XXX,XX +XXX,XX @@ static inline uint64_t useronly_maybe_clean_ptr(uint32_t desc, uint64_t ptr)
123
* specify.
75
return ptr;
124
*/
76
}
125
assert(!status->default_nan_mode);
77
126
+
78
+/* Values for M-profile PSR.ECI for MVE insns */
127
+ if (rule == float_infzeronan_none) {
79
+enum MVEECIState {
128
+ /*
80
+ ECI_NONE = 0, /* No completed beats */
129
+ * Temporarily fall back to ifdef ladder
81
+ ECI_A0 = 1, /* Completed: A0 */
130
+ */
82
+ ECI_A0A1 = 2, /* Completed: A0, A1 */
131
#if defined(TARGET_ARM)
83
+ /* 3 is reserved */
132
- /* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
84
+ ECI_A0A1A2 = 4, /* Completed: A0, A1, A2 */
133
- * the default NaN
85
+ ECI_A0A1A2B0 = 5, /* Completed: A0, A1, A2, B0 */
134
- */
86
+ /* All other values reserved */
135
- if (infzero && is_qnan(c_cls)) {
87
+};
136
- return 3;
88
+
137
+ /*
89
#endif
138
+ * For ARM, the (inf,zero,qnan) case returns the default NaN,
90
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
139
+ * but (inf,zero,snan) returns the input NaN.
91
index XXXXXXX..XXXXXXX 100644
140
+ */
92
--- a/target/arm/mve.decode
141
+ rule = float_infzeronan_dnan_if_qnan;
93
+++ b/target/arm/mve.decode
142
+#elif defined(TARGET_MIPS)
94
@@ -XXX,XX +XXX,XX @@
143
+ if (snan_bit_is_one(status)) {
95
#
144
+ /*
96
# This file is processed by scripts/decodetree.py
145
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
97
#
146
+ * case sets InvalidOp and returns the default NaN
98
+
147
+ */
99
+%qd 22:1 13:3
148
+ rule = float_infzeronan_dnan_always;
100
+
149
+ } else {
101
+&vldr_vstr rn qd imm p a w size l
150
+ /*
102
+
151
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
103
+@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd
152
+ * case sets InvalidOp and returns the input value 'c'
104
+
153
+ */
105
+# Vector loads and stores
154
+ rule = float_infzeronan_dnan_never;
106
+
155
+ }
107
+# Non-widening loads/stores (P=0 W=0 is 'related encoding')
156
+#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
108
+VLDR_VSTR 1110110 0 a:1 . 1 . .... ... 111100 ....... @vldr_vstr \
157
+ defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
109
+ size=0 p=0 w=1
158
+ defined(TARGET_I386) || defined(TARGET_LOONGARCH)
110
+VLDR_VSTR 1110110 0 a:1 . 1 . .... ... 111101 ....... @vldr_vstr \
159
+ /*
111
+ size=1 p=0 w=1
160
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
112
+VLDR_VSTR 1110110 0 a:1 . 1 . .... ... 111110 ....... @vldr_vstr \
161
+ * case sets InvalidOp and returns the input value 'c'
113
+ size=2 p=0 w=1
162
+ */
114
+VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111100 ....... @vldr_vstr \
163
+ /*
115
+ size=0 p=1
164
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
116
+VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \
165
+ * to return an input NaN if we have one (ie c) rather than generating
117
+ size=1 p=1
166
+ * a default NaN
118
+VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
167
+ */
119
+ size=2 p=1
168
+ rule = float_infzeronan_dnan_never;
120
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
169
+#elif defined(TARGET_S390X)
121
new file mode 100644
170
+ rule = float_infzeronan_dnan_always;
122
index XXXXXXX..XXXXXXX
171
+#endif
123
--- /dev/null
172
}
124
+++ b/target/arm/mve_helper.c
173
125
@@ -XXX,XX +XXX,XX @@
174
+ if (infzero) {
126
+/*
175
+ /*
127
+ * M-profile MVE Operations
176
+ * Inf * 0 + NaN -- some implementations return the default NaN here,
128
+ *
177
+ * and some return the input NaN.
129
+ * Copyright (c) 2021 Linaro, Ltd.
178
+ */
130
+ *
179
+ switch (rule) {
131
+ * This library is free software; you can redistribute it and/or
180
+ case float_infzeronan_dnan_never:
132
+ * modify it under the terms of the GNU Lesser General Public
181
+ return 2;
133
+ * License as published by the Free Software Foundation; either
182
+ case float_infzeronan_dnan_always:
134
+ * version 2.1 of the License, or (at your option) any later version.
183
+ return 3;
135
+ *
184
+ case float_infzeronan_dnan_if_qnan:
136
+ * This library is distributed in the hope that it will be useful,
185
+ return is_qnan(c_cls) ? 3 : 2;
137
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
138
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
139
+ * Lesser General Public License for more details.
140
+ *
141
+ * You should have received a copy of the GNU Lesser General Public
142
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
143
+ */
144
+
145
+#include "qemu/osdep.h"
146
+#include "cpu.h"
147
+#include "internals.h"
148
+#include "vec_internal.h"
149
+#include "exec/helper-proto.h"
150
+#include "exec/cpu_ldst.h"
151
+#include "exec/exec-all.h"
152
+
153
+static uint16_t mve_element_mask(CPUARMState *env)
154
+{
155
+ /*
156
+ * Return the mask of which elements in the MVE vector should be
157
+ * updated. This is a combination of multiple things:
158
+ * (1) by default, we update every lane in the vector
159
+ * (2) VPT predication stores its state in the VPR register;
160
+ * (3) low-overhead-branch tail predication will mask out part
161
+ * the vector on the final iteration of the loop
162
+ * (4) if EPSR.ECI is set then we must execute only some beats
163
+ * of the insn
164
+ * We combine all these into a 16-bit result with the same semantics
165
+ * as VPR.P0: 0 to mask the lane, 1 if it is active.
166
+ * 8-bit vector ops will look at all bits of the result;
167
+ * 16-bit ops will look at bits 0, 2, 4, ...;
168
+ * 32-bit ops will look at bits 0, 4, 8 and 12.
169
+ * Compare pseudocode GetCurInstrBeat(), though that only returns
170
+ * the 4-bit slice of the mask corresponding to a single beat.
171
+ */
172
+ uint16_t mask = FIELD_EX32(env->v7m.vpr, V7M_VPR, P0);
173
+
174
+ if (!(env->v7m.vpr & R_V7M_VPR_MASK01_MASK)) {
175
+ mask |= 0xff;
176
+ }
177
+ if (!(env->v7m.vpr & R_V7M_VPR_MASK23_MASK)) {
178
+ mask |= 0xff00;
179
+ }
180
+
181
+ if (env->v7m.ltpsize < 4 &&
182
+ env->regs[14] <= (1 << (4 - env->v7m.ltpsize))) {
183
+ /*
184
+ * Tail predication active, and this is the last loop iteration.
185
+ * The element size is (1 << ltpsize), and we only want to process
186
+ * loopcount elements, so we want to retain the least significant
187
+ * (loopcount * esize) predicate bits and zero out bits above that.
188
+ */
189
+ int masklen = env->regs[14] << env->v7m.ltpsize;
190
+ assert(masklen <= 16);
191
+ mask &= MAKE_64BIT_MASK(0, masklen);
192
+ }
193
+
194
+ if ((env->condexec_bits & 0xf) == 0) {
195
+ /*
196
+ * ECI bits indicate which beats are already executed;
197
+ * we handle this by effectively predicating them out.
198
+ */
199
+ int eci = env->condexec_bits >> 4;
200
+ switch (eci) {
201
+ case ECI_NONE:
202
+ break;
203
+ case ECI_A0:
204
+ mask &= 0xfff0;
205
+ break;
206
+ case ECI_A0A1:
207
+ mask &= 0xff00;
208
+ break;
209
+ case ECI_A0A1A2:
210
+ case ECI_A0A1A2B0:
211
+ mask &= 0xf000;
212
+ break;
213
+ default:
186
+ default:
214
+ g_assert_not_reached();
187
+ g_assert_not_reached();
215
+ }
188
+ }
216
+ }
189
+ }
217
+
190
+
218
+ return mask;
191
+#if defined(TARGET_ARM)
219
+}
192
+
220
+
193
/* This looks different from the ARM ARM pseudocode, because the ARM ARM
221
+static void mve_advance_vpt(CPUARMState *env)
194
* puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
222
+{
195
*/
223
+ /* Advance the VPT and ECI state if necessary */
196
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
224
+ uint32_t vpr = env->v7m.vpr;
197
}
225
+ unsigned mask01, mask23;
198
#elif defined(TARGET_MIPS)
226
+
199
if (snan_bit_is_one(status)) {
227
+ if ((env->condexec_bits & 0xf) == 0) {
200
- /*
228
+ env->condexec_bits = (env->condexec_bits == (ECI_A0A1A2B0 << 4)) ?
201
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
229
+ (ECI_A0 << 4) : (ECI_NONE << 4);
202
- * case sets InvalidOp and returns the default NaN
230
+ }
203
- */
231
+
204
- if (infzero) {
232
+ if (!(vpr & (R_V7M_VPR_MASK01_MASK | R_V7M_VPR_MASK23_MASK))) {
205
- return 3;
233
+ /* VPT not enabled, nothing to do */
206
- }
234
+ return;
207
/* Prefer sNaN over qNaN, in the a, b, c order. */
235
+ }
208
if (is_snan(a_cls)) {
236
+
209
return 0;
237
+ mask01 = FIELD_EX32(vpr, V7M_VPR, MASK01);
210
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
238
+ mask23 = FIELD_EX32(vpr, V7M_VPR, MASK23);
211
return 2;
239
+ if (mask01 > 8) {
212
}
240
+ /* high bit set, but not 0b1000: invert the relevant half of P0 */
213
} else {
241
+ vpr ^= 0xff;
214
- /*
242
+ }
215
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
243
+ if (mask23 > 8) {
216
- * case sets InvalidOp and returns the input value 'c'
244
+ /* high bit set, but not 0b1000: invert the relevant half of P0 */
217
- */
245
+ vpr ^= 0xff00;
218
/* Prefer sNaN over qNaN, in the c, a, b order. */
246
+ }
219
if (is_snan(c_cls)) {
247
+ vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1);
220
return 2;
248
+ vpr = FIELD_DP32(vpr, V7M_VPR, MASK23, mask23 << 1);
221
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
249
+ env->v7m.vpr = vpr;
222
}
250
+}
223
}
251
+
224
#elif defined(TARGET_LOONGARCH64)
252
+
225
- /*
253
+#define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \
226
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
254
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
227
- * case sets InvalidOp and returns the input value 'c'
255
+ { \
228
- */
256
+ TYPE *d = vd; \
229
-
257
+ uint16_t mask = mve_element_mask(env); \
230
/* Prefer sNaN over qNaN, in the c, a, b order. */
258
+ unsigned b, e; \
231
if (is_snan(c_cls)) {
259
+ /* \
232
return 2;
260
+ * R_SXTM allows the dest reg to become UNKNOWN for abandoned \
233
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
261
+ * beats so we don't care if we update part of the dest and \
234
return 1;
262
+ * then take an exception. \
235
}
263
+ */ \
236
#elif defined(TARGET_PPC)
264
+ for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
237
- /* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
265
+ if (mask & (1 << b)) { \
238
- * to return an input NaN if we have one (ie c) rather than generating
266
+ d[H##ESIZE(e)] = cpu_##LDTYPE##_data_ra(env, addr, GETPC()); \
239
- * a default NaN
267
+ } \
240
- */
268
+ addr += MSIZE; \
241
-
269
+ } \
242
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
270
+ mve_advance_vpt(env); \
243
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
271
+ }
244
*/
272
+
245
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
273
+#define DO_VSTR(OP, MSIZE, STTYPE, ESIZE, TYPE) \
246
return 1;
274
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
247
}
275
+ { \
248
#elif defined(TARGET_S390X)
276
+ TYPE *d = vd; \
249
- if (infzero) {
277
+ uint16_t mask = mve_element_mask(env); \
250
- return 3;
278
+ unsigned b, e; \
251
- }
279
+ for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
252
-
280
+ if (mask & (1 << b)) { \
253
if (is_snan(a_cls)) {
281
+ cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \
254
return 0;
282
+ } \
255
} else if (is_snan(b_cls)) {
283
+ addr += MSIZE; \
284
+ } \
285
+ mve_advance_vpt(env); \
286
+ }
287
+
288
+DO_VLDR(vldrb, 1, ldub, 1, uint8_t)
289
+DO_VLDR(vldrh, 2, lduw, 2, uint16_t)
290
+DO_VLDR(vldrw, 4, ldl, 4, uint32_t)
291
+
292
+DO_VSTR(vstrb, 1, stb, 1, uint8_t)
293
+DO_VSTR(vstrh, 2, stw, 2, uint16_t)
294
+DO_VSTR(vstrw, 4, stl, 4, uint32_t)
295
+
296
+#undef DO_VLDR
297
+#undef DO_VSTR
298
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
299
index XXXXXXX..XXXXXXX 100644
300
--- a/target/arm/translate-mve.c
301
+++ b/target/arm/translate-mve.c
302
@@ -XXX,XX +XXX,XX @@
303
304
/* Include the generated decoder */
305
#include "decode-mve.c.inc"
306
+
307
+typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
308
+
309
+/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
310
+static inline long mve_qreg_offset(unsigned reg)
311
+{
312
+ return offsetof(CPUARMState, vfp.zregs[reg].d[0]);
313
+}
314
+
315
+static TCGv_ptr mve_qreg_ptr(unsigned reg)
316
+{
317
+ TCGv_ptr ret = tcg_temp_new_ptr();
318
+ tcg_gen_addi_ptr(ret, cpu_env, mve_qreg_offset(reg));
319
+ return ret;
320
+}
321
+
322
+static bool mve_check_qreg_bank(DisasContext *s, int qmask)
323
+{
324
+ /*
325
+ * Check whether Qregs are in range. For v8.1M only Q0..Q7
326
+ * are supported, see VFPSmallRegisterBank().
327
+ */
328
+ return qmask < 8;
329
+}
330
+
331
+static bool mve_eci_check(DisasContext *s)
332
+{
333
+ /*
334
+ * This is a beatwise insn: check that ECI is valid (not a
335
+ * reserved value) and note that we are handling it.
336
+ * Return true if OK, false if we generated an exception.
337
+ */
338
+ s->eci_handled = true;
339
+ switch (s->eci) {
340
+ case ECI_NONE:
341
+ case ECI_A0:
342
+ case ECI_A0A1:
343
+ case ECI_A0A1A2:
344
+ case ECI_A0A1A2B0:
345
+ return true;
346
+ default:
347
+ /* Reserved value: INVSTATE UsageFault */
348
+ gen_exception_insn(s, s->pc_curr, EXCP_INVSTATE, syn_uncategorized(),
349
+ default_exception_el(s));
350
+ return false;
351
+ }
352
+}
353
+
354
+static void mve_update_eci(DisasContext *s)
355
+{
356
+ /*
357
+ * The helper function will always update the CPUState field,
358
+ * so we only need to update the DisasContext field.
359
+ */
360
+ if (s->eci) {
361
+ s->eci = (s->eci == ECI_A0A1A2B0) ? ECI_A0 : ECI_NONE;
362
+ }
363
+}
364
+
365
+static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn)
366
+{
367
+ TCGv_i32 addr;
368
+ uint32_t offset;
369
+ TCGv_ptr qreg;
370
+
371
+ if (!dc_isar_feature(aa32_mve, s) ||
372
+ !mve_check_qreg_bank(s, a->qd) ||
373
+ !fn) {
374
+ return false;
375
+ }
376
+
377
+ /* CONSTRAINED UNPREDICTABLE: we choose to UNDEF */
378
+ if (a->rn == 15 || (a->rn == 13 && a->w)) {
379
+ return false;
380
+ }
381
+
382
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
383
+ return true;
384
+ }
385
+
386
+ offset = a->imm << a->size;
387
+ if (!a->a) {
388
+ offset = -offset;
389
+ }
390
+ addr = load_reg(s, a->rn);
391
+ if (a->p) {
392
+ tcg_gen_addi_i32(addr, addr, offset);
393
+ }
394
+
395
+ qreg = mve_qreg_ptr(a->qd);
396
+ fn(cpu_env, qreg, addr);
397
+ tcg_temp_free_ptr(qreg);
398
+
399
+ /*
400
+ * Writeback always happens after the last beat of the insn,
401
+ * regardless of predication
402
+ */
403
+ if (a->w) {
404
+ if (!a->p) {
405
+ tcg_gen_addi_i32(addr, addr, offset);
406
+ }
407
+ store_reg(s, a->rn, addr);
408
+ } else {
409
+ tcg_temp_free_i32(addr);
410
+ }
411
+ mve_update_eci(s);
412
+ return true;
413
+}
414
+
415
+static bool trans_VLDR_VSTR(DisasContext *s, arg_VLDR_VSTR *a)
416
+{
417
+ static MVEGenLdStFn * const ldstfns[4][2] = {
418
+ { gen_helper_mve_vstrb, gen_helper_mve_vldrb },
419
+ { gen_helper_mve_vstrh, gen_helper_mve_vldrh },
420
+ { gen_helper_mve_vstrw, gen_helper_mve_vldrw },
421
+ { NULL, NULL }
422
+ };
423
+ return do_ldst(s, a, ldstfns[a->size][a->l]);
424
+}
425
diff --git a/target/arm/meson.build b/target/arm/meson.build
426
index XXXXXXX..XXXXXXX 100644
427
--- a/target/arm/meson.build
428
+++ b/target/arm/meson.build
429
@@ -XXX,XX +XXX,XX @@ arm_ss.add(files(
430
'helper.c',
431
'iwmmxt_helper.c',
432
'm_helper.c',
433
+ 'mve_helper.c',
434
'neon_helper.c',
435
'op_helper.c',
436
'tlb_helper.c',
437
--
256
--
438
2.20.1
257
2.34.1
439
440
diff view generated by jsdifflib
New patch
1
Explicitly set a rule in the softfloat tests for the inf-zero-nan
2
muladd special case. In meson.build we put -DTARGET_ARM in fpcflags,
3
and so we should select here the Arm rule of
4
float_infzeronan_dnan_if_qnan.
1
5
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241202131347.498124-5-peter.maydell@linaro.org
9
---
10
tests/fp/fp-bench.c | 5 +++++
11
tests/fp/fp-test.c | 5 +++++
12
2 files changed, 10 insertions(+)
13
14
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/fp/fp-bench.c
17
+++ b/tests/fp/fp-bench.c
18
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
19
{
20
bench_func_t f;
21
22
+ /*
23
+ * These implementation-defined choices for various things IEEE
24
+ * doesn't specify match those used by the Arm architecture.
25
+ */
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
28
29
f = bench_funcs[operation][precision];
30
g_assert(f);
31
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/tests/fp/fp-test.c
34
+++ b/tests/fp/fp-test.c
35
@@ -XXX,XX +XXX,XX @@ void run_test(void)
36
{
37
unsigned int i;
38
39
+ /*
40
+ * These implementation-defined choices for various things IEEE
41
+ * doesn't specify match those used by the Arm architecture.
42
+ */
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
44
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
45
46
genCases_setLevel(test_level);
47
verCases_maxErrorCount = n_max_errors;
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the Arm target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-6-peter.maydell@linaro.org
7
---
8
target/arm/cpu.c | 3 +++
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 4 insertions(+), 7 deletions(-)
11
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
17
* * tininess-before-rounding
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
20
+ * * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
21
+ * and the input NaN if it is signalling
22
*/
23
static void arm_set_default_fp_behaviours(float_status *s)
24
{
25
set_float_detect_tininess(float_tininess_before_rounding, s);
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
28
}
29
30
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
32
index XXXXXXX..XXXXXXX 100644
33
--- a/fpu/softfloat-specialize.c.inc
34
+++ b/fpu/softfloat-specialize.c.inc
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
36
/*
37
* Temporarily fall back to ifdef ladder
38
*/
39
-#if defined(TARGET_ARM)
40
- /*
41
- * For ARM, the (inf,zero,qnan) case returns the default NaN,
42
- * but (inf,zero,snan) returns the input NaN.
43
- */
44
- rule = float_infzeronan_dnan_if_qnan;
45
-#elif defined(TARGET_MIPS)
46
+#if defined(TARGET_MIPS)
47
if (snan_bit_is_one(status)) {
48
/*
49
* For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
50
--
51
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for s390, so we
2
can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-7-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/s390x/cpu.c
15
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
set_float_detect_tininess(float_tininess_before_rounding,
18
&env->fpu_status);
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
20
+ set_float_infzeronan_rule(float_infzeronan_dnan_always,
21
+ &env->fpu_status);
22
/* fall through */
23
case RESET_TYPE_S390_CPU_NORMAL:
24
env->psw.mask &= ~PSW_MASK_RI;
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
* a default NaN
31
*/
32
rule = float_infzeronan_dnan_never;
33
-#elif defined(TARGET_S390X)
34
- rule = float_infzeronan_dnan_always;
35
#endif
36
}
37
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the PPC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-8-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 7 +++++++
9
fpu/softfloat-specialize.c.inc | 7 +------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
22
+ * to return an input NaN if we have one (ie c) rather than generating
23
+ * a default NaN
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
27
28
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
29
ppc_spr_t *spr = &env->spr_cb[i];
30
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
index XXXXXXX..XXXXXXX 100644
32
--- a/fpu/softfloat-specialize.c.inc
33
+++ b/fpu/softfloat-specialize.c.inc
34
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
*/
36
rule = float_infzeronan_dnan_never;
37
}
38
-#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
39
+#elif defined(TARGET_SPARC) || \
40
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
41
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
42
/*
43
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
44
* case sets InvalidOp and returns the input value 'c'
45
*/
46
- /*
47
- * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
48
- * to return an input NaN if we have one (ie c) rather than generating
49
- * a default NaN
50
- */
51
rule = float_infzeronan_dnan_never;
52
#endif
53
}
54
--
55
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the MIPS target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-9-peter.maydell@linaro.org
7
---
8
target/mips/fpu_helper.h | 9 +++++++++
9
target/mips/msa.c | 4 ++++
10
fpu/softfloat-specialize.c.inc | 16 +---------------
11
3 files changed, 14 insertions(+), 15 deletions(-)
12
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/mips/fpu_helper.h
16
+++ b/target/mips/fpu_helper.h
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_flush_mode(CPUMIPSState *env)
18
static inline void restore_snan_bit_mode(CPUMIPSState *env)
19
{
20
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
21
+ FloatInfZeroNaNRule izn_rule;
22
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
set_snan_bit_is_one(!nan2008, &env->active_fpu.fp_status);
28
set_default_nan_mode(!nan2008, &env->active_fpu.fp_status);
29
+ /*
30
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
31
+ * case sets InvalidOp and returns the default NaN.
32
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
33
+ * case sets InvalidOp and returns the input value 'c'.
34
+ */
35
+ izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
36
+ set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
37
}
38
39
static inline void restore_fp_status(CPUMIPSState *env)
40
diff --git a/target/mips/msa.c b/target/mips/msa.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/mips/msa.c
43
+++ b/target/mips/msa.c
44
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
45
46
/* set proper signanling bit meaning ("1" means "quiet") */
47
set_snan_bit_is_one(0, &env->active_tc.msa_fp_status);
48
+
49
+ /* Inf * 0 + NaN returns the input NaN */
50
+ set_float_infzeronan_rule(float_infzeronan_dnan_never,
51
+ &env->active_tc.msa_fp_status);
52
}
53
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
54
index XXXXXXX..XXXXXXX 100644
55
--- a/fpu/softfloat-specialize.c.inc
56
+++ b/fpu/softfloat-specialize.c.inc
57
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
58
/*
59
* Temporarily fall back to ifdef ladder
60
*/
61
-#if defined(TARGET_MIPS)
62
- if (snan_bit_is_one(status)) {
63
- /*
64
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
65
- * case sets InvalidOp and returns the default NaN
66
- */
67
- rule = float_infzeronan_dnan_always;
68
- } else {
69
- /*
70
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
71
- * case sets InvalidOp and returns the input value 'c'
72
- */
73
- rule = float_infzeronan_dnan_never;
74
- }
75
-#elif defined(TARGET_SPARC) || \
76
+#if defined(TARGET_SPARC) || \
77
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
78
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
79
/*
80
--
81
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the SPARC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-10-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 +--
10
2 files changed, 3 insertions(+), 2 deletions(-)
11
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/sparc/cpu.c
15
+++ b/target/sparc/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
17
* the CPU state struct so it won't get zeroed on reset.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
23
cpu_exec_realizefn(cs, &local_err);
24
if (local_err != NULL) {
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_SPARC) || \
34
- defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
35
+#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
36
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
37
/*
38
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
--
40
2.34.1
diff view generated by jsdifflib
1
Implement the MVE VHCADD insn, which is similar to VCADD
1
Set the FloatInfZeroNaNRule explicitly for the xtensa target,
2
but performs a halving step. This one overlaps with VADC.
2
so we can remove the ifdef from pickNaNMulAdd().
3
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-43-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-11-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-mve.h | 8 ++++++++
8
target/xtensa/cpu.c | 2 ++
9
target/arm/mve.decode | 8 ++++++--
9
fpu/softfloat-specialize.c.inc | 2 +-
10
target/arm/mve_helper.c | 2 ++
10
2 files changed, 3 insertions(+), 1 deletion(-)
11
target/arm/translate-mve.c | 4 +++-
12
4 files changed, 19 insertions(+), 3 deletions(-)
13
11
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
14
--- a/target/xtensa/cpu.c
17
+++ b/target/arm/helper-mve.h
15
+++ b/target/xtensa/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vcadd270b, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_4(mve_vcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
reset_mmu(env);
20
DEF_HELPER_FLAGS_4(mve_vcadd270w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
cs->halted = env->runstall;
21
19
#endif
22
+DEF_HELPER_FLAGS_4(mve_vhcadd90b, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
+ /* For inf * 0 + NaN, return the input NaN */
23
+DEF_HELPER_FLAGS_4(mve_vhcadd90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
+DEF_HELPER_FLAGS_4(mve_vhcadd90w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
set_no_signaling_nans(!dfpu, &env->fp_status);
25
+
23
xtensa_use_first_nan(env, !dfpu);
26
+DEF_HELPER_FLAGS_4(mve_vhcadd270b, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
}
27
+DEF_HELPER_FLAGS_4(mve_vhcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
+DEF_HELPER_FLAGS_4(mve_vhcadd270w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+
30
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
27
--- a/fpu/softfloat-specialize.c.inc
36
+++ b/target/arm/mve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
37
@@ -XXX,XX +XXX,XX @@ VQDMULLT 111 . 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 1 @2op_sz28
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
38
VRHADD_S 111 0 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
30
/*
39
VRHADD_U 111 1 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
31
* Temporarily fall back to ifdef ladder
40
32
*/
41
-VADC 1110 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
33
-#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
42
-VADCI 1110 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
34
+#if defined(TARGET_HPPA) || \
43
+{
35
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
44
+ VADC 1110 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
36
/*
45
+ VADCI 1110 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
37
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
46
+ VHCADD90 1110 1110 0 . .. ... 0 ... 0 1111 . 0 . 0 ... 0 @2op
47
+ VHCADD270 1110 1110 0 . .. ... 0 ... 1 1111 . 0 . 0 ... 0 @2op
48
+}
49
50
{
51
VSBC 1111 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/mve_helper.c
55
+++ b/target/arm/mve_helper.c
56
@@ -XXX,XX +XXX,XX @@ void HELPER(mve_vsbci)(CPUARMState *env, void *vd, void *vn, void *vm)
57
58
DO_VCADD_ALL(vcadd90, DO_SUB, DO_ADD)
59
DO_VCADD_ALL(vcadd270, DO_ADD, DO_SUB)
60
+DO_VCADD_ALL(vhcadd90, do_vhsub_s, do_vhadd_s)
61
+DO_VCADD_ALL(vhcadd270, do_vhadd_s, do_vhsub_s)
62
63
static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
64
{
65
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
66
index XXXXXXX..XXXXXXX 100644
67
--- a/target/arm/translate-mve.c
68
+++ b/target/arm/translate-mve.c
69
@@ -XXX,XX +XXX,XX @@ DO_2OP(VRHADD_U, vrhaddu)
70
/*
71
* VCADD Qd == Qm at size MO_32 is UNPREDICTABLE; we choose not to diagnose
72
* so we can reuse the DO_2OP macro. (Our implementation calculates the
73
- * "expected" results in this case.)
74
+ * "expected" results in this case.) Similarly for VHCADD.
75
*/
76
DO_2OP(VCADD90, vcadd90)
77
DO_2OP(VCADD270, vcadd270)
78
+DO_2OP(VHCADD90, vhcadd90)
79
+DO_2OP(VHCADD270, vhcadd270)
80
81
static bool trans_VQDMULLB(DisasContext *s, arg_2op *a)
82
{
83
--
38
--
84
2.20.1
39
2.34.1
85
86
diff view generated by jsdifflib
1
Implement the MVE VRHADD insn, which performs a rounded halving
1
Set the FloatInfZeroNaNRule explicitly for the x86 target.
2
addition.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-40-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-12-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-mve.h | 8 ++++++++
7
target/i386/tcg/fpu_helper.c | 7 +++++++
9
target/arm/mve.decode | 3 +++
8
fpu/softfloat-specialize.c.inc | 2 +-
10
target/arm/mve_helper.c | 6 ++++++
9
2 files changed, 8 insertions(+), 1 deletion(-)
11
target/arm/translate-mve.c | 2 ++
12
4 files changed, 19 insertions(+)
13
10
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
11
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
13
--- a/target/i386/tcg/fpu_helper.c
17
+++ b/target/arm/helper-mve.h
14
+++ b/target/i386/tcg/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqdmullbw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
15
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
19
DEF_HELPER_FLAGS_4(mve_vqdmullth, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
*/
20
DEF_HELPER_FLAGS_4(mve_vqdmulltw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->mmx_status);
21
18
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->sse_status);
22
+DEF_HELPER_FLAGS_4(mve_vrhaddsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
+ /*
23
+DEF_HELPER_FLAGS_4(mve_vrhaddsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
+ * Only SSE has multiply-add instructions. In the SDM Section 14.5.2
24
+DEF_HELPER_FLAGS_4(mve_vrhaddsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ * "Fused-Multiply-ADD (FMA) Numeric Behavior" the NaN handling is
25
+
22
+ * specified -- for 0 * inf + NaN the input NaN is selected, and if
26
+DEF_HELPER_FLAGS_4(mve_vrhaddub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
+ * there are multiple input NaNs they are selected in the order a, b, c.
27
+DEF_HELPER_FLAGS_4(mve_vrhadduh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
+ */
28
+DEF_HELPER_FLAGS_4(mve_vrhadduw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
29
+
26
}
30
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
31
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
static inline uint8_t save_exception_flags(CPUX86State *env)
32
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
31
--- a/fpu/softfloat-specialize.c.inc
36
+++ b/target/arm/mve.decode
32
+++ b/fpu/softfloat-specialize.c.inc
37
@@ -XXX,XX +XXX,XX @@ VQRDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op
33
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
38
VQDMULLB 111 . 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 1 @2op_sz28
34
* Temporarily fall back to ifdef ladder
39
VQDMULLT 111 . 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 1 @2op_sz28
35
*/
40
36
#if defined(TARGET_HPPA) || \
41
+VRHADD_S 111 0 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
37
- defined(TARGET_I386) || defined(TARGET_LOONGARCH)
42
+VRHADD_U 111 1 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
38
+ defined(TARGET_LOONGARCH)
43
+
39
/*
44
# Vector miscellaneous
40
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
45
41
* case sets InvalidOp and returns the input value 'c'
46
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
47
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/mve_helper.c
50
+++ b/target/arm/mve_helper.c
51
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vshlu, DO_VSHLU)
52
DO_2OP_S(vrshls, DO_VRSHLS)
53
DO_2OP_U(vrshlu, DO_VRSHLU)
54
55
+#define DO_RHADD_S(N, M) (((int64_t)(N) + (M) + 1) >> 1)
56
+#define DO_RHADD_U(N, M) (((uint64_t)(N) + (M) + 1) >> 1)
57
+
58
+DO_2OP_S(vrhadds, DO_RHADD_S)
59
+DO_2OP_U(vrhaddu, DO_RHADD_U)
60
+
61
static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
62
{
63
if (val > max) {
64
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
65
index XXXXXXX..XXXXXXX 100644
66
--- a/target/arm/translate-mve.c
67
+++ b/target/arm/translate-mve.c
68
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQDMLSDH, vqdmlsdh)
69
DO_2OP(VQDMLSDHX, vqdmlsdhx)
70
DO_2OP(VQRDMLSDH, vqrdmlsdh)
71
DO_2OP(VQRDMLSDHX, vqrdmlsdhx)
72
+DO_2OP(VRHADD_S, vrhadds)
73
+DO_2OP(VRHADD_U, vrhaddu)
74
75
static bool trans_VQDMULLB(DisasContext *s, arg_2op *a)
76
{
77
--
42
--
78
2.20.1
43
2.34.1
79
80
diff view generated by jsdifflib
1
Implement the vector form of the MVE VQDMULL insn.
1
Set the FloatInfZeroNaNRule explicitly for the loongarch target.
2
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-39-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-13-peter.maydell@linaro.org
6
---
6
---
7
target/arm/helper-mve.h | 5 +++++
7
target/loongarch/tcg/fpu_helper.c | 5 +++++
8
target/arm/mve.decode | 5 +++++
8
fpu/softfloat-specialize.c.inc | 7 +------
9
target/arm/mve_helper.c | 30 ++++++++++++++++++++++++++++++
9
2 files changed, 6 insertions(+), 6 deletions(-)
10
target/arm/translate-mve.c | 30 ++++++++++++++++++++++++++++++
11
4 files changed, 70 insertions(+)
12
10
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
11
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
13
--- a/target/loongarch/tcg/fpu_helper.c
16
+++ b/target/arm/helper-mve.h
14
+++ b/target/loongarch/tcg/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrdmlsdhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
15
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
18
DEF_HELPER_FLAGS_4(mve_vqrdmlsdhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
&env->fp_status);
19
DEF_HELPER_FLAGS_4(mve_vqrdmlsdhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
set_flush_to_zero(0, &env->fp_status);
20
18
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
21
+DEF_HELPER_FLAGS_4(mve_vqdmullbh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
+ /*
22
+DEF_HELPER_FLAGS_4(mve_vqdmullbw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
23
+DEF_HELPER_FLAGS_4(mve_vqdmullth, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ * case sets InvalidOp and returns the input value 'c'
24
+DEF_HELPER_FLAGS_4(mve_vqdmulltw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
+ */
25
+
23
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
}
27
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
28
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
int ieee_ex_to_loongarch(int xcpt)
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/mve.decode
29
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/target/arm/mve.decode
30
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@
31
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
34
@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
32
/*
35
@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
33
* Temporarily fall back to ifdef ladder
36
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
34
*/
37
+@2op_sz28 .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn \
35
-#if defined(TARGET_HPPA) || \
38
+ size=%size_28
36
- defined(TARGET_LOONGARCH)
39
37
- /*
40
# The _rev suffix indicates that Vn and Vm are reversed. This is
38
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
41
# the case for shifts. In the Arm ARM these insns are documented
39
- * case sets InvalidOp and returns the input value 'c'
42
@@ -XXX,XX +XXX,XX @@ VQDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op
40
- */
43
VQRDMLSDH 1111 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op
41
+#if defined(TARGET_HPPA)
44
VQRDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op
42
rule = float_infzeronan_dnan_never;
45
43
#endif
46
+VQDMULLB 111 . 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 1 @2op_sz28
44
}
47
+VQDMULLT 111 . 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 1 @2op_sz28
48
+
49
# Vector miscellaneous
50
51
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/mve_helper.c
55
+++ b/target/arm/mve_helper.c
56
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_SCALAR_L(vqdmullt_scalarh, 1, 2, int16_t, 4, int32_t, \
57
DO_2OP_SAT_SCALAR_L(vqdmullt_scalarw, 1, 4, int32_t, 8, int64_t, \
58
do_qdmullw, SATMASK32)
59
60
+/*
61
+ * Long saturating ops
62
+ */
63
+#define DO_2OP_SAT_L(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN, SATMASK) \
64
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
65
+ void *vm) \
66
+ { \
67
+ LTYPE *d = vd; \
68
+ TYPE *n = vn, *m = vm; \
69
+ uint16_t mask = mve_element_mask(env); \
70
+ unsigned le; \
71
+ bool qc = false; \
72
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
73
+ bool sat = false; \
74
+ LTYPE op1 = n[H##ESIZE(le * 2 + TOP)]; \
75
+ LTYPE op2 = m[H##ESIZE(le * 2 + TOP)]; \
76
+ mergemask(&d[H##LESIZE(le)], FN(op1, op2, &sat), mask); \
77
+ qc |= sat && (mask & SATMASK); \
78
+ } \
79
+ if (qc) { \
80
+ env->vfp.qc[0] = qc; \
81
+ } \
82
+ mve_advance_vpt(env); \
83
+ }
84
+
85
+DO_2OP_SAT_L(vqdmullbh, 0, 2, int16_t, 4, int32_t, do_qdmullh, SATMASK16B)
86
+DO_2OP_SAT_L(vqdmullbw, 0, 4, int32_t, 8, int64_t, do_qdmullw, SATMASK32)
87
+DO_2OP_SAT_L(vqdmullth, 1, 2, int16_t, 4, int32_t, do_qdmullh, SATMASK16T)
88
+DO_2OP_SAT_L(vqdmulltw, 1, 4, int32_t, 8, int64_t, do_qdmullw, SATMASK32)
89
+
90
static inline uint32_t do_vbrsrb(uint32_t n, uint32_t m)
91
{
92
m &= 0xff;
93
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/translate-mve.c
96
+++ b/target/arm/translate-mve.c
97
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQDMLSDHX, vqdmlsdhx)
98
DO_2OP(VQRDMLSDH, vqrdmlsdh)
99
DO_2OP(VQRDMLSDHX, vqrdmlsdhx)
100
101
+static bool trans_VQDMULLB(DisasContext *s, arg_2op *a)
102
+{
103
+ static MVEGenTwoOpFn * const fns[] = {
104
+ NULL,
105
+ gen_helper_mve_vqdmullbh,
106
+ gen_helper_mve_vqdmullbw,
107
+ NULL,
108
+ };
109
+ if (a->size == MO_32 && (a->qd == a->qm || a->qd == a->qn)) {
110
+ /* UNPREDICTABLE; we choose to undef */
111
+ return false;
112
+ }
113
+ return do_2op(s, a, fns[a->size]);
114
+}
115
+
116
+static bool trans_VQDMULLT(DisasContext *s, arg_2op *a)
117
+{
118
+ static MVEGenTwoOpFn * const fns[] = {
119
+ NULL,
120
+ gen_helper_mve_vqdmullth,
121
+ gen_helper_mve_vqdmulltw,
122
+ NULL,
123
+ };
124
+ if (a->size == MO_32 && (a->qd == a->qm || a->qd == a->qn)) {
125
+ /* UNPREDICTABLE; we choose to undef */
126
+ return false;
127
+ }
128
+ return do_2op(s, a, fns[a->size]);
129
+}
130
+
131
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
132
MVEGenTwoOpScalarFn fn)
133
{
134
--
45
--
135
2.20.1
46
2.34.1
136
137
diff view generated by jsdifflib
1
Implement the MVE VQDMLADH and VQRDMLADH insns. These multiply
1
Set the FloatInfZeroNaNRule explicitly for the HPPA target,
2
elements, and then add pairs of products, double, possibly round,
2
so we can remove the ifdef from pickNaNMulAdd().
3
saturate and return the high half of the result.
3
4
As this is the last target to be converted to explicitly setting
5
the rule, we can remove the fallback code in pickNaNMulAdd()
6
entirely.
4
7
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210617121628.20116-37-peter.maydell@linaro.org
10
Message-id: 20241202131347.498124-14-peter.maydell@linaro.org
8
---
11
---
9
target/arm/helper-mve.h | 16 +++++++
12
target/hppa/fpu_helper.c | 2 ++
10
target/arm/mve.decode | 5 +++
13
fpu/softfloat-specialize.c.inc | 13 +------------
11
target/arm/mve_helper.c | 89 ++++++++++++++++++++++++++++++++++++++
14
2 files changed, 3 insertions(+), 12 deletions(-)
12
target/arm/translate-mve.c | 4 ++
13
4 files changed, 114 insertions(+)
14
15
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
16
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
18
--- a/target/hppa/fpu_helper.c
18
+++ b/target/arm/helper-mve.h
19
+++ b/target/hppa/fpu_helper.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
20
DEF_HELPER_FLAGS_4(mve_vqrshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
* HPPA does note implement a CPU reset method at all...
21
DEF_HELPER_FLAGS_4(mve_vqrshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
*/
22
23
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
23
+DEF_HELPER_FLAGS_4(mve_vqdmladhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
+ /* For inf * 0 + NaN, return the input NaN */
24
+DEF_HELPER_FLAGS_4(mve_vqdmladhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
25
+DEF_HELPER_FLAGS_4(mve_vqdmladhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
}
26
+
27
27
+DEF_HELPER_FLAGS_4(mve_vqdmladhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
28
+DEF_HELPER_FLAGS_4(mve_vqdmladhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
+DEF_HELPER_FLAGS_4(mve_vqdmladhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
+
31
+DEF_HELPER_FLAGS_4(mve_vqrdmladhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
+DEF_HELPER_FLAGS_4(mve_vqrdmladhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
+DEF_HELPER_FLAGS_4(mve_vqrdmladhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+
35
+DEF_HELPER_FLAGS_4(mve_vqrdmladhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
36
+DEF_HELPER_FLAGS_4(mve_vqrdmladhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
37
+DEF_HELPER_FLAGS_4(mve_vqrdmladhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
38
+
39
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
42
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
43
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/mve.decode
31
--- a/fpu/softfloat-specialize.c.inc
45
+++ b/target/arm/mve.decode
32
+++ b/fpu/softfloat-specialize.c.inc
46
@@ -XXX,XX +XXX,XX @@ VQSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
33
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
47
VQRSHL_S 111 0 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev
34
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
48
VQRSHL_U 111 1 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev
35
bool infzero, float_status *status)
49
36
{
50
+VQDMLADH 1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op
37
- FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
51
+VQDMLADHX 1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op
38
-
52
+VQRDMLADH 1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op
39
/*
53
+VQRDMLADHX 1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op
40
* We guarantee not to require the target to tell us how to
54
+
41
* pick a NaN if we're always returning the default NaN.
55
# Vector miscellaneous
42
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
56
43
*/
57
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
44
assert(!status->default_nan_mode);
58
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
45
59
index XXXXXXX..XXXXXXX 100644
46
- if (rule == float_infzeronan_none) {
60
--- a/target/arm/mve_helper.c
47
- /*
61
+++ b/target/arm/mve_helper.c
48
- * Temporarily fall back to ifdef ladder
62
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_U(vqshlu, DO_UQSHL_OP)
49
- */
63
DO_2OP_SAT_S(vqrshls, DO_SQRSHL_OP)
50
-#if defined(TARGET_HPPA)
64
DO_2OP_SAT_U(vqrshlu, DO_UQRSHL_OP)
51
- rule = float_infzeronan_dnan_never;
65
52
-#endif
66
+/*
53
- }
67
+ * Multiply add dual returning high half
54
-
68
+ * The 'FN' here takes four inputs A, B, C, D, a 0/1 indicator of
55
if (infzero) {
69
+ * whether to add the rounding constant, and the pointer to the
56
/*
70
+ * saturation flag, and should do "(A * B + C * D) * 2 + rounding constant",
57
* Inf * 0 + NaN -- some implementations return the default NaN here,
71
+ * saturate to twice the input size and return the high half; or
58
* and some return the input NaN.
72
+ * (A * B - C * D) etc for VQDMLSDH.
59
*/
73
+ */
60
- switch (rule) {
74
+#define DO_VQDMLADH_OP(OP, ESIZE, TYPE, XCHG, ROUND, FN) \
61
+ switch (status->float_infzeronan_rule) {
75
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
62
case float_infzeronan_dnan_never:
76
+ void *vm) \
63
return 2;
77
+ { \
64
case float_infzeronan_dnan_always:
78
+ TYPE *d = vd, *n = vn, *m = vm; \
79
+ uint16_t mask = mve_element_mask(env); \
80
+ unsigned e; \
81
+ bool qc = false; \
82
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
83
+ bool sat = false; \
84
+ if ((e & 1) == XCHG) { \
85
+ TYPE r = FN(n[H##ESIZE(e)], \
86
+ m[H##ESIZE(e - XCHG)], \
87
+ n[H##ESIZE(e + (1 - 2 * XCHG))], \
88
+ m[H##ESIZE(e + (1 - XCHG))], \
89
+ ROUND, &sat); \
90
+ mergemask(&d[H##ESIZE(e)], r, mask); \
91
+ qc |= sat & mask & 1; \
92
+ } \
93
+ } \
94
+ if (qc) { \
95
+ env->vfp.qc[0] = qc; \
96
+ } \
97
+ mve_advance_vpt(env); \
98
+ }
99
+
100
+static int8_t do_vqdmladh_b(int8_t a, int8_t b, int8_t c, int8_t d,
101
+ int round, bool *sat)
102
+{
103
+ int64_t r = ((int64_t)a * b + (int64_t)c * d) * 2 + (round << 7);
104
+ return do_sat_bhw(r, INT16_MIN, INT16_MAX, sat) >> 8;
105
+}
106
+
107
+static int16_t do_vqdmladh_h(int16_t a, int16_t b, int16_t c, int16_t d,
108
+ int round, bool *sat)
109
+{
110
+ int64_t r = ((int64_t)a * b + (int64_t)c * d) * 2 + (round << 15);
111
+ return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat) >> 16;
112
+}
113
+
114
+static int32_t do_vqdmladh_w(int32_t a, int32_t b, int32_t c, int32_t d,
115
+ int round, bool *sat)
116
+{
117
+ int64_t m1 = (int64_t)a * b;
118
+ int64_t m2 = (int64_t)c * d;
119
+ int64_t r;
120
+ /*
121
+ * Architecturally we should do the entire add, double, round
122
+ * and then check for saturation. We do three saturating adds,
123
+ * but we need to be careful about the order. If the first
124
+ * m1 + m2 saturates then it's impossible for the *2+rc to
125
+ * bring it back into the non-saturated range. However, if
126
+ * m1 + m2 is negative then it's possible that doing the doubling
127
+ * would take the intermediate result below INT64_MAX and the
128
+ * addition of the rounding constant then brings it back in range.
129
+ * So we add half the rounding constant before doubling rather
130
+ * than adding the rounding constant after the doubling.
131
+ */
132
+ if (sadd64_overflow(m1, m2, &r) ||
133
+ sadd64_overflow(r, (round << 30), &r) ||
134
+ sadd64_overflow(r, r, &r)) {
135
+ *sat = true;
136
+ return r < 0 ? INT32_MAX : INT32_MIN;
137
+ }
138
+ return r >> 32;
139
+}
140
+
141
+DO_VQDMLADH_OP(vqdmladhb, 1, int8_t, 0, 0, do_vqdmladh_b)
142
+DO_VQDMLADH_OP(vqdmladhh, 2, int16_t, 0, 0, do_vqdmladh_h)
143
+DO_VQDMLADH_OP(vqdmladhw, 4, int32_t, 0, 0, do_vqdmladh_w)
144
+DO_VQDMLADH_OP(vqdmladhxb, 1, int8_t, 1, 0, do_vqdmladh_b)
145
+DO_VQDMLADH_OP(vqdmladhxh, 2, int16_t, 1, 0, do_vqdmladh_h)
146
+DO_VQDMLADH_OP(vqdmladhxw, 4, int32_t, 1, 0, do_vqdmladh_w)
147
+
148
+DO_VQDMLADH_OP(vqrdmladhb, 1, int8_t, 0, 1, do_vqdmladh_b)
149
+DO_VQDMLADH_OP(vqrdmladhh, 2, int16_t, 0, 1, do_vqdmladh_h)
150
+DO_VQDMLADH_OP(vqrdmladhw, 4, int32_t, 0, 1, do_vqdmladh_w)
151
+DO_VQDMLADH_OP(vqrdmladhxb, 1, int8_t, 1, 1, do_vqdmladh_b)
152
+DO_VQDMLADH_OP(vqrdmladhxh, 2, int16_t, 1, 1, do_vqdmladh_h)
153
+DO_VQDMLADH_OP(vqrdmladhxw, 4, int32_t, 1, 1, do_vqdmladh_w)
154
+
155
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
156
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
157
uint32_t rm) \
158
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
159
index XXXXXXX..XXXXXXX 100644
160
--- a/target/arm/translate-mve.c
161
+++ b/target/arm/translate-mve.c
162
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQSHL_S, vqshls)
163
DO_2OP(VQSHL_U, vqshlu)
164
DO_2OP(VQRSHL_S, vqrshls)
165
DO_2OP(VQRSHL_U, vqrshlu)
166
+DO_2OP(VQDMLADH, vqdmladh)
167
+DO_2OP(VQDMLADHX, vqdmladhx)
168
+DO_2OP(VQRDMLADH, vqrdmladh)
169
+DO_2OP(VQRDMLADHX, vqrdmladhx)
170
171
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
172
MVEGenTwoOpScalarFn fn)
173
--
65
--
174
2.20.1
66
2.34.1
175
176
diff view generated by jsdifflib
1
Implement the MVE VRSHL insn (vector form).
1
The new implementation of pickNaNMulAdd() will find it convenient
2
to know whether at least one of the three arguments to the muladd
3
was a signaling NaN. We already calculate that in the caller,
4
so pass it in as a new bool have_snan.
2
5
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-36-peter.maydell@linaro.org
8
Message-id: 20241202131347.498124-15-peter.maydell@linaro.org
6
---
9
---
7
target/arm/helper-mve.h | 8 ++++++++
10
fpu/softfloat-parts.c.inc | 5 +++--
8
target/arm/mve.decode | 3 +++
11
fpu/softfloat-specialize.c.inc | 2 +-
9
target/arm/mve_helper.c | 4 ++++
12
2 files changed, 4 insertions(+), 3 deletions(-)
10
target/arm/translate-mve.c | 2 ++
11
4 files changed, 17 insertions(+)
12
13
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
14
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
16
--- a/fpu/softfloat-parts.c.inc
16
+++ b/target/arm/helper-mve.h
17
+++ b/fpu/softfloat-parts.c.inc
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
18
DEF_HELPER_FLAGS_4(mve_vshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
{
19
DEF_HELPER_FLAGS_4(mve_vshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
int which;
20
21
bool infzero = (ab_mask == float_cmask_infzero);
21
+DEF_HELPER_FLAGS_4(mve_vrshlsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
+ bool have_snan = (abc_mask & float_cmask_snan);
22
+DEF_HELPER_FLAGS_4(mve_vrshlsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
23
+DEF_HELPER_FLAGS_4(mve_vrshlsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
- if (unlikely(abc_mask & float_cmask_snan)) {
24
+
25
+ if (unlikely(have_snan)) {
25
+DEF_HELPER_FLAGS_4(mve_vrshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
26
+DEF_HELPER_FLAGS_4(mve_vrshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
}
27
+DEF_HELPER_FLAGS_4(mve_vrshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
28
+
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
29
DEF_HELPER_FLAGS_4(mve_vqshlsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
if (s->default_nan_mode) {
30
DEF_HELPER_FLAGS_4(mve_vqshlsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
which = 3;
31
DEF_HELPER_FLAGS_4(mve_vqshlsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
} else {
32
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
33
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
35
}
36
37
if (which == 3) {
38
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/mve.decode
40
--- a/fpu/softfloat-specialize.c.inc
35
+++ b/target/arm/mve.decode
41
+++ b/fpu/softfloat-specialize.c.inc
36
@@ -XXX,XX +XXX,XX @@ VQSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
42
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
37
VSHL_S 111 0 1111 0 . .. ... 0 ... 0 0100 . 1 . 0 ... 0 @2op_rev
43
| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
38
VSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 0 ... 0 @2op_rev
44
*----------------------------------------------------------------------------*/
39
45
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
40
+VRSHL_S 111 0 1111 0 . .. ... 0 ... 0 0101 . 1 . 0 ... 0 @2op_rev
46
- bool infzero, float_status *status)
41
+VRSHL_U 111 1 1111 0 . .. ... 0 ... 0 0101 . 1 . 0 ... 0 @2op_rev
47
+ bool infzero, bool have_snan, float_status *status)
42
+
43
VQSHL_S 111 0 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
44
VQSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
45
46
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/mve_helper.c
49
+++ b/target/arm/mve_helper.c
50
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vhsubu, do_vhsub_u)
51
52
#define DO_VSHLS(N, M) do_sqrshl_bhs(N, (int8_t)(M), sizeof(N) * 8, false, NULL)
53
#define DO_VSHLU(N, M) do_uqrshl_bhs(N, (int8_t)(M), sizeof(N) * 8, false, NULL)
54
+#define DO_VRSHLS(N, M) do_sqrshl_bhs(N, (int8_t)(M), sizeof(N) * 8, true, NULL)
55
+#define DO_VRSHLU(N, M) do_uqrshl_bhs(N, (int8_t)(M), sizeof(N) * 8, true, NULL)
56
57
DO_2OP_S(vshls, DO_VSHLS)
58
DO_2OP_U(vshlu, DO_VSHLU)
59
+DO_2OP_S(vrshls, DO_VRSHLS)
60
+DO_2OP_U(vrshlu, DO_VRSHLU)
61
62
static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
63
{
48
{
64
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
49
/*
65
index XXXXXXX..XXXXXXX 100644
50
* We guarantee not to require the target to tell us how to
66
--- a/target/arm/translate-mve.c
67
+++ b/target/arm/translate-mve.c
68
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQSUB_S, vqsubs)
69
DO_2OP(VQSUB_U, vqsubu)
70
DO_2OP(VSHL_S, vshls)
71
DO_2OP(VSHL_U, vshlu)
72
+DO_2OP(VRSHL_S, vrshls)
73
+DO_2OP(VRSHL_U, vrshlu)
74
DO_2OP(VQSHL_S, vqshls)
75
DO_2OP(VQSHL_U, vqshlu)
76
DO_2OP(VQRSHL_S, vqrshls)
77
--
51
--
78
2.20.1
52
2.34.1
79
80
diff view generated by jsdifflib
1
Implement the vector forms of the MVE VQDMULH and VQRDMULH insns.
1
IEEE 758 does not define a fixed rule for which NaN to pick as the
2
result if both operands of a 3-operand fused multiply-add operation
3
are NaNs. As a result different architectures have ended up with
4
different rules for propagating NaNs.
5
6
QEMU currently hardcodes the NaN propagation logic into the binary
7
because pickNaNMulAdd() has an ifdef ladder for different targets.
8
We want to make the propagation rule instead be selectable at
9
runtime, because:
10
* this will let us have multiple targets in one QEMU binary
11
* the Arm FEAT_AFP architectural feature includes letting
12
the guest select a NaN propagation rule at runtime
13
14
In this commit we add an enum for the propagation rule, the field in
15
float_status, and the corresponding getters and setters. We change
16
pickNaNMulAdd to honour this, but because all targets still leave
17
this field at its default 0 value, the fallback logic will pick the
18
rule type with the old ifdef ladder.
19
20
It's valid not to set a propagation rule if default_nan_mode is
21
enabled, because in that case there's no need to pick a NaN; all the
22
callers of pickNaNMulAdd() catch this case and skip calling it.
2
23
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-31-peter.maydell@linaro.org
26
Message-id: 20241202131347.498124-16-peter.maydell@linaro.org
6
---
27
---
7
target/arm/helper-mve.h | 8 ++++++++
28
include/fpu/softfloat-helpers.h | 11 +++
8
target/arm/mve.decode | 3 +++
29
include/fpu/softfloat-types.h | 55 +++++++++++
9
target/arm/mve_helper.c | 27 +++++++++++++++++++++++++++
30
fpu/softfloat-specialize.c.inc | 167 ++++++++------------------------
10
target/arm/translate-mve.c | 2 ++
31
3 files changed, 107 insertions(+), 126 deletions(-)
11
4 files changed, 40 insertions(+)
32
12
33
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
35
--- a/include/fpu/softfloat-helpers.h
16
+++ b/target/arm/helper-mve.h
36
+++ b/include/fpu/softfloat-helpers.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
37
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
18
DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
38
status->float_2nan_prop_rule = rule;
19
DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
39
}
20
40
21
+DEF_HELPER_FLAGS_4(mve_vqdmulhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
41
+static inline void set_float_3nan_prop_rule(Float3NaNPropRule rule,
22
+DEF_HELPER_FLAGS_4(mve_vqdmulhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
42
+ float_status *status)
23
+DEF_HELPER_FLAGS_4(mve_vqdmulhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
43
+{
24
+
44
+ status->float_3nan_prop_rule = rule;
25
+DEF_HELPER_FLAGS_4(mve_vqrdmulhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
45
+}
26
+DEF_HELPER_FLAGS_4(mve_vqrdmulhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
46
+
27
+DEF_HELPER_FLAGS_4(mve_vqrdmulhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
47
static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
28
+
48
float_status *status)
29
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
49
{
30
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
50
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
31
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
51
return status->float_2nan_prop_rule;
32
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
52
}
53
54
+static inline Float3NaNPropRule get_float_3nan_prop_rule(float_status *status)
55
+{
56
+ return status->float_3nan_prop_rule;
57
+}
58
+
59
static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
60
{
61
return status->float_infzeronan_rule;
62
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
33
index XXXXXXX..XXXXXXX 100644
63
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/mve.decode
64
--- a/include/fpu/softfloat-types.h
35
+++ b/target/arm/mve.decode
65
+++ b/include/fpu/softfloat-types.h
36
@@ -XXX,XX +XXX,XX @@ VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
66
@@ -XXX,XX +XXX,XX @@ this code that are retained.
37
VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
67
#ifndef SOFTFLOAT_TYPES_H
38
VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
68
#define SOFTFLOAT_TYPES_H
39
69
40
+VQDMULH 1110 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op
70
+#include "hw/registerfields.h"
41
+VQRDMULH 1111 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op
71
+
42
+
72
/*
43
# Vector miscellaneous
73
* Software IEC/IEEE floating-point types.
44
74
*/
45
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
75
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
46
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
76
float_2nan_prop_x87,
77
} Float2NaNPropRule;
78
79
+/*
80
+ * 3-input NaN propagation rule, for fused multiply-add. Individual
81
+ * architectures have different rules for which input NaN is
82
+ * propagated to the output when there is more than one NaN on the
83
+ * input.
84
+ *
85
+ * If default_nan_mode is enabled then it is valid not to set a NaN
86
+ * propagation rule, because the softfloat code guarantees not to try
87
+ * to pick a NaN to propagate in default NaN mode. When not in
88
+ * default-NaN mode, it is an error for the target not to set the rule
89
+ * in float_status if it uses a muladd, and we will assert if we need
90
+ * to handle an input NaN and no rule was selected.
91
+ *
92
+ * The naming scheme for Float3NaNPropRule values is:
93
+ * float_3nan_prop_s_abc:
94
+ * = "Prefer SNaN over QNaN, then operand A over B over C"
95
+ * float_3nan_prop_abc:
96
+ * = "Prefer A over B over C regardless of SNaN vs QNAN"
97
+ *
98
+ * For QEMU, the multiply-add operation is A * B + C.
99
+ */
100
+
101
+/*
102
+ * We set the Float3NaNPropRule enum values up so we can select the
103
+ * right value in pickNaNMulAdd in a data driven way.
104
+ */
105
+FIELD(3NAN, 1ST, 0, 2) /* which operand is most preferred ? */
106
+FIELD(3NAN, 2ND, 2, 2) /* which operand is next most preferred ? */
107
+FIELD(3NAN, 3RD, 4, 2) /* which operand is least preferred ? */
108
+FIELD(3NAN, SNAN, 6, 1) /* do we prefer SNaN over QNaN ? */
109
+
110
+#define PROPRULE(X, Y, Z) \
111
+ ((X << R_3NAN_1ST_SHIFT) | (Y << R_3NAN_2ND_SHIFT) | (Z << R_3NAN_3RD_SHIFT))
112
+
113
+typedef enum __attribute__((__packed__)) {
114
+ float_3nan_prop_none = 0, /* No propagation rule specified */
115
+ float_3nan_prop_abc = PROPRULE(0, 1, 2),
116
+ float_3nan_prop_acb = PROPRULE(0, 2, 1),
117
+ float_3nan_prop_bac = PROPRULE(1, 0, 2),
118
+ float_3nan_prop_bca = PROPRULE(1, 2, 0),
119
+ float_3nan_prop_cab = PROPRULE(2, 0, 1),
120
+ float_3nan_prop_cba = PROPRULE(2, 1, 0),
121
+ float_3nan_prop_s_abc = float_3nan_prop_abc | R_3NAN_SNAN_MASK,
122
+ float_3nan_prop_s_acb = float_3nan_prop_acb | R_3NAN_SNAN_MASK,
123
+ float_3nan_prop_s_bac = float_3nan_prop_bac | R_3NAN_SNAN_MASK,
124
+ float_3nan_prop_s_bca = float_3nan_prop_bca | R_3NAN_SNAN_MASK,
125
+ float_3nan_prop_s_cab = float_3nan_prop_cab | R_3NAN_SNAN_MASK,
126
+ float_3nan_prop_s_cba = float_3nan_prop_cba | R_3NAN_SNAN_MASK,
127
+} Float3NaNPropRule;
128
+
129
+#undef PROPRULE
130
+
131
/*
132
* Rule for result of fused multiply-add 0 * Inf + NaN.
133
* This must be a NaN, but implementations differ on whether this
134
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
135
FloatRoundMode float_rounding_mode;
136
FloatX80RoundPrec floatx80_rounding_precision;
137
Float2NaNPropRule float_2nan_prop_rule;
138
+ Float3NaNPropRule float_3nan_prop_rule;
139
FloatInfZeroNaNRule float_infzeronan_rule;
140
bool tininess_before_rounding;
141
/* should denormalised results go to zero and set the inexact flag? */
142
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
47
index XXXXXXX..XXXXXXX 100644
143
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/mve_helper.c
144
--- a/fpu/softfloat-specialize.c.inc
49
+++ b/target/arm/mve_helper.c
145
+++ b/fpu/softfloat-specialize.c.inc
50
@@ -XXX,XX +XXX,XX @@ DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
146
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
51
mve_advance_vpt(env); \
147
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
148
bool infzero, bool have_snan, float_status *status)
149
{
150
+ FloatClass cls[3] = { a_cls, b_cls, c_cls };
151
+ Float3NaNPropRule rule = status->float_3nan_prop_rule;
152
+ int which;
153
+
154
/*
155
* We guarantee not to require the target to tell us how to
156
* pick a NaN if we're always returning the default NaN.
157
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
158
}
52
}
159
}
53
160
54
+#define DO_2OP_SAT(OP, ESIZE, TYPE, FN) \
161
+ if (rule == float_3nan_prop_none) {
55
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, void *vm) \
162
#if defined(TARGET_ARM)
56
+ { \
163
-
57
+ TYPE *d = vd, *n = vn, *m = vm; \
164
- /* This looks different from the ARM ARM pseudocode, because the ARM ARM
58
+ uint16_t mask = mve_element_mask(env); \
165
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
59
+ unsigned e; \
166
- */
60
+ bool qc = false; \
167
- if (is_snan(c_cls)) {
61
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
168
- return 2;
62
+ bool sat = false; \
169
- } else if (is_snan(a_cls)) {
63
+ TYPE r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)], &sat); \
170
- return 0;
64
+ mergemask(&d[H##ESIZE(e)], r, mask); \
171
- } else if (is_snan(b_cls)) {
65
+ qc |= sat & mask & 1; \
172
- return 1;
66
+ } \
173
- } else if (is_qnan(c_cls)) {
67
+ if (qc) { \
174
- return 2;
68
+ env->vfp.qc[0] = qc; \
175
- } else if (is_qnan(a_cls)) {
69
+ } \
176
- return 0;
70
+ mve_advance_vpt(env); \
177
- } else {
178
- return 1;
179
- }
180
+ /*
181
+ * This looks different from the ARM ARM pseudocode, because the ARM ARM
182
+ * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
183
+ */
184
+ rule = float_3nan_prop_s_cab;
185
#elif defined(TARGET_MIPS)
186
- if (snan_bit_is_one(status)) {
187
- /* Prefer sNaN over qNaN, in the a, b, c order. */
188
- if (is_snan(a_cls)) {
189
- return 0;
190
- } else if (is_snan(b_cls)) {
191
- return 1;
192
- } else if (is_snan(c_cls)) {
193
- return 2;
194
- } else if (is_qnan(a_cls)) {
195
- return 0;
196
- } else if (is_qnan(b_cls)) {
197
- return 1;
198
+ if (snan_bit_is_one(status)) {
199
+ rule = float_3nan_prop_s_abc;
200
} else {
201
- return 2;
202
+ rule = float_3nan_prop_s_cab;
203
}
204
- } else {
205
- /* Prefer sNaN over qNaN, in the c, a, b order. */
206
- if (is_snan(c_cls)) {
207
- return 2;
208
- } else if (is_snan(a_cls)) {
209
- return 0;
210
- } else if (is_snan(b_cls)) {
211
- return 1;
212
- } else if (is_qnan(c_cls)) {
213
- return 2;
214
- } else if (is_qnan(a_cls)) {
215
- return 0;
216
- } else {
217
- return 1;
218
- }
219
- }
220
#elif defined(TARGET_LOONGARCH64)
221
- /* Prefer sNaN over qNaN, in the c, a, b order. */
222
- if (is_snan(c_cls)) {
223
- return 2;
224
- } else if (is_snan(a_cls)) {
225
- return 0;
226
- } else if (is_snan(b_cls)) {
227
- return 1;
228
- } else if (is_qnan(c_cls)) {
229
- return 2;
230
- } else if (is_qnan(a_cls)) {
231
- return 0;
232
- } else {
233
- return 1;
234
- }
235
+ rule = float_3nan_prop_s_cab;
236
#elif defined(TARGET_PPC)
237
- /* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
238
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
239
- */
240
- if (is_nan(a_cls)) {
241
- return 0;
242
- } else if (is_nan(c_cls)) {
243
- return 2;
244
- } else {
245
- return 1;
246
- }
247
+ /*
248
+ * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
249
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
250
+ */
251
+ rule = float_3nan_prop_acb;
252
#elif defined(TARGET_S390X)
253
- if (is_snan(a_cls)) {
254
- return 0;
255
- } else if (is_snan(b_cls)) {
256
- return 1;
257
- } else if (is_snan(c_cls)) {
258
- return 2;
259
- } else if (is_qnan(a_cls)) {
260
- return 0;
261
- } else if (is_qnan(b_cls)) {
262
- return 1;
263
- } else {
264
- return 2;
265
- }
266
+ rule = float_3nan_prop_s_abc;
267
#elif defined(TARGET_SPARC)
268
- /* Prefer SNaN over QNaN, order C, B, A. */
269
- if (is_snan(c_cls)) {
270
- return 2;
271
- } else if (is_snan(b_cls)) {
272
- return 1;
273
- } else if (is_snan(a_cls)) {
274
- return 0;
275
- } else if (is_qnan(c_cls)) {
276
- return 2;
277
- } else if (is_qnan(b_cls)) {
278
- return 1;
279
- } else {
280
- return 0;
281
- }
282
+ rule = float_3nan_prop_s_cba;
283
#elif defined(TARGET_XTENSA)
284
- /*
285
- * For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
286
- * an input NaN if we have one (ie c).
287
- */
288
- if (status->use_first_nan) {
289
- if (is_nan(a_cls)) {
290
- return 0;
291
- } else if (is_nan(b_cls)) {
292
- return 1;
293
+ if (status->use_first_nan) {
294
+ rule = float_3nan_prop_abc;
295
} else {
296
- return 2;
297
+ rule = float_3nan_prop_cba;
298
}
299
- } else {
300
- if (is_nan(c_cls)) {
301
- return 2;
302
- } else if (is_nan(b_cls)) {
303
- return 1;
304
- } else {
305
- return 0;
306
- }
307
- }
308
#else
309
- /* A default implementation: prefer a to b to c.
310
- * This is unlikely to actually match any real implementation.
311
- */
312
- if (is_nan(a_cls)) {
313
- return 0;
314
- } else if (is_nan(b_cls)) {
315
- return 1;
316
- } else {
317
- return 2;
318
- }
319
+ rule = float_3nan_prop_abc;
320
#endif
71
+ }
321
+ }
72
+
322
+
73
#define DO_AND(N, M) ((N) & (M))
323
+ assert(rule != float_3nan_prop_none);
74
#define DO_BIC(N, M) ((N) & ~(M))
324
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
75
#define DO_ORR(N, M) ((N) | (M))
325
+ /* We have at least one SNaN input and should prefer it */
76
@@ -XXX,XX +XXX,XX @@ static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
326
+ do {
77
#define DO_QRDMULH_W(n, m, s) do_sat_bhw(((int64_t)n * m + (1 << 30)) >> 31, \
327
+ which = rule & R_3NAN_1ST_MASK;
78
INT32_MIN, INT32_MAX, s)
328
+ rule >>= R_3NAN_1ST_LENGTH;
79
329
+ } while (!is_snan(cls[which]));
80
+DO_2OP_SAT(vqdmulhb, 1, int8_t, DO_QDMULH_B)
330
+ } else {
81
+DO_2OP_SAT(vqdmulhh, 2, int16_t, DO_QDMULH_H)
331
+ do {
82
+DO_2OP_SAT(vqdmulhw, 4, int32_t, DO_QDMULH_W)
332
+ which = rule & R_3NAN_1ST_MASK;
83
+
333
+ rule >>= R_3NAN_1ST_LENGTH;
84
+DO_2OP_SAT(vqrdmulhb, 1, int8_t, DO_QRDMULH_B)
334
+ } while (!is_nan(cls[which]));
85
+DO_2OP_SAT(vqrdmulhh, 2, int16_t, DO_QRDMULH_H)
335
+ }
86
+DO_2OP_SAT(vqrdmulhw, 4, int32_t, DO_QRDMULH_W)
336
+ return which;
87
+
337
}
88
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
338
89
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
339
/*----------------------------------------------------------------------------
90
uint32_t rm) \
91
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
92
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/translate-mve.c
94
+++ b/target/arm/translate-mve.c
95
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMULL_BS, vmullbs)
96
DO_2OP(VMULL_BU, vmullbu)
97
DO_2OP(VMULL_TS, vmullts)
98
DO_2OP(VMULL_TU, vmulltu)
99
+DO_2OP(VQDMULH, vqdmulh)
100
+DO_2OP(VQRDMULH, vqrdmulh)
101
102
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
103
MVEGenTwoOpScalarFn fn)
104
--
340
--
105
2.20.1
341
2.34.1
106
107
diff view generated by jsdifflib
1
Implement the MVE VSHL insn (vector form).
1
Explicitly set a rule in the softfloat tests for propagating NaNs in
2
the muladd case. In meson.build we put -DTARGET_ARM in fpcflags, and
3
so we should select here the Arm rule of float_3nan_prop_s_cab.
2
4
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-35-peter.maydell@linaro.org
7
Message-id: 20241202131347.498124-17-peter.maydell@linaro.org
6
---
8
---
7
target/arm/helper-mve.h | 8 ++++++++
9
tests/fp/fp-bench.c | 1 +
8
target/arm/mve.decode | 3 +++
10
tests/fp/fp-test.c | 1 +
9
target/arm/mve_helper.c | 6 ++++++
11
2 files changed, 2 insertions(+)
10
target/arm/translate-mve.c | 2 ++
11
4 files changed, 19 insertions(+)
12
12
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
13
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
14
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
15
--- a/tests/fp/fp-bench.c
16
+++ b/target/arm/helper-mve.h
16
+++ b/tests/fp/fp-bench.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqsubub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
18
DEF_HELPER_FLAGS_4(mve_vqsubuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
* doesn't specify match those used by the Arm architecture.
19
DEF_HELPER_FLAGS_4(mve_vqsubuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
*/
20
20
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
21
+DEF_HELPER_FLAGS_4(mve_vshlsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
22
+DEF_HELPER_FLAGS_4(mve_vshlsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
23
+DEF_HELPER_FLAGS_4(mve_vshlsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
24
+
24
f = bench_funcs[operation][precision];
25
+DEF_HELPER_FLAGS_4(mve_vshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
26
+DEF_HELPER_FLAGS_4(mve_vshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
+DEF_HELPER_FLAGS_4(mve_vshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+
29
DEF_HELPER_FLAGS_4(mve_vqshlsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
DEF_HELPER_FLAGS_4(mve_vqshlsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
DEF_HELPER_FLAGS_4(mve_vqshlsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
33
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/mve.decode
27
--- a/tests/fp/fp-test.c
35
+++ b/target/arm/mve.decode
28
+++ b/tests/fp/fp-test.c
36
@@ -XXX,XX +XXX,XX @@ VQADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 1 ... 0 @2op
29
@@ -XXX,XX +XXX,XX @@ void run_test(void)
37
VQSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
30
* doesn't specify match those used by the Arm architecture.
38
VQSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
31
*/
39
32
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
40
+VSHL_S 111 0 1111 0 . .. ... 0 ... 0 0100 . 1 . 0 ... 0 @2op_rev
33
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
41
+VSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 0 ... 0 @2op_rev
34
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
42
+
35
43
VQSHL_S 111 0 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
36
genCases_setLevel(test_level);
44
VQSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
45
46
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/mve_helper.c
49
+++ b/target/arm/mve_helper.c
50
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vhaddu, do_vhadd_u)
51
DO_2OP_S(vhsubs, do_vhsub_s)
52
DO_2OP_U(vhsubu, do_vhsub_u)
53
54
+#define DO_VSHLS(N, M) do_sqrshl_bhs(N, (int8_t)(M), sizeof(N) * 8, false, NULL)
55
+#define DO_VSHLU(N, M) do_uqrshl_bhs(N, (int8_t)(M), sizeof(N) * 8, false, NULL)
56
+
57
+DO_2OP_S(vshls, DO_VSHLS)
58
+DO_2OP_U(vshlu, DO_VSHLU)
59
+
60
static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
61
{
62
if (val > max) {
63
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/translate-mve.c
66
+++ b/target/arm/translate-mve.c
67
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQADD_S, vqadds)
68
DO_2OP(VQADD_U, vqaddu)
69
DO_2OP(VQSUB_S, vqsubs)
70
DO_2OP(VQSUB_U, vqsubu)
71
+DO_2OP(VSHL_S, vshls)
72
+DO_2OP(VSHL_U, vshlu)
73
DO_2OP(VQSHL_S, vqshls)
74
DO_2OP(VQSHL_U, vqshlu)
75
DO_2OP(VQRSHL_S, vqrshls)
76
--
37
--
77
2.20.1
38
2.34.1
78
79
diff view generated by jsdifflib
1
Implement the MVE VQSHL insn (encoding T4, which is the
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
vector-shift-by-vector version).
2
ifdef from pickNaNMulAdd().
3
4
The DO_SQSHL_OP and DO_UQSHL_OP macros here are derived from
5
the neon_helper.c code for qshl_u{8,16,32} and qshl_s{8,16,32}.
6
3
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210617121628.20116-33-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-18-peter.maydell@linaro.org
10
---
7
---
11
target/arm/helper-mve.h | 8 ++++++++
8
target/arm/cpu.c | 5 +++++
12
target/arm/mve.decode | 12 ++++++++++++
9
fpu/softfloat-specialize.c.inc | 8 +-------
13
target/arm/mve_helper.c | 34 ++++++++++++++++++++++++++++++++++
10
2 files changed, 6 insertions(+), 7 deletions(-)
14
target/arm/translate-mve.c | 2 ++
15
4 files changed, 56 insertions(+)
16
11
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
14
--- a/target/arm/cpu.c
20
+++ b/target/arm/helper-mve.h
15
+++ b/target/arm/cpu.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqsubub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
22
DEF_HELPER_FLAGS_4(mve_vqsubuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
* * tininess-before-rounding
23
DEF_HELPER_FLAGS_4(mve_vqsubuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
24
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
25
+DEF_HELPER_FLAGS_4(mve_vqshlsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
+ * * 3-input NaN propagation prefers SNaN over QNaN, and then
26
+DEF_HELPER_FLAGS_4(mve_vqshlsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ * operand C over A over B (see FPProcessNaNs3() pseudocode,
27
+DEF_HELPER_FLAGS_4(mve_vqshlsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
+ * but note that for QEMU muladd is a * b + c, whereas for
28
+
23
+ * the pseudocode function the arguments are in the order c, a, b.
29
+DEF_HELPER_FLAGS_4(mve_vqshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
30
+DEF_HELPER_FLAGS_4(mve_vqshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
* and the input NaN if it is signalling
31
+DEF_HELPER_FLAGS_4(mve_vqshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
*/
32
+
27
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
33
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
{
34
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
set_float_detect_tininess(float_tininess_before_rounding, s);
35
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
36
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
31
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
32
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
33
}
34
35
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
37
index XXXXXXX..XXXXXXX 100644
36
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/mve.decode
37
--- a/fpu/softfloat-specialize.c.inc
39
+++ b/target/arm/mve.decode
38
+++ b/fpu/softfloat-specialize.c.inc
40
@@ -XXX,XX +XXX,XX @@
39
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
41
@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
42
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
43
44
+# The _rev suffix indicates that Vn and Vm are reversed. This is
45
+# the case for shifts. In the Arm ARM these insns are documented
46
+# with the Vm and Vn fields in their usual places, but in the
47
+# assembly the operands are listed "backwards", ie in the order
48
+# Qd, Qm, Qn where other insns use Qd, Qn, Qm. For QEMU we choose
49
+# to consider Vm and Vn as being in different fields in the insn.
50
+# This gives us consistency with A64 and Neon.
51
+@2op_rev .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qn qn=%qm
52
+
53
@2scalar .... .... .. size:2 .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
54
@2scalar_nosz .... .... .... .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
55
56
@@ -XXX,XX +XXX,XX @@ VQADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 1 ... 0 @2op
57
VQSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
58
VQSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
59
60
+VQSHL_S 111 0 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
61
+VQSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
62
+
63
# Vector miscellaneous
64
65
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
66
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/target/arm/mve_helper.c
69
+++ b/target/arm/mve_helper.c
70
@@ -XXX,XX +XXX,XX @@ DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
71
mve_advance_vpt(env); \
72
}
40
}
73
41
74
+/* provide unsigned 2-op helpers for all sizes */
42
if (rule == float_3nan_prop_none) {
75
+#define DO_2OP_SAT_U(OP, FN) \
43
-#if defined(TARGET_ARM)
76
+ DO_2OP_SAT(OP##b, 1, uint8_t, FN) \
44
- /*
77
+ DO_2OP_SAT(OP##h, 2, uint16_t, FN) \
45
- * This looks different from the ARM ARM pseudocode, because the ARM ARM
78
+ DO_2OP_SAT(OP##w, 4, uint32_t, FN)
46
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
79
+
47
- */
80
+/* provide signed 2-op helpers for all sizes */
48
- rule = float_3nan_prop_s_cab;
81
+#define DO_2OP_SAT_S(OP, FN) \
49
-#elif defined(TARGET_MIPS)
82
+ DO_2OP_SAT(OP##b, 1, int8_t, FN) \
50
+#if defined(TARGET_MIPS)
83
+ DO_2OP_SAT(OP##h, 2, int16_t, FN) \
51
if (snan_bit_is_one(status)) {
84
+ DO_2OP_SAT(OP##w, 4, int32_t, FN)
52
rule = float_3nan_prop_s_abc;
85
+
53
} else {
86
#define DO_AND(N, M) ((N) & (M))
87
#define DO_BIC(N, M) ((N) & ~(M))
88
#define DO_ORR(N, M) ((N) | (M))
89
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT(vqsubsb, 1, int8_t, DO_SQSUB_B)
90
DO_2OP_SAT(vqsubsh, 2, int16_t, DO_SQSUB_H)
91
DO_2OP_SAT(vqsubsw, 4, int32_t, DO_SQSUB_W)
92
93
+/*
94
+ * This wrapper fixes up the impedance mismatch between do_sqrshl_bhs()
95
+ * and friends wanting a uint32_t* sat and our needing a bool*.
96
+ */
97
+#define WRAP_QRSHL_HELPER(FN, N, M, ROUND, satp) \
98
+ ({ \
99
+ uint32_t su32 = 0; \
100
+ typeof(N) r = FN(N, (int8_t)(M), sizeof(N) * 8, ROUND, &su32); \
101
+ if (su32) { \
102
+ *satp = true; \
103
+ } \
104
+ r; \
105
+ })
106
+
107
+#define DO_SQSHL_OP(N, M, satp) \
108
+ WRAP_QRSHL_HELPER(do_sqrshl_bhs, N, M, false, satp)
109
+#define DO_UQSHL_OP(N, M, satp) \
110
+ WRAP_QRSHL_HELPER(do_uqrshl_bhs, N, M, false, satp)
111
+
112
+DO_2OP_SAT_S(vqshls, DO_SQSHL_OP)
113
+DO_2OP_SAT_U(vqshlu, DO_UQSHL_OP)
114
+
115
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
116
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
117
uint32_t rm) \
118
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
119
index XXXXXXX..XXXXXXX 100644
120
--- a/target/arm/translate-mve.c
121
+++ b/target/arm/translate-mve.c
122
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQADD_S, vqadds)
123
DO_2OP(VQADD_U, vqaddu)
124
DO_2OP(VQSUB_S, vqsubs)
125
DO_2OP(VQSUB_U, vqsubu)
126
+DO_2OP(VQSHL_S, vqshls)
127
+DO_2OP(VQSHL_U, vqshlu)
128
129
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
130
MVEGenTwoOpScalarFn fn)
131
--
54
--
132
2.20.1
55
2.34.1
133
134
diff view generated by jsdifflib
1
Implement the MV VQRSHL (vector) insn. Again, the code to perform
1
Set the Float3NaNPropRule explicitly for loongarch, and remove the
2
the actual shifts is borrowed from neon_helper.c.
2
ifdef from pickNaNMulAdd().
3
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-34-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-19-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-mve.h | 8 ++++++++
8
target/loongarch/tcg/fpu_helper.c | 1 +
9
target/arm/mve.decode | 3 +++
9
fpu/softfloat-specialize.c.inc | 2 --
10
target/arm/mve_helper.c | 6 ++++++
10
2 files changed, 1 insertion(+), 2 deletions(-)
11
target/arm/translate-mve.c | 2 ++
12
4 files changed, 19 insertions(+)
13
11
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
14
--- a/target/loongarch/tcg/fpu_helper.c
17
+++ b/target/arm/helper-mve.h
15
+++ b/target/loongarch/tcg/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
19
DEF_HELPER_FLAGS_4(mve_vqshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
* case sets InvalidOp and returns the input value 'c'
20
DEF_HELPER_FLAGS_4(mve_vqshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
*/
21
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
+DEF_HELPER_FLAGS_4(mve_vqrshlsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
23
+DEF_HELPER_FLAGS_4(mve_vqrshlsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
}
24
+DEF_HELPER_FLAGS_4(mve_vqrshlsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
25
+
23
int ieee_ex_to_loongarch(int xcpt)
26
+DEF_HELPER_FLAGS_4(mve_vqrshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
27
+DEF_HELPER_FLAGS_4(mve_vqrshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+DEF_HELPER_FLAGS_4(mve_vqrshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+
30
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
26
--- a/fpu/softfloat-specialize.c.inc
36
+++ b/target/arm/mve.decode
27
+++ b/fpu/softfloat-specialize.c.inc
37
@@ -XXX,XX +XXX,XX @@ VQSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
38
VQSHL_S 111 0 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
29
} else {
39
VQSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
30
rule = float_3nan_prop_s_cab;
40
31
}
41
+VQRSHL_S 111 0 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev
32
-#elif defined(TARGET_LOONGARCH64)
42
+VQRSHL_U 111 1 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev
33
- rule = float_3nan_prop_s_cab;
43
+
34
#elif defined(TARGET_PPC)
44
# Vector miscellaneous
35
/*
45
36
* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
46
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
47
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/mve_helper.c
50
+++ b/target/arm/mve_helper.c
51
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT(vqsubsw, 4, int32_t, DO_SQSUB_W)
52
WRAP_QRSHL_HELPER(do_sqrshl_bhs, N, M, false, satp)
53
#define DO_UQSHL_OP(N, M, satp) \
54
WRAP_QRSHL_HELPER(do_uqrshl_bhs, N, M, false, satp)
55
+#define DO_SQRSHL_OP(N, M, satp) \
56
+ WRAP_QRSHL_HELPER(do_sqrshl_bhs, N, M, true, satp)
57
+#define DO_UQRSHL_OP(N, M, satp) \
58
+ WRAP_QRSHL_HELPER(do_uqrshl_bhs, N, M, true, satp)
59
60
DO_2OP_SAT_S(vqshls, DO_SQSHL_OP)
61
DO_2OP_SAT_U(vqshlu, DO_UQSHL_OP)
62
+DO_2OP_SAT_S(vqrshls, DO_SQRSHL_OP)
63
+DO_2OP_SAT_U(vqrshlu, DO_UQRSHL_OP)
64
65
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
66
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
67
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
68
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/translate-mve.c
70
+++ b/target/arm/translate-mve.c
71
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQSUB_S, vqsubs)
72
DO_2OP(VQSUB_U, vqsubu)
73
DO_2OP(VQSHL_S, vqshls)
74
DO_2OP(VQSHL_U, vqshlu)
75
+DO_2OP(VQRSHL_S, vqrshls)
76
+DO_2OP(VQRSHL_U, vqrshlu)
77
78
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
79
MVEGenTwoOpScalarFn fn)
80
--
37
--
81
2.20.1
38
2.34.1
82
83
diff view generated by jsdifflib
1
Implement the vector forms of the MVE VQADD and VQSUB insns.
1
Set the Float3NaNPropRule explicitly for PPC, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-32-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-20-peter.maydell@linaro.org
6
---
7
---
7
target/arm/helper-mve.h | 16 ++++++++++++++++
8
target/ppc/cpu_init.c | 8 ++++++++
8
target/arm/mve.decode | 5 +++++
9
fpu/softfloat-specialize.c.inc | 6 ------
9
target/arm/mve_helper.c | 14 ++++++++++++++
10
2 files changed, 8 insertions(+), 6 deletions(-)
10
target/arm/translate-mve.c | 4 ++++
11
4 files changed, 39 insertions(+)
12
11
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
14
--- a/target/ppc/cpu_init.c
16
+++ b/target/arm/helper-mve.h
15
+++ b/target/ppc/cpu_init.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrdmulhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
18
DEF_HELPER_FLAGS_4(mve_vqrdmulhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
*/
19
DEF_HELPER_FLAGS_4(mve_vqrdmulhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
20
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
21
+DEF_HELPER_FLAGS_4(mve_vqaddsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
+ /*
22
+DEF_HELPER_FLAGS_4(mve_vqaddsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ * NaN propagation for fused multiply-add:
23
+DEF_HELPER_FLAGS_4(mve_vqaddsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
+ * if fRA is a NaN return it; otherwise if fRB is a NaN return it;
24
+
23
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
25
+DEF_HELPER_FLAGS_4(mve_vqaddub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
+ * whereas QEMU labels the operands as (a * b) + c.
26
+DEF_HELPER_FLAGS_4(mve_vqadduh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
+ */
27
+DEF_HELPER_FLAGS_4(mve_vqadduw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->fp_status);
28
+
27
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->vec_status);
29
+DEF_HELPER_FLAGS_4(mve_vqsubsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
/*
30
+DEF_HELPER_FLAGS_4(mve_vqsubsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
31
+DEF_HELPER_FLAGS_4(mve_vqsubsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
* to return an input NaN if we have one (ie c) rather than generating
32
+
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
+DEF_HELPER_FLAGS_4(mve_vqsubub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+DEF_HELPER_FLAGS_4(mve_vqsubuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
35
+DEF_HELPER_FLAGS_4(mve_vqsubuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
36
+
37
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
41
index XXXXXXX..XXXXXXX 100644
32
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/mve.decode
33
--- a/fpu/softfloat-specialize.c.inc
43
+++ b/target/arm/mve.decode
34
+++ b/fpu/softfloat-specialize.c.inc
44
@@ -XXX,XX +XXX,XX @@ VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
45
VQDMULH 1110 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op
36
} else {
46
VQRDMULH 1111 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op
37
rule = float_3nan_prop_s_cab;
47
38
}
48
+VQADD_S 111 0 1111 0 . .. ... 0 ... 0 0000 . 1 . 1 ... 0 @2op
39
-#elif defined(TARGET_PPC)
49
+VQADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 1 ... 0 @2op
40
- /*
50
+VQSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
41
- * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
51
+VQSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 1 ... 0 @2op
42
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
52
+
43
- */
53
# Vector miscellaneous
44
- rule = float_3nan_prop_acb;
54
45
#elif defined(TARGET_S390X)
55
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
46
rule = float_3nan_prop_s_abc;
56
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
47
#elif defined(TARGET_SPARC)
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/mve_helper.c
59
+++ b/target/arm/mve_helper.c
60
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT(vqrdmulhb, 1, int8_t, DO_QRDMULH_B)
61
DO_2OP_SAT(vqrdmulhh, 2, int16_t, DO_QRDMULH_H)
62
DO_2OP_SAT(vqrdmulhw, 4, int32_t, DO_QRDMULH_W)
63
64
+DO_2OP_SAT(vqaddub, 1, uint8_t, DO_UQADD_B)
65
+DO_2OP_SAT(vqadduh, 2, uint16_t, DO_UQADD_H)
66
+DO_2OP_SAT(vqadduw, 4, uint32_t, DO_UQADD_W)
67
+DO_2OP_SAT(vqaddsb, 1, int8_t, DO_SQADD_B)
68
+DO_2OP_SAT(vqaddsh, 2, int16_t, DO_SQADD_H)
69
+DO_2OP_SAT(vqaddsw, 4, int32_t, DO_SQADD_W)
70
+
71
+DO_2OP_SAT(vqsubub, 1, uint8_t, DO_UQSUB_B)
72
+DO_2OP_SAT(vqsubuh, 2, uint16_t, DO_UQSUB_H)
73
+DO_2OP_SAT(vqsubuw, 4, uint32_t, DO_UQSUB_W)
74
+DO_2OP_SAT(vqsubsb, 1, int8_t, DO_SQSUB_B)
75
+DO_2OP_SAT(vqsubsh, 2, int16_t, DO_SQSUB_H)
76
+DO_2OP_SAT(vqsubsw, 4, int32_t, DO_SQSUB_W)
77
+
78
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
79
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
80
uint32_t rm) \
81
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
82
index XXXXXXX..XXXXXXX 100644
83
--- a/target/arm/translate-mve.c
84
+++ b/target/arm/translate-mve.c
85
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMULL_TS, vmullts)
86
DO_2OP(VMULL_TU, vmulltu)
87
DO_2OP(VQDMULH, vqdmulh)
88
DO_2OP(VQRDMULH, vqrdmulh)
89
+DO_2OP(VQADD_S, vqadds)
90
+DO_2OP(VQADD_U, vqaddu)
91
+DO_2OP(VQSUB_S, vqsubs)
92
+DO_2OP(VQSUB_U, vqsubu)
93
94
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
95
MVEGenTwoOpScalarFn fn)
96
--
48
--
97
2.20.1
49
2.34.1
98
99
diff view generated by jsdifflib
1
Implement the MVE VQDMULL scalar insn. This multiplies the top or
1
Set the Float3NaNPropRule explicitly for s390x, and remove the
2
bottom half of each element by the scalar, doubles and saturates
2
ifdef from pickNaNMulAdd().
3
to a double-width result.
4
5
Note that this encoding overlaps with VQADD and VQSUB; it uses
6
what in VQADD and VQSUB would be the 'size=0b11' encoding.
7
3
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20210617121628.20116-30-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-21-peter.maydell@linaro.org
11
---
7
---
12
target/arm/helper-mve.h | 5 +++
8
target/s390x/cpu.c | 1 +
13
target/arm/mve.decode | 23 +++++++++++---
9
fpu/softfloat-specialize.c.inc | 2 --
14
target/arm/mve_helper.c | 65 ++++++++++++++++++++++++++++++++++++++
10
2 files changed, 1 insertion(+), 2 deletions(-)
15
target/arm/translate-mve.c | 30 ++++++++++++++++++
16
4 files changed, 119 insertions(+), 4 deletions(-)
17
11
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
19
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
14
--- a/target/s390x/cpu.c
21
+++ b/target/arm/helper-mve.h
15
+++ b/target/s390x/cpu.c
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vbrsrb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
23
DEF_HELPER_FLAGS_4(mve_vbrsrh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
set_float_detect_tininess(float_tininess_before_rounding,
24
DEF_HELPER_FLAGS_4(mve_vbrsrw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
&env->fpu_status);
25
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
26
+DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
27
+DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
set_float_infzeronan_rule(float_infzeronan_dnan_always,
28
+DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
&env->fpu_status);
29
+DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
/* fall through */
30
+
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
32
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
33
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
34
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
35
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/mve.decode
26
--- a/fpu/softfloat-specialize.c.inc
37
+++ b/target/arm/mve.decode
27
+++ b/fpu/softfloat-specialize.c.inc
38
@@ -XXX,XX +XXX,XX @@
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
39
%qm 5:1 1:3
29
} else {
40
%qn 7:1 17:3
30
rule = float_3nan_prop_s_cab;
41
31
}
42
+# VQDMULL has size in bit 28: 0 for 16 bit, 1 for 32 bit
32
-#elif defined(TARGET_S390X)
43
+%size_28 28:1 !function=plus_1
33
- rule = float_3nan_prop_s_abc;
44
+
34
#elif defined(TARGET_SPARC)
45
&vldr_vstr rn qd imm p a w size l u
35
rule = float_3nan_prop_s_cba;
46
&1op qd qm size
36
#elif defined(TARGET_XTENSA)
47
&2op qd qm qn size
48
@@ -XXX,XX +XXX,XX @@
49
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
50
51
@2scalar .... .... .. size:2 .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
52
+@2scalar_nosz .... .... .... .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
53
54
# Vector loads and stores
55
56
@@ -XXX,XX +XXX,XX @@ VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
57
VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
58
VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
59
60
-VQADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar
61
-VQADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar
62
-VQSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
63
-VQSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
64
+{
65
+ VQADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar
66
+ VQADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar
67
+ VQDMULLB_scalar 111 . 1110 0 . 11 ... 0 ... 0 1111 . 110 .... @2scalar_nosz \
68
+ size=%size_28
69
+}
70
+
71
+{
72
+ VQSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
73
+ VQSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
74
+ VQDMULLT_scalar 111 . 1110 0 . 11 ... 0 ... 1 1111 . 110 .... @2scalar_nosz \
75
+ size=%size_28
76
+}
77
+
78
VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
79
80
VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
81
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
82
83
+
84
# Predicate operations
85
%mask_22_13 22:1 13:3
86
VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
87
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/target/arm/mve_helper.c
90
+++ b/target/arm/mve_helper.c
91
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B)
92
DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H)
93
DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W)
94
95
+/*
96
+ * Long saturating scalar ops. As with DO_2OP_L, TYPE and H are for the
97
+ * input (smaller) type and LESIZE, LTYPE, LH for the output (long) type.
98
+ * SATMASK specifies which bits of the predicate mask matter for determining
99
+ * whether to propagate a saturation indication into FPSCR.QC -- for
100
+ * the 16x16->32 case we must check only the bit corresponding to the T or B
101
+ * half that we used, but for the 32x32->64 case we propagate if the mask
102
+ * bit is set for either half.
103
+ */
104
+#define DO_2OP_SAT_SCALAR_L(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN, SATMASK) \
105
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
106
+ uint32_t rm) \
107
+ { \
108
+ LTYPE *d = vd; \
109
+ TYPE *n = vn; \
110
+ TYPE m = rm; \
111
+ uint16_t mask = mve_element_mask(env); \
112
+ unsigned le; \
113
+ bool qc = false; \
114
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
115
+ bool sat = false; \
116
+ LTYPE r = FN((LTYPE)n[H##ESIZE(le * 2 + TOP)], m, &sat); \
117
+ mergemask(&d[H##LESIZE(le)], r, mask); \
118
+ qc |= sat && (mask & SATMASK); \
119
+ } \
120
+ if (qc) { \
121
+ env->vfp.qc[0] = qc; \
122
+ } \
123
+ mve_advance_vpt(env); \
124
+ }
125
+
126
+static inline int32_t do_qdmullh(int16_t n, int16_t m, bool *sat)
127
+{
128
+ int64_t r = ((int64_t)n * m) * 2;
129
+ return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat);
130
+}
131
+
132
+static inline int64_t do_qdmullw(int32_t n, int32_t m, bool *sat)
133
+{
134
+ /* The multiply can't overflow, but the doubling might */
135
+ int64_t r = (int64_t)n * m;
136
+ if (r > INT64_MAX / 2) {
137
+ *sat = true;
138
+ return INT64_MAX;
139
+ } else if (r < INT64_MIN / 2) {
140
+ *sat = true;
141
+ return INT64_MIN;
142
+ } else {
143
+ return r * 2;
144
+ }
145
+}
146
+
147
+#define SATMASK16B 1
148
+#define SATMASK16T (1 << 2)
149
+#define SATMASK32 ((1 << 4) | 1)
150
+
151
+DO_2OP_SAT_SCALAR_L(vqdmullb_scalarh, 0, 2, int16_t, 4, int32_t, \
152
+ do_qdmullh, SATMASK16B)
153
+DO_2OP_SAT_SCALAR_L(vqdmullb_scalarw, 0, 4, int32_t, 8, int64_t, \
154
+ do_qdmullw, SATMASK32)
155
+DO_2OP_SAT_SCALAR_L(vqdmullt_scalarh, 1, 2, int16_t, 4, int32_t, \
156
+ do_qdmullh, SATMASK16T)
157
+DO_2OP_SAT_SCALAR_L(vqdmullt_scalarw, 1, 4, int32_t, 8, int64_t, \
158
+ do_qdmullw, SATMASK32)
159
+
160
static inline uint32_t do_vbrsrb(uint32_t n, uint32_t m)
161
{
162
m &= 0xff;
163
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
164
index XXXXXXX..XXXXXXX 100644
165
--- a/target/arm/translate-mve.c
166
+++ b/target/arm/translate-mve.c
167
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar)
168
DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar)
169
DO_2OP_SCALAR(VBRSR, vbrsr)
170
171
+static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a)
172
+{
173
+ static MVEGenTwoOpScalarFn * const fns[] = {
174
+ NULL,
175
+ gen_helper_mve_vqdmullb_scalarh,
176
+ gen_helper_mve_vqdmullb_scalarw,
177
+ NULL,
178
+ };
179
+ if (a->qd == a->qn && a->size == MO_32) {
180
+ /* UNPREDICTABLE; we choose to undef */
181
+ return false;
182
+ }
183
+ return do_2op_scalar(s, a, fns[a->size]);
184
+}
185
+
186
+static bool trans_VQDMULLT_scalar(DisasContext *s, arg_2scalar *a)
187
+{
188
+ static MVEGenTwoOpScalarFn * const fns[] = {
189
+ NULL,
190
+ gen_helper_mve_vqdmullt_scalarh,
191
+ gen_helper_mve_vqdmullt_scalarw,
192
+ NULL,
193
+ };
194
+ if (a->qd == a->qn && a->size == MO_32) {
195
+ /* UNPREDICTABLE; we choose to undef */
196
+ return false;
197
+ }
198
+ return do_2op_scalar(s, a, fns[a->size]);
199
+}
200
+
201
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
202
MVEGenDualAccOpFn *fn)
203
{
204
--
37
--
205
2.20.1
38
2.34.1
206
207
diff view generated by jsdifflib
1
Implement the MVE VQDMULH and VQRDMULH scalar insns, which multiply
1
Set the Float3NaNPropRule explicitly for SPARC, and remove the
2
elements by the scalar, double, possibly round, take the high half
2
ifdef from pickNaNMulAdd().
3
and saturate.
4
3
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210617121628.20116-29-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-22-peter.maydell@linaro.org
8
---
7
---
9
target/arm/helper-mve.h | 8 ++++++++
8
target/sparc/cpu.c | 2 ++
10
target/arm/mve.decode | 3 +++
9
fpu/softfloat-specialize.c.inc | 2 --
11
target/arm/mve_helper.c | 25 +++++++++++++++++++++++++
10
2 files changed, 2 insertions(+), 2 deletions(-)
12
target/arm/translate-mve.c | 2 ++
13
4 files changed, 38 insertions(+)
14
11
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
14
--- a/target/sparc/cpu.c
18
+++ b/target/arm/helper-mve.h
15
+++ b/target/sparc/cpu.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqsubu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
20
DEF_HELPER_FLAGS_4(mve_vqsubu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
* the CPU state struct so it won't get zeroed on reset.
21
DEF_HELPER_FLAGS_4(mve_vqsubu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
*/
22
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
23
+DEF_HELPER_FLAGS_4(mve_vqdmulh_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
+ /* For fused-multiply add, prefer SNaN over QNaN, then C->B->A */
24
+DEF_HELPER_FLAGS_4(mve_vqdmulh_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
25
+DEF_HELPER_FLAGS_4(mve_vqdmulh_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
/* For inf * 0 + NaN, return the input NaN */
26
+
23
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
27
+DEF_HELPER_FLAGS_4(mve_vqrdmulh_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
28
+DEF_HELPER_FLAGS_4(mve_vqrdmulh_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
+DEF_HELPER_FLAGS_4(mve_vqrdmulh_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+
31
DEF_HELPER_FLAGS_4(mve_vbrsrb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(mve_vbrsrh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
DEF_HELPER_FLAGS_4(mve_vbrsrw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
35
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/mve.decode
27
--- a/fpu/softfloat-specialize.c.inc
37
+++ b/target/arm/mve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
38
@@ -XXX,XX +XXX,XX @@ VQSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
39
VQSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
30
} else {
40
VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
31
rule = float_3nan_prop_s_cab;
41
32
}
42
+VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
33
-#elif defined(TARGET_SPARC)
43
+VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
34
- rule = float_3nan_prop_s_cba;
44
+
35
#elif defined(TARGET_XTENSA)
45
# Predicate operations
36
if (status->use_first_nan) {
46
%mask_22_13 22:1 13:3
37
rule = float_3nan_prop_abc;
47
VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
48
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/mve_helper.c
51
+++ b/target/arm/mve_helper.c
52
@@ -XXX,XX +XXX,XX @@ static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
53
#define DO_UQSUB_H(n, m, s) do_sat_bhw((int64_t)n - m, 0, UINT16_MAX, s)
54
#define DO_UQSUB_W(n, m, s) do_sat_bhw((int64_t)n - m, 0, UINT32_MAX, s)
55
56
+/*
57
+ * For QDMULH and QRDMULH we simplify "double and shift by esize" into
58
+ * "shift by esize-1", adjusting the QRDMULH rounding constant to match.
59
+ */
60
+#define DO_QDMULH_B(n, m, s) do_sat_bhw(((int64_t)n * m) >> 7, \
61
+ INT8_MIN, INT8_MAX, s)
62
+#define DO_QDMULH_H(n, m, s) do_sat_bhw(((int64_t)n * m) >> 15, \
63
+ INT16_MIN, INT16_MAX, s)
64
+#define DO_QDMULH_W(n, m, s) do_sat_bhw(((int64_t)n * m) >> 31, \
65
+ INT32_MIN, INT32_MAX, s)
66
+
67
+#define DO_QRDMULH_B(n, m, s) do_sat_bhw(((int64_t)n * m + (1 << 6)) >> 7, \
68
+ INT8_MIN, INT8_MAX, s)
69
+#define DO_QRDMULH_H(n, m, s) do_sat_bhw(((int64_t)n * m + (1 << 14)) >> 15, \
70
+ INT16_MIN, INT16_MAX, s)
71
+#define DO_QRDMULH_W(n, m, s) do_sat_bhw(((int64_t)n * m + (1 << 30)) >> 31, \
72
+ INT32_MIN, INT32_MAX, s)
73
+
74
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
75
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
76
uint32_t rm) \
77
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT_SCALAR(vqsubs_scalarb, 1, int8_t, DO_SQSUB_B)
78
DO_2OP_SAT_SCALAR(vqsubs_scalarh, 2, int16_t, DO_SQSUB_H)
79
DO_2OP_SAT_SCALAR(vqsubs_scalarw, 4, int32_t, DO_SQSUB_W)
80
81
+DO_2OP_SAT_SCALAR(vqdmulh_scalarb, 1, int8_t, DO_QDMULH_B)
82
+DO_2OP_SAT_SCALAR(vqdmulh_scalarh, 2, int16_t, DO_QDMULH_H)
83
+DO_2OP_SAT_SCALAR(vqdmulh_scalarw, 4, int32_t, DO_QDMULH_W)
84
+DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B)
85
+DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H)
86
+DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W)
87
+
88
static inline uint32_t do_vbrsrb(uint32_t n, uint32_t m)
89
{
90
m &= 0xff;
91
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
92
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/translate-mve.c
94
+++ b/target/arm/translate-mve.c
95
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VQADD_S_scalar, vqadds_scalar)
96
DO_2OP_SCALAR(VQADD_U_scalar, vqaddu_scalar)
97
DO_2OP_SCALAR(VQSUB_S_scalar, vqsubs_scalar)
98
DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar)
99
+DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar)
100
+DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar)
101
DO_2OP_SCALAR(VBRSR, vbrsr)
102
103
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
104
--
38
--
105
2.20.1
39
2.34.1
106
107
diff view generated by jsdifflib
1
Implement the MVE VQADD and VQSUB insns, which perform saturating
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
addition of a scalar to each element. Note that individual bytes of
2
ifdef from pickNaNMulAdd().
3
each result element are used or discarded according to the predicate
4
mask, but FPSCR.QC is only set if the predicate mask for the lowest
5
byte of the element is set.
6
3
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210617121628.20116-28-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-23-peter.maydell@linaro.org
10
---
7
---
11
target/arm/helper-mve.h | 16 ++++++++++
8
target/mips/fpu_helper.h | 4 ++++
12
target/arm/mve.decode | 5 +++
9
target/mips/msa.c | 3 +++
13
target/arm/mve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++
10
fpu/softfloat-specialize.c.inc | 8 +-------
14
target/arm/translate-mve.c | 4 +++
11
3 files changed, 8 insertions(+), 7 deletions(-)
15
4 files changed, 87 insertions(+)
16
12
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
18
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
15
--- a/target/mips/fpu_helper.h
20
+++ b/target/arm/helper-mve.h
16
+++ b/target/mips/fpu_helper.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vhsubu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
22
DEF_HELPER_FLAGS_4(mve_vhsubu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
{
23
DEF_HELPER_FLAGS_4(mve_vhsubu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
24
20
FloatInfZeroNaNRule izn_rule;
25
+DEF_HELPER_FLAGS_4(mve_vqadds_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
+ Float3NaNPropRule nan3_rule;
26
+DEF_HELPER_FLAGS_4(mve_vqadds_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
27
+DEF_HELPER_FLAGS_4(mve_vqadds_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
28
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
29
+ nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
30
+ set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
28
+
31
+
29
+DEF_HELPER_FLAGS_4(mve_vqaddu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
}
30
+DEF_HELPER_FLAGS_4(mve_vqaddu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
31
+DEF_HELPER_FLAGS_4(mve_vqaddu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
static inline void restore_fp_status(CPUMIPSState *env)
35
diff --git a/target/mips/msa.c b/target/mips/msa.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/target/mips/msa.c
38
+++ b/target/mips/msa.c
39
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
40
set_float_2nan_prop_rule(float_2nan_prop_s_ab,
41
&env->active_tc.msa_fp_status);
42
43
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab,
44
+ &env->active_tc.msa_fp_status);
32
+
45
+
33
+DEF_HELPER_FLAGS_4(mve_vqsubs_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
46
/* clear float_status exception flags */
34
+DEF_HELPER_FLAGS_4(mve_vqsubs_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
47
set_float_exception_flags(0, &env->active_tc.msa_fp_status);
35
+DEF_HELPER_FLAGS_4(mve_vqsubs_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
48
36
+
49
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
37
+DEF_HELPER_FLAGS_4(mve_vqsubu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(mve_vqsubu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(mve_vqsubu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+
41
DEF_HELPER_FLAGS_4(mve_vbrsrb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
42
DEF_HELPER_FLAGS_4(mve_vbrsrh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
DEF_HELPER_FLAGS_4(mve_vbrsrw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
44
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
45
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/mve.decode
51
--- a/fpu/softfloat-specialize.c.inc
47
+++ b/target/arm/mve.decode
52
+++ b/fpu/softfloat-specialize.c.inc
48
@@ -XXX,XX +XXX,XX @@ VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
53
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
49
VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
50
VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
51
VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
52
+
53
+VQADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar
54
+VQADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar
55
+VQSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
56
+VQSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 110 .... @2scalar
57
VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
58
59
# Predicate operations
60
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/mve_helper.c
63
+++ b/target/arm/mve_helper.c
64
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vhaddu, do_vhadd_u)
65
DO_2OP_S(vhsubs, do_vhsub_s)
66
DO_2OP_U(vhsubu, do_vhsub_u)
67
68
+static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
69
+{
70
+ if (val > max) {
71
+ *s = true;
72
+ return max;
73
+ } else if (val < min) {
74
+ *s = true;
75
+ return min;
76
+ }
77
+ return val;
78
+}
79
+
80
+#define DO_SQADD_B(n, m, s) do_sat_bhw((int64_t)n + m, INT8_MIN, INT8_MAX, s)
81
+#define DO_SQADD_H(n, m, s) do_sat_bhw((int64_t)n + m, INT16_MIN, INT16_MAX, s)
82
+#define DO_SQADD_W(n, m, s) do_sat_bhw((int64_t)n + m, INT32_MIN, INT32_MAX, s)
83
+
84
+#define DO_UQADD_B(n, m, s) do_sat_bhw((int64_t)n + m, 0, UINT8_MAX, s)
85
+#define DO_UQADD_H(n, m, s) do_sat_bhw((int64_t)n + m, 0, UINT16_MAX, s)
86
+#define DO_UQADD_W(n, m, s) do_sat_bhw((int64_t)n + m, 0, UINT32_MAX, s)
87
+
88
+#define DO_SQSUB_B(n, m, s) do_sat_bhw((int64_t)n - m, INT8_MIN, INT8_MAX, s)
89
+#define DO_SQSUB_H(n, m, s) do_sat_bhw((int64_t)n - m, INT16_MIN, INT16_MAX, s)
90
+#define DO_SQSUB_W(n, m, s) do_sat_bhw((int64_t)n - m, INT32_MIN, INT32_MAX, s)
91
+
92
+#define DO_UQSUB_B(n, m, s) do_sat_bhw((int64_t)n - m, 0, UINT8_MAX, s)
93
+#define DO_UQSUB_H(n, m, s) do_sat_bhw((int64_t)n - m, 0, UINT16_MAX, s)
94
+#define DO_UQSUB_W(n, m, s) do_sat_bhw((int64_t)n - m, 0, UINT32_MAX, s)
95
96
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
97
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
98
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vhsubu, do_vhsub_u)
99
mve_advance_vpt(env); \
100
}
54
}
101
55
102
+#define DO_2OP_SAT_SCALAR(OP, ESIZE, TYPE, FN) \
56
if (rule == float_3nan_prop_none) {
103
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
57
-#if defined(TARGET_MIPS)
104
+ uint32_t rm) \
58
- if (snan_bit_is_one(status)) {
105
+ { \
59
- rule = float_3nan_prop_s_abc;
106
+ TYPE *d = vd, *n = vn; \
60
- } else {
107
+ TYPE m = rm; \
61
- rule = float_3nan_prop_s_cab;
108
+ uint16_t mask = mve_element_mask(env); \
62
- }
109
+ unsigned e; \
63
-#elif defined(TARGET_XTENSA)
110
+ bool qc = false; \
64
+#if defined(TARGET_XTENSA)
111
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
65
if (status->use_first_nan) {
112
+ bool sat = false; \
66
rule = float_3nan_prop_abc;
113
+ mergemask(&d[H##ESIZE(e)], FN(n[H##ESIZE(e)], m, &sat), \
67
} else {
114
+ mask); \
115
+ qc |= sat & mask & 1; \
116
+ } \
117
+ if (qc) { \
118
+ env->vfp.qc[0] = qc; \
119
+ } \
120
+ mve_advance_vpt(env); \
121
+ }
122
+
123
/* provide unsigned 2-op scalar helpers for all sizes */
124
#define DO_2OP_SCALAR_U(OP, FN) \
125
DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \
126
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR_U(vhaddu_scalar, do_vhadd_u)
127
DO_2OP_SCALAR_S(vhsubs_scalar, do_vhsub_s)
128
DO_2OP_SCALAR_U(vhsubu_scalar, do_vhsub_u)
129
130
+DO_2OP_SAT_SCALAR(vqaddu_scalarb, 1, uint8_t, DO_UQADD_B)
131
+DO_2OP_SAT_SCALAR(vqaddu_scalarh, 2, uint16_t, DO_UQADD_H)
132
+DO_2OP_SAT_SCALAR(vqaddu_scalarw, 4, uint32_t, DO_UQADD_W)
133
+DO_2OP_SAT_SCALAR(vqadds_scalarb, 1, int8_t, DO_SQADD_B)
134
+DO_2OP_SAT_SCALAR(vqadds_scalarh, 2, int16_t, DO_SQADD_H)
135
+DO_2OP_SAT_SCALAR(vqadds_scalarw, 4, int32_t, DO_SQADD_W)
136
+
137
+DO_2OP_SAT_SCALAR(vqsubu_scalarb, 1, uint8_t, DO_UQSUB_B)
138
+DO_2OP_SAT_SCALAR(vqsubu_scalarh, 2, uint16_t, DO_UQSUB_H)
139
+DO_2OP_SAT_SCALAR(vqsubu_scalarw, 4, uint32_t, DO_UQSUB_W)
140
+DO_2OP_SAT_SCALAR(vqsubs_scalarb, 1, int8_t, DO_SQSUB_B)
141
+DO_2OP_SAT_SCALAR(vqsubs_scalarh, 2, int16_t, DO_SQSUB_H)
142
+DO_2OP_SAT_SCALAR(vqsubs_scalarw, 4, int32_t, DO_SQSUB_W)
143
+
144
static inline uint32_t do_vbrsrb(uint32_t n, uint32_t m)
145
{
146
m &= 0xff;
147
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
148
index XXXXXXX..XXXXXXX 100644
149
--- a/target/arm/translate-mve.c
150
+++ b/target/arm/translate-mve.c
151
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VHADD_S_scalar, vhadds_scalar)
152
DO_2OP_SCALAR(VHADD_U_scalar, vhaddu_scalar)
153
DO_2OP_SCALAR(VHSUB_S_scalar, vhsubs_scalar)
154
DO_2OP_SCALAR(VHSUB_U_scalar, vhsubu_scalar)
155
+DO_2OP_SCALAR(VQADD_S_scalar, vqadds_scalar)
156
+DO_2OP_SCALAR(VQADD_U_scalar, vqaddu_scalar)
157
+DO_2OP_SCALAR(VQSUB_S_scalar, vqsubs_scalar)
158
+DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar)
159
DO_2OP_SCALAR(VBRSR, vbrsr)
160
161
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
162
--
68
--
163
2.20.1
69
2.34.1
164
165
diff view generated by jsdifflib
1
Implement the MVE VCLS insn.
1
Set the Float3NaNPropRule explicitly for xtensa, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-5-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-24-peter.maydell@linaro.org
6
---
7
---
7
target/arm/helper-mve.h | 4 ++++
8
target/xtensa/fpu_helper.c | 2 ++
8
target/arm/mve.decode | 1 +
9
fpu/softfloat-specialize.c.inc | 8 --------
9
target/arm/mve_helper.c | 7 +++++++
10
2 files changed, 2 insertions(+), 8 deletions(-)
10
target/arm/translate-mve.c | 1 +
11
4 files changed, 13 insertions(+)
12
11
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
14
--- a/target/xtensa/fpu_helper.c
16
+++ b/target/arm/helper-mve.h
15
+++ b/target/xtensa/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
18
DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32)
17
set_use_first_nan(use_first, &env->fp_status);
19
DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32)
18
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
20
19
&env->fp_status);
21
+DEF_HELPER_FLAGS_3(mve_vclsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
20
+ set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
22
+DEF_HELPER_FLAGS_3(mve_vclsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
+ &env->fp_status);
23
+DEF_HELPER_FLAGS_3(mve_vclsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
}
24
+
23
25
DEF_HELPER_FLAGS_3(mve_vclzb, TCG_CALL_NO_WG, void, env, ptr, ptr)
24
void HELPER(wur_fpu2k_fcr)(CPUXtensaState *env, uint32_t v)
26
DEF_HELPER_FLAGS_3(mve_vclzh, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
27
DEF_HELPER_FLAGS_3(mve_vclzw, TCG_CALL_NO_WG, void, env, ptr, ptr)
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
29
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
27
--- a/fpu/softfloat-specialize.c.inc
31
+++ b/target/arm/mve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
32
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
33
34
# Vector miscellaneous
35
36
+VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
37
VCLZ 1111 1111 1 . 11 .. 00 ... 0 0100 11 . 0 ... 0 @1op
38
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/mve_helper.c
41
+++ b/target/arm/mve_helper.c
42
@@ -XXX,XX +XXX,XX @@ static void mergemask_sq(int64_t *d, int64_t r, uint16_t mask)
43
mve_advance_vpt(env); \
44
}
30
}
45
31
46
+#define DO_CLS_B(N) (clrsb32(N) - 24)
32
if (rule == float_3nan_prop_none) {
47
+#define DO_CLS_H(N) (clrsb32(N) - 16)
33
-#if defined(TARGET_XTENSA)
48
+
34
- if (status->use_first_nan) {
49
+DO_1OP(vclsb, 1, int8_t, DO_CLS_B)
35
- rule = float_3nan_prop_abc;
50
+DO_1OP(vclsh, 2, int16_t, DO_CLS_H)
36
- } else {
51
+DO_1OP(vclsw, 4, int32_t, clrsb32)
37
- rule = float_3nan_prop_cba;
52
+
38
- }
53
#define DO_CLZ_B(N) (clz32(N) - 24)
39
-#else
54
#define DO_CLZ_H(N) (clz32(N) - 16)
40
rule = float_3nan_prop_abc;
55
41
-#endif
56
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/translate-mve.c
59
+++ b/target/arm/translate-mve.c
60
@@ -XXX,XX +XXX,XX @@ static bool do_1op(DisasContext *s, arg_1op *a, MVEGenOneOpFn fn)
61
}
42
}
62
43
63
DO_1OP(VCLZ, vclz)
44
assert(rule != float_3nan_prop_none);
64
+DO_1OP(VCLS, vcls)
65
--
45
--
66
2.20.1
46
2.34.1
67
68
diff view generated by jsdifflib
1
Implement the MVE VCADD insn, which performs a complex add with
1
Set the Float3NaNPropRule explicitly for i386. We had no
2
rotate. Note that the size=0b11 encoding is VSBC.
2
i386-specific behaviour in the old ifdef ladder, so we were using the
3
3
default "prefer a then b then c" fallback; this is actually the
4
The architecture grants some leeway for the "destination and Vm
4
correct per-the-spec handling for i386.
5
source overlap" case for the size MO_32 case, but we choose not to
6
make use of it, instead always calculating all 16 bytes worth of
7
results before setting the destination register.
8
5
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20210617121628.20116-42-peter.maydell@linaro.org
8
Message-id: 20241202131347.498124-25-peter.maydell@linaro.org
12
---
9
---
13
target/arm/helper-mve.h | 8 ++++++++
10
target/i386/tcg/fpu_helper.c | 1 +
14
target/arm/mve.decode | 9 +++++++--
11
1 file changed, 1 insertion(+)
15
target/arm/mve_helper.c | 29 +++++++++++++++++++++++++++++
16
target/arm/translate-mve.c | 7 +++++++
17
4 files changed, 51 insertions(+), 2 deletions(-)
18
12
19
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
13
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
20
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper-mve.h
15
--- a/target/i386/tcg/fpu_helper.c
22
+++ b/target/arm/helper-mve.h
16
+++ b/target/i386/tcg/fpu_helper.c
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vadci, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
24
DEF_HELPER_FLAGS_4(mve_vsbc, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
* there are multiple input NaNs they are selected in the order a, b, c.
25
DEF_HELPER_FLAGS_4(mve_vsbci, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
*/
26
20
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
27
+DEF_HELPER_FLAGS_4(mve_vcadd90b, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
28
+DEF_HELPER_FLAGS_4(mve_vcadd90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+DEF_HELPER_FLAGS_4(mve_vcadd90w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
+
31
+DEF_HELPER_FLAGS_4(mve_vcadd270b, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
+DEF_HELPER_FLAGS_4(mve_vcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
+DEF_HELPER_FLAGS_4(mve_vcadd270w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+
35
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/mve.decode
41
+++ b/target/arm/mve.decode
42
@@ -XXX,XX +XXX,XX @@ VRHADD_S 111 0 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
43
VRHADD_U 111 1 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
44
45
VADC 1110 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
46
-VSBC 1111 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
47
VADCI 1110 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
48
-VSBCI 1111 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
49
+
50
+{
51
+ VSBC 1111 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
52
+ VSBCI 1111 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
53
+ VCADD90 1111 1110 0 . .. ... 0 ... 0 1111 . 0 . 0 ... 0 @2op
54
+ VCADD270 1111 1110 0 . .. ... 0 ... 1 1111 . 0 . 0 ... 0 @2op
55
+}
56
57
# Vector miscellaneous
58
59
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
60
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/mve_helper.c
62
+++ b/target/arm/mve_helper.c
63
@@ -XXX,XX +XXX,XX @@ void HELPER(mve_vsbci)(CPUARMState *env, void *vd, void *vn, void *vm)
64
do_vadc(env, vd, vn, vm, -1, 1, true);
65
}
22
}
66
23
67
+#define DO_VCADD(OP, ESIZE, TYPE, FN0, FN1) \
24
static inline uint8_t save_exception_flags(CPUX86State *env)
68
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, void *vm) \
69
+ { \
70
+ TYPE *d = vd, *n = vn, *m = vm; \
71
+ uint16_t mask = mve_element_mask(env); \
72
+ unsigned e; \
73
+ TYPE r[16 / ESIZE]; \
74
+ /* Calculate all results first to avoid overwriting inputs */ \
75
+ for (e = 0; e < 16 / ESIZE; e++) { \
76
+ if (!(e & 1)) { \
77
+ r[e] = FN0(n[H##ESIZE(e)], m[H##ESIZE(e + 1)]); \
78
+ } else { \
79
+ r[e] = FN1(n[H##ESIZE(e)], m[H##ESIZE(e - 1)]); \
80
+ } \
81
+ } \
82
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
83
+ mergemask(&d[H##ESIZE(e)], r[e], mask); \
84
+ } \
85
+ mve_advance_vpt(env); \
86
+ }
87
+
88
+#define DO_VCADD_ALL(OP, FN0, FN1) \
89
+ DO_VCADD(OP##b, 1, int8_t, FN0, FN1) \
90
+ DO_VCADD(OP##h, 2, int16_t, FN0, FN1) \
91
+ DO_VCADD(OP##w, 4, int32_t, FN0, FN1)
92
+
93
+DO_VCADD_ALL(vcadd90, DO_SUB, DO_ADD)
94
+DO_VCADD_ALL(vcadd270, DO_ADD, DO_SUB)
95
+
96
static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
97
{
98
if (val > max) {
99
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
100
index XXXXXXX..XXXXXXX 100644
101
--- a/target/arm/translate-mve.c
102
+++ b/target/arm/translate-mve.c
103
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQRDMLSDH, vqrdmlsdh)
104
DO_2OP(VQRDMLSDHX, vqrdmlsdhx)
105
DO_2OP(VRHADD_S, vrhadds)
106
DO_2OP(VRHADD_U, vrhaddu)
107
+/*
108
+ * VCADD Qd == Qm at size MO_32 is UNPREDICTABLE; we choose not to diagnose
109
+ * so we can reuse the DO_2OP macro. (Our implementation calculates the
110
+ * "expected" results in this case.)
111
+ */
112
+DO_2OP(VCADD90, vcadd90)
113
+DO_2OP(VCADD270, vcadd270)
114
115
static bool trans_VQDMULLB(DisasContext *s, arg_2op *a)
116
{
117
--
25
--
118
2.20.1
26
2.34.1
119
120
diff view generated by jsdifflib
1
Implement the scalar forms of the MVE VSUB and VMUL insns.
1
Set the Float3NaNPropRule explicitly for HPPA, and remove the
2
ifdef from pickNaNMulAdd().
3
4
HPPA is the only target that was using the default branch of the
5
ifdef ladder (other targets either do not use muladd or set
6
default_nan_mode), so we can remove the ifdef fallback entirely now
7
(allowing the "rule not set" case to fall into the default of the
8
switch statement and assert).
9
10
We add a TODO note that the HPPA rule is probably wrong; this is
11
not a behavioural change for this refactoring.
2
12
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-24-peter.maydell@linaro.org
15
Message-id: 20241202131347.498124-26-peter.maydell@linaro.org
6
---
16
---
7
target/arm/helper-mve.h | 8 ++++++++
17
target/hppa/fpu_helper.c | 8 ++++++++
8
target/arm/mve.decode | 2 ++
18
fpu/softfloat-specialize.c.inc | 4 ----
9
target/arm/mve_helper.c | 2 ++
19
2 files changed, 8 insertions(+), 4 deletions(-)
10
target/arm/translate-mve.c | 2 ++
11
4 files changed, 14 insertions(+)
12
20
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
21
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
23
--- a/target/hppa/fpu_helper.c
16
+++ b/target/arm/helper-mve.h
24
+++ b/target/hppa/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
18
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
* HPPA does note implement a CPU reset method at all...
19
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
*/
20
28
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
21
+DEF_HELPER_FLAGS_4(mve_vsub_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+ /*
22
+DEF_HELPER_FLAGS_4(mve_vsub_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+ * TODO: The HPPA architecture reference only documents its NaN
23
+DEF_HELPER_FLAGS_4(mve_vsub_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
+ * propagation rule for 2-operand operations. Testing on real hardware
24
+
32
+ * might be necessary to confirm whether this order for muladd is correct.
25
+DEF_HELPER_FLAGS_4(mve_vmul_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
+ * Not preferring the SNaN is almost certainly incorrect as it diverges
26
+DEF_HELPER_FLAGS_4(mve_vmul_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
+ * from the documented rules for 2-operand operations.
27
+DEF_HELPER_FLAGS_4(mve_vmul_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
+ */
28
+
36
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
29
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
37
/* For inf * 0 + NaN, return the input NaN */
30
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
38
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
31
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
39
}
32
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
40
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
33
index XXXXXXX..XXXXXXX 100644
41
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/mve.decode
42
--- a/fpu/softfloat-specialize.c.inc
35
+++ b/target/arm/mve.decode
43
+++ b/fpu/softfloat-specialize.c.inc
36
@@ -XXX,XX +XXX,XX @@ VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_no
44
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
37
# Scalar operations
45
}
38
39
VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar
40
+VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar
41
+VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
42
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/mve_helper.c
45
+++ b/target/arm/mve_helper.c
46
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vhsubu, do_vhsub_u)
47
DO_2OP_SCALAR(OP##w, 4, uint32_t, FN)
48
49
DO_2OP_SCALAR_U(vadd_scalar, DO_ADD)
50
+DO_2OP_SCALAR_U(vsub_scalar, DO_SUB)
51
+DO_2OP_SCALAR_U(vmul_scalar, DO_MUL)
52
53
/*
54
* Multiply add long dual accumulate ops.
55
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/translate-mve.c
58
+++ b/target/arm/translate-mve.c
59
@@ -XXX,XX +XXX,XX @@ static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
60
}
46
}
61
47
62
DO_2OP_SCALAR(VADD_scalar, vadd_scalar)
48
- if (rule == float_3nan_prop_none) {
63
+DO_2OP_SCALAR(VSUB_scalar, vsub_scalar)
49
- rule = float_3nan_prop_abc;
64
+DO_2OP_SCALAR(VMUL_scalar, vmul_scalar)
50
- }
65
51
-
66
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
52
assert(rule != float_3nan_prop_none);
67
MVEGenDualAccOpFn *fn)
53
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
54
/* We have at least one SNaN input and should prefer it */
68
--
55
--
69
2.20.1
56
2.34.1
70
71
diff view generated by jsdifflib
1
Implement the MVE VBRSR insn, which reverses a specified
1
The use_first_nan field in float_status was an xtensa-specific way to
2
number of bits in each element, setting the rest to zero.
2
select at runtime from two different NaN propagation rules. Now that
3
xtensa is using the target-agnostic NaN propagation rule selection
4
that we've just added, we can remove use_first_nan, because there is
5
no longer any code that reads it.
3
6
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-26-peter.maydell@linaro.org
9
Message-id: 20241202131347.498124-27-peter.maydell@linaro.org
7
---
10
---
8
target/arm/helper-mve.h | 4 ++++
11
include/fpu/softfloat-helpers.h | 5 -----
9
target/arm/mve.decode | 1 +
12
include/fpu/softfloat-types.h | 1 -
10
target/arm/mve_helper.c | 43 ++++++++++++++++++++++++++++++++++++++
13
target/xtensa/fpu_helper.c | 1 -
11
target/arm/translate-mve.c | 1 +
14
3 files changed, 7 deletions(-)
12
4 files changed, 49 insertions(+)
13
15
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
16
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
18
--- a/include/fpu/softfloat-helpers.h
17
+++ b/target/arm/helper-mve.h
19
+++ b/include/fpu/softfloat-helpers.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vhsubu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@ static inline void set_snan_bit_is_one(bool val, float_status *status)
19
DEF_HELPER_FLAGS_4(mve_vhsubu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
status->snan_bit_is_one = val;
20
DEF_HELPER_FLAGS_4(mve_vhsubu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
}
21
23
22
+DEF_HELPER_FLAGS_4(mve_vbrsrb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
-static inline void set_use_first_nan(bool val, float_status *status)
23
+DEF_HELPER_FLAGS_4(mve_vbrsrh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
-{
24
+DEF_HELPER_FLAGS_4(mve_vbrsrw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
- status->use_first_nan = val;
25
+
27
-}
26
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
28
-
27
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
29
static inline void set_no_signaling_nans(bool val, float_status *status)
28
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
30
{
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
31
status->no_signaling_nans = val;
32
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
30
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/mve.decode
34
--- a/include/fpu/softfloat-types.h
32
+++ b/target/arm/mve.decode
35
+++ b/include/fpu/softfloat-types.h
33
@@ -XXX,XX +XXX,XX @@ VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
36
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
34
VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
37
* softfloat-specialize.inc.c)
35
VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
38
*/
36
VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
39
bool snan_bit_is_one;
37
+VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
40
- bool use_first_nan;
38
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
41
bool no_signaling_nans;
42
/* should overflowed results subtract re_bias to its exponent? */
43
bool rebias_overflow;
44
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
39
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/mve_helper.c
46
--- a/target/xtensa/fpu_helper.c
41
+++ b/target/arm/mve_helper.c
47
+++ b/target/xtensa/fpu_helper.c
42
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR_U(vhaddu_scalar, do_vhadd_u)
48
@@ -XXX,XX +XXX,XX @@ static const struct {
43
DO_2OP_SCALAR_S(vhsubs_scalar, do_vhsub_s)
49
44
DO_2OP_SCALAR_U(vhsubu_scalar, do_vhsub_u)
50
void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
45
51
{
46
+static inline uint32_t do_vbrsrb(uint32_t n, uint32_t m)
52
- set_use_first_nan(use_first, &env->fp_status);
47
+{
53
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
48
+ m &= 0xff;
54
&env->fp_status);
49
+ if (m == 0) {
55
set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
50
+ return 0;
51
+ }
52
+ n = revbit8(n);
53
+ if (m < 8) {
54
+ n >>= 8 - m;
55
+ }
56
+ return n;
57
+}
58
+
59
+static inline uint32_t do_vbrsrh(uint32_t n, uint32_t m)
60
+{
61
+ m &= 0xff;
62
+ if (m == 0) {
63
+ return 0;
64
+ }
65
+ n = revbit16(n);
66
+ if (m < 16) {
67
+ n >>= 16 - m;
68
+ }
69
+ return n;
70
+}
71
+
72
+static inline uint32_t do_vbrsrw(uint32_t n, uint32_t m)
73
+{
74
+ m &= 0xff;
75
+ if (m == 0) {
76
+ return 0;
77
+ }
78
+ n = revbit32(n);
79
+ if (m < 32) {
80
+ n >>= 32 - m;
81
+ }
82
+ return n;
83
+}
84
+
85
+DO_2OP_SCALAR(vbrsrb, 1, uint8_t, do_vbrsrb)
86
+DO_2OP_SCALAR(vbrsrh, 2, uint16_t, do_vbrsrh)
87
+DO_2OP_SCALAR(vbrsrw, 4, uint32_t, do_vbrsrw)
88
+
89
/*
90
* Multiply add long dual accumulate ops.
91
*/
92
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/translate-mve.c
95
+++ b/target/arm/translate-mve.c
96
@@ -XXX,XX +XXX,XX @@ DO_2OP_SCALAR(VHADD_S_scalar, vhadds_scalar)
97
DO_2OP_SCALAR(VHADD_U_scalar, vhaddu_scalar)
98
DO_2OP_SCALAR(VHSUB_S_scalar, vhsubs_scalar)
99
DO_2OP_SCALAR(VHSUB_U_scalar, vhsubu_scalar)
100
+DO_2OP_SCALAR(VBRSR, vbrsr)
101
102
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
103
MVEGenDualAccOpFn *fn)
104
--
56
--
105
2.20.1
57
2.34.1
106
107
diff view generated by jsdifflib
1
Implement the scalar form of the MVE VADD insn. This takes the
1
Currently m68k_cpu_reset_hold() calls floatx80_default_nan(NULL)
2
scalar operand from a general purpose register.
2
to get the NaN bit pattern to reset the FPU registers. This
3
works because it happens that our implementation of
4
floatx80_default_nan() doesn't actually look at the float_status
5
pointer except for TARGET_MIPS. However, this isn't guaranteed,
6
and to be able to remove the ifdef in floatx80_default_nan()
7
we're going to need a real float_status here.
8
9
Rearrange m68k_cpu_reset_hold() so that we initialize env->fp_status
10
earlier, and thus can pass it to floatx80_default_nan().
3
11
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-23-peter.maydell@linaro.org
14
Message-id: 20241202131347.498124-28-peter.maydell@linaro.org
7
---
15
---
8
target/arm/helper-mve.h | 4 ++++
16
target/m68k/cpu.c | 12 +++++++-----
9
target/arm/mve.decode | 7 ++++++
17
1 file changed, 7 insertions(+), 5 deletions(-)
10
target/arm/mve_helper.c | 22 +++++++++++++++++++
11
target/arm/translate-mve.c | 45 ++++++++++++++++++++++++++++++++++++++
12
4 files changed, 78 insertions(+)
13
18
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
19
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
15
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
21
--- a/target/m68k/cpu.c
17
+++ b/target/arm/helper-mve.h
22
+++ b/target/m68k/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
CPUState *cs = CPU(obj);
20
DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
M68kCPUClass *mcc = M68K_CPU_GET_CLASS(obj);
21
26
CPUM68KState *env = cpu_env(cs);
22
+DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
- floatx80 nan = floatx80_default_nan(NULL);
23
+DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+ floatx80 nan;
24
+DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
int i;
30
31
if (mcc->parent_phases.hold) {
32
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
33
#else
34
cpu_m68k_set_sr(env, SR_S | SR_I);
35
#endif
36
- for (i = 0; i < 8; i++) {
37
- env->fregs[i].d = nan;
38
- }
39
- cpu_m68k_set_fpcr(env, 0);
40
/*
41
* M68000 FAMILY PROGRAMMER'S REFERENCE MANUAL
42
* 3.4 FLOATING-POINT INSTRUCTION DETAILS
43
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
44
* preceding paragraph for nonsignaling NaNs.
45
*/
46
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
25
+
47
+
26
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
48
+ nan = floatx80_default_nan(&env->fp_status);
27
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
49
+ for (i = 0; i < 8; i++) {
28
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
50
+ env->fregs[i].d = nan;
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/mve.decode
32
+++ b/target/arm/mve.decode
33
@@ -XXX,XX +XXX,XX @@
34
&vldr_vstr rn qd imm p a w size l u
35
&1op qd qm size
36
&2op qd qm qn size
37
+&2scalar qd qn rm size
38
39
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
40
# Note that both Rn and Qd are 3 bits only (no D bit)
41
@@ -XXX,XX +XXX,XX @@
42
@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
43
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
44
45
+@2scalar .... .... .. size:2 .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
46
+
47
# Vector loads and stores
48
49
# Widening loads and narrowing stores:
50
@@ -XXX,XX +XXX,XX @@ VRMLALDAVH_S 1110 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_no
51
VRMLALDAVH_U 1111 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz
52
53
VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_nosz
54
+
55
+# Scalar operations
56
+
57
+VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar
58
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/mve_helper.c
61
+++ b/target/arm/mve_helper.c
62
@@ -XXX,XX +XXX,XX @@ DO_2OP_S(vhsubs, do_vhsub_s)
63
DO_2OP_U(vhsubu, do_vhsub_u)
64
65
66
+#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
67
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
68
+ uint32_t rm) \
69
+ { \
70
+ TYPE *d = vd, *n = vn; \
71
+ TYPE m = rm; \
72
+ uint16_t mask = mve_element_mask(env); \
73
+ unsigned e; \
74
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
75
+ mergemask(&d[H##ESIZE(e)], FN(n[H##ESIZE(e)], m), mask); \
76
+ } \
77
+ mve_advance_vpt(env); \
78
+ }
51
+ }
79
+
52
+ cpu_m68k_set_fpcr(env, 0);
80
+/* provide unsigned 2-op scalar helpers for all sizes */
53
env->fpsr = 0;
81
+#define DO_2OP_SCALAR_U(OP, FN) \
54
82
+ DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \
55
/* TODO: We should set PC from the interrupt vector. */
83
+ DO_2OP_SCALAR(OP##h, 2, uint16_t, FN) \
84
+ DO_2OP_SCALAR(OP##w, 4, uint32_t, FN)
85
+
86
+DO_2OP_SCALAR_U(vadd_scalar, DO_ADD)
87
+
88
/*
89
* Multiply add long dual accumulate ops.
90
*/
91
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
92
index XXXXXXX..XXXXXXX 100644
93
--- a/target/arm/translate-mve.c
94
+++ b/target/arm/translate-mve.c
95
@@ -XXX,XX +XXX,XX @@
96
typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
97
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
98
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
99
+typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
100
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
101
102
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
103
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMULL_BU, vmullbu)
104
DO_2OP(VMULL_TS, vmullts)
105
DO_2OP(VMULL_TU, vmulltu)
106
107
+static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
108
+ MVEGenTwoOpScalarFn fn)
109
+{
110
+ TCGv_ptr qd, qn;
111
+ TCGv_i32 rm;
112
+
113
+ if (!dc_isar_feature(aa32_mve, s) ||
114
+ !mve_check_qreg_bank(s, a->qd | a->qn) ||
115
+ !fn) {
116
+ return false;
117
+ }
118
+ if (a->rm == 13 || a->rm == 15) {
119
+ /* UNPREDICTABLE */
120
+ return false;
121
+ }
122
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
123
+ return true;
124
+ }
125
+
126
+ qd = mve_qreg_ptr(a->qd);
127
+ qn = mve_qreg_ptr(a->qn);
128
+ rm = load_reg(s, a->rm);
129
+ fn(cpu_env, qd, qn, rm);
130
+ tcg_temp_free_i32(rm);
131
+ tcg_temp_free_ptr(qd);
132
+ tcg_temp_free_ptr(qn);
133
+ mve_update_eci(s);
134
+ return true;
135
+}
136
+
137
+#define DO_2OP_SCALAR(INSN, FN) \
138
+ static bool trans_##INSN(DisasContext *s, arg_2scalar *a) \
139
+ { \
140
+ static MVEGenTwoOpScalarFn * const fns[] = { \
141
+ gen_helper_mve_##FN##b, \
142
+ gen_helper_mve_##FN##h, \
143
+ gen_helper_mve_##FN##w, \
144
+ NULL, \
145
+ }; \
146
+ return do_2op_scalar(s, a, fns[a->size]); \
147
+ }
148
+
149
+DO_2OP_SCALAR(VADD_scalar, vadd_scalar)
150
+
151
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
152
MVEGenDualAccOpFn *fn)
153
{
154
--
56
--
155
2.20.1
57
2.34.1
156
157
diff view generated by jsdifflib
1
Implement the MVE VADC and VSBC insns. These perform an
1
We create our 128-bit default NaN by calling parts64_default_nan()
2
add-with-carry or subtract-with-carry of the 32-bit elements in each
2
and then adjusting the result. We can do the same trick for creating
3
lane of the input vectors, where the carry-out of each add is the
3
the floatx80 default NaN, which lets us drop a target ifdef.
4
carry-in of the next. The initial carry input is either 1 or is from
4
5
FPSCR.C; the carry out at the end is written back to FPSCR.C.
5
floatx80 is used only by:
6
i386
7
m68k
8
arm nwfpe old floating-point emulation emulation support
9
(which is essentially dead, especially the parts involving floatx80)
10
PPC (only in the xsrqpxp instruction, which just rounds an input
11
value by converting to floatx80 and back, so will never generate
12
the default NaN)
13
14
The floatx80 default NaN as currently implemented is:
15
m68k: sign = 0, exp = 1...1, int = 1, frac = 1....1
16
i386: sign = 1, exp = 1...1, int = 1, frac = 10...0
17
18
These are the same as the parts64_default_nan for these architectures.
19
20
This is technically a possible behaviour change for arm linux-user
21
nwfpe emulation emulation, because the default NaN will now have the
22
sign bit clear. But we were already generating a different floatx80
23
default NaN from the real kernel emulation we are supposedly
24
following, which appears to use an all-bits-1 value:
25
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L267
26
27
This won't affect the only "real" use of the nwfpe emulation, which
28
is ancient binaries that used it as part of the old floating point
29
calling convention; that only uses loads and stores of 32 and 64 bit
30
floats, not any of the floatx80 behaviour the original hardware had.
31
We also get the nwfpe float64 default NaN value wrong:
32
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L166
33
so if we ever cared about this obscure corner the right fix would be
34
to correct that so nwfpe used its own default-NaN setting rather
35
than the Arm VFP one.
6
36
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
37
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
38
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210617121628.20116-41-peter.maydell@linaro.org
39
Message-id: 20241202131347.498124-29-peter.maydell@linaro.org
10
---
40
---
11
target/arm/helper-mve.h | 5 ++++
41
fpu/softfloat-specialize.c.inc | 20 ++++++++++----------
12
target/arm/mve.decode | 5 ++++
42
1 file changed, 10 insertions(+), 10 deletions(-)
13
target/arm/mve_helper.c | 52 ++++++++++++++++++++++++++++++++++++++
14
target/arm/translate-mve.c | 37 +++++++++++++++++++++++++++
15
4 files changed, 99 insertions(+)
16
43
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
44
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
18
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
46
--- a/fpu/softfloat-specialize.c.inc
20
+++ b/target/arm/helper-mve.h
47
+++ b/fpu/softfloat-specialize.c.inc
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrhaddub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
48
@@ -XXX,XX +XXX,XX @@ static void parts128_silence_nan(FloatParts128 *p, float_status *status)
22
DEF_HELPER_FLAGS_4(mve_vrhadduh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
49
floatx80 floatx80_default_nan(float_status *status)
23
DEF_HELPER_FLAGS_4(mve_vrhadduw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
25
+DEF_HELPER_FLAGS_4(mve_vadc, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
+DEF_HELPER_FLAGS_4(mve_vadci, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
+DEF_HELPER_FLAGS_4(mve_vsbc, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+DEF_HELPER_FLAGS_4(mve_vsbci, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+
30
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
36
+++ b/target/arm/mve.decode
37
@@ -XXX,XX +XXX,XX @@ VQDMULLT 111 . 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 1 @2op_sz28
38
VRHADD_S 111 0 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
39
VRHADD_U 111 1 1111 0 . .. ... 0 ... 0 0001 . 1 . 0 ... 0 @2op
40
41
+VADC 1110 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
42
+VSBC 1111 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 0 @2op_nosz
43
+VADCI 1110 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
44
+VSBCI 1111 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 0 @2op_nosz
45
+
46
# Vector miscellaneous
47
48
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
49
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
50
index XXXXXXX..XXXXXXX 100644
51
--- a/target/arm/mve_helper.c
52
+++ b/target/arm/mve_helper.c
53
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vrshlu, DO_VRSHLU)
54
DO_2OP_S(vrhadds, DO_RHADD_S)
55
DO_2OP_U(vrhaddu, DO_RHADD_U)
56
57
+static void do_vadc(CPUARMState *env, uint32_t *d, uint32_t *n, uint32_t *m,
58
+ uint32_t inv, uint32_t carry_in, bool update_flags)
59
+{
60
+ uint16_t mask = mve_element_mask(env);
61
+ unsigned e;
62
+
63
+ /* If any additions trigger, we will update flags. */
64
+ if (mask & 0x1111) {
65
+ update_flags = true;
66
+ }
67
+
68
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) {
69
+ uint64_t r = carry_in;
70
+ r += n[H4(e)];
71
+ r += m[H4(e)] ^ inv;
72
+ if (mask & 1) {
73
+ carry_in = r >> 32;
74
+ }
75
+ mergemask(&d[H4(e)], r, mask);
76
+ }
77
+
78
+ if (update_flags) {
79
+ /* Store C, clear NZV. */
80
+ env->vfp.xregs[ARM_VFP_FPSCR] &= ~FPCR_NZCV_MASK;
81
+ env->vfp.xregs[ARM_VFP_FPSCR] |= carry_in * FPCR_C;
82
+ }
83
+ mve_advance_vpt(env);
84
+}
85
+
86
+void HELPER(mve_vadc)(CPUARMState *env, void *vd, void *vn, void *vm)
87
+{
88
+ bool carry_in = env->vfp.xregs[ARM_VFP_FPSCR] & FPCR_C;
89
+ do_vadc(env, vd, vn, vm, 0, carry_in, false);
90
+}
91
+
92
+void HELPER(mve_vsbc)(CPUARMState *env, void *vd, void *vn, void *vm)
93
+{
94
+ bool carry_in = env->vfp.xregs[ARM_VFP_FPSCR] & FPCR_C;
95
+ do_vadc(env, vd, vn, vm, -1, carry_in, false);
96
+}
97
+
98
+
99
+void HELPER(mve_vadci)(CPUARMState *env, void *vd, void *vn, void *vm)
100
+{
101
+ do_vadc(env, vd, vn, vm, 0, 0, true);
102
+}
103
+
104
+void HELPER(mve_vsbci)(CPUARMState *env, void *vd, void *vn, void *vm)
105
+{
106
+ do_vadc(env, vd, vn, vm, -1, 1, true);
107
+}
108
+
109
static inline int32_t do_sat_bhw(int64_t val, int64_t min, int64_t max, bool *s)
110
{
50
{
111
if (val > max) {
51
floatx80 r;
112
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
52
+ /*
113
index XXXXXXX..XXXXXXX 100644
53
+ * Extrapolate from the choices made by parts64_default_nan to fill
114
--- a/target/arm/translate-mve.c
54
+ * in the floatx80 format. We assume that floatx80's explicit
115
+++ b/target/arm/translate-mve.c
55
+ * integer bit is always set (this is true for i386 and m68k,
116
@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMULLT(DisasContext *s, arg_2op *a)
56
+ * which are the only real users of this format).
117
return do_2op(s, a, fns[a->size]);
57
+ */
58
+ FloatParts64 p64;
59
+ parts64_default_nan(&p64, status);
60
61
- /* None of the targets that have snan_bit_is_one use floatx80. */
62
- assert(!snan_bit_is_one(status));
63
-#if defined(TARGET_M68K)
64
- r.low = UINT64_C(0xFFFFFFFFFFFFFFFF);
65
- r.high = 0x7FFF;
66
-#else
67
- /* X86 */
68
- r.low = UINT64_C(0xC000000000000000);
69
- r.high = 0xFFFF;
70
-#endif
71
+ r.high = 0x7FFF | (p64.sign << 15);
72
+ r.low = (1ULL << DECOMPOSED_BINARY_POINT) | p64.frac;
73
return r;
118
}
74
}
119
75
120
+/*
121
+ * VADC and VSBC: these perform an add-with-carry or subtract-with-carry
122
+ * of the 32-bit elements in each lane of the input vectors, where the
123
+ * carry-out of each add is the carry-in of the next. The initial carry
124
+ * input is either fixed (0 for VADCI, 1 for VSBCI) or is from FPSCR.C
125
+ * (for VADC and VSBC); the carry out at the end is written back to FPSCR.C.
126
+ * These insns are subject to beat-wise execution. Partial execution
127
+ * of an I=1 (initial carry input fixed) insn which does not
128
+ * execute the first beat must start with the current FPSCR.NZCV
129
+ * value, not the fixed constant input.
130
+ */
131
+static bool trans_VADC(DisasContext *s, arg_2op *a)
132
+{
133
+ return do_2op(s, a, gen_helper_mve_vadc);
134
+}
135
+
136
+static bool trans_VADCI(DisasContext *s, arg_2op *a)
137
+{
138
+ if (mve_skip_first_beat(s)) {
139
+ return trans_VADC(s, a);
140
+ }
141
+ return do_2op(s, a, gen_helper_mve_vadci);
142
+}
143
+
144
+static bool trans_VSBC(DisasContext *s, arg_2op *a)
145
+{
146
+ return do_2op(s, a, gen_helper_mve_vsbc);
147
+}
148
+
149
+static bool trans_VSBCI(DisasContext *s, arg_2op *a)
150
+{
151
+ if (mve_skip_first_beat(s)) {
152
+ return trans_VSBC(s, a);
153
+ }
154
+ return do_2op(s, a, gen_helper_mve_vsbci);
155
+}
156
+
157
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
158
MVEGenTwoOpScalarFn fn)
159
{
160
--
76
--
161
2.20.1
77
2.34.1
162
163
diff view generated by jsdifflib
1
Implement the MVE VRMLALDAVH and VRMLSLDAVH insns, which accumulate
1
In target/loongarch's helper_fclass_s() and helper_fclass_d() we pass
2
the results of a rounded multiply of pairs of elements into a 72-bit
2
a zero-initialized float_status struct to float32_is_quiet_nan() and
3
accumulator, returning the top 64 bits in a pair of general purpose
3
float64_is_quiet_nan(), with the cryptic comment "for
4
registers.
4
snan_bit_is_one".
5
6
This pattern appears to have been copied from target/riscv, where it
7
is used because the functions there do not have ready access to the
8
CPU state struct. The comment presumably refers to the fact that the
9
main reason the is_quiet_nan() functions want the float_state is
10
because they want to know about the snan_bit_is_one config.
11
12
In the loongarch helpers, though, we have the CPU state struct
13
to hand. Use the usual env->fp_status here. This avoids our needing
14
to track that we need to update the initializer of the local
15
float_status structs when the core softfloat code adds new
16
options for targets to configure their behaviour.
5
17
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210617121628.20116-22-peter.maydell@linaro.org
20
Message-id: 20241202131347.498124-30-peter.maydell@linaro.org
9
---
21
---
10
target/arm/helper-mve.h | 8 ++++++++
22
target/loongarch/tcg/fpu_helper.c | 6 ++----
11
target/arm/mve.decode | 7 +++++++
23
1 file changed, 2 insertions(+), 4 deletions(-)
12
target/arm/mve_helper.c | 37 +++++++++++++++++++++++++++++++++++++
13
target/arm/translate-mve.c | 24 ++++++++++++++++++++++++
14
4 files changed, 76 insertions(+)
15
24
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
25
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
17
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-mve.h
27
--- a/target/loongarch/tcg/fpu_helper.c
19
+++ b/target/arm/helper-mve.h
28
+++ b/target/loongarch/tcg/fpu_helper.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmlsldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
29
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_s(CPULoongArchState *env, uint64_t fj)
21
DEF_HELPER_FLAGS_4(mve_vmlsldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
30
} else if (float32_is_zero_or_denormal(f)) {
22
DEF_HELPER_FLAGS_4(mve_vmlsldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
31
return sign ? 1 << 4 : 1 << 8;
23
DEF_HELPER_FLAGS_4(mve_vmlsldavxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
32
} else if (float32_is_any_nan(f)) {
24
+
33
- float_status s = { }; /* for snan_bit_is_one */
25
+DEF_HELPER_FLAGS_4(mve_vrmlaldavhsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
34
- return float32_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
26
+DEF_HELPER_FLAGS_4(mve_vrmlaldavhxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
35
+ return float32_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
27
+
36
} else {
28
+DEF_HELPER_FLAGS_4(mve_vrmlaldavhuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
37
return sign ? 1 << 3 : 1 << 7;
29
+
38
}
30
+DEF_HELPER_FLAGS_4(mve_vrmlsldavhsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
39
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_d(CPULoongArchState *env, uint64_t fj)
31
+DEF_HELPER_FLAGS_4(mve_vrmlsldavhxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
40
} else if (float64_is_zero_or_denormal(f)) {
32
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
41
return sign ? 1 << 4 : 1 << 8;
33
index XXXXXXX..XXXXXXX 100644
42
} else if (float64_is_any_nan(f)) {
34
--- a/target/arm/mve.decode
43
- float_status s = { }; /* for snan_bit_is_one */
35
+++ b/target/arm/mve.decode
44
- return float64_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
36
@@ -XXX,XX +XXX,XX @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
45
+ return float64_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
37
46
} else {
38
@vmlaldav .... .... . ... ... . ... . .... .... qm:3 . \
47
return sign ? 1 << 3 : 1 << 7;
39
qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav
48
}
40
+@vmlaldav_nosz .... .... . ... ... . ... . .... .... qm:3 . \
41
+ qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav
42
VMLALDAV_S 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
43
VMLALDAV_U 1111 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
44
45
VMLSLDAV 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav
46
+
47
+VRMLALDAVH_S 1110 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz
48
+VRMLALDAVH_U 1111 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz
49
+
50
+VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_nosz
51
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/mve_helper.c
54
+++ b/target/arm/mve_helper.c
55
@@ -XXX,XX +XXX,XX @@
56
*/
57
58
#include "qemu/osdep.h"
59
+#include "qemu/int128.h"
60
#include "cpu.h"
61
#include "internals.h"
62
#include "vec_internal.h"
63
@@ -XXX,XX +XXX,XX @@ DO_LDAV(vmlsldavsh, 2, int16_t, false, +=, -=)
64
DO_LDAV(vmlsldavxsh, 2, int16_t, true, +=, -=)
65
DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=)
66
DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=)
67
+
68
+/*
69
+ * Rounding multiply add long dual accumulate high: we must keep
70
+ * a 72-bit internal accumulator value and return the top 64 bits.
71
+ */
72
+#define DO_LDAVH(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC, TO128) \
73
+ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
74
+ void *vm, uint64_t a) \
75
+ { \
76
+ uint16_t mask = mve_element_mask(env); \
77
+ unsigned e; \
78
+ TYPE *n = vn, *m = vm; \
79
+ Int128 acc = int128_lshift(TO128(a), 8); \
80
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
81
+ if (mask & 1) { \
82
+ if (e & 1) { \
83
+ acc = ODDACC(acc, TO128(n[H##ESIZE(e - 1 * XCHG)] * \
84
+ m[H##ESIZE(e)])); \
85
+ } else { \
86
+ acc = EVENACC(acc, TO128(n[H##ESIZE(e + 1 * XCHG)] * \
87
+ m[H##ESIZE(e)])); \
88
+ } \
89
+ acc = int128_add(acc, 1 << 7); \
90
+ } \
91
+ } \
92
+ mve_advance_vpt(env); \
93
+ return int128_getlo(int128_rshift(acc, 8)); \
94
+ }
95
+
96
+DO_LDAVH(vrmlaldavhsw, 4, int32_t, false, int128_add, int128_add, int128_makes64)
97
+DO_LDAVH(vrmlaldavhxsw, 4, int32_t, true, int128_add, int128_add, int128_makes64)
98
+
99
+DO_LDAVH(vrmlaldavhuw, 4, uint32_t, false, int128_add, int128_add, int128_make64)
100
+
101
+DO_LDAVH(vrmlsldavhsw, 4, int32_t, false, int128_add, int128_sub, int128_makes64)
102
+DO_LDAVH(vrmlsldavhxsw, 4, int32_t, true, int128_add, int128_sub, int128_makes64)
103
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/target/arm/translate-mve.c
106
+++ b/target/arm/translate-mve.c
107
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a)
108
};
109
return do_long_dual_acc(s, a, fns[a->size][a->x]);
110
}
111
+
112
+static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a)
113
+{
114
+ static MVEGenDualAccOpFn * const fns[] = {
115
+ gen_helper_mve_vrmlaldavhsw, gen_helper_mve_vrmlaldavhxsw,
116
+ };
117
+ return do_long_dual_acc(s, a, fns[a->x]);
118
+}
119
+
120
+static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a)
121
+{
122
+ static MVEGenDualAccOpFn * const fns[] = {
123
+ gen_helper_mve_vrmlaldavhuw, NULL,
124
+ };
125
+ return do_long_dual_acc(s, a, fns[a->x]);
126
+}
127
+
128
+static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a)
129
+{
130
+ static MVEGenDualAccOpFn * const fns[] = {
131
+ gen_helper_mve_vrmlsldavhsw, gen_helper_mve_vrmlsldavhxsw,
132
+ };
133
+ return do_long_dual_acc(s, a, fns[a->x]);
134
+}
135
--
49
--
136
2.20.1
50
2.34.1
137
138
diff view generated by jsdifflib
1
Implement the MVE insn VMLSLDAV, which multiplies source elements,
1
In the frem helper, we have a local float_status because we want to
2
alternately adding and subtracting them, and accumulates into a
2
execute the floatx80_div() with a custom rounding mode. Instead of
3
64-bit result in a pair of general purpose registers.
3
zero-initializing the local float_status and then having to set it up
4
with the m68k standard behaviour (including the NaN propagation rule
5
and copying the rounding precision from env->fp_status), initialize
6
it as a complete copy of env->fp_status. This will avoid our having
7
to add new code in this function for every new config knob we add
8
to fp_status.
4
9
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210617121628.20116-21-peter.maydell@linaro.org
12
Message-id: 20241202131347.498124-31-peter.maydell@linaro.org
8
---
13
---
9
target/arm/helper-mve.h | 5 +++++
14
target/m68k/fpu_helper.c | 6 ++----
10
target/arm/mve.decode | 2 ++
15
1 file changed, 2 insertions(+), 4 deletions(-)
11
target/arm/mve_helper.c | 5 +++++
12
target/arm/translate-mve.c | 11 +++++++++++
13
4 files changed, 23 insertions(+)
14
16
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
17
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
19
--- a/target/m68k/fpu_helper.c
18
+++ b/target/arm/helper-mve.h
20
+++ b/target/m68k/fpu_helper.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmlaldavxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
21
@@ -XXX,XX +XXX,XX @@ void HELPER(frem)(CPUM68KState *env, FPReg *res, FPReg *val0, FPReg *val1)
20
22
21
DEF_HELPER_FLAGS_4(mve_vmlaldavuh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
23
fp_rem = floatx80_rem(val1->d, val0->d, &env->fp_status);
22
DEF_HELPER_FLAGS_4(mve_vmlaldavuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
24
if (!floatx80_is_any_nan(fp_rem)) {
23
+
25
- float_status fp_status = { };
24
+DEF_HELPER_FLAGS_4(mve_vmlsldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
26
+ /* Use local temporary fp_status to set different rounding mode */
25
+DEF_HELPER_FLAGS_4(mve_vmlsldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
27
+ float_status fp_status = env->fp_status;
26
+DEF_HELPER_FLAGS_4(mve_vmlsldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
28
uint32_t quotient;
27
+DEF_HELPER_FLAGS_4(mve_vmlsldavxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
29
int sign;
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
30
29
index XXXXXXX..XXXXXXX 100644
31
/* Calculate quotient directly using round to nearest mode */
30
--- a/target/arm/mve.decode
32
- set_float_2nan_prop_rule(float_2nan_prop_ab, &fp_status);
31
+++ b/target/arm/mve.decode
33
set_float_rounding_mode(float_round_nearest_even, &fp_status);
32
@@ -XXX,XX +XXX,XX @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
34
- set_floatx80_rounding_precision(
33
qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav
35
- get_floatx80_rounding_precision(&env->fp_status), &fp_status);
34
VMLALDAV_S 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
36
fp_quot.d = floatx80_div(val1->d, val0->d, &fp_status);
35
VMLALDAV_U 1111 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
37
36
+
38
sign = extractFloatx80Sign(fp_quot.d);
37
+VMLSLDAV 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav
38
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/mve_helper.c
41
+++ b/target/arm/mve_helper.c
42
@@ -XXX,XX +XXX,XX @@ DO_LDAV(vmlaldavxsw, 4, int32_t, true, +=, +=)
43
44
DO_LDAV(vmlaldavuh, 2, uint16_t, false, +=, +=)
45
DO_LDAV(vmlaldavuw, 4, uint32_t, false, +=, +=)
46
+
47
+DO_LDAV(vmlsldavsh, 2, int16_t, false, +=, -=)
48
+DO_LDAV(vmlsldavxsh, 2, int16_t, true, +=, -=)
49
+DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=)
50
+DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=)
51
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/translate-mve.c
54
+++ b/target/arm/translate-mve.c
55
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a)
56
};
57
return do_long_dual_acc(s, a, fns[a->size][a->x]);
58
}
59
+
60
+static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a)
61
+{
62
+ static MVEGenDualAccOpFn * const fns[4][2] = {
63
+ { NULL, NULL },
64
+ { gen_helper_mve_vmlsldavsh, gen_helper_mve_vmlsldavxsh },
65
+ { gen_helper_mve_vmlsldavsw, gen_helper_mve_vmlsldavxsw },
66
+ { NULL, NULL },
67
+ };
68
+ return do_long_dual_acc(s, a, fns[a->size][a->x]);
69
+}
70
--
39
--
71
2.20.1
40
2.34.1
72
73
diff view generated by jsdifflib
1
Instead of open-coding the "take NOCP exception if FPU disabled,
1
In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we do the conversion
2
otherwise call gen_preserve_fp_state()" code in the accessors for
2
from float64 to floatx80 using a scratch float_status, because we
3
FPCXT_NS, add an argument to vfp_access_check_m() which tells it to
3
don't want the conversion to affect the CPU's floating point exception
4
skip the gen_update_fp_context() call, so we can use it for the
4
status. Currently we use a zero-initialized float_status. This will
5
FPCXT_NS case.
5
get steadily more awkward as we add config knobs to float_status
6
that the target must initialize. Avoid having to add any of that
7
configuration here by instead initializing our local float_status
8
from the env->fp_status.
6
9
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210618141019.10671-8-peter.maydell@linaro.org
12
Message-id: 20241202131347.498124-32-peter.maydell@linaro.org
10
---
13
---
11
target/arm/translate-a32.h | 2 +-
14
target/m68k/helper.c | 6 ++++--
12
target/arm/translate-m-nocp.c | 10 ++--------
15
1 file changed, 4 insertions(+), 2 deletions(-)
13
target/arm/translate-vfp.c | 13 ++++++++-----
14
3 files changed, 11 insertions(+), 14 deletions(-)
15
16
16
diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
17
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
17
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate-a32.h
19
--- a/target/m68k/helper.c
19
+++ b/target/arm/translate-a32.h
20
+++ b/target/m68k/helper.c
20
@@ -XXX,XX +XXX,XX @@ bool disas_neon_shared(DisasContext *s, uint32_t insn);
21
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_get_reg(CPUState *cs, GByteArray *mem_buf, int n)
21
void load_reg_var(DisasContext *s, TCGv_i32 var, int reg);
22
CPUM68KState *env = &cpu->env;
22
void arm_gen_condlabel(DisasContext *s);
23
23
bool vfp_access_check(DisasContext *s);
24
if (n < 8) {
24
-void gen_preserve_fp_state(DisasContext *s);
25
- float_status s = {};
25
+bool vfp_access_check_m(DisasContext *s, bool skip_context_update);
26
+ /* Use scratch float_status so any exceptions don't change CPU state */
26
void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop);
27
+ float_status s = env->fp_status;
27
void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop);
28
return gdb_get_reg64(mem_buf, floatx80_to_float64(env->fregs[n].d, &s));
28
void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop);
29
diff --git a/target/arm/translate-m-nocp.c b/target/arm/translate-m-nocp.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate-m-nocp.c
32
+++ b/target/arm/translate-m-nocp.c
33
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
34
* otherwise PreserveFPState(), and then FPCXT_NS writes
35
* behave the same as FPCXT_S writes.
36
*/
37
- if (s->fp_excp_el) {
38
- gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
39
- syn_uncategorized(), s->fp_excp_el);
40
+ if (!vfp_access_check_m(s, true)) {
41
/*
42
* This was only a conditional exception, so override
43
* gen_exception_insn()'s default to DISAS_NORETURN
44
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
45
s->base.is_jmp = DISAS_NEXT;
46
break;
47
}
48
- gen_preserve_fp_state(s);
49
}
29
}
50
/* fall through */
30
switch (n) {
51
case ARM_VFP_FPCXT_S:
31
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_set_reg(CPUState *cs, uint8_t *mem_buf, int n)
52
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
32
CPUM68KState *env = &cpu->env;
53
* otherwise PreserveFPState(), and then FPCXT_NS
33
54
* reads the same as FPCXT_S.
34
if (n < 8) {
55
*/
35
- float_status s = {};
56
- if (s->fp_excp_el) {
36
+ /* Use scratch float_status so any exceptions don't change CPU state */
57
- gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
37
+ float_status s = env->fp_status;
58
- syn_uncategorized(), s->fp_excp_el);
38
env->fregs[n].d = float64_to_floatx80(ldq_be_p(mem_buf), &s);
59
+ if (!vfp_access_check_m(s, true)) {
39
return 8;
60
/*
61
* This was only a conditional exception, so override
62
* gen_exception_insn()'s default to DISAS_NORETURN
63
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
64
s->base.is_jmp = DISAS_NEXT;
65
break;
66
}
67
- gen_preserve_fp_state(s);
68
tmp = tcg_temp_new_i32();
69
sfpa = tcg_temp_new_i32();
70
fpscr = tcg_temp_new_i32();
71
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
72
index XXXXXXX..XXXXXXX 100644
73
--- a/target/arm/translate-vfp.c
74
+++ b/target/arm/translate-vfp.c
75
@@ -XXX,XX +XXX,XX @@ static inline long vfp_f16_offset(unsigned reg, bool top)
76
* Generate code for M-profile lazy FP state preservation if needed;
77
* this corresponds to the pseudocode PreserveFPState() function.
78
*/
79
-void gen_preserve_fp_state(DisasContext *s)
80
+static void gen_preserve_fp_state(DisasContext *s)
81
{
82
if (s->v7m_lspact) {
83
/*
84
@@ -XXX,XX +XXX,XX @@ static bool vfp_access_check_a(DisasContext *s, bool ignore_vfp_enabled)
85
* If VFP is enabled, do the necessary M-profile lazy-FP handling and then
86
* return true. If not, emit code to generate an appropriate exception and
87
* return false.
88
+ * skip_context_update is true to skip the "update FP context" part of this.
89
*/
90
-static bool vfp_access_check_m(DisasContext *s)
91
+bool vfp_access_check_m(DisasContext *s, bool skip_context_update)
92
{
93
if (s->fp_excp_el) {
94
/*
95
@@ -XXX,XX +XXX,XX @@ static bool vfp_access_check_m(DisasContext *s)
96
/* Trigger lazy-state preservation if necessary */
97
gen_preserve_fp_state(s);
98
99
- /* Update ownership of FP context and create new FP context if needed */
100
- gen_update_fp_context(s);
101
+ if (!skip_context_update) {
102
+ /* Update ownership of FP context and create new FP context if needed */
103
+ gen_update_fp_context(s);
104
+ }
105
106
return true;
107
}
108
@@ -XXX,XX +XXX,XX @@ static bool vfp_access_check_m(DisasContext *s)
109
bool vfp_access_check(DisasContext *s)
110
{
111
if (arm_dc_feature(s, ARM_FEATURE_M)) {
112
- return vfp_access_check_m(s);
113
+ return vfp_access_check_m(s, false);
114
} else {
115
return vfp_access_check_a(s, false);
116
}
40
}
117
--
41
--
118
2.20.1
42
2.34.1
119
120
diff view generated by jsdifflib
1
In a CPU with MVE, the VMOV (vector lane to general-purpose register)
1
In the helper functions flcmps and flcmpd we use a scratch float_status
2
and VMOV (general-purpose register to vector lane) insns are not
2
so that we don't change the CPU state if the comparison raises any
3
predicated, but they are subject to beatwise execution if they
3
floating point exception flags. Instead of zero-initializing this
4
are not in an IT block.
4
scratch float_status, initialize it as a copy of env->fp_status. This
5
avoids the need to explicitly initialize settings like the NaN
6
propagation rule or others we might add to softfloat in future.
5
7
6
Since our implementation always executes all 4 beats in one tick,
8
To do this we need to pass the CPU env pointer in to the helper.
7
this means only that we need to handle PSR.ECI:
8
* we must do the usual check for bad ECI state
9
* we must advance ECI state if the insn succeeds
10
* if ECI says we should not be executing the beat corresponding
11
to the lane of the vector register being accessed then we
12
should skip performing the move
13
14
Note that if PSR.ECI is non-zero then we cannot be in an IT block.
15
9
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
18
Message-id: 20210617121628.20116-45-peter.maydell@linaro.org
12
Message-id: 20241202131347.498124-33-peter.maydell@linaro.org
19
---
13
---
20
target/arm/translate-a32.h | 2 +
14
target/sparc/helper.h | 4 ++--
21
target/arm/translate-mve.c | 4 +-
15
target/sparc/fop_helper.c | 8 ++++----
22
target/arm/translate-vfp.c | 77 +++++++++++++++++++++++++++++++++++---
16
target/sparc/translate.c | 4 ++--
23
3 files changed, 75 insertions(+), 8 deletions(-)
17
3 files changed, 8 insertions(+), 8 deletions(-)
24
18
25
diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
19
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
26
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate-a32.h
21
--- a/target/sparc/helper.h
28
+++ b/target/arm/translate-a32.h
22
+++ b/target/sparc/helper.h
29
@@ -XXX,XX +XXX,XX @@ long neon_full_reg_offset(unsigned reg);
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(fcmpd, TCG_CALL_NO_WG, i32, env, f64, f64)
30
long neon_element_offset(int reg, int element, MemOp memop);
24
DEF_HELPER_FLAGS_3(fcmped, TCG_CALL_NO_WG, i32, env, f64, f64)
31
void gen_rev16(TCGv_i32 dest, TCGv_i32 var);
25
DEF_HELPER_FLAGS_3(fcmpq, TCG_CALL_NO_WG, i32, env, i128, i128)
32
void clear_eci_state(DisasContext *s);
26
DEF_HELPER_FLAGS_3(fcmpeq, TCG_CALL_NO_WG, i32, env, i128, i128)
33
+bool mve_eci_check(DisasContext *s);
27
-DEF_HELPER_FLAGS_2(flcmps, TCG_CALL_NO_RWG_SE, i32, f32, f32)
34
+void mve_update_and_store_eci(DisasContext *s);
28
-DEF_HELPER_FLAGS_2(flcmpd, TCG_CALL_NO_RWG_SE, i32, f64, f64)
35
29
+DEF_HELPER_FLAGS_3(flcmps, TCG_CALL_NO_RWG_SE, i32, env, f32, f32)
36
static inline TCGv_i32 load_cpu_offset(int offset)
30
+DEF_HELPER_FLAGS_3(flcmpd, TCG_CALL_NO_RWG_SE, i32, env, f64, f64)
37
{
31
DEF_HELPER_2(raise_exception, noreturn, env, int)
38
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
32
33
DEF_HELPER_FLAGS_3(faddd, TCG_CALL_NO_WG, f64, env, f64, f64)
34
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
39
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate-mve.c
36
--- a/target/sparc/fop_helper.c
41
+++ b/target/arm/translate-mve.c
37
+++ b/target/sparc/fop_helper.c
42
@@ -XXX,XX +XXX,XX @@ static bool mve_check_qreg_bank(DisasContext *s, int qmask)
38
@@ -XXX,XX +XXX,XX @@ uint32_t helper_fcmpeq(CPUSPARCState *env, Int128 src1, Int128 src2)
43
return qmask < 8;
39
return finish_fcmp(env, r, GETPC());
44
}
40
}
45
41
46
-static bool mve_eci_check(DisasContext *s)
42
-uint32_t helper_flcmps(float32 src1, float32 src2)
47
+bool mve_eci_check(DisasContext *s)
43
+uint32_t helper_flcmps(CPUSPARCState *env, float32 src1, float32 src2)
48
{
44
{
49
/*
45
/*
50
* This is a beatwise insn: check that ECI is valid (not a
46
* FLCMP never raises an exception nor modifies any FSR fields.
51
@@ -XXX,XX +XXX,XX @@ static void mve_update_eci(DisasContext *s)
47
* Perform the comparison with a dummy fp environment.
52
}
48
*/
49
- float_status discard = { };
50
+ float_status discard = env->fp_status;
51
FloatRelation r;
52
53
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
54
@@ -XXX,XX +XXX,XX @@ uint32_t helper_flcmps(float32 src1, float32 src2)
55
g_assert_not_reached();
53
}
56
}
54
57
55
-static void mve_update_and_store_eci(DisasContext *s)
58
-uint32_t helper_flcmpd(float64 src1, float64 src2)
56
+void mve_update_and_store_eci(DisasContext *s)
59
+uint32_t helper_flcmpd(CPUSPARCState *env, float64 src1, float64 src2)
57
{
60
{
58
/*
61
- float_status discard = { };
59
* For insns which don't call a helper function that will call
62
+ float_status discard = env->fp_status;
60
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
63
FloatRelation r;
64
65
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
66
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
61
index XXXXXXX..XXXXXXX 100644
67
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/translate-vfp.c
68
--- a/target/sparc/translate.c
63
+++ b/target/arm/translate-vfp.c
69
+++ b/target/sparc/translate.c
64
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
70
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPs(DisasContext *dc, arg_FLCMPs *a)
65
return true;
71
72
src1 = gen_load_fpr_F(dc, a->rs1);
73
src2 = gen_load_fpr_F(dc, a->rs2);
74
- gen_helper_flcmps(cpu_fcc[a->cc], src1, src2);
75
+ gen_helper_flcmps(cpu_fcc[a->cc], tcg_env, src1, src2);
76
return advance_pc(dc);
66
}
77
}
67
78
68
+static bool mve_skip_vmov(DisasContext *s, int vn, int index, int size)
79
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPd(DisasContext *dc, arg_FLCMPd *a)
69
+{
80
70
+ /*
81
src1 = gen_load_fpr_D(dc, a->rs1);
71
+ * In a CPU with MVE, the VMOV (vector lane to general-purpose register)
82
src2 = gen_load_fpr_D(dc, a->rs2);
72
+ * and VMOV (general-purpose register to vector lane) insns are not
83
- gen_helper_flcmpd(cpu_fcc[a->cc], src1, src2);
73
+ * predicated, but they are subject to beatwise execution if they are
84
+ gen_helper_flcmpd(cpu_fcc[a->cc], tcg_env, src1, src2);
74
+ * not in an IT block.
85
return advance_pc(dc);
75
+ *
76
+ * Since our implementation always executes all 4 beats in one tick,
77
+ * this means only that if PSR.ECI says we should not be executing
78
+ * the beat corresponding to the lane of the vector register being
79
+ * accessed then we should skip performing the move, and that we need
80
+ * to do the usual check for bad ECI state and advance of ECI state.
81
+ *
82
+ * Note that if PSR.ECI is non-zero then we cannot be in an IT block.
83
+ *
84
+ * Return true if this VMOV scalar <-> gpreg should be skipped because
85
+ * the MVE PSR.ECI state says we skip the beat where the store happens.
86
+ */
87
+
88
+ /* Calculate the byte offset into Qn which we're going to access */
89
+ int ofs = (index << size) + ((vn & 1) * 8);
90
+
91
+ if (!dc_isar_feature(aa32_mve, s)) {
92
+ return false;
93
+ }
94
+
95
+ switch (s->eci) {
96
+ case ECI_NONE:
97
+ return false;
98
+ case ECI_A0:
99
+ return ofs < 4;
100
+ case ECI_A0A1:
101
+ return ofs < 8;
102
+ case ECI_A0A1A2:
103
+ case ECI_A0A1A2B0:
104
+ return ofs < 12;
105
+ default:
106
+ g_assert_not_reached();
107
+ }
108
+}
109
+
110
static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
111
{
112
/* VMOV scalar to general purpose register */
113
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
114
return false;
115
}
116
117
+ if (dc_isar_feature(aa32_mve, s)) {
118
+ if (!mve_eci_check(s)) {
119
+ return true;
120
+ }
121
+ }
122
+
123
if (!vfp_access_check(s)) {
124
return true;
125
}
126
127
- tmp = tcg_temp_new_i32();
128
- read_neon_element32(tmp, a->vn, a->index, a->size | (a->u ? 0 : MO_SIGN));
129
- store_reg(s, a->rt, tmp);
130
+ if (!mve_skip_vmov(s, a->vn, a->index, a->size)) {
131
+ tmp = tcg_temp_new_i32();
132
+ read_neon_element32(tmp, a->vn, a->index,
133
+ a->size | (a->u ? 0 : MO_SIGN));
134
+ store_reg(s, a->rt, tmp);
135
+ }
136
137
+ if (dc_isar_feature(aa32_mve, s)) {
138
+ mve_update_and_store_eci(s);
139
+ }
140
return true;
141
}
86
}
142
87
143
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a)
144
return false;
145
}
146
147
+ if (dc_isar_feature(aa32_mve, s)) {
148
+ if (!mve_eci_check(s)) {
149
+ return true;
150
+ }
151
+ }
152
+
153
if (!vfp_access_check(s)) {
154
return true;
155
}
156
157
- tmp = load_reg(s, a->rt);
158
- write_neon_element32(tmp, a->vn, a->index, a->size);
159
- tcg_temp_free_i32(tmp);
160
+ if (!mve_skip_vmov(s, a->vn, a->index, a->size)) {
161
+ tmp = load_reg(s, a->rt);
162
+ write_neon_element32(tmp, a->vn, a->index, a->size);
163
+ tcg_temp_free_i32(tmp);
164
+ }
165
166
+ if (dc_isar_feature(aa32_mve, s)) {
167
+ mve_update_and_store_eci(s);
168
+ }
169
return true;
170
}
171
172
--
88
--
173
2.20.1
89
2.34.1
174
175
diff view generated by jsdifflib
1
Implement the MVE VMULL insn, which multiplies two single
1
In the helper_compute_fprf functions, we pass a dummy float_status
2
width integer elements to produce a double width result.
2
in to the is_signaling_nan() function. This is unnecessary, because
3
we have convenient access to the CPU env pointer here and that
4
is already set up with the correct values for the snan_bit_is_one
5
and no_signaling_nans config settings. is_signaling_nan() doesn't
6
ever update the fp_status with any exception flags, so there is
7
no reason not to use env->fp_status here.
8
9
Use env->fp_status instead of the dummy fp_status.
3
10
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-19-peter.maydell@linaro.org
13
Message-id: 20241202131347.498124-34-peter.maydell@linaro.org
7
---
14
---
8
target/arm/helper-mve.h | 14 ++++++++++++++
15
target/ppc/fpu_helper.c | 3 +--
9
target/arm/mve.decode | 5 +++++
16
1 file changed, 1 insertion(+), 2 deletions(-)
10
target/arm/mve_helper.c | 34 ++++++++++++++++++++++++++++++++++
11
target/arm/translate-mve.c | 4 ++++
12
4 files changed, 57 insertions(+)
13
17
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
18
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
20
--- a/target/ppc/fpu_helper.c
17
+++ b/target/arm/helper-mve.h
21
+++ b/target/ppc/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vhsubsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
@@ -XXX,XX +XXX,XX @@ void helper_compute_fprf_##tp(CPUPPCState *env, tp arg) \
19
DEF_HELPER_FLAGS_4(mve_vhsubub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
} else if (tp##_is_infinity(arg)) { \
20
DEF_HELPER_FLAGS_4(mve_vhsubuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
fprf = neg ? 0x09 << FPSCR_FPRF : 0x05 << FPSCR_FPRF; \
21
DEF_HELPER_FLAGS_4(mve_vhsubuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
} else { \
22
+
26
- float_status dummy = { }; /* snan_bit_is_one = 0 */ \
23
+DEF_HELPER_FLAGS_4(mve_vmullbsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
- if (tp##_is_signaling_nan(arg, &dummy)) { \
24
+DEF_HELPER_FLAGS_4(mve_vmullbsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+ if (tp##_is_signaling_nan(arg, &env->fp_status)) { \
25
+DEF_HELPER_FLAGS_4(mve_vmullbsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
fprf = 0x00 << FPSCR_FPRF; \
26
+DEF_HELPER_FLAGS_4(mve_vmullbub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
} else { \
27
+DEF_HELPER_FLAGS_4(mve_vmullbuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
fprf = 0x11 << FPSCR_FPRF; \
28
+DEF_HELPER_FLAGS_4(mve_vmullbuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+
30
+DEF_HELPER_FLAGS_4(mve_vmulltsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
+DEF_HELPER_FLAGS_4(mve_vmulltsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
+DEF_HELPER_FLAGS_4(mve_vmulltsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
+DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
35
+DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
36
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/mve.decode
39
+++ b/target/arm/mve.decode
40
@@ -XXX,XX +XXX,XX @@ VHADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 0 ... 0 @2op
41
VHSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op
42
VHSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op
43
44
+VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
45
+VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op
46
+VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
47
+VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op
48
+
49
# Vector miscellaneous
50
51
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/mve_helper.c
55
+++ b/target/arm/mve_helper.c
56
@@ -XXX,XX +XXX,XX @@ DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
57
DO_2OP(OP##h, 2, int16_t, FN) \
58
DO_2OP(OP##w, 4, int32_t, FN)
59
60
+/*
61
+ * "Long" operations where two half-sized inputs (taken from either the
62
+ * top or the bottom of the input vector) produce a double-width result.
63
+ * Here ESIZE, TYPE are for the input, and LESIZE, LTYPE for the output.
64
+ */
65
+#define DO_2OP_L(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \
66
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, void *vm) \
67
+ { \
68
+ LTYPE *d = vd; \
69
+ TYPE *n = vn, *m = vm; \
70
+ uint16_t mask = mve_element_mask(env); \
71
+ unsigned le; \
72
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
73
+ LTYPE r = FN((LTYPE)n[H##ESIZE(le * 2 + TOP)], \
74
+ m[H##ESIZE(le * 2 + TOP)]); \
75
+ mergemask(&d[H##LESIZE(le)], r, mask); \
76
+ } \
77
+ mve_advance_vpt(env); \
78
+ }
79
+
80
#define DO_AND(N, M) ((N) & (M))
81
#define DO_BIC(N, M) ((N) & ~(M))
82
#define DO_ORR(N, M) ((N) | (M))
83
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vadd, DO_ADD)
84
DO_2OP_U(vsub, DO_SUB)
85
DO_2OP_U(vmul, DO_MUL)
86
87
+DO_2OP_L(vmullbsb, 0, 1, int8_t, 2, int16_t, DO_MUL)
88
+DO_2OP_L(vmullbsh, 0, 2, int16_t, 4, int32_t, DO_MUL)
89
+DO_2OP_L(vmullbsw, 0, 4, int32_t, 8, int64_t, DO_MUL)
90
+DO_2OP_L(vmullbub, 0, 1, uint8_t, 2, uint16_t, DO_MUL)
91
+DO_2OP_L(vmullbuh, 0, 2, uint16_t, 4, uint32_t, DO_MUL)
92
+DO_2OP_L(vmullbuw, 0, 4, uint32_t, 8, uint64_t, DO_MUL)
93
+
94
+DO_2OP_L(vmulltsb, 1, 1, int8_t, 2, int16_t, DO_MUL)
95
+DO_2OP_L(vmulltsh, 1, 2, int16_t, 4, int32_t, DO_MUL)
96
+DO_2OP_L(vmulltsw, 1, 4, int32_t, 8, int64_t, DO_MUL)
97
+DO_2OP_L(vmulltub, 1, 1, uint8_t, 2, uint16_t, DO_MUL)
98
+DO_2OP_L(vmulltuh, 1, 2, uint16_t, 4, uint32_t, DO_MUL)
99
+DO_2OP_L(vmulltuw, 1, 4, uint32_t, 8, uint64_t, DO_MUL)
100
+
101
/*
102
* Because the computation type is at least twice as large as required,
103
* these work for both signed and unsigned source types.
104
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
105
index XXXXXXX..XXXXXXX 100644
106
--- a/target/arm/translate-mve.c
107
+++ b/target/arm/translate-mve.c
108
@@ -XXX,XX +XXX,XX @@ DO_2OP(VHADD_S, vhadds)
109
DO_2OP(VHADD_U, vhaddu)
110
DO_2OP(VHSUB_S, vhsubs)
111
DO_2OP(VHSUB_U, vhsubu)
112
+DO_2OP(VMULL_BS, vmullbs)
113
+DO_2OP(VMULL_BU, vmullbu)
114
+DO_2OP(VMULL_TS, vmullts)
115
+DO_2OP(VMULL_TU, vmulltu)
116
--
32
--
117
2.20.1
33
2.34.1
118
119
diff view generated by jsdifflib
1
These days the Arm architecture has a wide range of fine-grained
1
From: Richard Henderson <richard.henderson@linaro.org>
2
optional extra architectural features. We implement quite a lot
3
of these but by no means all of them. Document what we do implement,
4
so that users can find out without having to dig through back-issues
5
of our Changelog on the wiki.
6
2
3
Now that float_status has a bunch of fp parameters,
4
it is easier to copy an existing structure than create
5
one from scratch. Begin by copying the structure that
6
corresponds to the FPSR and make only the adjustments
7
required for BFloat16 semantics.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Message-id: 20241203203949.483774-2-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Message-id: 20210617140328.28622-1-peter.maydell@linaro.org
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
---
14
---
12
docs/system/arm/emulation.rst | 102 ++++++++++++++++++++++++++++++++++
15
target/arm/tcg/vec_helper.c | 20 +++++++-------------
13
docs/system/target-arm.rst | 6 ++
16
1 file changed, 7 insertions(+), 13 deletions(-)
14
2 files changed, 108 insertions(+)
15
create mode 100644 docs/system/arm/emulation.rst
16
17
17
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
18
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
18
new file mode 100644
19
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX
20
--- a/target/arm/tcg/vec_helper.c
20
--- /dev/null
21
+++ b/target/arm/tcg/vec_helper.c
21
+++ b/docs/system/arm/emulation.rst
22
@@ -XXX,XX +XXX,XX @@ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
22
@@ -XXX,XX +XXX,XX @@
23
* no effect on AArch32 instructions.
23
+A-profile CPU architecture support
24
*/
24
+==================================
25
bool ebf = is_a64(env) && env->vfp.fpcr & FPCR_EBF;
26
- *statusp = (float_status){
27
- .tininess_before_rounding = float_tininess_before_rounding,
28
- .float_rounding_mode = float_round_to_odd_inf,
29
- .flush_to_zero = true,
30
- .flush_inputs_to_zero = true,
31
- .default_nan_mode = true,
32
- };
25
+
33
+
26
+QEMU's TCG emulation includes support for the Armv5, Armv6, Armv7 and
34
+ *statusp = env->vfp.fp_status;
27
+Armv8 versions of the A-profile architecture. It also has support for
35
+ set_default_nan_mode(true, statusp);
28
+the following architecture extensions:
36
29
+
37
if (ebf) {
30
+- FEAT_AA32BF16 (AArch32 BFloat16 instructions)
38
- float_status *fpst = &env->vfp.fp_status;
31
+- FEAT_AA32HPD (AArch32 hierarchical permission disables)
39
- set_flush_to_zero(get_flush_to_zero(fpst), statusp);
32
+- FEAT_AA32I8MM (AArch32 Int8 matrix multiplication instructions)
40
- set_flush_inputs_to_zero(get_flush_inputs_to_zero(fpst), statusp);
33
+- FEAT_AES (AESD and AESE instructions)
41
- set_float_rounding_mode(get_float_rounding_mode(fpst), statusp);
34
+- FEAT_BF16 (AArch64 BFloat16 instructions)
42
-
35
+- FEAT_BTI (Branch Target Identification)
43
/* EBF=1 needs to do a step with round-to-odd semantics */
36
+- FEAT_DIT (Data Independent Timing instructions)
44
*oddstatusp = *statusp;
37
+- FEAT_DPB (DC CVAP instruction)
45
set_float_rounding_mode(float_round_to_odd, oddstatusp);
38
+- FEAT_DotProd (Advanced SIMD dot product instructions)
46
+ } else {
39
+- FEAT_FCMA (Floating-point complex number instructions)
47
+ set_flush_to_zero(true, statusp);
40
+- FEAT_FHM (Floating-point half-precision multiplication instructions)
48
+ set_flush_inputs_to_zero(true, statusp);
41
+- FEAT_FP16 (Half-precision floating-point data processing)
49
+ set_float_rounding_mode(float_round_to_odd_inf, statusp);
42
+- FEAT_FRINTTS (Floating-point to integer instructions)
50
}
43
+- FEAT_FlagM (Flag manipulation instructions v2)
51
-
44
+- FEAT_FlagM2 (Enhancements to flag manipulation instructions)
52
return ebf;
45
+- FEAT_HPDS (Hierarchical permission disables)
53
}
46
+- FEAT_I8MM (AArch64 Int8 matrix multiplication instructions)
47
+- FEAT_JSCVT (JavaScript conversion instructions)
48
+- FEAT_LOR (Limited ordering regions)
49
+- FEAT_LRCPC (Load-acquire RCpc instructions)
50
+- FEAT_LRCPC2 (Load-acquire RCpc instructions v2)
51
+- FEAT_LSE (Large System Extensions)
52
+- FEAT_MTE (Memory Tagging Extension)
53
+- FEAT_MTE2 (Memory Tagging Extension)
54
+- FEAT_PAN (Privileged access never)
55
+- FEAT_PAN2 (AT S1E1R and AT S1E1W instruction variants affected by PSTATE.PAN)
56
+- FEAT_PAuth (Pointer authentication)
57
+- FEAT_PMULL (PMULL, PMULL2 instructions)
58
+- FEAT_PMUv3p1 (PMU Extensions v3.1)
59
+- FEAT_PMUv3p4 (PMU Extensions v3.4)
60
+- FEAT_RDM (Advanced SIMD rounding double multiply accumulate instructions)
61
+- FEAT_RNG (Random number generator)
62
+- FEAT_SB (Speculation Barrier)
63
+- FEAT_SEL2 (Secure EL2)
64
+- FEAT_SHA1 (SHA1 instructions)
65
+- FEAT_SHA256 (SHA256 instructions)
66
+- FEAT_SHA3 (Advanced SIMD SHA3 instructions)
67
+- FEAT_SHA512 (Advanced SIMD SHA512 instructions)
68
+- FEAT_SM3 (Advanced SIMD SM3 instructions)
69
+- FEAT_SM4 (Advanced SIMD SM4 instructions)
70
+- FEAT_SPECRES (Speculation restriction instructions)
71
+- FEAT_SSBS (Speculative Store Bypass Safe)
72
+- FEAT_TLBIOS (TLB invalidate instructions in Outer Shareable domain)
73
+- FEAT_TLBIRANGE (TLB invalidate range instructions)
74
+- FEAT_TTCNP (Translation table Common not private translations)
75
+- FEAT_TTST (Small translation tables)
76
+- FEAT_UAO (Unprivileged Access Override control)
77
+- FEAT_VHE (Virtualization Host Extensions)
78
+- FEAT_VMID16 (16-bit VMID)
79
+- FEAT_XNX (Translation table stage 2 Unprivileged Execute-never)
80
+- SVE (The Scalable Vector Extension)
81
+- SVE2 (The Scalable Vector Extension v2)
82
+
83
+For information on the specifics of these extensions, please refer
84
+to the `Armv8-A Arm Architecture Reference Manual
85
+<https://developer.arm.com/documentation/ddi0487/latest>`_.
86
+
87
+When a specific named CPU is being emulated, only those features which
88
+are present in hardware for that CPU are emulated. (If a feature is
89
+not in the list above then it is not supported, even if the real
90
+hardware should have it.) The ``max`` CPU enables all features.
91
+
92
+R-profile CPU architecture support
93
+==================================
94
+
95
+QEMU's TCG emulation support for R-profile CPUs is currently limited.
96
+We emulate only the Cortex-R5 and Cortex-R5F CPUs.
97
+
98
+M-profile CPU architecture support
99
+==================================
100
+
101
+QEMU's TCG emulation includes support for Armv6-M, Armv7-M, Armv8-M, and
102
+Armv8.1-M versions of the M-profile architucture. It also has support
103
+for the following architecture extensions:
104
+
105
+- FP (Floating-point Extension)
106
+- FPCXT (FPCXT access instructions)
107
+- HP (Half-precision floating-point instructions)
108
+- LOB (Low Overhead loops and Branch future)
109
+- M (Main Extension)
110
+- MPU (Memory Protection Unit Extension)
111
+- PXN (Privileged Execute Never)
112
+- RAS (Reliability, Serviceability and Availability): "minimum RAS Extension" only
113
+- S (Security Extension)
114
+- ST (System Timer Extension)
115
+
116
+For information on the specifics of these extensions, please refer
117
+to the `Armv8-M Arm Architecture Reference Manual
118
+<https://developer.arm.com/documentation/ddi0553/latest>`_.
119
+
120
+When a specific named CPU is being emulated, only those features which
121
+are present in hardware for that CPU are emulated. (If a feature is
122
+not in the list above then it is not supported, even if the real
123
+hardware should have it.) There is no equivalent of the ``max`` CPU for
124
+M-profile.
125
diff --git a/docs/system/target-arm.rst b/docs/system/target-arm.rst
126
index XXXXXXX..XXXXXXX 100644
127
--- a/docs/system/target-arm.rst
128
+++ b/docs/system/target-arm.rst
129
@@ -XXX,XX +XXX,XX @@ undocumented; you can get a complete list by running
130
arm/virt
131
arm/xlnx-versal-virt
132
133
+Emulated CPU architecture support
134
+=================================
135
+
136
+.. toctree::
137
+ arm/emulation
138
+
139
Arm CPU features
140
================
141
54
142
--
55
--
143
2.20.1
56
2.34.1
144
57
145
58
diff view generated by jsdifflib
1
Implement the MVE VMLALDAV insn, which multiplies pairs of integer
1
Currently we hardcode the default NaN value in parts64_default_nan()
2
elements, accumulating them into a 64-bit result in a pair of
2
using a compile-time ifdef ladder. This is awkward for two cases:
3
general-purpose registers.
3
* for single-QEMU-binary we can't hard-code target-specifics like this
4
* for Arm FEAT_AFP the default NaN value depends on FPCR.AH
5
(specifically the sign bit is different)
6
7
Add a field to float_status to specify the default NaN value; fall
8
back to the old ifdef behaviour if these are not set.
9
10
The default NaN value is specified by setting a uint8_t to a
11
pattern corresponding to the sign and upper fraction parts of
12
the NaN; the lower bits of the fraction are set from bit 0 of
13
the pattern.
4
14
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210617121628.20116-20-peter.maydell@linaro.org
17
Message-id: 20241202131347.498124-35-peter.maydell@linaro.org
8
---
18
---
9
target/arm/helper-mve.h | 8 ++++
19
include/fpu/softfloat-helpers.h | 11 +++++++
10
target/arm/translate.h | 10 ++++
20
include/fpu/softfloat-types.h | 10 ++++++
11
target/arm/mve.decode | 15 ++++++
21
fpu/softfloat-specialize.c.inc | 55 ++++++++++++++++++++-------------
12
target/arm/mve_helper.c | 34 ++++++++++++++
22
3 files changed, 54 insertions(+), 22 deletions(-)
13
target/arm/translate-mve.c | 96 ++++++++++++++++++++++++++++++++++++++
14
5 files changed, 163 insertions(+)
15
23
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
24
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
17
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-mve.h
26
--- a/include/fpu/softfloat-helpers.h
19
+++ b/target/arm/helper-mve.h
27
+++ b/include/fpu/softfloat-helpers.h
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmulltsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
@@ -XXX,XX +XXX,XX @@ static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
21
DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
status->float_infzeronan_rule = rule;
22
DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
+
25
+DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
26
+DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
27
+DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
28
+DEF_HELPER_FLAGS_4(mve_vmlaldavxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
29
+
30
+DEF_HELPER_FLAGS_4(mve_vmlaldavuh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
31
+DEF_HELPER_FLAGS_4(mve_vmlaldavuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
32
diff --git a/target/arm/translate.h b/target/arm/translate.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate.h
35
+++ b/target/arm/translate.h
36
@@ -XXX,XX +XXX,XX @@ static inline int negate(DisasContext *s, int x)
37
return -x;
38
}
30
}
39
31
40
+static inline int plus_1(DisasContext *s, int x)
32
+static inline void set_float_default_nan_pattern(uint8_t dnan_pattern,
33
+ float_status *status)
41
+{
34
+{
42
+ return x + 1;
35
+ status->default_nan_pattern = dnan_pattern;
43
+}
36
+}
44
+
37
+
45
static inline int plus_2(DisasContext *s, int x)
38
static inline void set_flush_to_zero(bool val, float_status *status)
46
{
39
{
47
return x + 2;
40
status->flush_to_zero = val;
48
@@ -XXX,XX +XXX,XX @@ static inline int times_4(DisasContext *s, int x)
41
@@ -XXX,XX +XXX,XX @@ static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status
49
return x * 4;
42
return status->float_infzeronan_rule;
50
}
43
}
51
44
52
+static inline int times_2_plus_1(DisasContext *s, int x)
45
+static inline uint8_t get_float_default_nan_pattern(float_status *status)
53
+{
46
+{
54
+ return x * 2 + 1;
47
+ return status->default_nan_pattern;
55
+}
48
+}
56
+
49
+
57
static inline int arm_dc_feature(DisasContext *dc, int feature)
50
static inline bool get_flush_to_zero(float_status *status)
58
{
51
{
59
return (dc->features & (1ULL << feature)) != 0;
52
return status->flush_to_zero;
60
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
53
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
61
index XXXXXXX..XXXXXXX 100644
54
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/mve.decode
55
--- a/include/fpu/softfloat-types.h
63
+++ b/target/arm/mve.decode
56
+++ b/include/fpu/softfloat-types.h
64
@@ -XXX,XX +XXX,XX @@ VNEG_fp 1111 1111 1 . 11 .. 01 ... 0 0111 11 . 0 ... 0 @1op
57
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
65
VDUP 1110 1110 1 1 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=0
58
/* should denormalised inputs go to zero and set the input_denormal flag? */
66
VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 1 1 0000 @vdup size=1
59
bool flush_inputs_to_zero;
67
VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
60
bool default_nan_mode;
61
+ /*
62
+ * The pattern to use for the default NaN. Here the high bit specifies
63
+ * the default NaN's sign bit, and bits 6..0 specify the high bits of the
64
+ * fractional part. The low bits of the fractional part are copies of bit 0.
65
+ * The exponent of the default NaN is (as for any NaN) always all 1s.
66
+ * Note that a value of 0 here is not a valid NaN. The target must set
67
+ * this to the correct non-zero value, or we will assert when trying to
68
+ * create a default NaN.
69
+ */
70
+ uint8_t default_nan_pattern;
71
/*
72
* The flags below are not used on all specializations and may
73
* constant fold away (see snan_bit_is_one()/no_signalling_nans() in
74
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
75
index XXXXXXX..XXXXXXX 100644
76
--- a/fpu/softfloat-specialize.c.inc
77
+++ b/fpu/softfloat-specialize.c.inc
78
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
79
{
80
bool sign = 0;
81
uint64_t frac;
82
+ uint8_t dnan_pattern = status->default_nan_pattern;
83
84
+ if (dnan_pattern == 0) {
85
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
86
- /* !snan_bit_is_one, set all bits */
87
- frac = (1ULL << DECOMPOSED_BINARY_POINT) - 1;
88
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
89
+ /* Sign bit clear, all frac bits set */
90
+ dnan_pattern = 0b01111111;
91
+#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
92
|| defined(TARGET_MICROBLAZE)
93
- /* !snan_bit_is_one, set sign and msb */
94
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
95
- sign = 1;
96
+ /* Sign bit set, most significant frac bit set */
97
+ dnan_pattern = 0b11000000;
98
#elif defined(TARGET_HPPA)
99
- /* snan_bit_is_one, set msb-1. */
100
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 2);
101
+ /* Sign bit clear, msb-1 frac bit set */
102
+ dnan_pattern = 0b00100000;
103
#elif defined(TARGET_HEXAGON)
104
- sign = 1;
105
- frac = ~0ULL;
106
+ /* Sign bit set, all frac bits set. */
107
+ dnan_pattern = 0b11111111;
108
#else
109
- /*
110
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
111
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
112
- * do not have floating-point.
113
- */
114
- if (snan_bit_is_one(status)) {
115
- /* set all bits other than msb */
116
- frac = (1ULL << (DECOMPOSED_BINARY_POINT - 1)) - 1;
117
- } else {
118
- /* set msb */
119
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
120
- }
121
+ /*
122
+ * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
123
+ * S390, SH4, TriCore, and Xtensa. Our other supported targets
124
+ * do not have floating-point.
125
+ */
126
+ if (snan_bit_is_one(status)) {
127
+ /* sign bit clear, set all frac bits other than msb */
128
+ dnan_pattern = 0b00111111;
129
+ } else {
130
+ /* sign bit clear, set frac msb */
131
+ dnan_pattern = 0b01000000;
132
+ }
133
#endif
134
+ }
135
+ assert(dnan_pattern != 0);
68
+
136
+
69
+# multiply-add long dual accumulate
137
+ sign = dnan_pattern >> 7;
70
+# rdahi: bits [3:1] from insn, bit 0 is 1
71
+# rdalo: bits [3:1] from insn, bit 0 is 0
72
+%rdahi 20:3 !function=times_2_plus_1
73
+%rdalo 13:3 !function=times_2
74
+# size bit is 0 for 16 bit, 1 for 32 bit
75
+%size_16 16:1 !function=plus_1
76
+
77
+&vmlaldav rdahi rdalo size qn qm x a
78
+
79
+@vmlaldav .... .... . ... ... . ... . .... .... qm:3 . \
80
+ qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav
81
+VMLALDAV_S 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
82
+VMLALDAV_U 1111 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav
83
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
84
index XXXXXXX..XXXXXXX 100644
85
--- a/target/arm/mve_helper.c
86
+++ b/target/arm/mve_helper.c
87
@@ -XXX,XX +XXX,XX @@ DO_2OP_S(vhadds, do_vhadd_s)
88
DO_2OP_U(vhaddu, do_vhadd_u)
89
DO_2OP_S(vhsubs, do_vhsub_s)
90
DO_2OP_U(vhsubu, do_vhsub_u)
91
+
92
+
93
+/*
94
+ * Multiply add long dual accumulate ops.
95
+ */
96
+#define DO_LDAV(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC) \
97
+ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
98
+ void *vm, uint64_t a) \
99
+ { \
100
+ uint16_t mask = mve_element_mask(env); \
101
+ unsigned e; \
102
+ TYPE *n = vn, *m = vm; \
103
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
104
+ if (mask & 1) { \
105
+ if (e & 1) { \
106
+ a ODDACC \
107
+ (int64_t)n[H##ESIZE(e - 1 * XCHG)] * m[H##ESIZE(e)]; \
108
+ } else { \
109
+ a EVENACC \
110
+ (int64_t)n[H##ESIZE(e + 1 * XCHG)] * m[H##ESIZE(e)]; \
111
+ } \
112
+ } \
113
+ } \
114
+ mve_advance_vpt(env); \
115
+ return a; \
116
+ }
117
+
118
+DO_LDAV(vmlaldavsh, 2, int16_t, false, +=, +=)
119
+DO_LDAV(vmlaldavxsh, 2, int16_t, true, +=, +=)
120
+DO_LDAV(vmlaldavsw, 4, int32_t, false, +=, +=)
121
+DO_LDAV(vmlaldavxsw, 4, int32_t, true, +=, +=)
122
+
123
+DO_LDAV(vmlaldavuh, 2, uint16_t, false, +=, +=)
124
+DO_LDAV(vmlaldavuw, 4, uint32_t, false, +=, +=)
125
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
126
index XXXXXXX..XXXXXXX 100644
127
--- a/target/arm/translate-mve.c
128
+++ b/target/arm/translate-mve.c
129
@@ -XXX,XX +XXX,XX @@
130
typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
131
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
132
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
133
+typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
134
135
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
136
static inline long mve_qreg_offset(unsigned reg)
137
@@ -XXX,XX +XXX,XX @@ static void mve_update_eci(DisasContext *s)
138
}
139
}
140
141
+static bool mve_skip_first_beat(DisasContext *s)
142
+{
143
+ /* Return true if PSR.ECI says we must skip the first beat of this insn */
144
+ switch (s->eci) {
145
+ case ECI_NONE:
146
+ return false;
147
+ case ECI_A0:
148
+ case ECI_A0A1:
149
+ case ECI_A0A1A2:
150
+ case ECI_A0A1A2B0:
151
+ return true;
152
+ default:
153
+ g_assert_not_reached();
154
+ }
155
+}
156
+
157
static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn)
158
{
159
TCGv_i32 addr;
160
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMULL_BS, vmullbs)
161
DO_2OP(VMULL_BU, vmullbu)
162
DO_2OP(VMULL_TS, vmullts)
163
DO_2OP(VMULL_TU, vmulltu)
164
+
165
+static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
166
+ MVEGenDualAccOpFn *fn)
167
+{
168
+ TCGv_ptr qn, qm;
169
+ TCGv_i64 rda;
170
+ TCGv_i32 rdalo, rdahi;
171
+
172
+ if (!dc_isar_feature(aa32_mve, s) ||
173
+ !mve_check_qreg_bank(s, a->qn | a->qm) ||
174
+ !fn) {
175
+ return false;
176
+ }
177
+ /*
138
+ /*
178
+ * rdahi == 13 is UNPREDICTABLE; rdahi == 15 is a related
139
+ * Place default_nan_pattern [6:0] into bits [62:56],
179
+ * encoding; rdalo always has bit 0 clear so cannot be 13 or 15.
140
+ * and replecate bit [0] down into [55:0]
180
+ */
141
+ */
181
+ if (a->rdahi == 13 || a->rdahi == 15) {
142
+ frac = deposit64(0, DECOMPOSED_BINARY_POINT - 7, 7, dnan_pattern);
182
+ return false;
143
+ frac = deposit64(frac, 0, DECOMPOSED_BINARY_POINT - 7, -(dnan_pattern & 1));
183
+ }
144
184
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
145
*p = (FloatParts64) {
185
+ return true;
146
.cls = float_class_qnan,
186
+ }
187
+
188
+ qn = mve_qreg_ptr(a->qn);
189
+ qm = mve_qreg_ptr(a->qm);
190
+
191
+ /*
192
+ * This insn is subject to beat-wise execution. Partial execution
193
+ * of an A=0 (no-accumulate) insn which does not execute the first
194
+ * beat must start with the current rda value, not 0.
195
+ */
196
+ if (a->a || mve_skip_first_beat(s)) {
197
+ rda = tcg_temp_new_i64();
198
+ rdalo = load_reg(s, a->rdalo);
199
+ rdahi = load_reg(s, a->rdahi);
200
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
201
+ tcg_temp_free_i32(rdalo);
202
+ tcg_temp_free_i32(rdahi);
203
+ } else {
204
+ rda = tcg_const_i64(0);
205
+ }
206
+
207
+ fn(rda, cpu_env, qn, qm, rda);
208
+ tcg_temp_free_ptr(qn);
209
+ tcg_temp_free_ptr(qm);
210
+
211
+ rdalo = tcg_temp_new_i32();
212
+ rdahi = tcg_temp_new_i32();
213
+ tcg_gen_extrl_i64_i32(rdalo, rda);
214
+ tcg_gen_extrh_i64_i32(rdahi, rda);
215
+ store_reg(s, a->rdalo, rdalo);
216
+ store_reg(s, a->rdahi, rdahi);
217
+ tcg_temp_free_i64(rda);
218
+ mve_update_eci(s);
219
+ return true;
220
+}
221
+
222
+static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a)
223
+{
224
+ static MVEGenDualAccOpFn * const fns[4][2] = {
225
+ { NULL, NULL },
226
+ { gen_helper_mve_vmlaldavsh, gen_helper_mve_vmlaldavxsh },
227
+ { gen_helper_mve_vmlaldavsw, gen_helper_mve_vmlaldavxsw },
228
+ { NULL, NULL },
229
+ };
230
+ return do_long_dual_acc(s, a, fns[a->size][a->x]);
231
+}
232
+
233
+static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a)
234
+{
235
+ static MVEGenDualAccOpFn * const fns[4][2] = {
236
+ { NULL, NULL },
237
+ { gen_helper_mve_vmlaldavuh, NULL },
238
+ { gen_helper_mve_vmlaldavuw, NULL },
239
+ { NULL, NULL },
240
+ };
241
+ return do_long_dual_acc(s, a, fns[a->size][a->x]);
242
+}
243
--
147
--
244
2.20.1
148
2.34.1
245
246
diff view generated by jsdifflib
1
Implement MVE VHADD and VHSUB insns, which perform an addition
1
Set the default NaN pattern explicitly for the tests/fp code.
2
or subtraction and then halve the result.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-18-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-36-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-mve.h | 14 ++++++++++++++
7
tests/fp/fp-bench.c | 1 +
9
target/arm/mve.decode | 5 +++++
8
tests/fp/fp-test-log2.c | 1 +
10
target/arm/mve_helper.c | 25 +++++++++++++++++++++++++
9
tests/fp/fp-test.c | 1 +
11
target/arm/translate-mve.c | 4 ++++
10
3 files changed, 3 insertions(+)
12
4 files changed, 48 insertions(+)
13
11
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
14
--- a/tests/fp/fp-bench.c
17
+++ b/target/arm/helper-mve.h
15
+++ b/tests/fp/fp-bench.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vabdsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
19
DEF_HELPER_FLAGS_4(mve_vabdub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
20
DEF_HELPER_FLAGS_4(mve_vabduh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
21
DEF_HELPER_FLAGS_4(mve_vabduw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
22
+
20
+ set_float_default_nan_pattern(0b01000000, &soft_status);
23
+DEF_HELPER_FLAGS_4(mve_vhaddsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
24
+DEF_HELPER_FLAGS_4(mve_vhaddsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
f = bench_funcs[operation][precision];
25
+DEF_HELPER_FLAGS_4(mve_vhaddsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
g_assert(f);
26
+DEF_HELPER_FLAGS_4(mve_vhaddub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
diff --git a/tests/fp/fp-test-log2.c b/tests/fp/fp-test-log2.c
27
+DEF_HELPER_FLAGS_4(mve_vhadduh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+DEF_HELPER_FLAGS_4(mve_vhadduw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+
30
+DEF_HELPER_FLAGS_4(mve_vhsubsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
+DEF_HELPER_FLAGS_4(mve_vhsubsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
+DEF_HELPER_FLAGS_4(mve_vhsubsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
+DEF_HELPER_FLAGS_4(mve_vhsubub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+DEF_HELPER_FLAGS_4(mve_vhsubuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
35
+DEF_HELPER_FLAGS_4(mve_vhsubuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
36
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
37
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/mve.decode
26
--- a/tests/fp/fp-test-log2.c
39
+++ b/target/arm/mve.decode
27
+++ b/tests/fp/fp-test-log2.c
40
@@ -XXX,XX +XXX,XX @@ VMIN_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 1 ... 0 @2op
28
@@ -XXX,XX +XXX,XX @@ int main(int ac, char **av)
41
VABD_S 111 0 1111 0 . .. ... 0 ... 0 0111 . 1 . 0 ... 0 @2op
29
int i;
42
VABD_U 111 1 1111 0 . .. ... 0 ... 0 0111 . 1 . 0 ... 0 @2op
30
43
31
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
44
+VHADD_S 111 0 1111 0 . .. ... 0 ... 0 0000 . 1 . 0 ... 0 @2op
32
+ set_float_default_nan_pattern(0b01000000, &qsf);
45
+VHADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 0 ... 0 @2op
33
set_float_rounding_mode(float_round_nearest_even, &qsf);
46
+VHSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op
34
47
+VHSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op
35
test.d = 0.0;
48
+
36
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
49
# Vector miscellaneous
50
51
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
53
index XXXXXXX..XXXXXXX 100644
37
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/mve_helper.c
38
--- a/tests/fp/fp-test.c
55
+++ b/target/arm/mve_helper.c
39
+++ b/tests/fp/fp-test.c
56
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vminu, DO_MIN)
40
@@ -XXX,XX +XXX,XX @@ void run_test(void)
57
41
*/
58
DO_2OP_S(vabds, DO_ABD)
42
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
59
DO_2OP_U(vabdu, DO_ABD)
43
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
60
+
44
+ set_float_default_nan_pattern(0b01000000, &qsf);
61
+static inline uint32_t do_vhadd_u(uint32_t n, uint32_t m)
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
62
+{
46
63
+ return ((uint64_t)n + m) >> 1;
47
genCases_setLevel(test_level);
64
+}
65
+
66
+static inline int32_t do_vhadd_s(int32_t n, int32_t m)
67
+{
68
+ return ((int64_t)n + m) >> 1;
69
+}
70
+
71
+static inline uint32_t do_vhsub_u(uint32_t n, uint32_t m)
72
+{
73
+ return ((uint64_t)n - m) >> 1;
74
+}
75
+
76
+static inline int32_t do_vhsub_s(int32_t n, int32_t m)
77
+{
78
+ return ((int64_t)n - m) >> 1;
79
+}
80
+
81
+DO_2OP_S(vhadds, do_vhadd_s)
82
+DO_2OP_U(vhaddu, do_vhadd_u)
83
+DO_2OP_S(vhsubs, do_vhsub_s)
84
+DO_2OP_U(vhsubu, do_vhsub_u)
85
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
86
index XXXXXXX..XXXXXXX 100644
87
--- a/target/arm/translate-mve.c
88
+++ b/target/arm/translate-mve.c
89
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMIN_S, vmins)
90
DO_2OP(VMIN_U, vminu)
91
DO_2OP(VABD_S, vabds)
92
DO_2OP(VABD_U, vabdu)
93
+DO_2OP(VHADD_S, vhadds)
94
+DO_2OP(VHADD_U, vhaddu)
95
+DO_2OP(VHSUB_S, vhsubs)
96
+DO_2OP(VHSUB_U, vhsubu)
97
--
48
--
98
2.20.1
49
2.34.1
99
100
diff view generated by jsdifflib
1
Implement the MVE VABD insn.
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
2
3
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-17-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-37-peter.maydell@linaro.org
6
---
7
---
7
target/arm/helper-mve.h | 7 +++++++
8
target/microblaze/cpu.c | 2 ++
8
target/arm/mve.decode | 3 +++
9
fpu/softfloat-specialize.c.inc | 3 +--
9
target/arm/mve_helper.c | 5 +++++
10
2 files changed, 3 insertions(+), 2 deletions(-)
10
target/arm/translate-mve.c | 2 ++
11
4 files changed, 17 insertions(+)
12
11
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
14
--- a/target/microblaze/cpu.c
16
+++ b/target/arm/helper-mve.h
15
+++ b/target/microblaze/cpu.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vminsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_reset_hold(Object *obj, ResetType type)
18
DEF_HELPER_FLAGS_4(mve_vminub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
* this architecture.
19
DEF_HELPER_FLAGS_4(mve_vminuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
*/
20
DEF_HELPER_FLAGS_4(mve_vminuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
21
+
20
+ /* Default NaN: sign bit set, most significant frac bit set */
22
+DEF_HELPER_FLAGS_4(mve_vabdsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
23
+DEF_HELPER_FLAGS_4(mve_vabdsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
24
+DEF_HELPER_FLAGS_4(mve_vabdsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
#if defined(CONFIG_USER_ONLY)
25
+DEF_HELPER_FLAGS_4(mve_vabdub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
/* start in user mode with interrupts enabled. */
26
+DEF_HELPER_FLAGS_4(mve_vabduh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
27
+DEF_HELPER_FLAGS_4(mve_vabduw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
29
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
27
--- a/fpu/softfloat-specialize.c.inc
31
+++ b/target/arm/mve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
32
@@ -XXX,XX +XXX,XX @@ VMAX_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
33
VMIN_S 111 0 1111 0 . .. ... 0 ... 0 0110 . 1 . 1 ... 0 @2op
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
34
VMIN_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 1 ... 0 @2op
31
/* Sign bit clear, all frac bits set */
35
32
dnan_pattern = 0b01111111;
36
+VABD_S 111 0 1111 0 . .. ... 0 ... 0 0111 . 1 . 0 ... 0 @2op
33
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
37
+VABD_U 111 1 1111 0 . .. ... 0 ... 0 0111 . 1 . 0 ... 0 @2op
34
- || defined(TARGET_MICROBLAZE)
38
+
35
+#elif defined(TARGET_I386) || defined(TARGET_X86_64)
39
# Vector miscellaneous
36
/* Sign bit set, most significant frac bit set */
40
37
dnan_pattern = 0b11000000;
41
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
38
#elif defined(TARGET_HPPA)
42
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/mve_helper.c
45
+++ b/target/arm/mve_helper.c
46
@@ -XXX,XX +XXX,XX @@ DO_2OP_S(vmaxs, DO_MAX)
47
DO_2OP_U(vmaxu, DO_MAX)
48
DO_2OP_S(vmins, DO_MIN)
49
DO_2OP_U(vminu, DO_MIN)
50
+
51
+#define DO_ABD(N, M) ((N) >= (M) ? (N) - (M) : (M) - (N))
52
+
53
+DO_2OP_S(vabds, DO_ABD)
54
+DO_2OP_U(vabdu, DO_ABD)
55
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/translate-mve.c
58
+++ b/target/arm/translate-mve.c
59
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMAX_S, vmaxs)
60
DO_2OP(VMAX_U, vmaxu)
61
DO_2OP(VMIN_S, vmins)
62
DO_2OP(VMIN_U, vminu)
63
+DO_2OP(VABD_S, vabds)
64
+DO_2OP(VABD_U, vabdu)
65
--
39
--
66
2.20.1
40
2.34.1
67
68
diff view generated by jsdifflib
1
Implement the MVE VMAX and VMIN insns.
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
2
3
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-16-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-38-peter.maydell@linaro.org
6
---
7
---
7
target/arm/helper-mve.h | 14 ++++++++++++++
8
target/i386/tcg/fpu_helper.c | 4 ++++
8
target/arm/mve.decode | 5 +++++
9
fpu/softfloat-specialize.c.inc | 3 ---
9
target/arm/mve_helper.c | 14 ++++++++++++++
10
2 files changed, 4 insertions(+), 3 deletions(-)
10
target/arm/translate-mve.c | 4 ++++
11
4 files changed, 37 insertions(+)
12
11
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
14
--- a/target/i386/tcg/fpu_helper.c
16
+++ b/target/arm/helper-mve.h
15
+++ b/target/i386/tcg/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrmulhsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
18
DEF_HELPER_FLAGS_4(mve_vrmulhub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
*/
19
DEF_HELPER_FLAGS_4(mve_vrmulhuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
20
DEF_HELPER_FLAGS_4(mve_vrmulhuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
21
+
20
+ /* Default NaN: sign bit set, most significant frac bit set */
22
+DEF_HELPER_FLAGS_4(mve_vmaxsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
23
+DEF_HELPER_FLAGS_4(mve_vmaxsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
+ set_float_default_nan_pattern(0b11000000, &env->mmx_status);
24
+DEF_HELPER_FLAGS_4(mve_vmaxsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
+ set_float_default_nan_pattern(0b11000000, &env->sse_status);
25
+DEF_HELPER_FLAGS_4(mve_vmaxub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
}
26
+DEF_HELPER_FLAGS_4(mve_vmaxuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
27
+DEF_HELPER_FLAGS_4(mve_vmaxuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
static inline uint8_t save_exception_flags(CPUX86State *env)
28
+
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
+DEF_HELPER_FLAGS_4(mve_vminsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
+DEF_HELPER_FLAGS_4(mve_vminsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
+DEF_HELPER_FLAGS_4(mve_vminsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
+DEF_HELPER_FLAGS_4(mve_vminub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
+DEF_HELPER_FLAGS_4(mve_vminuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+DEF_HELPER_FLAGS_4(mve_vminuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
35
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
36
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
37
--- a/target/arm/mve.decode
29
--- a/fpu/softfloat-specialize.c.inc
38
+++ b/target/arm/mve.decode
30
+++ b/fpu/softfloat-specialize.c.inc
39
@@ -XXX,XX +XXX,XX @@ VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
31
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
40
VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
32
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
41
VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
33
/* Sign bit clear, all frac bits set */
42
34
dnan_pattern = 0b01111111;
43
+VMAX_S 111 0 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
35
-#elif defined(TARGET_I386) || defined(TARGET_X86_64)
44
+VMAX_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
36
- /* Sign bit set, most significant frac bit set */
45
+VMIN_S 111 0 1111 0 . .. ... 0 ... 0 0110 . 1 . 1 ... 0 @2op
37
- dnan_pattern = 0b11000000;
46
+VMIN_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 1 ... 0 @2op
38
#elif defined(TARGET_HPPA)
47
+
39
/* Sign bit clear, msb-1 frac bit set */
48
# Vector miscellaneous
40
dnan_pattern = 0b00100000;
49
50
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
51
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/mve_helper.c
54
+++ b/target/arm/mve_helper.c
55
@@ -XXX,XX +XXX,XX @@ DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
56
DO_2OP(OP##h, 2, uint16_t, FN) \
57
DO_2OP(OP##w, 4, uint32_t, FN)
58
59
+/* provide signed 2-op helpers for all sizes */
60
+#define DO_2OP_S(OP, FN) \
61
+ DO_2OP(OP##b, 1, int8_t, FN) \
62
+ DO_2OP(OP##h, 2, int16_t, FN) \
63
+ DO_2OP(OP##w, 4, int32_t, FN)
64
+
65
#define DO_AND(N, M) ((N) & (M))
66
#define DO_BIC(N, M) ((N) & ~(M))
67
#define DO_ORR(N, M) ((N) | (M))
68
@@ -XXX,XX +XXX,XX @@ DO_2OP(vrmulhsw, 4, int32_t, do_rmulh_w)
69
DO_2OP(vrmulhub, 1, uint8_t, do_rmulh_b)
70
DO_2OP(vrmulhuh, 2, uint16_t, do_rmulh_h)
71
DO_2OP(vrmulhuw, 4, uint32_t, do_rmulh_w)
72
+
73
+#define DO_MAX(N, M) ((N) >= (M) ? (N) : (M))
74
+#define DO_MIN(N, M) ((N) >= (M) ? (M) : (N))
75
+
76
+DO_2OP_S(vmaxs, DO_MAX)
77
+DO_2OP_U(vmaxu, DO_MAX)
78
+DO_2OP_S(vmins, DO_MIN)
79
+DO_2OP_U(vminu, DO_MIN)
80
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/translate-mve.c
83
+++ b/target/arm/translate-mve.c
84
@@ -XXX,XX +XXX,XX @@ DO_2OP(VMULH_S, vmulhs)
85
DO_2OP(VMULH_U, vmulhu)
86
DO_2OP(VRMULH_S, vrmulhs)
87
DO_2OP(VRMULH_U, vrmulhu)
88
+DO_2OP(VMAX_S, vmaxs)
89
+DO_2OP(VMAX_U, vmaxu)
90
+DO_2OP(VMIN_S, vmins)
91
+DO_2OP(VMIN_U, vminu)
92
--
41
--
93
2.20.1
42
2.34.1
94
95
diff view generated by jsdifflib
1
Implement the MVE VMULH insn, which performs a vector
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
multiply and returns the high half of the result.
2
parts64_default_nan().
3
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-14-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-39-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-mve.h | 7 +++++++
8
target/hppa/fpu_helper.c | 2 ++
9
target/arm/mve.decode | 3 +++
9
fpu/softfloat-specialize.c.inc | 3 ---
10
target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++
10
2 files changed, 2 insertions(+), 3 deletions(-)
11
target/arm/translate-mve.c | 2 ++
12
4 files changed, 38 insertions(+)
13
11
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
14
--- a/target/hppa/fpu_helper.c
17
+++ b/target/arm/helper-mve.h
15
+++ b/target/hppa/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vsubw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
19
DEF_HELPER_FLAGS_4(mve_vmulb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
20
DEF_HELPER_FLAGS_4(mve_vmulh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
/* For inf * 0 + NaN, return the input NaN */
21
DEF_HELPER_FLAGS_4(mve_vmulw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
+
20
+ /* Default NaN: sign bit clear, msb-1 frac bit set */
23
+DEF_HELPER_FLAGS_4(mve_vmulhsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
+ set_float_default_nan_pattern(0b00100000, &env->fp_status);
24
+DEF_HELPER_FLAGS_4(mve_vmulhsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
}
25
+DEF_HELPER_FLAGS_4(mve_vmulhsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
26
+DEF_HELPER_FLAGS_4(mve_vmulhub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
27
+DEF_HELPER_FLAGS_4(mve_vmulhuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
+DEF_HELPER_FLAGS_4(mve_vmulhuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
30
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/mve.decode
27
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/target/arm/mve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ VADD 1110 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
34
VSUB 1111 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
35
VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
31
/* Sign bit clear, all frac bits set */
36
32
dnan_pattern = 0b01111111;
37
+VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
33
-#elif defined(TARGET_HPPA)
38
+VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
34
- /* Sign bit clear, msb-1 frac bit set */
39
+
35
- dnan_pattern = 0b00100000;
40
# Vector miscellaneous
36
#elif defined(TARGET_HEXAGON)
41
37
/* Sign bit set, all frac bits set. */
42
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
38
dnan_pattern = 0b11111111;
43
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/mve_helper.c
46
+++ b/target/arm/mve_helper.c
47
@@ -XXX,XX +XXX,XX @@ DO_2OP(veor, 8, uint64_t, DO_EOR)
48
DO_2OP_U(vadd, DO_ADD)
49
DO_2OP_U(vsub, DO_SUB)
50
DO_2OP_U(vmul, DO_MUL)
51
+
52
+/*
53
+ * Because the computation type is at least twice as large as required,
54
+ * these work for both signed and unsigned source types.
55
+ */
56
+static inline uint8_t do_mulh_b(int32_t n, int32_t m)
57
+{
58
+ return (n * m) >> 8;
59
+}
60
+
61
+static inline uint16_t do_mulh_h(int32_t n, int32_t m)
62
+{
63
+ return (n * m) >> 16;
64
+}
65
+
66
+static inline uint32_t do_mulh_w(int64_t n, int64_t m)
67
+{
68
+ return (n * m) >> 32;
69
+}
70
+
71
+DO_2OP(vmulhsb, 1, int8_t, do_mulh_b)
72
+DO_2OP(vmulhsh, 2, int16_t, do_mulh_h)
73
+DO_2OP(vmulhsw, 4, int32_t, do_mulh_w)
74
+DO_2OP(vmulhub, 1, uint8_t, do_mulh_b)
75
+DO_2OP(vmulhuh, 2, uint16_t, do_mulh_h)
76
+DO_2OP(vmulhuw, 4, uint32_t, do_mulh_w)
77
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
78
index XXXXXXX..XXXXXXX 100644
79
--- a/target/arm/translate-mve.c
80
+++ b/target/arm/translate-mve.c
81
@@ -XXX,XX +XXX,XX @@ DO_LOGIC(VEOR, gen_helper_mve_veor)
82
DO_2OP(VADD, vadd)
83
DO_2OP(VSUB, vsub)
84
DO_2OP(VMUL, vmul)
85
+DO_2OP(VMULH_S, vmulhs)
86
+DO_2OP(VMULH_U, vmulhu)
87
--
39
--
88
2.20.1
40
2.34.1
89
90
diff view generated by jsdifflib
1
Implement the MVE vector logical operations operating
1
Set the default NaN pattern explicitly for the alpha target.
2
on two registers.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-12-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-40-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-mve.h | 6 ++++++
7
target/alpha/cpu.c | 2 ++
9
target/arm/mve.decode | 9 +++++++++
8
1 file changed, 2 insertions(+)
10
target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++
11
target/arm/translate-mve.c | 37 +++++++++++++++++++++++++++++++++++++
12
4 files changed, 78 insertions(+)
13
9
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
12
--- a/target/alpha/cpu.c
17
+++ b/target/arm/helper-mve.h
13
+++ b/target/alpha/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
14
@@ -XXX,XX +XXX,XX @@ static void alpha_cpu_initfn(Object *obj)
19
DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr)
15
* operand in Fa. That is float_2nan_prop_ba.
20
DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
16
*/
21
DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr)
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
22
+
18
+ /* Default NaN: sign bit clear, msb frac bit set */
23
+DEF_HELPER_FLAGS_4(mve_vand, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
+DEF_HELPER_FLAGS_4(mve_vbic, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
#if defined(CONFIG_USER_ONLY)
25
+DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
env->flags = ENV_FLAG_PS_USER | ENV_FLAG_FEN;
26
+DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
22
cpu_alpha_store_fpcr(env, (uint64_t)(FPCR_INVD | FPCR_DZED | FPCR_OVFD
27
+DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
31
+++ b/target/arm/mve.decode
32
@@ -XXX,XX +XXX,XX @@
33
34
&vldr_vstr rn qd imm p a w size l u
35
&1op qd qm size
36
+&2op qd qm qn size
37
38
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
39
# Note that both Rn and Qd are 3 bits only (no D bit)
40
@@ -XXX,XX +XXX,XX @@
41
42
@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
43
@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
44
+@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
45
46
# Vector loads and stores
47
48
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \
49
VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
50
size=2 p=1
51
52
+# Vector 2-op
53
+VAND 1110 1111 0 . 00 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
54
+VBIC 1110 1111 0 . 01 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
55
+VORR 1110 1111 0 . 10 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
56
+VORN 1110 1111 0 . 11 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
57
+VEOR 1111 1111 0 . 00 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
58
+
59
# Vector miscellaneous
60
61
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
62
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
63
index XXXXXXX..XXXXXXX 100644
64
--- a/target/arm/mve_helper.c
65
+++ b/target/arm/mve_helper.c
66
@@ -XXX,XX +XXX,XX @@ DO_1OP(vnegw, 4, int32_t, DO_NEG)
67
/* We can do these 64 bits at a time */
68
DO_1OP(vfnegh, 8, uint64_t, DO_FNEGH)
69
DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
70
+
71
+#define DO_2OP(OP, ESIZE, TYPE, FN) \
72
+ void HELPER(glue(mve_, OP))(CPUARMState *env, \
73
+ void *vd, void *vn, void *vm) \
74
+ { \
75
+ TYPE *d = vd, *n = vn, *m = vm; \
76
+ uint16_t mask = mve_element_mask(env); \
77
+ unsigned e; \
78
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
79
+ mergemask(&d[H##ESIZE(e)], \
80
+ FN(n[H##ESIZE(e)], m[H##ESIZE(e)]), mask); \
81
+ } \
82
+ mve_advance_vpt(env); \
83
+ }
84
+
85
+#define DO_AND(N, M) ((N) & (M))
86
+#define DO_BIC(N, M) ((N) & ~(M))
87
+#define DO_ORR(N, M) ((N) | (M))
88
+#define DO_ORN(N, M) ((N) | ~(M))
89
+#define DO_EOR(N, M) ((N) ^ (M))
90
+
91
+DO_2OP(vand, 8, uint64_t, DO_AND)
92
+DO_2OP(vbic, 8, uint64_t, DO_BIC)
93
+DO_2OP(vorr, 8, uint64_t, DO_ORR)
94
+DO_2OP(vorn, 8, uint64_t, DO_ORN)
95
+DO_2OP(veor, 8, uint64_t, DO_EOR)
96
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
97
index XXXXXXX..XXXXXXX 100644
98
--- a/target/arm/translate-mve.c
99
+++ b/target/arm/translate-mve.c
100
@@ -XXX,XX +XXX,XX @@
101
102
typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
103
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
104
+typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
105
106
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
107
static inline long mve_qreg_offset(unsigned reg)
108
@@ -XXX,XX +XXX,XX @@ static bool trans_VNEG_fp(DisasContext *s, arg_1op *a)
109
}
110
return do_1op(s, a, fns[a->size]);
111
}
112
+
113
+static bool do_2op(DisasContext *s, arg_2op *a, MVEGenTwoOpFn fn)
114
+{
115
+ TCGv_ptr qd, qn, qm;
116
+
117
+ if (!dc_isar_feature(aa32_mve, s) ||
118
+ !mve_check_qreg_bank(s, a->qd | a->qn | a->qm) ||
119
+ !fn) {
120
+ return false;
121
+ }
122
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
123
+ return true;
124
+ }
125
+
126
+ qd = mve_qreg_ptr(a->qd);
127
+ qn = mve_qreg_ptr(a->qn);
128
+ qm = mve_qreg_ptr(a->qm);
129
+ fn(cpu_env, qd, qn, qm);
130
+ tcg_temp_free_ptr(qd);
131
+ tcg_temp_free_ptr(qn);
132
+ tcg_temp_free_ptr(qm);
133
+ mve_update_eci(s);
134
+ return true;
135
+}
136
+
137
+#define DO_LOGIC(INSN, HELPER) \
138
+ static bool trans_##INSN(DisasContext *s, arg_2op *a) \
139
+ { \
140
+ return do_2op(s, a, HELPER); \
141
+ }
142
+
143
+DO_LOGIC(VAND, gen_helper_mve_vand)
144
+DO_LOGIC(VBIC, gen_helper_mve_vbic)
145
+DO_LOGIC(VORR, gen_helper_mve_vorr)
146
+DO_LOGIC(VORN, gen_helper_mve_vorn)
147
+DO_LOGIC(VEOR, gen_helper_mve_veor)
148
--
23
--
149
2.20.1
24
2.34.1
150
151
diff view generated by jsdifflib
1
In the code for handling VFP system register accesses there is some
1
Set the default NaN pattern explicitly for the arm target.
2
stray whitespace after a unary '-' operator, and also some incorrect
2
This includes setting it for the old linux-user nwfpe emulation.
3
indent in a couple of function prototypes. We're about to move this
3
For nwfpe, our default doesn't match the real kernel, but we
4
code to another file, so fix the code style issues first so
4
avoid making a behaviour change in this commit.
5
checkpatch doesn't complain about the code-movement patch.
6
5
7
Cc: qemu-stable@nongnu.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20210618141019.10671-2-peter.maydell@linaro.org
8
Message-id: 20241202131347.498124-41-peter.maydell@linaro.org
11
---
9
---
12
target/arm/translate-vfp.c | 11 +++++------
10
linux-user/arm/nwfpe/fpa11.c | 5 +++++
13
1 file changed, 5 insertions(+), 6 deletions(-)
11
target/arm/cpu.c | 2 ++
12
2 files changed, 7 insertions(+)
14
13
15
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
14
diff --git a/linux-user/arm/nwfpe/fpa11.c b/linux-user/arm/nwfpe/fpa11.c
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-vfp.c
16
--- a/linux-user/arm/nwfpe/fpa11.c
18
+++ b/target/arm/translate-vfp.c
17
+++ b/linux-user/arm/nwfpe/fpa11.c
19
@@ -XXX,XX +XXX,XX @@ static void gen_branch_fpInactive(DisasContext *s, TCGCond cond,
18
@@ -XXX,XX +XXX,XX @@ void resetFPA11(void)
19
* this late date.
20
*/
21
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &fpa11->fp_status);
22
+ /*
23
+ * Use the same default NaN value as Arm VFP. This doesn't match
24
+ * the Linux kernel's nwfpe emulation, which uses an all-1s value.
25
+ */
26
+ set_float_default_nan_pattern(0b01000000, &fpa11->fp_status);
20
}
27
}
21
28
22
static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
29
void SetRoundingMode(const unsigned int opcode)
23
-
30
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
24
fp_sysreg_loadfn *loadfn,
31
index XXXXXXX..XXXXXXX 100644
25
- void *opaque)
32
--- a/target/arm/cpu.c
26
+ void *opaque)
33
+++ b/target/arm/cpu.c
34
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
35
* the pseudocode function the arguments are in the order c, a, b.
36
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
37
* and the input NaN if it is signalling
38
+ * * Default NaN has sign bit clear, msb frac bit set
39
*/
40
static void arm_set_default_fp_behaviours(float_status *s)
27
{
41
{
28
/* Do a write to an M-profile floating point system register */
42
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
29
TCGv_i32 tmp;
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
30
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
44
set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
46
+ set_float_default_nan_pattern(0b01000000, s);
31
}
47
}
32
48
33
static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
49
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
34
- fp_sysreg_storefn *storefn,
35
- void *opaque)
36
+ fp_sysreg_storefn *storefn,
37
+ void *opaque)
38
{
39
/* Do a read from an M-profile floating point system register */
40
TCGv_i32 tmp;
41
@@ -XXX,XX +XXX,XX @@ static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
42
TCGv_i32 addr;
43
44
if (!a->a) {
45
- offset = - offset;
46
+ offset = -offset;
47
}
48
49
addr = load_reg(s, a->rn);
50
@@ -XXX,XX +XXX,XX @@ static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque)
51
TCGv_i32 value = tcg_temp_new_i32();
52
53
if (!a->a) {
54
- offset = - offset;
55
+ offset = -offset;
56
}
57
58
addr = load_reg(s, a->rn);
59
--
50
--
60
2.20.1
51
2.34.1
61
62
diff view generated by jsdifflib
1
Implement the MVE VQDMLSDH and VQRDMLSDH insns, which are
1
Set the default NaN pattern explicitly for loongarch.
2
like VQDMLADH and VQRDMLADH except that products are subtracted
3
rather than added.
4
2
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210617121628.20116-38-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-42-peter.maydell@linaro.org
8
---
6
---
9
target/arm/helper-mve.h | 16 ++++++++++++++
7
target/loongarch/tcg/fpu_helper.c | 2 ++
10
target/arm/mve.decode | 5 +++++
8
1 file changed, 2 insertions(+)
11
target/arm/mve_helper.c | 44 ++++++++++++++++++++++++++++++++++++++
12
target/arm/translate-mve.c | 4 ++++
13
4 files changed, 69 insertions(+)
14
9
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
16
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
12
--- a/target/loongarch/tcg/fpu_helper.c
18
+++ b/target/arm/helper-mve.h
13
+++ b/target/loongarch/tcg/fpu_helper.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrdmladhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
14
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
20
DEF_HELPER_FLAGS_4(mve_vqrdmladhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
15
*/
21
DEF_HELPER_FLAGS_4(mve_vqrdmladhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
17
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
23
+DEF_HELPER_FLAGS_4(mve_vqdmlsdhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
+ /* Default NaN: sign bit clear, msb frac bit set */
24
+DEF_HELPER_FLAGS_4(mve_vqdmlsdhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
25
+DEF_HELPER_FLAGS_4(mve_vqdmlsdhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
+
27
+DEF_HELPER_FLAGS_4(mve_vqdmlsdhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+DEF_HELPER_FLAGS_4(mve_vqdmlsdhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
+DEF_HELPER_FLAGS_4(mve_vqdmlsdhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
30
+
31
+DEF_HELPER_FLAGS_4(mve_vqrdmlsdhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
+DEF_HELPER_FLAGS_4(mve_vqrdmlsdhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
+DEF_HELPER_FLAGS_4(mve_vqrdmlsdhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
34
+
35
+DEF_HELPER_FLAGS_4(mve_vqrdmlsdhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
36
+DEF_HELPER_FLAGS_4(mve_vqrdmlsdhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
37
+DEF_HELPER_FLAGS_4(mve_vqrdmlsdhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
38
+
39
DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
42
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/mve.decode
45
+++ b/target/arm/mve.decode
46
@@ -XXX,XX +XXX,XX @@ VQDMLADHX 1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op
47
VQRDMLADH 1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op
48
VQRDMLADHX 1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op
49
50
+VQDMLSDH 1111 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op
51
+VQDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op
52
+VQRDMLSDH 1111 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op
53
+VQRDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op
54
+
55
# Vector miscellaneous
56
57
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
58
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/mve_helper.c
61
+++ b/target/arm/mve_helper.c
62
@@ -XXX,XX +XXX,XX @@ static int32_t do_vqdmladh_w(int32_t a, int32_t b, int32_t c, int32_t d,
63
return r >> 32;
64
}
20
}
65
21
66
+static int8_t do_vqdmlsdh_b(int8_t a, int8_t b, int8_t c, int8_t d,
22
int ieee_ex_to_loongarch(int xcpt)
67
+ int round, bool *sat)
68
+{
69
+ int64_t r = ((int64_t)a * b - (int64_t)c * d) * 2 + (round << 7);
70
+ return do_sat_bhw(r, INT16_MIN, INT16_MAX, sat) >> 8;
71
+}
72
+
73
+static int16_t do_vqdmlsdh_h(int16_t a, int16_t b, int16_t c, int16_t d,
74
+ int round, bool *sat)
75
+{
76
+ int64_t r = ((int64_t)a * b - (int64_t)c * d) * 2 + (round << 15);
77
+ return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat) >> 16;
78
+}
79
+
80
+static int32_t do_vqdmlsdh_w(int32_t a, int32_t b, int32_t c, int32_t d,
81
+ int round, bool *sat)
82
+{
83
+ int64_t m1 = (int64_t)a * b;
84
+ int64_t m2 = (int64_t)c * d;
85
+ int64_t r;
86
+ /* The same ordering issue as in do_vqdmladh_w applies here too */
87
+ if (ssub64_overflow(m1, m2, &r) ||
88
+ sadd64_overflow(r, (round << 30), &r) ||
89
+ sadd64_overflow(r, r, &r)) {
90
+ *sat = true;
91
+ return r < 0 ? INT32_MAX : INT32_MIN;
92
+ }
93
+ return r >> 32;
94
+}
95
+
96
DO_VQDMLADH_OP(vqdmladhb, 1, int8_t, 0, 0, do_vqdmladh_b)
97
DO_VQDMLADH_OP(vqdmladhh, 2, int16_t, 0, 0, do_vqdmladh_h)
98
DO_VQDMLADH_OP(vqdmladhw, 4, int32_t, 0, 0, do_vqdmladh_w)
99
@@ -XXX,XX +XXX,XX @@ DO_VQDMLADH_OP(vqrdmladhxb, 1, int8_t, 1, 1, do_vqdmladh_b)
100
DO_VQDMLADH_OP(vqrdmladhxh, 2, int16_t, 1, 1, do_vqdmladh_h)
101
DO_VQDMLADH_OP(vqrdmladhxw, 4, int32_t, 1, 1, do_vqdmladh_w)
102
103
+DO_VQDMLADH_OP(vqdmlsdhb, 1, int8_t, 0, 0, do_vqdmlsdh_b)
104
+DO_VQDMLADH_OP(vqdmlsdhh, 2, int16_t, 0, 0, do_vqdmlsdh_h)
105
+DO_VQDMLADH_OP(vqdmlsdhw, 4, int32_t, 0, 0, do_vqdmlsdh_w)
106
+DO_VQDMLADH_OP(vqdmlsdhxb, 1, int8_t, 1, 0, do_vqdmlsdh_b)
107
+DO_VQDMLADH_OP(vqdmlsdhxh, 2, int16_t, 1, 0, do_vqdmlsdh_h)
108
+DO_VQDMLADH_OP(vqdmlsdhxw, 4, int32_t, 1, 0, do_vqdmlsdh_w)
109
+
110
+DO_VQDMLADH_OP(vqrdmlsdhb, 1, int8_t, 0, 1, do_vqdmlsdh_b)
111
+DO_VQDMLADH_OP(vqrdmlsdhh, 2, int16_t, 0, 1, do_vqdmlsdh_h)
112
+DO_VQDMLADH_OP(vqrdmlsdhw, 4, int32_t, 0, 1, do_vqdmlsdh_w)
113
+DO_VQDMLADH_OP(vqrdmlsdhxb, 1, int8_t, 1, 1, do_vqdmlsdh_b)
114
+DO_VQDMLADH_OP(vqrdmlsdhxh, 2, int16_t, 1, 1, do_vqdmlsdh_h)
115
+DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w)
116
+
117
#define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN) \
118
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \
119
uint32_t rm) \
120
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
121
index XXXXXXX..XXXXXXX 100644
122
--- a/target/arm/translate-mve.c
123
+++ b/target/arm/translate-mve.c
124
@@ -XXX,XX +XXX,XX @@ DO_2OP(VQDMLADH, vqdmladh)
125
DO_2OP(VQDMLADHX, vqdmladhx)
126
DO_2OP(VQRDMLADH, vqrdmladh)
127
DO_2OP(VQRDMLADHX, vqrdmladhx)
128
+DO_2OP(VQDMLSDH, vqdmlsdh)
129
+DO_2OP(VQDMLSDHX, vqdmlsdhx)
130
+DO_2OP(VQRDMLSDH, vqrdmlsdh)
131
+DO_2OP(VQRDMLSDHX, vqrdmlsdhx)
132
133
static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
134
MVEGenTwoOpScalarFn fn)
135
--
23
--
136
2.20.1
24
2.34.1
137
138
diff view generated by jsdifflib
1
Implement the MVE VDUP insn, which duplicates a value from
1
Set the default NaN pattern explicitly for m68k.
2
a general-purpose register into every lane of a vector
3
register (subject to predication).
4
2
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210617121628.20116-11-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-43-peter.maydell@linaro.org
8
---
6
---
9
target/arm/helper-mve.h | 2 ++
7
target/m68k/cpu.c | 2 ++
10
target/arm/mve.decode | 10 ++++++++++
8
fpu/softfloat-specialize.c.inc | 2 +-
11
target/arm/mve_helper.c | 16 ++++++++++++++++
9
2 files changed, 3 insertions(+), 1 deletion(-)
12
target/arm/translate-mve.c | 27 +++++++++++++++++++++++++++
13
4 files changed, 55 insertions(+)
14
10
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
11
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
16
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
13
--- a/target/m68k/cpu.c
18
+++ b/target/arm/helper-mve.h
14
+++ b/target/m68k/cpu.c
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32)
15
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
20
DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32)
16
* preceding paragraph for nonsignaling NaNs.
21
DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32)
17
*/
22
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
23
+DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32)
19
+ /* Default NaN: sign bit clear, all frac bits set */
24
+
20
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
25
DEF_HELPER_FLAGS_3(mve_vclsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
26
DEF_HELPER_FLAGS_3(mve_vclsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
nan = floatx80_default_nan(&env->fp_status);
27
DEF_HELPER_FLAGS_3(mve_vclsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
23
for (i = 0; i < 8; i++) {
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
26
--- a/fpu/softfloat-specialize.c.inc
31
+++ b/target/arm/mve.decode
27
+++ b/fpu/softfloat-specialize.c.inc
32
@@ -XXX,XX +XXX,XX @@
28
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
33
29
uint8_t dnan_pattern = status->default_nan_pattern;
34
%qd 22:1 13:3
30
35
%qm 5:1 1:3
31
if (dnan_pattern == 0) {
36
+%qn 7:1 17:3
32
-#if defined(TARGET_SPARC) || defined(TARGET_M68K)
37
33
+#if defined(TARGET_SPARC)
38
&vldr_vstr rn qd imm p a w size l u
34
/* Sign bit clear, all frac bits set */
39
&1op qd qm size
35
dnan_pattern = 0b01111111;
40
@@ -XXX,XX +XXX,XX @@ VABS 1111 1111 1 . 11 .. 01 ... 0 0011 01 . 0 ... 0 @1op
36
#elif defined(TARGET_HEXAGON)
41
VABS_fp 1111 1111 1 . 11 .. 01 ... 0 0111 01 . 0 ... 0 @1op
42
VNEG 1111 1111 1 . 11 .. 01 ... 0 0011 11 . 0 ... 0 @1op
43
VNEG_fp 1111 1111 1 . 11 .. 01 ... 0 0111 11 . 0 ... 0 @1op
44
+
45
+&vdup qd rt size
46
+# Qd is in the fields usually named Qn
47
+@vdup .... .... . . .. ... . rt:4 .... . . . . .... qd=%qn &vdup
48
+
49
+# B and E bits encode size, which we decode here to the usual size values
50
+VDUP 1110 1110 1 1 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=0
51
+VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 1 1 0000 @vdup size=1
52
+VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2
53
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
54
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/mve_helper.c
56
+++ b/target/arm/mve_helper.c
57
@@ -XXX,XX +XXX,XX @@ static void mergemask_sq(int64_t *d, int64_t r, uint16_t mask)
58
uint64_t *: mergemask_uq, \
59
int64_t *: mergemask_sq)(D, R, M)
60
61
+void HELPER(mve_vdup)(CPUARMState *env, void *vd, uint32_t val)
62
+{
63
+ /*
64
+ * The generated code already replicated an 8 or 16 bit constant
65
+ * into the 32-bit value, so we only need to write the 32-bit
66
+ * value to all elements of the Qreg, allowing for predication.
67
+ */
68
+ uint32_t *d = vd;
69
+ uint16_t mask = mve_element_mask(env);
70
+ unsigned e;
71
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) {
72
+ mergemask(&d[H4(e)], val, mask);
73
+ }
74
+ mve_advance_vpt(env);
75
+}
76
+
77
#define DO_1OP(OP, ESIZE, TYPE, FN) \
78
void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
79
{ \
80
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/translate-mve.c
83
+++ b/target/arm/translate-mve.c
84
@@ -XXX,XX +XXX,XX @@ DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h)
85
DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w)
86
DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w)
87
88
+static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
89
+{
90
+ TCGv_ptr qd;
91
+ TCGv_i32 rt;
92
+
93
+ if (!dc_isar_feature(aa32_mve, s) ||
94
+ !mve_check_qreg_bank(s, a->qd)) {
95
+ return false;
96
+ }
97
+ if (a->rt == 13 || a->rt == 15) {
98
+ /* UNPREDICTABLE; we choose to UNDEF */
99
+ return false;
100
+ }
101
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
102
+ return true;
103
+ }
104
+
105
+ qd = mve_qreg_ptr(a->qd);
106
+ rt = load_reg(s, a->rt);
107
+ tcg_gen_dup_i32(a->size, rt, rt);
108
+ gen_helper_mve_vdup(cpu_env, qd, rt);
109
+ tcg_temp_free_ptr(qd);
110
+ tcg_temp_free_i32(rt);
111
+ mve_update_eci(s);
112
+ return true;
113
+}
114
+
115
static bool do_1op(DisasContext *s, arg_1op *a, MVEGenOneOpFn fn)
116
{
117
TCGv_ptr qd, qm;
118
--
37
--
119
2.20.1
38
2.34.1
120
121
diff view generated by jsdifflib
1
Implement the MVE VNEG insn (both integer and floating point forms).
1
Set the default NaN pattern explicitly for MIPS. Note that this
2
is our only target which currently changes the default NaN
3
at runtime (which it was previously doing indirectly when it
4
changed the snan_bit_is_one setting).
2
5
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-9-peter.maydell@linaro.org
8
Message-id: 20241202131347.498124-44-peter.maydell@linaro.org
6
---
9
---
7
target/arm/helper-mve.h | 6 ++++++
10
target/mips/fpu_helper.h | 7 +++++++
8
target/arm/mve.decode | 2 ++
11
target/mips/msa.c | 3 +++
9
target/arm/mve_helper.c | 12 ++++++++++++
12
2 files changed, 10 insertions(+)
10
target/arm/translate-mve.c | 15 +++++++++++++++
11
4 files changed, 35 insertions(+)
12
13
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
14
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
16
--- a/target/mips/fpu_helper.h
16
+++ b/target/arm/helper-mve.h
17
+++ b/target/mips/fpu_helper.h
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vabsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
18
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
18
DEF_HELPER_FLAGS_3(mve_vabsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
19
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
19
DEF_HELPER_FLAGS_3(mve_vfabsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
20
nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
20
DEF_HELPER_FLAGS_3(mve_vfabss, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
21
+
22
+ /*
22
+DEF_HELPER_FLAGS_3(mve_vnegb, TCG_CALL_NO_WG, void, env, ptr, ptr)
23
+ * With nan2008, the default NaN value has the sign bit clear and the
23
+DEF_HELPER_FLAGS_3(mve_vnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
24
+ * frac msb set; with the older mode, the sign bit is clear, and all
24
+DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr)
25
+ * frac bits except the msb are set.
25
+DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr)
26
+ */
26
+DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr)
27
+ set_float_default_nan_pattern(nan2008 ? 0b01000000 : 0b00111111,
27
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
28
+ &env->active_fpu.fp_status);
29
30
}
31
32
diff --git a/target/mips/msa.c b/target/mips/msa.c
28
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/mve.decode
34
--- a/target/mips/msa.c
30
+++ b/target/arm/mve.decode
35
+++ b/target/mips/msa.c
31
@@ -XXX,XX +XXX,XX @@ VMVN 1111 1111 1 . 11 00 00 ... 0 0101 11 . 0 ... 0 @1op_nosz
36
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
32
37
/* Inf * 0 + NaN returns the input NaN */
33
VABS 1111 1111 1 . 11 .. 01 ... 0 0011 01 . 0 ... 0 @1op
38
set_float_infzeronan_rule(float_infzeronan_dnan_never,
34
VABS_fp 1111 1111 1 . 11 .. 01 ... 0 0111 01 . 0 ... 0 @1op
39
&env->active_tc.msa_fp_status);
35
+VNEG 1111 1111 1 . 11 .. 01 ... 0 0011 11 . 0 ... 0 @1op
40
+ /* Default NaN: sign bit clear, frac msb set */
36
+VNEG_fp 1111 1111 1 . 11 .. 01 ... 0 0111 11 . 0 ... 0 @1op
41
+ set_float_default_nan_pattern(0b01000000,
37
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
42
+ &env->active_tc.msa_fp_status);
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/mve_helper.c
40
+++ b/target/arm/mve_helper.c
41
@@ -XXX,XX +XXX,XX @@ DO_1OP(vabsw, 4, int32_t, DO_ABS)
42
/* We can do these 64 bits at a time */
43
DO_1OP(vfabsh, 8, uint64_t, DO_FABSH)
44
DO_1OP(vfabss, 8, uint64_t, DO_FABSS)
45
+
46
+#define DO_NEG(N) (-(N))
47
+#define DO_FNEGH(N) ((N) ^ dup_const(MO_16, 0x8000))
48
+#define DO_FNEGS(N) ((N) ^ dup_const(MO_32, 0x80000000))
49
+
50
+DO_1OP(vnegb, 1, int8_t, DO_NEG)
51
+DO_1OP(vnegh, 2, int16_t, DO_NEG)
52
+DO_1OP(vnegw, 4, int32_t, DO_NEG)
53
+
54
+/* We can do these 64 bits at a time */
55
+DO_1OP(vfnegh, 8, uint64_t, DO_FNEGH)
56
+DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
57
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/translate-mve.c
60
+++ b/target/arm/translate-mve.c
61
@@ -XXX,XX +XXX,XX @@ static bool do_1op(DisasContext *s, arg_1op *a, MVEGenOneOpFn fn)
62
DO_1OP(VCLZ, vclz)
63
DO_1OP(VCLS, vcls)
64
DO_1OP(VABS, vabs)
65
+DO_1OP(VNEG, vneg)
66
67
static bool trans_VREV16(DisasContext *s, arg_1op *a)
68
{
69
@@ -XXX,XX +XXX,XX @@ static bool trans_VABS_fp(DisasContext *s, arg_1op *a)
70
}
71
return do_1op(s, a, fns[a->size]);
72
}
43
}
73
+
74
+static bool trans_VNEG_fp(DisasContext *s, arg_1op *a)
75
+{
76
+ static MVEGenOneOpFn * const fns[] = {
77
+ NULL,
78
+ gen_helper_mve_vfnegh,
79
+ gen_helper_mve_vfnegs,
80
+ NULL,
81
+ };
82
+ if (!dc_isar_feature(aa32_mve_fp, s)) {
83
+ return false;
84
+ }
85
+ return do_1op(s, a, fns[a->size]);
86
+}
87
--
44
--
88
2.20.1
45
2.34.1
89
90
diff view generated by jsdifflib
1
Implement the MVE VABS functions (both integer and floating point).
1
Set the default NaN pattern explicitly for openrisc.
2
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-8-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-45-peter.maydell@linaro.org
6
---
6
---
7
target/arm/helper-mve.h | 6 ++++++
7
target/openrisc/cpu.c | 2 ++
8
target/arm/mve.decode | 3 +++
8
1 file changed, 2 insertions(+)
9
target/arm/mve_helper.c | 13 +++++++++++++
10
target/arm/translate-mve.c | 15 +++++++++++++++
11
4 files changed, 37 insertions(+)
12
9
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
12
--- a/target/openrisc/cpu.c
16
+++ b/target/arm/helper-mve.h
13
+++ b/target/openrisc/cpu.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vrev64h, TCG_CALL_NO_WG, void, env, ptr, ptr)
14
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_reset_hold(Object *obj, ResetType type)
18
DEF_HELPER_FLAGS_3(mve_vrev64w, TCG_CALL_NO_WG, void, env, ptr, ptr)
15
*/
19
16
set_float_2nan_prop_rule(float_2nan_prop_x87, &cpu->env.fp_status);
20
DEF_HELPER_FLAGS_3(mve_vmvn, TCG_CALL_NO_WG, void, env, ptr, ptr)
17
21
+
18
+ /* Default NaN: sign bit clear, frac msb set */
22
+DEF_HELPER_FLAGS_3(mve_vabsb, TCG_CALL_NO_WG, void, env, ptr, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &cpu->env.fp_status);
23
+DEF_HELPER_FLAGS_3(mve_vabsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
20
24
+DEF_HELPER_FLAGS_3(mve_vabsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
#ifndef CONFIG_USER_ONLY
25
+DEF_HELPER_FLAGS_3(mve_vfabsh, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
cpu->env.picmr = 0x00000000;
26
+DEF_HELPER_FLAGS_3(mve_vfabss, TCG_CALL_NO_WG, void, env, ptr, ptr)
27
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/mve.decode
30
+++ b/target/arm/mve.decode
31
@@ -XXX,XX +XXX,XX @@ VREV32 1111 1111 1 . 11 .. 00 ... 0 0000 11 . 0 ... 0 @1op
32
VREV64 1111 1111 1 . 11 .. 00 ... 0 0000 01 . 0 ... 0 @1op
33
34
VMVN 1111 1111 1 . 11 00 00 ... 0 0101 11 . 0 ... 0 @1op_nosz
35
+
36
+VABS 1111 1111 1 . 11 .. 01 ... 0 0011 01 . 0 ... 0 @1op
37
+VABS_fp 1111 1111 1 . 11 .. 01 ... 0 0111 01 . 0 ... 0 @1op
38
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/mve_helper.c
41
+++ b/target/arm/mve_helper.c
42
@@ -XXX,XX +XXX,XX @@
43
#include "exec/helper-proto.h"
44
#include "exec/cpu_ldst.h"
45
#include "exec/exec-all.h"
46
+#include "tcg/tcg.h"
47
48
static uint16_t mve_element_mask(CPUARMState *env)
49
{
50
@@ -XXX,XX +XXX,XX @@ DO_1OP(vrev64w, 8, uint64_t, wswap64)
51
#define DO_NOT(N) (~(N))
52
53
DO_1OP(vmvn, 8, uint64_t, DO_NOT)
54
+
55
+#define DO_ABS(N) ((N) < 0 ? -(N) : (N))
56
+#define DO_FABSH(N) ((N) & dup_const(MO_16, 0x7fff))
57
+#define DO_FABSS(N) ((N) & dup_const(MO_32, 0x7fffffff))
58
+
59
+DO_1OP(vabsb, 1, int8_t, DO_ABS)
60
+DO_1OP(vabsh, 2, int16_t, DO_ABS)
61
+DO_1OP(vabsw, 4, int32_t, DO_ABS)
62
+
63
+/* We can do these 64 bits at a time */
64
+DO_1OP(vfabsh, 8, uint64_t, DO_FABSH)
65
+DO_1OP(vfabss, 8, uint64_t, DO_FABSS)
66
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/target/arm/translate-mve.c
69
+++ b/target/arm/translate-mve.c
70
@@ -XXX,XX +XXX,XX @@ static bool do_1op(DisasContext *s, arg_1op *a, MVEGenOneOpFn fn)
71
72
DO_1OP(VCLZ, vclz)
73
DO_1OP(VCLS, vcls)
74
+DO_1OP(VABS, vabs)
75
76
static bool trans_VREV16(DisasContext *s, arg_1op *a)
77
{
78
@@ -XXX,XX +XXX,XX @@ static bool trans_VMVN(DisasContext *s, arg_1op *a)
79
{
80
return do_1op(s, a, gen_helper_mve_vmvn);
81
}
82
+
83
+static bool trans_VABS_fp(DisasContext *s, arg_1op *a)
84
+{
85
+ static MVEGenOneOpFn * const fns[] = {
86
+ NULL,
87
+ gen_helper_mve_vfabsh,
88
+ gen_helper_mve_vfabss,
89
+ NULL,
90
+ };
91
+ if (!dc_isar_feature(aa32_mve_fp, s)) {
92
+ return false;
93
+ }
94
+ return do_1op(s, a, fns[a->size]);
95
+}
96
--
23
--
97
2.20.1
24
2.34.1
98
99
diff view generated by jsdifflib
1
Implement the MVE VMVN(register) operation. Note that for
1
Set the default NaN pattern explicitly for ppc.
2
predication this operation is byte-by-byte.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-7-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-46-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-mve.h | 2 ++
7
target/ppc/cpu_init.c | 4 ++++
9
target/arm/mve.decode | 3 +++
8
1 file changed, 4 insertions(+)
10
target/arm/mve_helper.c | 4 ++++
11
target/arm/translate-mve.c | 5 +++++
12
4 files changed, 14 insertions(+)
13
9
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
12
--- a/target/ppc/cpu_init.c
17
+++ b/target/arm/helper-mve.h
13
+++ b/target/ppc/cpu_init.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vrev32h, TCG_CALL_NO_WG, void, env, ptr, ptr)
14
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_3(mve_vrev64b, TCG_CALL_NO_WG, void, env, ptr, ptr)
15
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
20
DEF_HELPER_FLAGS_3(mve_vrev64h, TCG_CALL_NO_WG, void, env, ptr, ptr)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
21
DEF_HELPER_FLAGS_3(mve_vrev64w, TCG_CALL_NO_WG, void, env, ptr, ptr)
17
18
+ /* Default NaN: sign bit clear, set frac msb */
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
20
+ set_float_default_nan_pattern(0b01000000, &env->vec_status);
22
+
21
+
23
+DEF_HELPER_FLAGS_3(mve_vmvn, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
24
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
23
ppc_spr_t *spr = &env->spr_cb[i];
25
index XXXXXXX..XXXXXXX 100644
24
26
--- a/target/arm/mve.decode
27
+++ b/target/arm/mve.decode
28
@@ -XXX,XX +XXX,XX @@
29
@vldst_wn ... u:1 ... . . . . l:1 . rn:3 qd:3 . ... .. imm:7 &vldr_vstr
30
31
@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
32
+@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
33
34
# Vector loads and stores
35
36
@@ -XXX,XX +XXX,XX @@ VCLZ 1111 1111 1 . 11 .. 00 ... 0 0100 11 . 0 ... 0 @1op
37
VREV16 1111 1111 1 . 11 .. 00 ... 0 0001 01 . 0 ... 0 @1op
38
VREV32 1111 1111 1 . 11 .. 00 ... 0 0000 11 . 0 ... 0 @1op
39
VREV64 1111 1111 1 . 11 .. 00 ... 0 0000 01 . 0 ... 0 @1op
40
+
41
+VMVN 1111 1111 1 . 11 00 00 ... 0 0101 11 . 0 ... 0 @1op_nosz
42
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/mve_helper.c
45
+++ b/target/arm/mve_helper.c
46
@@ -XXX,XX +XXX,XX @@ DO_1OP(vrev32h, 4, uint32_t, hswap32)
47
DO_1OP(vrev64b, 8, uint64_t, bswap64)
48
DO_1OP(vrev64h, 8, uint64_t, hswap64)
49
DO_1OP(vrev64w, 8, uint64_t, wswap64)
50
+
51
+#define DO_NOT(N) (~(N))
52
+
53
+DO_1OP(vmvn, 8, uint64_t, DO_NOT)
54
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/translate-mve.c
57
+++ b/target/arm/translate-mve.c
58
@@ -XXX,XX +XXX,XX @@ static bool trans_VREV64(DisasContext *s, arg_1op *a)
59
};
60
return do_1op(s, a, fns[a->size]);
61
}
62
+
63
+static bool trans_VMVN(DisasContext *s, arg_1op *a)
64
+{
65
+ return do_1op(s, a, gen_helper_mve_vmvn);
66
+}
67
--
25
--
68
2.20.1
26
2.34.1
69
70
diff view generated by jsdifflib
1
Implement the MVE VPST insn, which sets the predicate mask
1
Set the default NaN pattern explicitly for sh4. Note that sh4
2
fields in the VPR to the immediate value encoded in the insn.
2
is one of the only three targets (the others being HPPA and
3
sometimes MIPS) that has snan_bit_is_one set.
3
4
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-27-peter.maydell@linaro.org
7
Message-id: 20241202131347.498124-47-peter.maydell@linaro.org
7
---
8
---
8
target/arm/mve.decode | 4 +++
9
target/sh4/cpu.c | 2 ++
9
target/arm/translate-mve.c | 59 ++++++++++++++++++++++++++++++++++++++
10
1 file changed, 2 insertions(+)
10
2 files changed, 63 insertions(+)
11
11
12
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
12
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
13
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/mve.decode
14
--- a/target/sh4/cpu.c
15
+++ b/target/arm/mve.decode
15
+++ b/target/sh4/cpu.c
16
@@ -XXX,XX +XXX,XX @@ VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
16
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_reset_hold(Object *obj, ResetType type)
17
VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
17
set_flush_to_zero(1, &env->fp_status);
18
VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
18
#endif
19
VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
19
set_default_nan_mode(1, &env->fp_status);
20
+
20
+ /* sign bit clear, set all frac bits other than msb */
21
+# Predicate operations
21
+ set_float_default_nan_pattern(0b00111111, &env->fp_status);
22
+%mask_22_13 22:1 13:3
23
+VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
24
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/translate-mve.c
27
+++ b/target/arm/translate-mve.c
28
@@ -XXX,XX +XXX,XX @@ static void mve_update_eci(DisasContext *s)
29
}
30
}
22
}
31
23
32
+static void mve_update_and_store_eci(DisasContext *s)
24
static void superh_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
33
+{
34
+ /*
35
+ * For insns which don't call a helper function that will call
36
+ * mve_advance_vpt(), this version updates s->eci and also stores
37
+ * it out to the CPUState field.
38
+ */
39
+ if (s->eci) {
40
+ mve_update_eci(s);
41
+ store_cpu_field(tcg_constant_i32(s->eci << 4), condexec_bits);
42
+ }
43
+}
44
+
45
static bool mve_skip_first_beat(DisasContext *s)
46
{
47
/* Return true if PSR.ECI says we must skip the first beat of this insn */
48
@@ -XXX,XX +XXX,XX @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a)
49
};
50
return do_long_dual_acc(s, a, fns[a->x]);
51
}
52
+
53
+static bool trans_VPST(DisasContext *s, arg_VPST *a)
54
+{
55
+ TCGv_i32 vpr;
56
+
57
+ /* mask == 0 is a "related encoding" */
58
+ if (!dc_isar_feature(aa32_mve, s) || !a->mask) {
59
+ return false;
60
+ }
61
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
62
+ return true;
63
+ }
64
+ /*
65
+ * Set the VPR mask fields. We take advantage of MASK01 and MASK23
66
+ * being adjacent fields in the register.
67
+ *
68
+ * This insn is not predicated, but it is subject to beat-wise
69
+ * execution, and the mask is updated on the odd-numbered beats.
70
+ * So if PSR.ECI says we should skip beat 1, we mustn't update the
71
+ * 01 mask field.
72
+ */
73
+ vpr = load_cpu_field(v7m.vpr);
74
+ switch (s->eci) {
75
+ case ECI_NONE:
76
+ case ECI_A0:
77
+ /* Update both 01 and 23 fields */
78
+ tcg_gen_deposit_i32(vpr, vpr,
79
+ tcg_constant_i32(a->mask | (a->mask << 4)),
80
+ R_V7M_VPR_MASK01_SHIFT,
81
+ R_V7M_VPR_MASK01_LENGTH + R_V7M_VPR_MASK23_LENGTH);
82
+ break;
83
+ case ECI_A0A1:
84
+ case ECI_A0A1A2:
85
+ case ECI_A0A1A2B0:
86
+ /* Update only the 23 mask field */
87
+ tcg_gen_deposit_i32(vpr, vpr,
88
+ tcg_constant_i32(a->mask),
89
+ R_V7M_VPR_MASK23_SHIFT, R_V7M_VPR_MASK23_LENGTH);
90
+ break;
91
+ default:
92
+ g_assert_not_reached();
93
+ }
94
+ store_cpu_field(vpr, v7m.vpr);
95
+ mve_update_and_store_eci(s);
96
+ return true;
97
+}
98
--
25
--
99
2.20.1
26
2.34.1
100
101
diff view generated by jsdifflib
1
Implement the MVE instructions VREV16, VREV32 and VREV64.
1
Set the default NaN pattern explicitly for rx.
2
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-6-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-48-peter.maydell@linaro.org
6
---
6
---
7
target/arm/helper-mve.h | 7 +++++++
7
target/rx/cpu.c | 2 ++
8
target/arm/mve.decode | 4 ++++
8
1 file changed, 2 insertions(+)
9
target/arm/mve_helper.c | 7 +++++++
10
target/arm/translate-mve.c | 33 +++++++++++++++++++++++++++++++++
11
4 files changed, 51 insertions(+)
12
9
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
12
--- a/target/rx/cpu.c
16
+++ b/target/arm/helper-mve.h
13
+++ b/target/rx/cpu.c
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vclsw, TCG_CALL_NO_WG, void, env, ptr, ptr)
14
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_reset_hold(Object *obj, ResetType type)
18
DEF_HELPER_FLAGS_3(mve_vclzb, TCG_CALL_NO_WG, void, env, ptr, ptr)
15
* then prefer dest over source", which is float_2nan_prop_s_ab.
19
DEF_HELPER_FLAGS_3(mve_vclzh, TCG_CALL_NO_WG, void, env, ptr, ptr)
16
*/
20
DEF_HELPER_FLAGS_3(mve_vclzw, TCG_CALL_NO_WG, void, env, ptr, ptr)
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
21
+
18
+ /* Default NaN value: sign bit clear, set frac msb */
22
+DEF_HELPER_FLAGS_3(mve_vrev16b, TCG_CALL_NO_WG, void, env, ptr, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
23
+DEF_HELPER_FLAGS_3(mve_vrev32b, TCG_CALL_NO_WG, void, env, ptr, ptr)
20
}
24
+DEF_HELPER_FLAGS_3(mve_vrev32h, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
25
+DEF_HELPER_FLAGS_3(mve_vrev64b, TCG_CALL_NO_WG, void, env, ptr, ptr)
22
static ObjectClass *rx_cpu_class_by_name(const char *cpu_model)
26
+DEF_HELPER_FLAGS_3(mve_vrev64h, TCG_CALL_NO_WG, void, env, ptr, ptr)
27
+DEF_HELPER_FLAGS_3(mve_vrev64w, TCG_CALL_NO_WG, void, env, ptr, ptr)
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
31
+++ b/target/arm/mve.decode
32
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
33
34
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
35
VCLZ 1111 1111 1 . 11 .. 00 ... 0 0100 11 . 0 ... 0 @1op
36
+
37
+VREV16 1111 1111 1 . 11 .. 00 ... 0 0001 01 . 0 ... 0 @1op
38
+VREV32 1111 1111 1 . 11 .. 00 ... 0 0000 11 . 0 ... 0 @1op
39
+VREV64 1111 1111 1 . 11 .. 00 ... 0 0000 01 . 0 ... 0 @1op
40
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/mve_helper.c
43
+++ b/target/arm/mve_helper.c
44
@@ -XXX,XX +XXX,XX @@ DO_1OP(vclsw, 4, int32_t, clrsb32)
45
DO_1OP(vclzb, 1, uint8_t, DO_CLZ_B)
46
DO_1OP(vclzh, 2, uint16_t, DO_CLZ_H)
47
DO_1OP(vclzw, 4, uint32_t, clz32)
48
+
49
+DO_1OP(vrev16b, 2, uint16_t, bswap16)
50
+DO_1OP(vrev32b, 4, uint32_t, bswap32)
51
+DO_1OP(vrev32h, 4, uint32_t, hswap32)
52
+DO_1OP(vrev64b, 8, uint64_t, bswap64)
53
+DO_1OP(vrev64h, 8, uint64_t, hswap64)
54
+DO_1OP(vrev64w, 8, uint64_t, wswap64)
55
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/translate-mve.c
58
+++ b/target/arm/translate-mve.c
59
@@ -XXX,XX +XXX,XX @@ static bool do_1op(DisasContext *s, arg_1op *a, MVEGenOneOpFn fn)
60
61
DO_1OP(VCLZ, vclz)
62
DO_1OP(VCLS, vcls)
63
+
64
+static bool trans_VREV16(DisasContext *s, arg_1op *a)
65
+{
66
+ static MVEGenOneOpFn * const fns[] = {
67
+ gen_helper_mve_vrev16b,
68
+ NULL,
69
+ NULL,
70
+ NULL,
71
+ };
72
+ return do_1op(s, a, fns[a->size]);
73
+}
74
+
75
+static bool trans_VREV32(DisasContext *s, arg_1op *a)
76
+{
77
+ static MVEGenOneOpFn * const fns[] = {
78
+ gen_helper_mve_vrev32b,
79
+ gen_helper_mve_vrev32h,
80
+ NULL,
81
+ NULL,
82
+ };
83
+ return do_1op(s, a, fns[a->size]);
84
+}
85
+
86
+static bool trans_VREV64(DisasContext *s, arg_1op *a)
87
+{
88
+ static MVEGenOneOpFn * const fns[] = {
89
+ gen_helper_mve_vrev64b,
90
+ gen_helper_mve_vrev64h,
91
+ gen_helper_mve_vrev64w,
92
+ NULL,
93
+ };
94
+ return do_1op(s, a, fns[a->size]);
95
+}
96
--
23
--
97
2.20.1
24
2.34.1
98
99
diff view generated by jsdifflib
1
Implement the MVE VCLZ insn (and the necessary machinery
1
Set the default NaN pattern explicitly for s390x.
2
for MVE 1-input vector ops).
3
4
Note that for non-load instructions predication is always performed
5
at a byte level granularity regardless of element size (R_ZLSJ),
6
and so the masking logic here differs from that used in the VLDR
7
and VSTR helpers.
8
2
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20210617121628.20116-4-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-49-peter.maydell@linaro.org
12
---
6
---
13
target/arm/helper-mve.h | 4 ++
7
target/s390x/cpu.c | 2 ++
14
target/arm/mve.decode | 8 ++++
8
1 file changed, 2 insertions(+)
15
target/arm/mve_helper.c | 82 ++++++++++++++++++++++++++++++++++++++
16
target/arm/translate-mve.c | 38 ++++++++++++++++++
17
4 files changed, 132 insertions(+)
18
9
19
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
20
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper-mve.h
12
--- a/target/s390x/cpu.c
22
+++ b/target/arm/helper-mve.h
13
+++ b/target/s390x/cpu.c
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vldrh_uw, TCG_CALL_NO_WG, void, env, ptr, i32)
14
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
24
DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32)
15
set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
25
DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32)
16
set_float_infzeronan_rule(float_infzeronan_dnan_always,
26
DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32)
17
&env->fpu_status);
27
+
18
+ /* Default NaN value: sign bit clear, frac msb set */
28
+DEF_HELPER_FLAGS_3(mve_vclzb, TCG_CALL_NO_WG, void, env, ptr, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fpu_status);
29
+DEF_HELPER_FLAGS_3(mve_vclzh, TCG_CALL_NO_WG, void, env, ptr, ptr)
20
/* fall through */
30
+DEF_HELPER_FLAGS_3(mve_vclzw, TCG_CALL_NO_WG, void, env, ptr, ptr)
21
case RESET_TYPE_S390_CPU_NORMAL:
31
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
22
env->psw.mask &= ~PSW_MASK_RI;
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/mve.decode
34
+++ b/target/arm/mve.decode
35
@@ -XXX,XX +XXX,XX @@
36
#
37
38
%qd 22:1 13:3
39
+%qm 5:1 1:3
40
41
&vldr_vstr rn qd imm p a w size l u
42
+&1op qd qm size
43
44
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
45
# Note that both Rn and Qd are 3 bits only (no D bit)
46
@vldst_wn ... u:1 ... . . . . l:1 . rn:3 qd:3 . ... .. imm:7 &vldr_vstr
47
48
+@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
49
+
50
# Vector loads and stores
51
52
# Widening loads and narrowing stores:
53
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \
54
size=1 p=1
55
VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \
56
size=2 p=1
57
+
58
+# Vector miscellaneous
59
+
60
+VCLZ 1111 1111 1 . 11 .. 00 ... 0 0100 11 . 0 ... 0 @1op
61
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/mve_helper.c
64
+++ b/target/arm/mve_helper.c
65
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
66
67
#undef DO_VLDR
68
#undef DO_VSTR
69
+
70
+/*
71
+ * The mergemask(D, R, M) macro performs the operation "*D = R" but
72
+ * storing only the bytes which correspond to 1 bits in M,
73
+ * leaving other bytes in *D unchanged. We use _Generic
74
+ * to select the correct implementation based on the type of D.
75
+ */
76
+
77
+static void mergemask_ub(uint8_t *d, uint8_t r, uint16_t mask)
78
+{
79
+ if (mask & 1) {
80
+ *d = r;
81
+ }
82
+}
83
+
84
+static void mergemask_sb(int8_t *d, int8_t r, uint16_t mask)
85
+{
86
+ mergemask_ub((uint8_t *)d, r, mask);
87
+}
88
+
89
+static void mergemask_uh(uint16_t *d, uint16_t r, uint16_t mask)
90
+{
91
+ uint16_t bmask = expand_pred_b_data[mask & 3];
92
+ *d = (*d & ~bmask) | (r & bmask);
93
+}
94
+
95
+static void mergemask_sh(int16_t *d, int16_t r, uint16_t mask)
96
+{
97
+ mergemask_uh((uint16_t *)d, r, mask);
98
+}
99
+
100
+static void mergemask_uw(uint32_t *d, uint32_t r, uint16_t mask)
101
+{
102
+ uint32_t bmask = expand_pred_b_data[mask & 0xf];
103
+ *d = (*d & ~bmask) | (r & bmask);
104
+}
105
+
106
+static void mergemask_sw(int32_t *d, int32_t r, uint16_t mask)
107
+{
108
+ mergemask_uw((uint32_t *)d, r, mask);
109
+}
110
+
111
+static void mergemask_uq(uint64_t *d, uint64_t r, uint16_t mask)
112
+{
113
+ uint64_t bmask = expand_pred_b_data[mask & 0xff];
114
+ *d = (*d & ~bmask) | (r & bmask);
115
+}
116
+
117
+static void mergemask_sq(int64_t *d, int64_t r, uint16_t mask)
118
+{
119
+ mergemask_uq((uint64_t *)d, r, mask);
120
+}
121
+
122
+#define mergemask(D, R, M) \
123
+ _Generic(D, \
124
+ uint8_t *: mergemask_ub, \
125
+ int8_t *: mergemask_sb, \
126
+ uint16_t *: mergemask_uh, \
127
+ int16_t *: mergemask_sh, \
128
+ uint32_t *: mergemask_uw, \
129
+ int32_t *: mergemask_sw, \
130
+ uint64_t *: mergemask_uq, \
131
+ int64_t *: mergemask_sq)(D, R, M)
132
+
133
+#define DO_1OP(OP, ESIZE, TYPE, FN) \
134
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \
135
+ { \
136
+ TYPE *d = vd, *m = vm; \
137
+ uint16_t mask = mve_element_mask(env); \
138
+ unsigned e; \
139
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
140
+ mergemask(&d[H##ESIZE(e)], FN(m[H##ESIZE(e)]), mask); \
141
+ } \
142
+ mve_advance_vpt(env); \
143
+ }
144
+
145
+#define DO_CLZ_B(N) (clz32(N) - 24)
146
+#define DO_CLZ_H(N) (clz32(N) - 16)
147
+
148
+DO_1OP(vclzb, 1, uint8_t, DO_CLZ_B)
149
+DO_1OP(vclzh, 2, uint16_t, DO_CLZ_H)
150
+DO_1OP(vclzw, 4, uint32_t, clz32)
151
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
152
index XXXXXXX..XXXXXXX 100644
153
--- a/target/arm/translate-mve.c
154
+++ b/target/arm/translate-mve.c
155
@@ -XXX,XX +XXX,XX @@
156
#include "decode-mve.c.inc"
157
158
typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
159
+typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
160
161
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
162
static inline long mve_qreg_offset(unsigned reg)
163
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR(DisasContext *s, arg_VLDR_VSTR *a)
164
DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h)
165
DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w)
166
DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w)
167
+
168
+static bool do_1op(DisasContext *s, arg_1op *a, MVEGenOneOpFn fn)
169
+{
170
+ TCGv_ptr qd, qm;
171
+
172
+ if (!dc_isar_feature(aa32_mve, s) ||
173
+ !mve_check_qreg_bank(s, a->qd | a->qm) ||
174
+ !fn) {
175
+ return false;
176
+ }
177
+
178
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
179
+ return true;
180
+ }
181
+
182
+ qd = mve_qreg_ptr(a->qd);
183
+ qm = mve_qreg_ptr(a->qm);
184
+ fn(cpu_env, qd, qm);
185
+ tcg_temp_free_ptr(qd);
186
+ tcg_temp_free_ptr(qm);
187
+ mve_update_eci(s);
188
+ return true;
189
+}
190
+
191
+#define DO_1OP(INSN, FN) \
192
+ static bool trans_##INSN(DisasContext *s, arg_1op *a) \
193
+ { \
194
+ static MVEGenOneOpFn * const fns[] = { \
195
+ gen_helper_mve_##FN##b, \
196
+ gen_helper_mve_##FN##h, \
197
+ gen_helper_mve_##FN##w, \
198
+ NULL, \
199
+ }; \
200
+ return do_1op(s, a, fns[a->size]); \
201
+ }
202
+
203
+DO_1OP(VCLZ, vclz)
204
--
23
--
205
2.20.1
24
2.34.1
206
207
diff view generated by jsdifflib
1
Implement the variants of MVE VLDR (encodings T1, T2) which perform
1
Set the default NaN pattern explicitly for SPARC, and remove
2
"widening" loads where bytes or halfwords are loaded from memory and
2
the ifdef from parts64_default_nan.
3
zero or sign-extended into halfword or word length vector elements,
4
and the narrowing MVE VSTR (encodings T1, T2) where bytes or
5
halfwords are stored from halfword or word elements.
6
3
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210617121628.20116-3-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-50-peter.maydell@linaro.org
10
---
7
---
11
target/arm/helper-mve.h | 10 ++++++++++
8
target/sparc/cpu.c | 2 ++
12
target/arm/mve.decode | 25 +++++++++++++++++++++++--
9
fpu/softfloat-specialize.c.inc | 5 +----
13
target/arm/mve_helper.c | 11 +++++++++++
10
2 files changed, 3 insertions(+), 4 deletions(-)
14
target/arm/translate-mve.c | 14 ++++++++++++++
15
4 files changed, 58 insertions(+), 2 deletions(-)
16
11
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
14
--- a/target/sparc/cpu.c
20
+++ b/target/arm/helper-mve.h
15
+++ b/target/sparc/cpu.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vldrw, TCG_CALL_NO_WG, void, env, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
22
DEF_HELPER_FLAGS_3(mve_vstrb, TCG_CALL_NO_WG, void, env, ptr, i32)
17
set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
23
DEF_HELPER_FLAGS_3(mve_vstrh, TCG_CALL_NO_WG, void, env, ptr, i32)
18
/* For inf * 0 + NaN, return the input NaN */
24
DEF_HELPER_FLAGS_3(mve_vstrw, TCG_CALL_NO_WG, void, env, ptr, i32)
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
25
+
20
+ /* Default NaN value: sign bit clear, all frac bits set */
26
+DEF_HELPER_FLAGS_3(mve_vldrb_sh, TCG_CALL_NO_WG, void, env, ptr, i32)
21
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
27
+DEF_HELPER_FLAGS_3(mve_vldrb_sw, TCG_CALL_NO_WG, void, env, ptr, i32)
22
28
+DEF_HELPER_FLAGS_3(mve_vldrb_uh, TCG_CALL_NO_WG, void, env, ptr, i32)
23
cpu_exec_realizefn(cs, &local_err);
29
+DEF_HELPER_FLAGS_3(mve_vldrb_uw, TCG_CALL_NO_WG, void, env, ptr, i32)
24
if (local_err != NULL) {
30
+DEF_HELPER_FLAGS_3(mve_vldrh_sw, TCG_CALL_NO_WG, void, env, ptr, i32)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
+DEF_HELPER_FLAGS_3(mve_vldrh_uw, TCG_CALL_NO_WG, void, env, ptr, i32)
32
+DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32)
33
+DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32)
34
+DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32)
35
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
36
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
37
--- a/target/arm/mve.decode
27
--- a/fpu/softfloat-specialize.c.inc
38
+++ b/target/arm/mve.decode
28
+++ b/fpu/softfloat-specialize.c.inc
39
@@ -XXX,XX +XXX,XX @@
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
40
30
uint8_t dnan_pattern = status->default_nan_pattern;
41
%qd 22:1 13:3
31
42
32
if (dnan_pattern == 0) {
43
-&vldr_vstr rn qd imm p a w size l
33
-#if defined(TARGET_SPARC)
44
+&vldr_vstr rn qd imm p a w size l u
34
- /* Sign bit clear, all frac bits set */
45
35
- dnan_pattern = 0b01111111;
46
-@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd
36
-#elif defined(TARGET_HEXAGON)
47
+@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
37
+#if defined(TARGET_HEXAGON)
48
+# Note that both Rn and Qd are 3 bits only (no D bit)
38
/* Sign bit set, all frac bits set. */
49
+@vldst_wn ... u:1 ... . . . . l:1 . rn:3 qd:3 . ... .. imm:7 &vldr_vstr
39
dnan_pattern = 0b11111111;
50
40
#else
51
# Vector loads and stores
52
53
+# Widening loads and narrowing stores:
54
+# for these P=0 W=0 is 'related encoding'; sz=11 is 'related encoding'
55
+# This means we need to expand out to multiple patterns for P, W, SZ.
56
+# For stores the U bit must be 0 but we catch that in the trans_ function.
57
+# The naming scheme here is "VLDSTB_H == in-memory byte load/store to/from
58
+# signed halfword element in register", etc.
59
+VLDSTB_H 111 . 110 0 a:1 0 1 . 0 ... ... 0 111 01 ....... @vldst_wn \
60
+ p=0 w=1 size=1
61
+VLDSTB_H 111 . 110 1 a:1 0 w:1 . 0 ... ... 0 111 01 ....... @vldst_wn \
62
+ p=1 size=1
63
+VLDSTB_W 111 . 110 0 a:1 0 1 . 0 ... ... 0 111 10 ....... @vldst_wn \
64
+ p=0 w=1 size=2
65
+VLDSTB_W 111 . 110 1 a:1 0 w:1 . 0 ... ... 0 111 10 ....... @vldst_wn \
66
+ p=1 size=2
67
+VLDSTH_W 111 . 110 0 a:1 0 1 . 1 ... ... 0 111 10 ....... @vldst_wn \
68
+ p=0 w=1 size=2
69
+VLDSTH_W 111 . 110 1 a:1 0 w:1 . 1 ... ... 0 111 10 ....... @vldst_wn \
70
+ p=1 size=2
71
+
72
# Non-widening loads/stores (P=0 W=0 is 'related encoding')
73
VLDR_VSTR 1110110 0 a:1 . 1 . .... ... 111100 ....... @vldr_vstr \
74
size=0 p=0 w=1
75
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
76
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/mve_helper.c
78
+++ b/target/arm/mve_helper.c
79
@@ -XXX,XX +XXX,XX @@ DO_VSTR(vstrb, 1, stb, 1, uint8_t)
80
DO_VSTR(vstrh, 2, stw, 2, uint16_t)
81
DO_VSTR(vstrw, 4, stl, 4, uint32_t)
82
83
+DO_VLDR(vldrb_sh, 1, ldsb, 2, int16_t)
84
+DO_VLDR(vldrb_sw, 1, ldsb, 4, int32_t)
85
+DO_VLDR(vldrb_uh, 1, ldub, 2, uint16_t)
86
+DO_VLDR(vldrb_uw, 1, ldub, 4, uint32_t)
87
+DO_VLDR(vldrh_sw, 2, ldsw, 4, int32_t)
88
+DO_VLDR(vldrh_uw, 2, lduw, 4, uint32_t)
89
+
90
+DO_VSTR(vstrb_h, 1, stb, 2, int16_t)
91
+DO_VSTR(vstrb_w, 1, stb, 4, int32_t)
92
+DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
93
+
94
#undef DO_VLDR
95
#undef DO_VSTR
96
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
97
index XXXXXXX..XXXXXXX 100644
98
--- a/target/arm/translate-mve.c
99
+++ b/target/arm/translate-mve.c
100
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR(DisasContext *s, arg_VLDR_VSTR *a)
101
};
102
return do_ldst(s, a, ldstfns[a->size][a->l]);
103
}
104
+
105
+#define DO_VLDST_WIDE_NARROW(OP, SLD, ULD, ST) \
106
+ static bool trans_##OP(DisasContext *s, arg_VLDR_VSTR *a) \
107
+ { \
108
+ static MVEGenLdStFn * const ldstfns[2][2] = { \
109
+ { gen_helper_mve_##ST, gen_helper_mve_##SLD }, \
110
+ { NULL, gen_helper_mve_##ULD }, \
111
+ }; \
112
+ return do_ldst(s, a, ldstfns[a->u][a->l]); \
113
+ }
114
+
115
+DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h)
116
+DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w)
117
+DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w)
118
--
41
--
119
2.20.1
42
2.34.1
120
121
diff view generated by jsdifflib
1
Implement the MVE VRMULH insn, which performs a rounding multiply
1
Set the default NaN pattern explicitly for xtensa.
2
and then returns the high half.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210617121628.20116-15-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-51-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-mve.h | 7 +++++++
7
target/xtensa/cpu.c | 2 ++
9
target/arm/mve.decode | 3 +++
8
1 file changed, 2 insertions(+)
10
target/arm/mve_helper.c | 22 ++++++++++++++++++++++
11
target/arm/translate-mve.c | 2 ++
12
4 files changed, 34 insertions(+)
13
9
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
10
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
12
--- a/target/xtensa/cpu.c
17
+++ b/target/arm/helper-mve.h
13
+++ b/target/xtensa/cpu.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmulhsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
14
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
19
DEF_HELPER_FLAGS_4(mve_vmulhub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
15
/* For inf * 0 + NaN, return the input NaN */
20
DEF_HELPER_FLAGS_4(mve_vmulhuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
21
DEF_HELPER_FLAGS_4(mve_vmulhuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
17
set_no_signaling_nans(!dfpu, &env->fp_status);
22
+
18
+ /* Default NaN value: sign bit clear, set frac msb */
23
+DEF_HELPER_FLAGS_4(mve_vrmulhsb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
+DEF_HELPER_FLAGS_4(mve_vrmulhsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
xtensa_use_first_nan(env, !dfpu);
25
+DEF_HELPER_FLAGS_4(mve_vrmulhsw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
26
+DEF_HELPER_FLAGS_4(mve_vrmulhub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
+DEF_HELPER_FLAGS_4(mve_vrmulhuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
+DEF_HELPER_FLAGS_4(mve_vrmulhuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/mve.decode
32
+++ b/target/arm/mve.decode
33
@@ -XXX,XX +XXX,XX @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
34
VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
35
VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
36
37
+VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
38
+VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
39
+
40
# Vector miscellaneous
41
42
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
43
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/mve_helper.c
46
+++ b/target/arm/mve_helper.c
47
@@ -XXX,XX +XXX,XX @@ static inline uint32_t do_mulh_w(int64_t n, int64_t m)
48
return (n * m) >> 32;
49
}
21
}
50
22
51
+static inline uint8_t do_rmulh_b(int32_t n, int32_t m)
52
+{
53
+ return (n * m + (1U << 7)) >> 8;
54
+}
55
+
56
+static inline uint16_t do_rmulh_h(int32_t n, int32_t m)
57
+{
58
+ return (n * m + (1U << 15)) >> 16;
59
+}
60
+
61
+static inline uint32_t do_rmulh_w(int64_t n, int64_t m)
62
+{
63
+ return (n * m + (1U << 31)) >> 32;
64
+}
65
+
66
DO_2OP(vmulhsb, 1, int8_t, do_mulh_b)
67
DO_2OP(vmulhsh, 2, int16_t, do_mulh_h)
68
DO_2OP(vmulhsw, 4, int32_t, do_mulh_w)
69
DO_2OP(vmulhub, 1, uint8_t, do_mulh_b)
70
DO_2OP(vmulhuh, 2, uint16_t, do_mulh_h)
71
DO_2OP(vmulhuw, 4, uint32_t, do_mulh_w)
72
+
73
+DO_2OP(vrmulhsb, 1, int8_t, do_rmulh_b)
74
+DO_2OP(vrmulhsh, 2, int16_t, do_rmulh_h)
75
+DO_2OP(vrmulhsw, 4, int32_t, do_rmulh_w)
76
+DO_2OP(vrmulhub, 1, uint8_t, do_rmulh_b)
77
+DO_2OP(vrmulhuh, 2, uint16_t, do_rmulh_h)
78
+DO_2OP(vrmulhuw, 4, uint32_t, do_rmulh_w)
79
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/target/arm/translate-mve.c
82
+++ b/target/arm/translate-mve.c
83
@@ -XXX,XX +XXX,XX @@ DO_2OP(VSUB, vsub)
84
DO_2OP(VMUL, vmul)
85
DO_2OP(VMULH_S, vmulhs)
86
DO_2OP(VMULH_U, vmulhu)
87
+DO_2OP(VRMULH_S, vrmulhs)
88
+DO_2OP(VRMULH_U, vrmulhu)
89
--
23
--
90
2.20.1
24
2.34.1
91
92
diff view generated by jsdifflib
1
vfp_access_check and its helper routine full_vfp_access_check() has
1
Set the default NaN pattern explicitly for hexagon.
2
gradually grown and is now an awkward mix of A-profile only and
2
Remove the ifdef from parts64_default_nan(); the only
3
M-profile only pieces. Refactor it into an A-profile only and an
3
remaining unconverted targets all use the default case.
4
M-profile only version, taking advantage of the fact that now the
5
only direct call to full_vfp_access_check() is in A-profile-only
6
code.
7
4
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20210618141019.10671-7-peter.maydell@linaro.org
7
Message-id: 20241202131347.498124-52-peter.maydell@linaro.org
11
---
8
---
12
target/arm/translate-vfp.c | 79 +++++++++++++++++++++++---------------
9
target/hexagon/cpu.c | 2 ++
13
1 file changed, 48 insertions(+), 31 deletions(-)
10
fpu/softfloat-specialize.c.inc | 5 -----
11
2 files changed, 2 insertions(+), 5 deletions(-)
14
12
15
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
13
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
16
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-vfp.c
15
--- a/target/hexagon/cpu.c
18
+++ b/target/arm/translate-vfp.c
16
+++ b/target/hexagon/cpu.c
19
@@ -XXX,XX +XXX,XX @@ static void gen_update_fp_context(DisasContext *s)
17
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_reset_hold(Object *obj, ResetType type)
18
19
set_default_nan_mode(1, &env->fp_status);
20
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
21
+ /* Default NaN value: sign bit set, all frac bits set */
22
+ set_float_default_nan_pattern(0b11111111, &env->fp_status);
20
}
23
}
21
24
22
/*
25
static void hexagon_cpu_disas_set_info(CPUState *s, disassemble_info *info)
23
- * Check that VFP access is enabled. If it is, do the necessary
26
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
24
- * M-profile lazy-FP handling and then return true.
27
index XXXXXXX..XXXXXXX 100644
25
- * If not, emit code to generate an appropriate exception and
28
--- a/fpu/softfloat-specialize.c.inc
26
- * return false.
29
+++ b/fpu/softfloat-specialize.c.inc
27
+ * Check that VFP access is enabled, A-profile specific version.
30
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
28
+ *
31
uint8_t dnan_pattern = status->default_nan_pattern;
29
+ * If VFP is enabled, return true. If not, emit code to generate an
32
30
+ * appropriate exception and return false.
33
if (dnan_pattern == 0) {
31
* The ignore_vfp_enabled argument specifies that we should ignore
34
-#if defined(TARGET_HEXAGON)
32
- * whether VFP is enabled via FPEXC[EN]: this should be true for FMXR/FMRX
35
- /* Sign bit set, all frac bits set. */
33
+ * whether VFP is enabled via FPEXC.EN: this should be true for FMXR/FMRX
36
- dnan_pattern = 0b11111111;
34
* accesses to FPSID, FPEXC, MVFR0, MVFR1, MVFR2, and false for all other insns.
37
-#else
35
*/
38
/*
36
-static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
39
* This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
37
+static bool vfp_access_check_a(DisasContext *s, bool ignore_vfp_enabled)
40
* S390, SH4, TriCore, and Xtensa. Our other supported targets
38
{
41
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
39
if (s->fp_excp_el) {
42
/* sign bit clear, set frac msb */
40
- if (arm_dc_feature(s, ARM_FEATURE_M)) {
43
dnan_pattern = 0b01000000;
41
- /*
44
}
42
- * M-profile mostly catches the "FPU disabled" case early, in
45
-#endif
43
- * disas_m_nocp(), but a few insns (eg LCTP, WLSTP, DLSTP)
44
- * which do coprocessor-checks are outside the large ranges of
45
- * the encoding space handled by the patterns in m-nocp.decode,
46
- * and for them we may need to raise NOCP here.
47
- */
48
- gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
49
- syn_uncategorized(), s->fp_excp_el);
50
- } else {
51
- gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
52
- syn_fp_access_trap(1, 0xe, false),
53
- s->fp_excp_el);
54
- }
55
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
56
+ syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
57
return false;
58
}
46
}
59
47
assert(dnan_pattern != 0);
60
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
61
unallocated_encoding(s);
62
return false;
63
}
64
+ return true;
65
+}
66
67
- if (arm_dc_feature(s, ARM_FEATURE_M)) {
68
- /* Handle M-profile lazy FP state mechanics */
69
-
70
- /* Trigger lazy-state preservation if necessary */
71
- gen_preserve_fp_state(s);
72
-
73
- /* Update ownership of FP context and create new FP context if needed */
74
- gen_update_fp_context(s);
75
+/*
76
+ * Check that VFP access is enabled, M-profile specific version.
77
+ *
78
+ * If VFP is enabled, do the necessary M-profile lazy-FP handling and then
79
+ * return true. If not, emit code to generate an appropriate exception and
80
+ * return false.
81
+ */
82
+static bool vfp_access_check_m(DisasContext *s)
83
+{
84
+ if (s->fp_excp_el) {
85
+ /*
86
+ * M-profile mostly catches the "FPU disabled" case early, in
87
+ * disas_m_nocp(), but a few insns (eg LCTP, WLSTP, DLSTP)
88
+ * which do coprocessor-checks are outside the large ranges of
89
+ * the encoding space handled by the patterns in m-nocp.decode,
90
+ * and for them we may need to raise NOCP here.
91
+ */
92
+ gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
93
+ syn_uncategorized(), s->fp_excp_el);
94
+ return false;
95
}
96
97
+ /* Handle M-profile lazy FP state mechanics */
98
+
99
+ /* Trigger lazy-state preservation if necessary */
100
+ gen_preserve_fp_state(s);
101
+
102
+ /* Update ownership of FP context and create new FP context if needed */
103
+ gen_update_fp_context(s);
104
+
105
return true;
106
}
107
108
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
109
*/
110
bool vfp_access_check(DisasContext *s)
111
{
112
- return full_vfp_access_check(s, false);
113
+ if (arm_dc_feature(s, ARM_FEATURE_M)) {
114
+ return vfp_access_check_m(s);
115
+ } else {
116
+ return vfp_access_check_a(s, false);
117
+ }
118
}
119
120
static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
121
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
122
return false;
123
}
124
125
- if (!full_vfp_access_check(s, ignore_vfp_enabled)) {
126
+ /*
127
+ * Call vfp_access_check_a() directly, because we need to tell
128
+ * it to ignore FPEXC.EN for some register accesses.
129
+ */
130
+ if (!vfp_access_check_a(s, ignore_vfp_enabled)) {
131
return true;
132
}
133
48
134
--
49
--
135
2.20.1
50
2.34.1
136
137
diff view generated by jsdifflib
1
The virt_is_acpi_enabled() function is specific to the virt board, as
1
Set the default NaN pattern explicitly for riscv.
2
is the check for its 'ras' property. Use the new acpi_ghes_present()
3
function to check whether we should report memory errors via
4
acpi_ghes_record_errors().
5
6
This avoids a link error if QEMU was built without support for the
7
virt board, and provides a mechanism that can be used by any future
8
board models that want to add ACPI memory error reporting support
9
(they only need to call acpi_ghes_add_fw_cfg()).
10
2
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Dongjiu Geng <gengdongjiu1@gmail.com>
5
Message-id: 20241202131347.498124-53-peter.maydell@linaro.org
14
Message-id: 20210603171259.27962-4-peter.maydell@linaro.org
15
---
6
---
16
target/arm/kvm64.c | 6 +-----
7
target/riscv/cpu.c | 2 ++
17
1 file changed, 1 insertion(+), 5 deletions(-)
8
1 file changed, 2 insertions(+)
18
9
19
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
10
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
20
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/kvm64.c
12
--- a/target/riscv/cpu.c
22
+++ b/target/arm/kvm64.c
13
+++ b/target/riscv/cpu.c
23
@@ -XXX,XX +XXX,XX @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
14
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type)
24
{
15
cs->exception_index = RISCV_EXCP_NONE;
25
ram_addr_t ram_addr;
16
env->load_res = -1;
26
hwaddr paddr;
17
set_default_nan_mode(1, &env->fp_status);
27
- Object *obj = qdev_get_machine();
18
+ /* Default NaN value: sign bit clear, frac msb set */
28
- VirtMachineState *vms = VIRT_MACHINE(obj);
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
29
- bool acpi_enabled = virt_is_acpi_enabled(vms);
20
env->vill = true;
30
21
31
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
22
#ifndef CONFIG_USER_ONLY
32
33
- if (acpi_enabled && addr &&
34
- object_property_get_bool(obj, "ras", NULL)) {
35
+ if (acpi_ghes_present() && addr) {
36
ram_addr = qemu_ram_addr_from_host(addr);
37
if (ram_addr != RAM_ADDR_INVALID &&
38
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
39
--
23
--
40
2.20.1
24
2.34.1
41
42
diff view generated by jsdifflib
1
Factor the code in full_vfp_access_check() which updates the
1
Set the default NaN pattern explicitly for tricore.
2
ownership of the FP context and creates a new FP context
3
out into its own function.
4
2
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210618141019.10671-6-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-54-peter.maydell@linaro.org
8
---
6
---
9
target/arm/translate-vfp.c | 104 +++++++++++++++++++++----------------
7
target/tricore/helper.c | 2 ++
10
1 file changed, 58 insertions(+), 46 deletions(-)
8
1 file changed, 2 insertions(+)
11
9
12
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
10
diff --git a/target/tricore/helper.c b/target/tricore/helper.c
13
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-vfp.c
12
--- a/target/tricore/helper.c
15
+++ b/target/arm/translate-vfp.c
13
+++ b/target/tricore/helper.c
16
@@ -XXX,XX +XXX,XX @@ void gen_preserve_fp_state(DisasContext *s)
14
@@ -XXX,XX +XXX,XX @@ void fpu_set_state(CPUTriCoreState *env)
17
}
15
set_flush_to_zero(1, &env->fp_status);
16
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
17
set_default_nan_mode(1, &env->fp_status);
18
+ /* Default NaN pattern: sign bit clear, frac msb set */
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
18
}
20
}
19
21
20
+/*
22
uint32_t psw_read(CPUTriCoreState *env)
21
+ * Generate code for M-profile FP context handling: update the
22
+ * ownership of the FP context, and create a new context if
23
+ * necessary. This corresponds to the parts of the pseudocode
24
+ * ExecuteFPCheck() after the inital PreserveFPState() call.
25
+ */
26
+static void gen_update_fp_context(DisasContext *s)
27
+{
28
+ /* Update ownership of FP context: set FPCCR.S to match current state */
29
+ if (s->v8m_fpccr_s_wrong) {
30
+ TCGv_i32 tmp;
31
+
32
+ tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
33
+ if (s->v8m_secure) {
34
+ tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
35
+ } else {
36
+ tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
37
+ }
38
+ store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
39
+ /* Don't need to do this for any further FP insns in this TB */
40
+ s->v8m_fpccr_s_wrong = false;
41
+ }
42
+
43
+ if (s->v7m_new_fp_ctxt_needed) {
44
+ /*
45
+ * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA,
46
+ * the FPSCR, and VPR.
47
+ */
48
+ TCGv_i32 control, fpscr;
49
+ uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
50
+
51
+ fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
52
+ gen_helper_vfp_set_fpscr(cpu_env, fpscr);
53
+ tcg_temp_free_i32(fpscr);
54
+ if (dc_isar_feature(aa32_mve, s)) {
55
+ TCGv_i32 z32 = tcg_const_i32(0);
56
+ store_cpu_field(z32, v7m.vpr);
57
+ }
58
+
59
+ /*
60
+ * We don't need to arrange to end the TB, because the only
61
+ * parts of FPSCR which we cache in the TB flags are the VECLEN
62
+ * and VECSTRIDE, and those don't exist for M-profile.
63
+ */
64
+
65
+ if (s->v8m_secure) {
66
+ bits |= R_V7M_CONTROL_SFPA_MASK;
67
+ }
68
+ control = load_cpu_field(v7m.control[M_REG_S]);
69
+ tcg_gen_ori_i32(control, control, bits);
70
+ store_cpu_field(control, v7m.control[M_REG_S]);
71
+ /* Don't need to do this for any further FP insns in this TB */
72
+ s->v7m_new_fp_ctxt_needed = false;
73
+ }
74
+}
75
+
76
/*
77
* Check that VFP access is enabled. If it is, do the necessary
78
* M-profile lazy-FP handling and then return true.
79
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
80
/* Trigger lazy-state preservation if necessary */
81
gen_preserve_fp_state(s);
82
83
- /* Update ownership of FP context: set FPCCR.S to match current state */
84
- if (s->v8m_fpccr_s_wrong) {
85
- TCGv_i32 tmp;
86
-
87
- tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
88
- if (s->v8m_secure) {
89
- tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
90
- } else {
91
- tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
92
- }
93
- store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
94
- /* Don't need to do this for any further FP insns in this TB */
95
- s->v8m_fpccr_s_wrong = false;
96
- }
97
-
98
- if (s->v7m_new_fp_ctxt_needed) {
99
- /*
100
- * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA,
101
- * the FPSCR, and VPR.
102
- */
103
- TCGv_i32 control, fpscr;
104
- uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
105
-
106
- fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
107
- gen_helper_vfp_set_fpscr(cpu_env, fpscr);
108
- tcg_temp_free_i32(fpscr);
109
- if (dc_isar_feature(aa32_mve, s)) {
110
- TCGv_i32 z32 = tcg_const_i32(0);
111
- store_cpu_field(z32, v7m.vpr);
112
- }
113
-
114
- /*
115
- * We don't need to arrange to end the TB, because the only
116
- * parts of FPSCR which we cache in the TB flags are the VECLEN
117
- * and VECSTRIDE, and those don't exist for M-profile.
118
- */
119
-
120
- if (s->v8m_secure) {
121
- bits |= R_V7M_CONTROL_SFPA_MASK;
122
- }
123
- control = load_cpu_field(v7m.control[M_REG_S]);
124
- tcg_gen_ori_i32(control, control, bits);
125
- store_cpu_field(control, v7m.control[M_REG_S]);
126
- /* Don't need to do this for any further FP insns in this TB */
127
- s->v7m_new_fp_ctxt_needed = false;
128
- }
129
+ /* Update ownership of FP context and create new FP context if needed */
130
+ gen_update_fp_context(s);
131
}
132
133
return true;
134
--
23
--
135
2.20.1
24
2.34.1
136
137
diff view generated by jsdifflib
1
Allow code elsewhere in the system to check whether the ACPI GHES
1
Now that all our targets have bene converted to explicitly specify
2
table is present, so it can determine whether it is OK to try to
2
their pattern for the default NaN value we can remove the remaining
3
record an error by calling acpi_ghes_record_errors().
3
fallback code in parts64_default_nan().
4
5
(We don't need to migrate the new 'present' field in AcpiGhesState,
6
because it is set once at system initialization and doesn't change.)
7
4
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Dongjiu Geng <gengdongjiu1@gmail.com>
7
Message-id: 20241202131347.498124-55-peter.maydell@linaro.org
11
Message-id: 20210603171259.27962-3-peter.maydell@linaro.org
12
---
8
---
13
include/hw/acpi/ghes.h | 9 +++++++++
9
fpu/softfloat-specialize.c.inc | 14 --------------
14
hw/acpi/ghes-stub.c | 5 +++++
10
1 file changed, 14 deletions(-)
15
hw/acpi/ghes.c | 17 +++++++++++++++++
16
3 files changed, 31 insertions(+)
17
11
18
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
12
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
19
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
20
--- a/include/hw/acpi/ghes.h
14
--- a/fpu/softfloat-specialize.c.inc
21
+++ b/include/hw/acpi/ghes.h
15
+++ b/fpu/softfloat-specialize.c.inc
22
@@ -XXX,XX +XXX,XX @@ enum {
16
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
23
17
uint64_t frac;
24
typedef struct AcpiGhesState {
18
uint8_t dnan_pattern = status->default_nan_pattern;
25
uint64_t ghes_addr_le;
19
26
+ bool present; /* True if GHES is present at all on this board */
20
- if (dnan_pattern == 0) {
27
} AcpiGhesState;
21
- /*
28
22
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
29
void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
23
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
30
@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker,
24
- * do not have floating-point.
31
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
25
- */
32
GArray *hardware_errors);
26
- if (snan_bit_is_one(status)) {
33
int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
27
- /* sign bit clear, set all frac bits other than msb */
34
+
28
- dnan_pattern = 0b00111111;
35
+/**
29
- } else {
36
+ * acpi_ghes_present: Report whether ACPI GHES table is present
30
- /* sign bit clear, set frac msb */
37
+ *
31
- dnan_pattern = 0b01000000;
38
+ * Returns: true if the system has an ACPI GHES table and it is
32
- }
39
+ * safe to call acpi_ghes_record_errors() to record a memory error.
33
- }
40
+ */
34
assert(dnan_pattern != 0);
41
+bool acpi_ghes_present(void);
35
42
#endif
36
sign = dnan_pattern >> 7;
43
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/hw/acpi/ghes-stub.c
46
+++ b/hw/acpi/ghes-stub.c
47
@@ -XXX,XX +XXX,XX @@ int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
48
{
49
return -1;
50
}
51
+
52
+bool acpi_ghes_present(void)
53
+{
54
+ return false;
55
+}
56
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/hw/acpi/ghes.c
59
+++ b/hw/acpi/ghes.c
60
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
61
/* Create a read-write fw_cfg file for Address */
62
fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
63
NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
64
+
65
+ ags->present = true;
66
}
67
68
int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
69
@@ -XXX,XX +XXX,XX @@ int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
70
71
return ret;
72
}
73
+
74
+bool acpi_ghes_present(void)
75
+{
76
+ AcpiGedState *acpi_ged_state;
77
+ AcpiGhesState *ags;
78
+
79
+ acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
80
+ NULL));
81
+
82
+ if (!acpi_ged_state) {
83
+ return false;
84
+ }
85
+ ags = &acpi_ged_state->ghes_state;
86
+ return ags->present;
87
+}
88
--
37
--
89
2.20.1
38
2.34.1
90
91
diff view generated by jsdifflib
1
From: Peter Collingbourne <pcc@google.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
MTE3 introduces an asymmetric tag checking mode, in which loads are
3
Inline pickNaNMulAdd into its only caller. This makes
4
checked synchronously and stores are checked asynchronously. Add
4
one assert redundant with the immediately preceding IF.
5
support for it.
6
5
7
Signed-off-by: Peter Collingbourne <pcc@google.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Message-id: 20210616195614.11785-1-pcc@google.com
8
Message-id: 20241203203949.483774-3-richard.henderson@linaro.org
10
[PMM: Add line to emulation.rst]
9
[PMM: keep comment from old code in new location]
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
11
---
13
docs/system/arm/emulation.rst | 1 +
12
fpu/softfloat-parts.c.inc | 41 +++++++++++++++++++++++++-
14
target/arm/cpu64.c | 2 +-
13
fpu/softfloat-specialize.c.inc | 54 ----------------------------------
15
target/arm/mte_helper.c | 82 ++++++++++++++++++++++-------------
14
2 files changed, 40 insertions(+), 55 deletions(-)
16
3 files changed, 53 insertions(+), 32 deletions(-)
17
15
18
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/docs/system/arm/emulation.rst
18
--- a/fpu/softfloat-parts.c.inc
21
+++ b/docs/system/arm/emulation.rst
19
+++ b/fpu/softfloat-parts.c.inc
22
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
23
- FEAT_LSE (Large System Extensions)
21
}
24
- FEAT_MTE (Memory Tagging Extension)
22
25
- FEAT_MTE2 (Memory Tagging Extension)
23
if (s->default_nan_mode) {
26
+- FEAT_MTE3 (MTE Asymmetric Fault Handling)
24
+ /*
27
- FEAT_PAN (Privileged access never)
25
+ * We guarantee not to require the target to tell us how to
28
- FEAT_PAN2 (AT S1E1R and AT S1E1W instruction variants affected by PSTATE.PAN)
26
+ * pick a NaN if we're always returning the default NaN.
29
- FEAT_PAuth (Pointer authentication)
27
+ * But if we're not in default-NaN mode then the target must
30
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
28
+ * specify.
29
+ */
30
which = 3;
31
+ } else if (infzero) {
32
+ /*
33
+ * Inf * 0 + NaN -- some implementations return the
34
+ * default NaN here, and some return the input NaN.
35
+ */
36
+ switch (s->float_infzeronan_rule) {
37
+ case float_infzeronan_dnan_never:
38
+ which = 2;
39
+ break;
40
+ case float_infzeronan_dnan_always:
41
+ which = 3;
42
+ break;
43
+ case float_infzeronan_dnan_if_qnan:
44
+ which = is_qnan(c->cls) ? 3 : 2;
45
+ break;
46
+ default:
47
+ g_assert_not_reached();
48
+ }
49
} else {
50
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
51
+ FloatClass cls[3] = { a->cls, b->cls, c->cls };
52
+ Float3NaNPropRule rule = s->float_3nan_prop_rule;
53
+
54
+ assert(rule != float_3nan_prop_none);
55
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
56
+ /* We have at least one SNaN input and should prefer it */
57
+ do {
58
+ which = rule & R_3NAN_1ST_MASK;
59
+ rule >>= R_3NAN_1ST_LENGTH;
60
+ } while (!is_snan(cls[which]));
61
+ } else {
62
+ do {
63
+ which = rule & R_3NAN_1ST_MASK;
64
+ rule >>= R_3NAN_1ST_LENGTH;
65
+ } while (!is_nan(cls[which]));
66
+ }
67
}
68
69
if (which == 3) {
70
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
index XXXXXXX..XXXXXXX 100644
71
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/cpu64.c
72
--- a/fpu/softfloat-specialize.c.inc
33
+++ b/target/arm/cpu64.c
73
+++ b/fpu/softfloat-specialize.c.inc
34
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
74
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
35
* during realize if the board provides no tag memory, much like
36
* we do for EL2 with the virtualization=on property.
37
*/
38
- t = FIELD_DP64(t, ID_AA64PFR1, MTE, 2);
39
+ t = FIELD_DP64(t, ID_AA64PFR1, MTE, 3);
40
cpu->isar.id_aa64pfr1 = t;
41
42
t = cpu->isar.id_aa64mmfr0;
43
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/mte_helper.c
46
+++ b/target/arm/mte_helper.c
47
@@ -XXX,XX +XXX,XX @@ void HELPER(stzgm_tags)(CPUARMState *env, uint64_t ptr, uint64_t val)
48
}
75
}
49
}
76
}
50
77
51
+static void mte_sync_check_fail(CPUARMState *env, uint32_t desc,
78
-/*----------------------------------------------------------------------------
52
+ uint64_t dirty_ptr, uintptr_t ra)
79
-| Select which NaN to propagate for a three-input operation.
53
+{
80
-| For the moment we assume that no CPU needs the 'larger significand'
54
+ int is_write, syn;
81
-| information.
55
+
82
-| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
56
+ env->exception.vaddress = dirty_ptr;
83
-*----------------------------------------------------------------------------*/
57
+
84
-static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
58
+ is_write = FIELD_EX32(desc, MTEDESC, WRITE);
85
- bool infzero, bool have_snan, float_status *status)
59
+ syn = syn_data_abort_no_iss(arm_current_el(env) != 0, 0, 0, 0, 0, is_write,
86
-{
60
+ 0x11);
87
- FloatClass cls[3] = { a_cls, b_cls, c_cls };
61
+ raise_exception_ra(env, EXCP_DATA_ABORT, syn, exception_target_el(env), ra);
88
- Float3NaNPropRule rule = status->float_3nan_prop_rule;
62
+ g_assert_not_reached();
89
- int which;
63
+}
64
+
65
+static void mte_async_check_fail(CPUARMState *env, uint64_t dirty_ptr,
66
+ uintptr_t ra, ARMMMUIdx arm_mmu_idx, int el)
67
+{
68
+ int select;
69
+
70
+ if (regime_has_2_ranges(arm_mmu_idx)) {
71
+ select = extract64(dirty_ptr, 55, 1);
72
+ } else {
73
+ select = 0;
74
+ }
75
+ env->cp15.tfsr_el[el] |= 1 << select;
76
+#ifdef CONFIG_USER_ONLY
77
+ /*
78
+ * Stand in for a timer irq, setting _TIF_MTE_ASYNC_FAULT,
79
+ * which then sends a SIGSEGV when the thread is next scheduled.
80
+ * This cpu will return to the main loop at the end of the TB,
81
+ * which is rather sooner than "normal". But the alternative
82
+ * is waiting until the next syscall.
83
+ */
84
+ qemu_cpu_kick(env_cpu(env));
85
+#endif
86
+}
87
+
88
/* Record a tag check failure. */
89
static void mte_check_fail(CPUARMState *env, uint32_t desc,
90
uint64_t dirty_ptr, uintptr_t ra)
91
{
92
int mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
93
ARMMMUIdx arm_mmu_idx = core_to_aa64_mmu_idx(mmu_idx);
94
- int el, reg_el, tcf, select, is_write, syn;
95
+ int el, reg_el, tcf;
96
uint64_t sctlr;
97
98
reg_el = regime_el(env, arm_mmu_idx);
99
@@ -XXX,XX +XXX,XX @@ static void mte_check_fail(CPUARMState *env, uint32_t desc,
100
switch (tcf) {
101
case 1:
102
/* Tag check fail causes a synchronous exception. */
103
- env->exception.vaddress = dirty_ptr;
104
-
90
-
105
- is_write = FIELD_EX32(desc, MTEDESC, WRITE);
91
- /*
106
- syn = syn_data_abort_no_iss(arm_current_el(env) != 0, 0, 0, 0, 0,
92
- * We guarantee not to require the target to tell us how to
107
- is_write, 0x11);
93
- * pick a NaN if we're always returning the default NaN.
108
- raise_exception_ra(env, EXCP_DATA_ABORT, syn,
94
- * But if we're not in default-NaN mode then the target must
109
- exception_target_el(env), ra);
95
- * specify.
110
- /* noreturn, but fall through to the assert anyway */
96
- */
111
+ mte_sync_check_fail(env, desc, dirty_ptr, ra);
97
- assert(!status->default_nan_mode);
112
+ break;
98
-
113
99
- if (infzero) {
114
case 0:
100
- /*
115
/*
101
- * Inf * 0 + NaN -- some implementations return the default NaN here,
116
@@ -XXX,XX +XXX,XX @@ static void mte_check_fail(CPUARMState *env, uint32_t desc,
102
- * and some return the input NaN.
117
103
- */
118
case 2:
104
- switch (status->float_infzeronan_rule) {
119
/* Tag check fail causes asynchronous flag set. */
105
- case float_infzeronan_dnan_never:
120
- if (regime_has_2_ranges(arm_mmu_idx)) {
106
- return 2;
121
- select = extract64(dirty_ptr, 55, 1);
107
- case float_infzeronan_dnan_always:
122
- } else {
108
- return 3;
123
- select = 0;
109
- case float_infzeronan_dnan_if_qnan:
110
- return is_qnan(c_cls) ? 3 : 2;
111
- default:
112
- g_assert_not_reached();
124
- }
113
- }
125
- env->cp15.tfsr_el[el] |= 1 << select;
114
- }
126
-#ifdef CONFIG_USER_ONLY
115
-
127
- /*
116
- assert(rule != float_3nan_prop_none);
128
- * Stand in for a timer irq, setting _TIF_MTE_ASYNC_FAULT,
117
- if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
129
- * which then sends a SIGSEGV when the thread is next scheduled.
118
- /* We have at least one SNaN input and should prefer it */
130
- * This cpu will return to the main loop at the end of the TB,
119
- do {
131
- * which is rather sooner than "normal". But the alternative
120
- which = rule & R_3NAN_1ST_MASK;
132
- * is waiting until the next syscall.
121
- rule >>= R_3NAN_1ST_LENGTH;
133
- */
122
- } while (!is_snan(cls[which]));
134
- qemu_cpu_kick(env_cpu(env));
123
- } else {
135
-#endif
124
- do {
136
+ mte_async_check_fail(env, dirty_ptr, ra, arm_mmu_idx, el);
125
- which = rule & R_3NAN_1ST_MASK;
137
break;
126
- rule >>= R_3NAN_1ST_LENGTH;
138
127
- } while (!is_nan(cls[which]));
139
- default:
128
- }
140
- /* Case 3: Reserved. */
129
- return which;
141
- qemu_log_mask(LOG_GUEST_ERROR,
130
-}
142
- "Tag check failure with SCTLR_EL%d.TCF%s "
131
-
143
- "set to reserved value %d\n",
132
/*----------------------------------------------------------------------------
144
- reg_el, el ? "" : "0", tcf);
133
| Returns 1 if the double-precision floating-point value `a' is a quiet
145
+ case 3:
134
| NaN; otherwise returns 0.
146
+ /*
147
+ * Tag check fail causes asynchronous flag set for stores, or
148
+ * a synchronous exception for loads.
149
+ */
150
+ if (FIELD_EX32(desc, MTEDESC, WRITE)) {
151
+ mte_async_check_fail(env, dirty_ptr, ra, arm_mmu_idx, el);
152
+ } else {
153
+ mte_sync_check_fail(env, desc, dirty_ptr, ra);
154
+ }
155
break;
156
}
157
}
158
--
135
--
159
2.20.1
136
2.34.1
160
137
161
138
diff view generated by jsdifflib
1
Implement the scalar variants of the MVE VHADD and VHSUB insns.
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Remove "3" as a special case for which and simply
4
branch to return the desired value.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-4-richard.henderson@linaro.org
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-25-peter.maydell@linaro.org
6
---
10
---
7
target/arm/helper-mve.h | 16 ++++++++++++++++
11
fpu/softfloat-parts.c.inc | 20 ++++++++++----------
8
target/arm/mve.decode | 4 ++++
12
1 file changed, 10 insertions(+), 10 deletions(-)
9
target/arm/mve_helper.c | 8 ++++++++
10
target/arm/translate-mve.c | 4 ++++
11
4 files changed, 32 insertions(+)
12
13
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
14
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
16
--- a/fpu/softfloat-parts.c.inc
16
+++ b/target/arm/helper-mve.h
17
+++ b/fpu/softfloat-parts.c.inc
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vmul_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
18
DEF_HELPER_FLAGS_4(mve_vmul_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
* But if we're not in default-NaN mode then the target must
19
DEF_HELPER_FLAGS_4(mve_vmul_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
* specify.
20
21
*/
21
+DEF_HELPER_FLAGS_4(mve_vhadds_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
- which = 3;
22
+DEF_HELPER_FLAGS_4(mve_vhadds_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
+ goto default_nan;
23
+DEF_HELPER_FLAGS_4(mve_vhadds_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
} else if (infzero) {
25
/*
26
* Inf * 0 + NaN -- some implementations return the
27
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
28
*/
29
switch (s->float_infzeronan_rule) {
30
case float_infzeronan_dnan_never:
31
- which = 2;
32
break;
33
case float_infzeronan_dnan_always:
34
- which = 3;
35
- break;
36
+ goto default_nan;
37
case float_infzeronan_dnan_if_qnan:
38
- which = is_qnan(c->cls) ? 3 : 2;
39
+ if (is_qnan(c->cls)) {
40
+ goto default_nan;
41
+ }
42
break;
43
default:
44
g_assert_not_reached();
45
}
46
+ which = 2;
47
} else {
48
FloatClass cls[3] = { a->cls, b->cls, c->cls };
49
Float3NaNPropRule rule = s->float_3nan_prop_rule;
50
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
51
}
52
}
53
54
- if (which == 3) {
55
- parts_default_nan(a, s);
56
- return a;
57
- }
58
-
59
switch (which) {
60
case 0:
61
break;
62
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
63
parts_silence_nan(a, s);
64
}
65
return a;
24
+
66
+
25
+DEF_HELPER_FLAGS_4(mve_vhaddu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
67
+ default_nan:
26
+DEF_HELPER_FLAGS_4(mve_vhaddu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
68
+ parts_default_nan(a, s);
27
+DEF_HELPER_FLAGS_4(mve_vhaddu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
69
+ return a;
28
+
70
}
29
+DEF_HELPER_FLAGS_4(mve_vhsubs_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(mve_vhsubs_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(mve_vhsubs_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
+
33
+DEF_HELPER_FLAGS_4(mve_vhsubu_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(mve_vhsubu_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(mve_vhsubu_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
+
37
DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
38
DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
39
DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64)
40
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/mve.decode
43
+++ b/target/arm/mve.decode
44
@@ -XXX,XX +XXX,XX @@ VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_no
45
VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar
46
VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar
47
VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar
48
+VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
49
+VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar
50
+VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
51
+VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/mve_helper.c
55
+++ b/target/arm/mve_helper.c
56
@@ -XXX,XX +XXX,XX @@ DO_2OP_U(vhsubu, do_vhsub_u)
57
DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \
58
DO_2OP_SCALAR(OP##h, 2, uint16_t, FN) \
59
DO_2OP_SCALAR(OP##w, 4, uint32_t, FN)
60
+#define DO_2OP_SCALAR_S(OP, FN) \
61
+ DO_2OP_SCALAR(OP##b, 1, int8_t, FN) \
62
+ DO_2OP_SCALAR(OP##h, 2, int16_t, FN) \
63
+ DO_2OP_SCALAR(OP##w, 4, int32_t, FN)
64
65
DO_2OP_SCALAR_U(vadd_scalar, DO_ADD)
66
DO_2OP_SCALAR_U(vsub_scalar, DO_SUB)
67
DO_2OP_SCALAR_U(vmul_scalar, DO_MUL)
68
+DO_2OP_SCALAR_S(vhadds_scalar, do_vhadd_s)
69
+DO_2OP_SCALAR_U(vhaddu_scalar, do_vhadd_u)
70
+DO_2OP_SCALAR_S(vhsubs_scalar, do_vhsub_s)
71
+DO_2OP_SCALAR_U(vhsubu_scalar, do_vhsub_u)
72
71
73
/*
72
/*
74
* Multiply add long dual accumulate ops.
75
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
76
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/translate-mve.c
78
+++ b/target/arm/translate-mve.c
79
@@ -XXX,XX +XXX,XX @@ static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
80
DO_2OP_SCALAR(VADD_scalar, vadd_scalar)
81
DO_2OP_SCALAR(VSUB_scalar, vsub_scalar)
82
DO_2OP_SCALAR(VMUL_scalar, vmul_scalar)
83
+DO_2OP_SCALAR(VHADD_S_scalar, vhadds_scalar)
84
+DO_2OP_SCALAR(VHADD_U_scalar, vhaddu_scalar)
85
+DO_2OP_SCALAR(VHSUB_S_scalar, vhsubs_scalar)
86
+DO_2OP_SCALAR(VHSUB_U_scalar, vhsubu_scalar)
87
88
static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a,
89
MVEGenDualAccOpFn *fn)
90
--
73
--
91
2.20.1
74
2.34.1
92
75
93
76
diff view generated by jsdifflib
1
Implement the MVE VADD, VSUB and VMUL insns.
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Assign the pointer return value to 'a' directly,
4
rather than going through an intermediary index.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-5-richard.henderson@linaro.org
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20210617121628.20116-13-peter.maydell@linaro.org
6
---
10
---
7
target/arm/helper-mve.h | 12 ++++++++++++
11
fpu/softfloat-parts.c.inc | 32 ++++++++++----------------------
8
target/arm/mve.decode | 5 +++++
12
1 file changed, 10 insertions(+), 22 deletions(-)
9
target/arm/mve_helper.c | 14 ++++++++++++++
10
target/arm/translate-mve.c | 16 ++++++++++++++++
11
4 files changed, 47 insertions(+)
12
13
13
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
14
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-mve.h
16
--- a/fpu/softfloat-parts.c.inc
16
+++ b/target/arm/helper-mve.h
17
+++ b/fpu/softfloat-parts.c.inc
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vbic, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
18
DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
19
FloatPartsN *c, float_status *s,
19
DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
20
int ab_mask, int abc_mask)
20
DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
21
{
21
+
22
- int which;
22
+DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
23
bool infzero = (ab_mask == float_cmask_infzero);
23
+DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
24
bool have_snan = (abc_mask & float_cmask_snan);
24
+DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
25
+ FloatPartsN *ret;
25
+
26
26
+DEF_HELPER_FLAGS_4(mve_vsubb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
27
if (unlikely(have_snan)) {
27
+DEF_HELPER_FLAGS_4(mve_vsubh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
28
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
28
+DEF_HELPER_FLAGS_4(mve_vsubw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
29
+
30
default:
30
+DEF_HELPER_FLAGS_4(mve_vmulb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
31
g_assert_not_reached();
31
+DEF_HELPER_FLAGS_4(mve_vmulh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
32
}
32
+DEF_HELPER_FLAGS_4(mve_vmulw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
33
- which = 2;
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
+ ret = c;
34
index XXXXXXX..XXXXXXX 100644
35
} else {
35
--- a/target/arm/mve.decode
36
- FloatClass cls[3] = { a->cls, b->cls, c->cls };
36
+++ b/target/arm/mve.decode
37
+ FloatPartsN *val[3] = { a, b, c };
37
@@ -XXX,XX +XXX,XX @@
38
Float3NaNPropRule rule = s->float_3nan_prop_rule;
38
39
39
@1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm
40
assert(rule != float_3nan_prop_none);
40
@1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0
41
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
41
+@2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn
42
/* We have at least one SNaN input and should prefer it */
42
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
43
do {
43
44
- which = rule & R_3NAN_1ST_MASK;
44
# Vector loads and stores
45
+ ret = val[rule & R_3NAN_1ST_MASK];
45
@@ -XXX,XX +XXX,XX @@ VORR 1110 1111 0 . 10 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
46
rule >>= R_3NAN_1ST_LENGTH;
46
VORN 1110 1111 0 . 11 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
47
- } while (!is_snan(cls[which]));
47
VEOR 1111 1111 0 . 00 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz
48
+ } while (!is_snan(ret->cls));
48
49
} else {
49
+VADD 1110 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
50
do {
50
+VSUB 1111 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
51
- which = rule & R_3NAN_1ST_MASK;
51
+VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
52
+ ret = val[rule & R_3NAN_1ST_MASK];
52
+
53
rule >>= R_3NAN_1ST_LENGTH;
53
# Vector miscellaneous
54
- } while (!is_nan(cls[which]));
54
55
+ } while (!is_nan(ret->cls));
55
VCLS 1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
56
}
56
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/target/arm/mve_helper.c
59
+++ b/target/arm/mve_helper.c
60
@@ -XXX,XX +XXX,XX @@ DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
61
mve_advance_vpt(env); \
62
}
57
}
63
58
64
+/* provide unsigned 2-op helpers for all sizes */
59
- switch (which) {
65
+#define DO_2OP_U(OP, FN) \
60
- case 0:
66
+ DO_2OP(OP##b, 1, uint8_t, FN) \
61
- break;
67
+ DO_2OP(OP##h, 2, uint16_t, FN) \
62
- case 1:
68
+ DO_2OP(OP##w, 4, uint32_t, FN)
63
- a = b;
69
+
64
- break;
70
#define DO_AND(N, M) ((N) & (M))
65
- case 2:
71
#define DO_BIC(N, M) ((N) & ~(M))
66
- a = c;
72
#define DO_ORR(N, M) ((N) | (M))
67
- break;
73
@@ -XXX,XX +XXX,XX @@ DO_2OP(vbic, 8, uint64_t, DO_BIC)
68
- default:
74
DO_2OP(vorr, 8, uint64_t, DO_ORR)
69
- g_assert_not_reached();
75
DO_2OP(vorn, 8, uint64_t, DO_ORN)
70
+ if (is_snan(ret->cls)) {
76
DO_2OP(veor, 8, uint64_t, DO_EOR)
71
+ parts_silence_nan(ret, s);
77
+
72
}
78
+#define DO_ADD(N, M) ((N) + (M))
73
- if (is_snan(a->cls)) {
79
+#define DO_SUB(N, M) ((N) - (M))
74
- parts_silence_nan(a, s);
80
+#define DO_MUL(N, M) ((N) * (M))
75
- }
81
+
76
- return a;
82
+DO_2OP_U(vadd, DO_ADD)
77
+ return ret;
83
+DO_2OP_U(vsub, DO_SUB)
78
84
+DO_2OP_U(vmul, DO_MUL)
79
default_nan:
85
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
80
parts_default_nan(a, s);
86
index XXXXXXX..XXXXXXX 100644
87
--- a/target/arm/translate-mve.c
88
+++ b/target/arm/translate-mve.c
89
@@ -XXX,XX +XXX,XX @@ DO_LOGIC(VBIC, gen_helper_mve_vbic)
90
DO_LOGIC(VORR, gen_helper_mve_vorr)
91
DO_LOGIC(VORN, gen_helper_mve_vorn)
92
DO_LOGIC(VEOR, gen_helper_mve_veor)
93
+
94
+#define DO_2OP(INSN, FN) \
95
+ static bool trans_##INSN(DisasContext *s, arg_2op *a) \
96
+ { \
97
+ static MVEGenTwoOpFn * const fns[] = { \
98
+ gen_helper_mve_##FN##b, \
99
+ gen_helper_mve_##FN##h, \
100
+ gen_helper_mve_##FN##w, \
101
+ NULL, \
102
+ }; \
103
+ return do_2op(s, a, fns[a->size]); \
104
+ }
105
+
106
+DO_2OP(VADD, vadd)
107
+DO_2OP(VSUB, vsub)
108
+DO_2OP(VMUL, vmul)
109
--
81
--
110
2.20.1
82
2.34.1
111
83
112
84
diff view generated by jsdifflib
1
If the guest makes an FPCXT_NS access when the FPU is disabled,
1
From: Richard Henderson <richard.henderson@linaro.org>
2
one of two things happens:
3
* if there is no active FP context, then the insn behaves the
4
same way as if the FPU was enabled: writes ignored, reads
5
same value as FPDSCR_NS
6
* if there is an active FP context, then we take a NOCP
7
exception
8
2
9
Add code to the sysreg read/write functions which emits
3
While all indices into val[] should be in [0-2], the mask
10
code to take the NOCP exception in the latter case.
4
applied is two bits. To help static analysis see there is
5
no possibility of read beyond the end of the array, pad the
6
array to 4 entries, with the final being (implicitly) NULL.
11
7
12
At the moment this will never be used, because the NOCP checks in
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
m-nocp.decode happen first, and so the trans functions are never
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
14
called when the FPU is disabled. The code will be needed when we
10
Message-id: 20241203203949.483774-6-richard.henderson@linaro.org
15
move the sysreg access insns to before the NOCP patterns in the
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
following commit.
12
---
13
fpu/softfloat-parts.c.inc | 2 +-
14
1 file changed, 1 insertion(+), 1 deletion(-)
17
15
18
Cc: qemu-stable@nongnu.org
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
21
Message-id: 20210618141019.10671-3-peter.maydell@linaro.org
22
---
23
target/arm/translate-vfp.c | 32 ++++++++++++++++++++++++++++++--
24
1 file changed, 30 insertions(+), 2 deletions(-)
25
26
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
27
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/translate-vfp.c
18
--- a/fpu/softfloat-parts.c.inc
29
+++ b/target/arm/translate-vfp.c
19
+++ b/fpu/softfloat-parts.c.inc
30
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
31
lab_end = gen_new_label();
21
}
32
/* fpInactive case: write is a NOP, so branch to end */
22
ret = c;
33
gen_branch_fpInactive(s, TCG_COND_NE, lab_end);
23
} else {
34
- /* !fpInactive: PreserveFPState(), and reads same as FPCXT_S */
24
- FloatPartsN *val[3] = { a, b, c };
35
+ /*
25
+ FloatPartsN *val[R_3NAN_1ST_MASK + 1] = { a, b, c };
36
+ * !fpInactive: if FPU disabled, take NOCP exception;
26
Float3NaNPropRule rule = s->float_3nan_prop_rule;
37
+ * otherwise PreserveFPState(), and then FPCXT_NS writes
27
38
+ * behave the same as FPCXT_S writes.
28
assert(rule != float_3nan_prop_none);
39
+ */
40
+ if (s->fp_excp_el) {
41
+ gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
42
+ syn_uncategorized(), s->fp_excp_el);
43
+ /*
44
+ * This was only a conditional exception, so override
45
+ * gen_exception_insn()'s default to DISAS_NORETURN
46
+ */
47
+ s->base.is_jmp = DISAS_NEXT;
48
+ break;
49
+ }
50
gen_preserve_fp_state(s);
51
/* fall through */
52
case ARM_VFP_FPCXT_S:
53
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
54
tcg_gen_br(lab_end);
55
56
gen_set_label(lab_active);
57
- /* !fpInactive: Reads the same as FPCXT_S, but side effects differ */
58
+ /*
59
+ * !fpInactive: if FPU disabled, take NOCP exception;
60
+ * otherwise PreserveFPState(), and then FPCXT_NS
61
+ * reads the same as FPCXT_S.
62
+ */
63
+ if (s->fp_excp_el) {
64
+ gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
65
+ syn_uncategorized(), s->fp_excp_el);
66
+ /*
67
+ * This was only a conditional exception, so override
68
+ * gen_exception_insn()'s default to DISAS_NORETURN
69
+ */
70
+ s->base.is_jmp = DISAS_NEXT;
71
+ break;
72
+ }
73
gen_preserve_fp_state(s);
74
tmp = tcg_temp_new_i32();
75
sfpa = tcg_temp_new_i32();
76
--
29
--
77
2.20.1
30
2.34.1
78
31
79
32
diff view generated by jsdifflib
1
The M-profile architecture requires that accesses to FPCXT_NS when
1
From: Richard Henderson <richard.henderson@linaro.org>
2
there is no active FP state must not take a NOCP fault even if the
3
FPU is disabled. We were not implementing this correctly, because
4
in our decode we catch the NOCP faults early in m-nocp.decode.
5
2
6
Fix this bug by moving all the handling of M-profile FP system
3
This function is part of the public interface and
7
register accesses from vfp.decode into m-nocp.decode and putting
4
is not "specialized" to any target in any way.
8
it above the NOCP blocks. This provides the correct behaviour:
9
* for accesses other than FPCXT_NS the trans functions call
10
vfp_access_check(), which will check for FPU disabled and
11
raise a NOCP exception if necessary
12
* for FPCXT_NS we have the special case code that doesn't
13
call vfp_access_check()
14
* when these trans functions want to raise an UNDEF they return
15
false, so the decoder will fall through into the NOCP blocks.
16
This means that NOCP correctly takes precedence over UNDEF
17
for these insns. (This is a difference from the other insns
18
handled by m-nocp.decode, where UNDEF takes precedence and
19
which we implement by having those trans functions call
20
unallocated_encoding() in the appropriate places.)
21
5
22
[Note for backport to stable: this commit has a semantic dependency
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
23
on commit 9a486856e9173af, which was not marked as cc-stable because
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
24
we didn't know we'd need it for a for-stable bugfix.]
8
Message-id: 20241203203949.483774-7-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
fpu/softfloat.c | 52 ++++++++++++++++++++++++++++++++++
12
fpu/softfloat-specialize.c.inc | 52 ----------------------------------
13
2 files changed, 52 insertions(+), 52 deletions(-)
25
14
26
Cc: qemu-stable@nongnu.org
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
27
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
28
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
29
Message-id: 20210618141019.10671-4-peter.maydell@linaro.org
30
---
31
target/arm/translate-a32.h | 1 +
32
target/arm/m-nocp.decode | 24 ++
33
target/arm/vfp.decode | 14 -
34
target/arm/translate-m-nocp.c | 514 +++++++++++++++++++++++++++++++++
35
target/arm/translate-vfp.c | 517 +---------------------------------
36
5 files changed, 542 insertions(+), 528 deletions(-)
37
38
diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
39
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate-a32.h
17
--- a/fpu/softfloat.c
41
+++ b/target/arm/translate-a32.h
18
+++ b/fpu/softfloat.c
42
@@ -XXX,XX +XXX,XX @@ bool disas_neon_shared(DisasContext *s, uint32_t insn);
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
43
void load_reg_var(DisasContext *s, TCGv_i32 var, int reg);
20
*zExpPtr = 1 - shiftCount;
44
void arm_gen_condlabel(DisasContext *s);
21
}
45
bool vfp_access_check(DisasContext *s);
22
46
+void gen_preserve_fp_state(DisasContext *s);
23
+/*----------------------------------------------------------------------------
47
void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop);
24
+| Takes two extended double-precision floating-point values `a' and `b', one
48
void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop);
25
+| of which is a NaN, and returns the appropriate NaN result. If either `a' or
49
void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop);
26
+| `b' is a signaling NaN, the invalid exception is raised.
50
diff --git a/target/arm/m-nocp.decode b/target/arm/m-nocp.decode
27
+*----------------------------------------------------------------------------*/
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/m-nocp.decode
53
+++ b/target/arm/m-nocp.decode
54
@@ -XXX,XX +XXX,XX @@
55
56
&nocp cp
57
58
+# M-profile VLDR/VSTR to sysreg
59
+%vldr_sysreg 22:1 13:3
60
+%imm7_0x4 0:7 !function=times_4
61
+
28
+
62
+&vldr_sysreg rn reg imm a w p
29
+floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
63
+@vldr_sysreg .... ... . a:1 . . . rn:4 ... . ... .. ....... \
30
+{
64
+ reg=%vldr_sysreg imm=%imm7_0x4 &vldr_sysreg
31
+ bool aIsLargerSignificand;
32
+ FloatClass a_cls, b_cls;
65
+
33
+
66
{
34
+ /* This is not complete, but is good enough for pickNaN. */
67
# Special cases which do not take an early NOCP: VLLDM and VLSTM
35
+ a_cls = (!floatx80_is_any_nan(a)
68
VLLDM_VLSTM 1110 1100 001 l:1 rn:4 0000 1010 op:1 000 0000
36
+ ? float_class_normal
69
@@ -XXX,XX +XXX,XX @@
37
+ : floatx80_is_signaling_nan(a, status)
70
VSCCLRM 1110 1100 1.01 1111 .... 1011 imm:7 0 vd=%vd_dp size=3
38
+ ? float_class_snan
71
VSCCLRM 1110 1100 1.01 1111 .... 1010 imm:8 vd=%vd_sp size=2
39
+ : float_class_qnan);
72
40
+ b_cls = (!floatx80_is_any_nan(b)
73
+ # FP system register accesses: these are a special case because accesses
41
+ ? float_class_normal
74
+ # to FPCXT_NS succeed even if the FPU is disabled. We therefore need
42
+ : floatx80_is_signaling_nan(b, status)
75
+ # to handle them before the big NOCP blocks. Note that within these
43
+ ? float_class_snan
76
+ # insns NOCP still has higher priority than UNDEFs; this is implemented
44
+ : float_class_qnan);
77
+ # by their returning 'false' for UNDEF so as to fall through into the
78
+ # NOCP check (in contrast to VLLDM etc, which call unallocated_encoding()
79
+ # for the UNDEFs there that must take precedence over NOCP.)
80
+
45
+
81
+ VMSR_VMRS ---- 1110 111 l:1 reg:4 rt:4 1010 0001 0000
46
+ if (is_snan(a_cls) || is_snan(b_cls)) {
82
+
47
+ float_raise(float_flag_invalid, status);
83
+ # P=0 W=0 is SEE "Related encodings", so split into two patterns
84
+ VLDR_sysreg ---- 110 1 . . w:1 1 .... ... 0 111 11 ....... @vldr_sysreg p=1
85
+ VLDR_sysreg ---- 110 0 . . 1 1 .... ... 0 111 11 ....... @vldr_sysreg p=0 w=1
86
+ VSTR_sysreg ---- 110 1 . . w:1 0 .... ... 0 111 11 ....... @vldr_sysreg p=1
87
+ VSTR_sysreg ---- 110 0 . . 1 0 .... ... 0 111 11 ....... @vldr_sysreg p=0 w=1
88
+
89
NOCP 111- 1110 ---- ---- ---- cp:4 ---- ---- &nocp
90
NOCP 111- 110- ---- ---- ---- cp:4 ---- ---- &nocp
91
# From v8.1M onwards this range will also NOCP:
92
diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/vfp.decode
95
+++ b/target/arm/vfp.decode
96
@@ -XXX,XX +XXX,XX @@ VLDR_VSTR_hp ---- 1101 u:1 .0 l:1 rn:4 .... 1001 imm:8 vd=%vd_sp
97
VLDR_VSTR_sp ---- 1101 u:1 .0 l:1 rn:4 .... 1010 imm:8 vd=%vd_sp
98
VLDR_VSTR_dp ---- 1101 u:1 .0 l:1 rn:4 .... 1011 imm:8 vd=%vd_dp
99
100
-# M-profile VLDR/VSTR to sysreg
101
-%vldr_sysreg 22:1 13:3
102
-%imm7_0x4 0:7 !function=times_4
103
-
104
-&vldr_sysreg rn reg imm a w p
105
-@vldr_sysreg .... ... . a:1 . . . rn:4 ... . ... .. ....... \
106
- reg=%vldr_sysreg imm=%imm7_0x4 &vldr_sysreg
107
-
108
-# P=0 W=0 is SEE "Related encodings", so split into two patterns
109
-VLDR_sysreg ---- 110 1 . . w:1 1 .... ... 0 111 11 ....... @vldr_sysreg p=1
110
-VLDR_sysreg ---- 110 0 . . 1 1 .... ... 0 111 11 ....... @vldr_sysreg p=0 w=1
111
-VSTR_sysreg ---- 110 1 . . w:1 0 .... ... 0 111 11 ....... @vldr_sysreg p=1
112
-VSTR_sysreg ---- 110 0 . . 1 0 .... ... 0 111 11 ....... @vldr_sysreg p=0 w=1
113
-
114
# We split the load/store multiple up into two patterns to avoid
115
# overlap with other insns in the "Advanced SIMD load/store and 64-bit move"
116
# grouping:
117
diff --git a/target/arm/translate-m-nocp.c b/target/arm/translate-m-nocp.c
118
index XXXXXXX..XXXXXXX 100644
119
--- a/target/arm/translate-m-nocp.c
120
+++ b/target/arm/translate-m-nocp.c
121
@@ -XXX,XX +XXX,XX @@
122
123
#include "qemu/osdep.h"
124
#include "tcg/tcg-op.h"
125
+#include "tcg/tcg-op-gvec.h"
126
#include "translate.h"
127
#include "translate-a32.h"
128
129
@@ -XXX,XX +XXX,XX @@ static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
130
return true;
131
}
132
133
+/*
134
+ * M-profile provides two different sets of instructions that can
135
+ * access floating point system registers: VMSR/VMRS (which move
136
+ * to/from a general purpose register) and VLDR/VSTR sysreg (which
137
+ * move directly to/from memory). In some cases there are also side
138
+ * effects which must happen after any write to memory (which could
139
+ * cause an exception). So we implement the common logic for the
140
+ * sysreg access in gen_M_fp_sysreg_write() and gen_M_fp_sysreg_read(),
141
+ * which take pointers to callback functions which will perform the
142
+ * actual "read/write general purpose register" and "read/write
143
+ * memory" operations.
144
+ */
145
+
146
+/*
147
+ * Emit code to store the sysreg to its final destination; frees the
148
+ * TCG temp 'value' it is passed.
149
+ */
150
+typedef void fp_sysreg_storefn(DisasContext *s, void *opaque, TCGv_i32 value);
151
+/*
152
+ * Emit code to load the value to be copied to the sysreg; returns
153
+ * a new TCG temporary
154
+ */
155
+typedef TCGv_i32 fp_sysreg_loadfn(DisasContext *s, void *opaque);
156
+
157
+/* Common decode/access checks for fp sysreg read/write */
158
+typedef enum FPSysRegCheckResult {
159
+ FPSysRegCheckFailed, /* caller should return false */
160
+ FPSysRegCheckDone, /* caller should return true */
161
+ FPSysRegCheckContinue, /* caller should continue generating code */
162
+} FPSysRegCheckResult;
163
+
164
+static FPSysRegCheckResult fp_sysreg_checks(DisasContext *s, int regno)
165
+{
166
+ if (!dc_isar_feature(aa32_fpsp_v2, s) && !dc_isar_feature(aa32_mve, s)) {
167
+ return FPSysRegCheckFailed;
168
+ }
48
+ }
169
+
49
+
170
+ switch (regno) {
50
+ if (status->default_nan_mode) {
171
+ case ARM_VFP_FPSCR:
51
+ return floatx80_default_nan(status);
172
+ case QEMU_VFP_FPSCR_NZCV:
173
+ break;
174
+ case ARM_VFP_FPSCR_NZCVQC:
175
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
176
+ return FPSysRegCheckFailed;
177
+ }
178
+ break;
179
+ case ARM_VFP_FPCXT_S:
180
+ case ARM_VFP_FPCXT_NS:
181
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
182
+ return FPSysRegCheckFailed;
183
+ }
184
+ if (!s->v8m_secure) {
185
+ return FPSysRegCheckFailed;
186
+ }
187
+ break;
188
+ case ARM_VFP_VPR:
189
+ case ARM_VFP_P0:
190
+ if (!dc_isar_feature(aa32_mve, s)) {
191
+ return FPSysRegCheckFailed;
192
+ }
193
+ break;
194
+ default:
195
+ return FPSysRegCheckFailed;
196
+ }
52
+ }
197
+
53
+
198
+ /*
54
+ if (a.low < b.low) {
199
+ * FPCXT_NS is a special case: it has specific handling for
55
+ aIsLargerSignificand = 0;
200
+ * "current FP state is inactive", and must do the PreserveFPState()
56
+ } else if (b.low < a.low) {
201
+ * but not the usual full set of actions done by ExecuteFPCheck().
57
+ aIsLargerSignificand = 1;
202
+ * So we don't call vfp_access_check() and the callers must handle this.
58
+ } else {
203
+ */
59
+ aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
204
+ if (regno != ARM_VFP_FPCXT_NS && !vfp_access_check(s)) {
205
+ return FPSysRegCheckDone;
206
+ }
207
+ return FPSysRegCheckContinue;
208
+}
209
+
210
+static void gen_branch_fpInactive(DisasContext *s, TCGCond cond,
211
+ TCGLabel *label)
212
+{
213
+ /*
214
+ * FPCXT_NS is a special case: it has specific handling for
215
+ * "current FP state is inactive", and must do the PreserveFPState()
216
+ * but not the usual full set of actions done by ExecuteFPCheck().
217
+ * We don't have a TB flag that matches the fpInactive check, so we
218
+ * do it at runtime as we don't expect FPCXT_NS accesses to be frequent.
219
+ *
220
+ * Emit code that checks fpInactive and does a conditional
221
+ * branch to label based on it:
222
+ * if cond is TCG_COND_NE then branch if fpInactive != 0 (ie if inactive)
223
+ * if cond is TCG_COND_EQ then branch if fpInactive == 0 (ie if active)
224
+ */
225
+ assert(cond == TCG_COND_EQ || cond == TCG_COND_NE);
226
+
227
+ /* fpInactive = FPCCR_NS.ASPEN == 1 && CONTROL.FPCA == 0 */
228
+ TCGv_i32 aspen, fpca;
229
+ aspen = load_cpu_field(v7m.fpccr[M_REG_NS]);
230
+ fpca = load_cpu_field(v7m.control[M_REG_S]);
231
+ tcg_gen_andi_i32(aspen, aspen, R_V7M_FPCCR_ASPEN_MASK);
232
+ tcg_gen_xori_i32(aspen, aspen, R_V7M_FPCCR_ASPEN_MASK);
233
+ tcg_gen_andi_i32(fpca, fpca, R_V7M_CONTROL_FPCA_MASK);
234
+ tcg_gen_or_i32(fpca, fpca, aspen);
235
+ tcg_gen_brcondi_i32(tcg_invert_cond(cond), fpca, 0, label);
236
+ tcg_temp_free_i32(aspen);
237
+ tcg_temp_free_i32(fpca);
238
+}
239
+
240
+static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
241
+ fp_sysreg_loadfn *loadfn,
242
+ void *opaque)
243
+{
244
+ /* Do a write to an M-profile floating point system register */
245
+ TCGv_i32 tmp;
246
+ TCGLabel *lab_end = NULL;
247
+
248
+ switch (fp_sysreg_checks(s, regno)) {
249
+ case FPSysRegCheckFailed:
250
+ return false;
251
+ case FPSysRegCheckDone:
252
+ return true;
253
+ case FPSysRegCheckContinue:
254
+ break;
255
+ }
60
+ }
256
+
61
+
257
+ switch (regno) {
62
+ if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
258
+ case ARM_VFP_FPSCR:
63
+ if (is_snan(b_cls)) {
259
+ tmp = loadfn(s, opaque);
64
+ return floatx80_silence_nan(b, status);
260
+ gen_helper_vfp_set_fpscr(cpu_env, tmp);
261
+ tcg_temp_free_i32(tmp);
262
+ gen_lookup_tb(s);
263
+ break;
264
+ case ARM_VFP_FPSCR_NZCVQC:
265
+ {
266
+ TCGv_i32 fpscr;
267
+ tmp = loadfn(s, opaque);
268
+ if (dc_isar_feature(aa32_mve, s)) {
269
+ /* QC is only present for MVE; otherwise RES0 */
270
+ TCGv_i32 qc = tcg_temp_new_i32();
271
+ tcg_gen_andi_i32(qc, tmp, FPCR_QC);
272
+ /*
273
+ * The 4 vfp.qc[] fields need only be "zero" vs "non-zero";
274
+ * here writing the same value into all elements is simplest.
275
+ */
276
+ tcg_gen_gvec_dup_i32(MO_32, offsetof(CPUARMState, vfp.qc),
277
+ 16, 16, qc);
278
+ }
65
+ }
279
+ tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
66
+ return b;
280
+ fpscr = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
67
+ } else {
281
+ tcg_gen_andi_i32(fpscr, fpscr, ~FPCR_NZCV_MASK);
68
+ if (is_snan(a_cls)) {
282
+ tcg_gen_or_i32(fpscr, fpscr, tmp);
69
+ return floatx80_silence_nan(a, status);
283
+ store_cpu_field(fpscr, vfp.xregs[ARM_VFP_FPSCR]);
284
+ tcg_temp_free_i32(tmp);
285
+ break;
286
+ }
287
+ case ARM_VFP_FPCXT_NS:
288
+ lab_end = gen_new_label();
289
+ /* fpInactive case: write is a NOP, so branch to end */
290
+ gen_branch_fpInactive(s, TCG_COND_NE, lab_end);
291
+ /*
292
+ * !fpInactive: if FPU disabled, take NOCP exception;
293
+ * otherwise PreserveFPState(), and then FPCXT_NS writes
294
+ * behave the same as FPCXT_S writes.
295
+ */
296
+ if (s->fp_excp_el) {
297
+ gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
298
+ syn_uncategorized(), s->fp_excp_el);
299
+ /*
300
+ * This was only a conditional exception, so override
301
+ * gen_exception_insn()'s default to DISAS_NORETURN
302
+ */
303
+ s->base.is_jmp = DISAS_NEXT;
304
+ break;
305
+ }
70
+ }
306
+ gen_preserve_fp_state(s);
71
+ return a;
307
+ /* fall through */
308
+ case ARM_VFP_FPCXT_S:
309
+ {
310
+ TCGv_i32 sfpa, control;
311
+ /*
312
+ * Set FPSCR and CONTROL.SFPA from value; the new FPSCR takes
313
+ * bits [27:0] from value and zeroes bits [31:28].
314
+ */
315
+ tmp = loadfn(s, opaque);
316
+ sfpa = tcg_temp_new_i32();
317
+ tcg_gen_shri_i32(sfpa, tmp, 31);
318
+ control = load_cpu_field(v7m.control[M_REG_S]);
319
+ tcg_gen_deposit_i32(control, control, sfpa,
320
+ R_V7M_CONTROL_SFPA_SHIFT, 1);
321
+ store_cpu_field(control, v7m.control[M_REG_S]);
322
+ tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
323
+ gen_helper_vfp_set_fpscr(cpu_env, tmp);
324
+ tcg_temp_free_i32(tmp);
325
+ tcg_temp_free_i32(sfpa);
326
+ break;
327
+ }
328
+ case ARM_VFP_VPR:
329
+ /* Behaves as NOP if not privileged */
330
+ if (IS_USER(s)) {
331
+ break;
332
+ }
333
+ tmp = loadfn(s, opaque);
334
+ store_cpu_field(tmp, v7m.vpr);
335
+ break;
336
+ case ARM_VFP_P0:
337
+ {
338
+ TCGv_i32 vpr;
339
+ tmp = loadfn(s, opaque);
340
+ vpr = load_cpu_field(v7m.vpr);
341
+ tcg_gen_deposit_i32(vpr, vpr, tmp,
342
+ R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
343
+ store_cpu_field(vpr, v7m.vpr);
344
+ tcg_temp_free_i32(tmp);
345
+ break;
346
+ }
347
+ default:
348
+ g_assert_not_reached();
349
+ }
350
+ if (lab_end) {
351
+ gen_set_label(lab_end);
352
+ }
353
+ return true;
354
+}
355
+
356
+static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
357
+ fp_sysreg_storefn *storefn,
358
+ void *opaque)
359
+{
360
+ /* Do a read from an M-profile floating point system register */
361
+ TCGv_i32 tmp;
362
+ TCGLabel *lab_end = NULL;
363
+ bool lookup_tb = false;
364
+
365
+ switch (fp_sysreg_checks(s, regno)) {
366
+ case FPSysRegCheckFailed:
367
+ return false;
368
+ case FPSysRegCheckDone:
369
+ return true;
370
+ case FPSysRegCheckContinue:
371
+ break;
372
+ }
373
+
374
+ if (regno == ARM_VFP_FPSCR_NZCVQC && !dc_isar_feature(aa32_mve, s)) {
375
+ /* QC is RES0 without MVE, so NZCVQC simplifies to NZCV */
376
+ regno = QEMU_VFP_FPSCR_NZCV;
377
+ }
378
+
379
+ switch (regno) {
380
+ case ARM_VFP_FPSCR:
381
+ tmp = tcg_temp_new_i32();
382
+ gen_helper_vfp_get_fpscr(tmp, cpu_env);
383
+ storefn(s, opaque, tmp);
384
+ break;
385
+ case ARM_VFP_FPSCR_NZCVQC:
386
+ tmp = tcg_temp_new_i32();
387
+ gen_helper_vfp_get_fpscr(tmp, cpu_env);
388
+ tcg_gen_andi_i32(tmp, tmp, FPCR_NZCVQC_MASK);
389
+ storefn(s, opaque, tmp);
390
+ break;
391
+ case QEMU_VFP_FPSCR_NZCV:
392
+ /*
393
+ * Read just NZCV; this is a special case to avoid the
394
+ * helper call for the "VMRS to CPSR.NZCV" insn.
395
+ */
396
+ tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
397
+ tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
398
+ storefn(s, opaque, tmp);
399
+ break;
400
+ case ARM_VFP_FPCXT_S:
401
+ {
402
+ TCGv_i32 control, sfpa, fpscr;
403
+ /* Bits [27:0] from FPSCR, bit [31] from CONTROL.SFPA */
404
+ tmp = tcg_temp_new_i32();
405
+ sfpa = tcg_temp_new_i32();
406
+ gen_helper_vfp_get_fpscr(tmp, cpu_env);
407
+ tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
408
+ control = load_cpu_field(v7m.control[M_REG_S]);
409
+ tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
410
+ tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
411
+ tcg_gen_or_i32(tmp, tmp, sfpa);
412
+ tcg_temp_free_i32(sfpa);
413
+ /*
414
+ * Store result before updating FPSCR etc, in case
415
+ * it is a memory write which causes an exception.
416
+ */
417
+ storefn(s, opaque, tmp);
418
+ /*
419
+ * Now we must reset FPSCR from FPDSCR_NS, and clear
420
+ * CONTROL.SFPA; so we'll end the TB here.
421
+ */
422
+ tcg_gen_andi_i32(control, control, ~R_V7M_CONTROL_SFPA_MASK);
423
+ store_cpu_field(control, v7m.control[M_REG_S]);
424
+ fpscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
425
+ gen_helper_vfp_set_fpscr(cpu_env, fpscr);
426
+ tcg_temp_free_i32(fpscr);
427
+ lookup_tb = true;
428
+ break;
429
+ }
430
+ case ARM_VFP_FPCXT_NS:
431
+ {
432
+ TCGv_i32 control, sfpa, fpscr, fpdscr, zero;
433
+ TCGLabel *lab_active = gen_new_label();
434
+
435
+ lookup_tb = true;
436
+
437
+ gen_branch_fpInactive(s, TCG_COND_EQ, lab_active);
438
+ /* fpInactive case: reads as FPDSCR_NS */
439
+ TCGv_i32 tmp = load_cpu_field(v7m.fpdscr[M_REG_NS]);
440
+ storefn(s, opaque, tmp);
441
+ lab_end = gen_new_label();
442
+ tcg_gen_br(lab_end);
443
+
444
+ gen_set_label(lab_active);
445
+ /*
446
+ * !fpInactive: if FPU disabled, take NOCP exception;
447
+ * otherwise PreserveFPState(), and then FPCXT_NS
448
+ * reads the same as FPCXT_S.
449
+ */
450
+ if (s->fp_excp_el) {
451
+ gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
452
+ syn_uncategorized(), s->fp_excp_el);
453
+ /*
454
+ * This was only a conditional exception, so override
455
+ * gen_exception_insn()'s default to DISAS_NORETURN
456
+ */
457
+ s->base.is_jmp = DISAS_NEXT;
458
+ break;
459
+ }
460
+ gen_preserve_fp_state(s);
461
+ tmp = tcg_temp_new_i32();
462
+ sfpa = tcg_temp_new_i32();
463
+ fpscr = tcg_temp_new_i32();
464
+ gen_helper_vfp_get_fpscr(fpscr, cpu_env);
465
+ tcg_gen_andi_i32(tmp, fpscr, ~FPCR_NZCV_MASK);
466
+ control = load_cpu_field(v7m.control[M_REG_S]);
467
+ tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
468
+ tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
469
+ tcg_gen_or_i32(tmp, tmp, sfpa);
470
+ tcg_temp_free_i32(control);
471
+ /* Store result before updating FPSCR, in case it faults */
472
+ storefn(s, opaque, tmp);
473
+ /* If SFPA is zero then set FPSCR from FPDSCR_NS */
474
+ fpdscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
475
+ zero = tcg_const_i32(0);
476
+ tcg_gen_movcond_i32(TCG_COND_EQ, fpscr, sfpa, zero, fpdscr, fpscr);
477
+ gen_helper_vfp_set_fpscr(cpu_env, fpscr);
478
+ tcg_temp_free_i32(zero);
479
+ tcg_temp_free_i32(sfpa);
480
+ tcg_temp_free_i32(fpdscr);
481
+ tcg_temp_free_i32(fpscr);
482
+ break;
483
+ }
484
+ case ARM_VFP_VPR:
485
+ /* Behaves as NOP if not privileged */
486
+ if (IS_USER(s)) {
487
+ break;
488
+ }
489
+ tmp = load_cpu_field(v7m.vpr);
490
+ storefn(s, opaque, tmp);
491
+ break;
492
+ case ARM_VFP_P0:
493
+ tmp = load_cpu_field(v7m.vpr);
494
+ tcg_gen_extract_i32(tmp, tmp, R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
495
+ storefn(s, opaque, tmp);
496
+ break;
497
+ default:
498
+ g_assert_not_reached();
499
+ }
500
+
501
+ if (lab_end) {
502
+ gen_set_label(lab_end);
503
+ }
504
+ if (lookup_tb) {
505
+ gen_lookup_tb(s);
506
+ }
507
+ return true;
508
+}
509
+
510
+static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, TCGv_i32 value)
511
+{
512
+ arg_VMSR_VMRS *a = opaque;
513
+
514
+ if (a->rt == 15) {
515
+ /* Set the 4 flag bits in the CPSR */
516
+ gen_set_nzcv(value);
517
+ tcg_temp_free_i32(value);
518
+ } else {
519
+ store_reg(s, a->rt, value);
520
+ }
72
+ }
521
+}
73
+}
522
+
74
+
523
+static TCGv_i32 gpr_to_fp_sysreg(DisasContext *s, void *opaque)
75
/*----------------------------------------------------------------------------
524
+{
76
| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
525
+ arg_VMSR_VMRS *a = opaque;
77
| and extended significand formed by the concatenation of `zSig0' and `zSig1',
526
+
78
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
527
+ return load_reg(s, a->rt);
528
+}
529
+
530
+static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
531
+{
532
+ /*
533
+ * Accesses to R15 are UNPREDICTABLE; we choose to undef.
534
+ * FPSCR -> r15 is a special case which writes to the PSR flags;
535
+ * set a->reg to a special value to tell gen_M_fp_sysreg_read()
536
+ * we only care about the top 4 bits of FPSCR there.
537
+ */
538
+ if (a->rt == 15) {
539
+ if (a->l && a->reg == ARM_VFP_FPSCR) {
540
+ a->reg = QEMU_VFP_FPSCR_NZCV;
541
+ } else {
542
+ return false;
543
+ }
544
+ }
545
+
546
+ if (a->l) {
547
+ /* VMRS, move FP system register to gp register */
548
+ return gen_M_fp_sysreg_read(s, a->reg, fp_sysreg_to_gpr, a);
549
+ } else {
550
+ /* VMSR, move gp register to FP system register */
551
+ return gen_M_fp_sysreg_write(s, a->reg, gpr_to_fp_sysreg, a);
552
+ }
553
+}
554
+
555
+static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
556
+{
557
+ arg_vldr_sysreg *a = opaque;
558
+ uint32_t offset = a->imm;
559
+ TCGv_i32 addr;
560
+
561
+ if (!a->a) {
562
+ offset = -offset;
563
+ }
564
+
565
+ addr = load_reg(s, a->rn);
566
+ if (a->p) {
567
+ tcg_gen_addi_i32(addr, addr, offset);
568
+ }
569
+
570
+ if (s->v8m_stackcheck && a->rn == 13 && a->w) {
571
+ gen_helper_v8m_stackcheck(cpu_env, addr);
572
+ }
573
+
574
+ gen_aa32_st_i32(s, value, addr, get_mem_index(s),
575
+ MO_UL | MO_ALIGN | s->be_data);
576
+ tcg_temp_free_i32(value);
577
+
578
+ if (a->w) {
579
+ /* writeback */
580
+ if (!a->p) {
581
+ tcg_gen_addi_i32(addr, addr, offset);
582
+ }
583
+ store_reg(s, a->rn, addr);
584
+ } else {
585
+ tcg_temp_free_i32(addr);
586
+ }
587
+}
588
+
589
+static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque)
590
+{
591
+ arg_vldr_sysreg *a = opaque;
592
+ uint32_t offset = a->imm;
593
+ TCGv_i32 addr;
594
+ TCGv_i32 value = tcg_temp_new_i32();
595
+
596
+ if (!a->a) {
597
+ offset = -offset;
598
+ }
599
+
600
+ addr = load_reg(s, a->rn);
601
+ if (a->p) {
602
+ tcg_gen_addi_i32(addr, addr, offset);
603
+ }
604
+
605
+ if (s->v8m_stackcheck && a->rn == 13 && a->w) {
606
+ gen_helper_v8m_stackcheck(cpu_env, addr);
607
+ }
608
+
609
+ gen_aa32_ld_i32(s, value, addr, get_mem_index(s),
610
+ MO_UL | MO_ALIGN | s->be_data);
611
+
612
+ if (a->w) {
613
+ /* writeback */
614
+ if (!a->p) {
615
+ tcg_gen_addi_i32(addr, addr, offset);
616
+ }
617
+ store_reg(s, a->rn, addr);
618
+ } else {
619
+ tcg_temp_free_i32(addr);
620
+ }
621
+ return value;
622
+}
623
+
624
+static bool trans_VLDR_sysreg(DisasContext *s, arg_vldr_sysreg *a)
625
+{
626
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
627
+ return false;
628
+ }
629
+ if (a->rn == 15) {
630
+ return false;
631
+ }
632
+ return gen_M_fp_sysreg_write(s, a->reg, memory_to_fp_sysreg, a);
633
+}
634
+
635
+static bool trans_VSTR_sysreg(DisasContext *s, arg_vldr_sysreg *a)
636
+{
637
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
638
+ return false;
639
+ }
640
+ if (a->rn == 15) {
641
+ return false;
642
+ }
643
+ return gen_M_fp_sysreg_read(s, a->reg, fp_sysreg_to_memory, a);
644
+}
645
+
646
static bool trans_NOCP(DisasContext *s, arg_nocp *a)
647
{
648
/*
649
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
650
index XXXXXXX..XXXXXXX 100644
79
index XXXXXXX..XXXXXXX 100644
651
--- a/target/arm/translate-vfp.c
80
--- a/fpu/softfloat-specialize.c.inc
652
+++ b/target/arm/translate-vfp.c
81
+++ b/fpu/softfloat-specialize.c.inc
653
@@ -XXX,XX +XXX,XX @@ static inline long vfp_f16_offset(unsigned reg, bool top)
82
@@ -XXX,XX +XXX,XX @@ floatx80 floatx80_silence_nan(floatx80 a, float_status *status)
654
* Generate code for M-profile lazy FP state preservation if needed;
83
return a;
655
* this corresponds to the pseudocode PreserveFPState() function.
656
*/
657
-static void gen_preserve_fp_state(DisasContext *s)
658
+void gen_preserve_fp_state(DisasContext *s)
659
{
660
if (s->v7m_lspact) {
661
/*
662
@@ -XXX,XX +XXX,XX @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
663
return true;
664
}
84
}
665
85
666
-/*
86
-/*----------------------------------------------------------------------------
667
- * M-profile provides two different sets of instructions that can
87
-| Takes two extended double-precision floating-point values `a' and `b', one
668
- * access floating point system registers: VMSR/VMRS (which move
88
-| of which is a NaN, and returns the appropriate NaN result. If either `a' or
669
- * to/from a general purpose register) and VLDR/VSTR sysreg (which
89
-| `b' is a signaling NaN, the invalid exception is raised.
670
- * move directly to/from memory). In some cases there are also side
90
-*----------------------------------------------------------------------------*/
671
- * effects which must happen after any write to memory (which could
672
- * cause an exception). So we implement the common logic for the
673
- * sysreg access in gen_M_fp_sysreg_write() and gen_M_fp_sysreg_read(),
674
- * which take pointers to callback functions which will perform the
675
- * actual "read/write general purpose register" and "read/write
676
- * memory" operations.
677
- */
678
-
91
-
679
-/*
92
-floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
680
- * Emit code to store the sysreg to its final destination; frees the
93
-{
681
- * TCG temp 'value' it is passed.
94
- bool aIsLargerSignificand;
682
- */
95
- FloatClass a_cls, b_cls;
683
-typedef void fp_sysreg_storefn(DisasContext *s, void *opaque, TCGv_i32 value);
684
-/*
685
- * Emit code to load the value to be copied to the sysreg; returns
686
- * a new TCG temporary
687
- */
688
-typedef TCGv_i32 fp_sysreg_loadfn(DisasContext *s, void *opaque);
689
-
96
-
690
-/* Common decode/access checks for fp sysreg read/write */
97
- /* This is not complete, but is good enough for pickNaN. */
691
-typedef enum FPSysRegCheckResult {
98
- a_cls = (!floatx80_is_any_nan(a)
692
- FPSysRegCheckFailed, /* caller should return false */
99
- ? float_class_normal
693
- FPSysRegCheckDone, /* caller should return true */
100
- : floatx80_is_signaling_nan(a, status)
694
- FPSysRegCheckContinue, /* caller should continue generating code */
101
- ? float_class_snan
695
-} FPSysRegCheckResult;
102
- : float_class_qnan);
103
- b_cls = (!floatx80_is_any_nan(b)
104
- ? float_class_normal
105
- : floatx80_is_signaling_nan(b, status)
106
- ? float_class_snan
107
- : float_class_qnan);
696
-
108
-
697
-static FPSysRegCheckResult fp_sysreg_checks(DisasContext *s, int regno)
109
- if (is_snan(a_cls) || is_snan(b_cls)) {
698
-{
110
- float_raise(float_flag_invalid, status);
699
- if (!dc_isar_feature(aa32_fpsp_v2, s) && !dc_isar_feature(aa32_mve, s)) {
700
- return FPSysRegCheckFailed;
701
- }
111
- }
702
-
112
-
703
- switch (regno) {
113
- if (status->default_nan_mode) {
704
- case ARM_VFP_FPSCR:
114
- return floatx80_default_nan(status);
705
- case QEMU_VFP_FPSCR_NZCV:
706
- break;
707
- case ARM_VFP_FPSCR_NZCVQC:
708
- if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
709
- return FPSysRegCheckFailed;
710
- }
711
- break;
712
- case ARM_VFP_FPCXT_S:
713
- case ARM_VFP_FPCXT_NS:
714
- if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
715
- return FPSysRegCheckFailed;
716
- }
717
- if (!s->v8m_secure) {
718
- return FPSysRegCheckFailed;
719
- }
720
- break;
721
- case ARM_VFP_VPR:
722
- case ARM_VFP_P0:
723
- if (!dc_isar_feature(aa32_mve, s)) {
724
- return FPSysRegCheckFailed;
725
- }
726
- break;
727
- default:
728
- return FPSysRegCheckFailed;
729
- }
115
- }
730
-
116
-
731
- /*
117
- if (a.low < b.low) {
732
- * FPCXT_NS is a special case: it has specific handling for
118
- aIsLargerSignificand = 0;
733
- * "current FP state is inactive", and must do the PreserveFPState()
119
- } else if (b.low < a.low) {
734
- * but not the usual full set of actions done by ExecuteFPCheck().
120
- aIsLargerSignificand = 1;
735
- * So we don't call vfp_access_check() and the callers must handle this.
121
- } else {
736
- */
122
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
737
- if (regno != ARM_VFP_FPCXT_NS && !vfp_access_check(s)) {
738
- return FPSysRegCheckDone;
739
- }
740
- return FPSysRegCheckContinue;
741
-}
742
-
743
-static void gen_branch_fpInactive(DisasContext *s, TCGCond cond,
744
- TCGLabel *label)
745
-{
746
- /*
747
- * FPCXT_NS is a special case: it has specific handling for
748
- * "current FP state is inactive", and must do the PreserveFPState()
749
- * but not the usual full set of actions done by ExecuteFPCheck().
750
- * We don't have a TB flag that matches the fpInactive check, so we
751
- * do it at runtime as we don't expect FPCXT_NS accesses to be frequent.
752
- *
753
- * Emit code that checks fpInactive and does a conditional
754
- * branch to label based on it:
755
- * if cond is TCG_COND_NE then branch if fpInactive != 0 (ie if inactive)
756
- * if cond is TCG_COND_EQ then branch if fpInactive == 0 (ie if active)
757
- */
758
- assert(cond == TCG_COND_EQ || cond == TCG_COND_NE);
759
-
760
- /* fpInactive = FPCCR_NS.ASPEN == 1 && CONTROL.FPCA == 0 */
761
- TCGv_i32 aspen, fpca;
762
- aspen = load_cpu_field(v7m.fpccr[M_REG_NS]);
763
- fpca = load_cpu_field(v7m.control[M_REG_S]);
764
- tcg_gen_andi_i32(aspen, aspen, R_V7M_FPCCR_ASPEN_MASK);
765
- tcg_gen_xori_i32(aspen, aspen, R_V7M_FPCCR_ASPEN_MASK);
766
- tcg_gen_andi_i32(fpca, fpca, R_V7M_CONTROL_FPCA_MASK);
767
- tcg_gen_or_i32(fpca, fpca, aspen);
768
- tcg_gen_brcondi_i32(tcg_invert_cond(cond), fpca, 0, label);
769
- tcg_temp_free_i32(aspen);
770
- tcg_temp_free_i32(fpca);
771
-}
772
-
773
-static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
774
- fp_sysreg_loadfn *loadfn,
775
- void *opaque)
776
-{
777
- /* Do a write to an M-profile floating point system register */
778
- TCGv_i32 tmp;
779
- TCGLabel *lab_end = NULL;
780
-
781
- switch (fp_sysreg_checks(s, regno)) {
782
- case FPSysRegCheckFailed:
783
- return false;
784
- case FPSysRegCheckDone:
785
- return true;
786
- case FPSysRegCheckContinue:
787
- break;
788
- }
123
- }
789
-
124
-
790
- switch (regno) {
125
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
791
- case ARM_VFP_FPSCR:
126
- if (is_snan(b_cls)) {
792
- tmp = loadfn(s, opaque);
127
- return floatx80_silence_nan(b, status);
793
- gen_helper_vfp_set_fpscr(cpu_env, tmp);
794
- tcg_temp_free_i32(tmp);
795
- gen_lookup_tb(s);
796
- break;
797
- case ARM_VFP_FPSCR_NZCVQC:
798
- {
799
- TCGv_i32 fpscr;
800
- tmp = loadfn(s, opaque);
801
- if (dc_isar_feature(aa32_mve, s)) {
802
- /* QC is only present for MVE; otherwise RES0 */
803
- TCGv_i32 qc = tcg_temp_new_i32();
804
- tcg_gen_andi_i32(qc, tmp, FPCR_QC);
805
- /*
806
- * The 4 vfp.qc[] fields need only be "zero" vs "non-zero";
807
- * here writing the same value into all elements is simplest.
808
- */
809
- tcg_gen_gvec_dup_i32(MO_32, offsetof(CPUARMState, vfp.qc),
810
- 16, 16, qc);
811
- }
128
- }
812
- tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
129
- return b;
813
- fpscr = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
130
- } else {
814
- tcg_gen_andi_i32(fpscr, fpscr, ~FPCR_NZCV_MASK);
131
- if (is_snan(a_cls)) {
815
- tcg_gen_or_i32(fpscr, fpscr, tmp);
132
- return floatx80_silence_nan(a, status);
816
- store_cpu_field(fpscr, vfp.xregs[ARM_VFP_FPSCR]);
817
- tcg_temp_free_i32(tmp);
818
- break;
819
- }
820
- case ARM_VFP_FPCXT_NS:
821
- lab_end = gen_new_label();
822
- /* fpInactive case: write is a NOP, so branch to end */
823
- gen_branch_fpInactive(s, TCG_COND_NE, lab_end);
824
- /*
825
- * !fpInactive: if FPU disabled, take NOCP exception;
826
- * otherwise PreserveFPState(), and then FPCXT_NS writes
827
- * behave the same as FPCXT_S writes.
828
- */
829
- if (s->fp_excp_el) {
830
- gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
831
- syn_uncategorized(), s->fp_excp_el);
832
- /*
833
- * This was only a conditional exception, so override
834
- * gen_exception_insn()'s default to DISAS_NORETURN
835
- */
836
- s->base.is_jmp = DISAS_NEXT;
837
- break;
838
- }
133
- }
839
- gen_preserve_fp_state(s);
134
- return a;
840
- /* fall through */
841
- case ARM_VFP_FPCXT_S:
842
- {
843
- TCGv_i32 sfpa, control;
844
- /*
845
- * Set FPSCR and CONTROL.SFPA from value; the new FPSCR takes
846
- * bits [27:0] from value and zeroes bits [31:28].
847
- */
848
- tmp = loadfn(s, opaque);
849
- sfpa = tcg_temp_new_i32();
850
- tcg_gen_shri_i32(sfpa, tmp, 31);
851
- control = load_cpu_field(v7m.control[M_REG_S]);
852
- tcg_gen_deposit_i32(control, control, sfpa,
853
- R_V7M_CONTROL_SFPA_SHIFT, 1);
854
- store_cpu_field(control, v7m.control[M_REG_S]);
855
- tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
856
- gen_helper_vfp_set_fpscr(cpu_env, tmp);
857
- tcg_temp_free_i32(tmp);
858
- tcg_temp_free_i32(sfpa);
859
- break;
860
- }
861
- case ARM_VFP_VPR:
862
- /* Behaves as NOP if not privileged */
863
- if (IS_USER(s)) {
864
- break;
865
- }
866
- tmp = loadfn(s, opaque);
867
- store_cpu_field(tmp, v7m.vpr);
868
- break;
869
- case ARM_VFP_P0:
870
- {
871
- TCGv_i32 vpr;
872
- tmp = loadfn(s, opaque);
873
- vpr = load_cpu_field(v7m.vpr);
874
- tcg_gen_deposit_i32(vpr, vpr, tmp,
875
- R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
876
- store_cpu_field(vpr, v7m.vpr);
877
- tcg_temp_free_i32(tmp);
878
- break;
879
- }
880
- default:
881
- g_assert_not_reached();
882
- }
883
- if (lab_end) {
884
- gen_set_label(lab_end);
885
- }
886
- return true;
887
-}
888
-
889
-static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
890
- fp_sysreg_storefn *storefn,
891
- void *opaque)
892
-{
893
- /* Do a read from an M-profile floating point system register */
894
- TCGv_i32 tmp;
895
- TCGLabel *lab_end = NULL;
896
- bool lookup_tb = false;
897
-
898
- switch (fp_sysreg_checks(s, regno)) {
899
- case FPSysRegCheckFailed:
900
- return false;
901
- case FPSysRegCheckDone:
902
- return true;
903
- case FPSysRegCheckContinue:
904
- break;
905
- }
906
-
907
- if (regno == ARM_VFP_FPSCR_NZCVQC && !dc_isar_feature(aa32_mve, s)) {
908
- /* QC is RES0 without MVE, so NZCVQC simplifies to NZCV */
909
- regno = QEMU_VFP_FPSCR_NZCV;
910
- }
911
-
912
- switch (regno) {
913
- case ARM_VFP_FPSCR:
914
- tmp = tcg_temp_new_i32();
915
- gen_helper_vfp_get_fpscr(tmp, cpu_env);
916
- storefn(s, opaque, tmp);
917
- break;
918
- case ARM_VFP_FPSCR_NZCVQC:
919
- tmp = tcg_temp_new_i32();
920
- gen_helper_vfp_get_fpscr(tmp, cpu_env);
921
- tcg_gen_andi_i32(tmp, tmp, FPCR_NZCVQC_MASK);
922
- storefn(s, opaque, tmp);
923
- break;
924
- case QEMU_VFP_FPSCR_NZCV:
925
- /*
926
- * Read just NZCV; this is a special case to avoid the
927
- * helper call for the "VMRS to CPSR.NZCV" insn.
928
- */
929
- tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
930
- tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
931
- storefn(s, opaque, tmp);
932
- break;
933
- case ARM_VFP_FPCXT_S:
934
- {
935
- TCGv_i32 control, sfpa, fpscr;
936
- /* Bits [27:0] from FPSCR, bit [31] from CONTROL.SFPA */
937
- tmp = tcg_temp_new_i32();
938
- sfpa = tcg_temp_new_i32();
939
- gen_helper_vfp_get_fpscr(tmp, cpu_env);
940
- tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
941
- control = load_cpu_field(v7m.control[M_REG_S]);
942
- tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
943
- tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
944
- tcg_gen_or_i32(tmp, tmp, sfpa);
945
- tcg_temp_free_i32(sfpa);
946
- /*
947
- * Store result before updating FPSCR etc, in case
948
- * it is a memory write which causes an exception.
949
- */
950
- storefn(s, opaque, tmp);
951
- /*
952
- * Now we must reset FPSCR from FPDSCR_NS, and clear
953
- * CONTROL.SFPA; so we'll end the TB here.
954
- */
955
- tcg_gen_andi_i32(control, control, ~R_V7M_CONTROL_SFPA_MASK);
956
- store_cpu_field(control, v7m.control[M_REG_S]);
957
- fpscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
958
- gen_helper_vfp_set_fpscr(cpu_env, fpscr);
959
- tcg_temp_free_i32(fpscr);
960
- lookup_tb = true;
961
- break;
962
- }
963
- case ARM_VFP_FPCXT_NS:
964
- {
965
- TCGv_i32 control, sfpa, fpscr, fpdscr, zero;
966
- TCGLabel *lab_active = gen_new_label();
967
-
968
- lookup_tb = true;
969
-
970
- gen_branch_fpInactive(s, TCG_COND_EQ, lab_active);
971
- /* fpInactive case: reads as FPDSCR_NS */
972
- TCGv_i32 tmp = load_cpu_field(v7m.fpdscr[M_REG_NS]);
973
- storefn(s, opaque, tmp);
974
- lab_end = gen_new_label();
975
- tcg_gen_br(lab_end);
976
-
977
- gen_set_label(lab_active);
978
- /*
979
- * !fpInactive: if FPU disabled, take NOCP exception;
980
- * otherwise PreserveFPState(), and then FPCXT_NS
981
- * reads the same as FPCXT_S.
982
- */
983
- if (s->fp_excp_el) {
984
- gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
985
- syn_uncategorized(), s->fp_excp_el);
986
- /*
987
- * This was only a conditional exception, so override
988
- * gen_exception_insn()'s default to DISAS_NORETURN
989
- */
990
- s->base.is_jmp = DISAS_NEXT;
991
- break;
992
- }
993
- gen_preserve_fp_state(s);
994
- tmp = tcg_temp_new_i32();
995
- sfpa = tcg_temp_new_i32();
996
- fpscr = tcg_temp_new_i32();
997
- gen_helper_vfp_get_fpscr(fpscr, cpu_env);
998
- tcg_gen_andi_i32(tmp, fpscr, ~FPCR_NZCV_MASK);
999
- control = load_cpu_field(v7m.control[M_REG_S]);
1000
- tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
1001
- tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
1002
- tcg_gen_or_i32(tmp, tmp, sfpa);
1003
- tcg_temp_free_i32(control);
1004
- /* Store result before updating FPSCR, in case it faults */
1005
- storefn(s, opaque, tmp);
1006
- /* If SFPA is zero then set FPSCR from FPDSCR_NS */
1007
- fpdscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
1008
- zero = tcg_const_i32(0);
1009
- tcg_gen_movcond_i32(TCG_COND_EQ, fpscr, sfpa, zero, fpdscr, fpscr);
1010
- gen_helper_vfp_set_fpscr(cpu_env, fpscr);
1011
- tcg_temp_free_i32(zero);
1012
- tcg_temp_free_i32(sfpa);
1013
- tcg_temp_free_i32(fpdscr);
1014
- tcg_temp_free_i32(fpscr);
1015
- break;
1016
- }
1017
- case ARM_VFP_VPR:
1018
- /* Behaves as NOP if not privileged */
1019
- if (IS_USER(s)) {
1020
- break;
1021
- }
1022
- tmp = load_cpu_field(v7m.vpr);
1023
- storefn(s, opaque, tmp);
1024
- break;
1025
- case ARM_VFP_P0:
1026
- tmp = load_cpu_field(v7m.vpr);
1027
- tcg_gen_extract_i32(tmp, tmp, R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
1028
- storefn(s, opaque, tmp);
1029
- break;
1030
- default:
1031
- g_assert_not_reached();
1032
- }
1033
-
1034
- if (lab_end) {
1035
- gen_set_label(lab_end);
1036
- }
1037
- if (lookup_tb) {
1038
- gen_lookup_tb(s);
1039
- }
1040
- return true;
1041
-}
1042
-
1043
-static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, TCGv_i32 value)
1044
-{
1045
- arg_VMSR_VMRS *a = opaque;
1046
-
1047
- if (a->rt == 15) {
1048
- /* Set the 4 flag bits in the CPSR */
1049
- gen_set_nzcv(value);
1050
- tcg_temp_free_i32(value);
1051
- } else {
1052
- store_reg(s, a->rt, value);
1053
- }
135
- }
1054
-}
136
-}
1055
-
137
-
1056
-static TCGv_i32 gpr_to_fp_sysreg(DisasContext *s, void *opaque)
138
/*----------------------------------------------------------------------------
1057
-{
139
| Returns 1 if the quadruple-precision floating-point value `a' is a quiet
1058
- arg_VMSR_VMRS *a = opaque;
140
| NaN; otherwise returns 0.
1059
-
1060
- return load_reg(s, a->rt);
1061
-}
1062
-
1063
-static bool gen_M_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
1064
-{
1065
- /*
1066
- * Accesses to R15 are UNPREDICTABLE; we choose to undef.
1067
- * FPSCR -> r15 is a special case which writes to the PSR flags;
1068
- * set a->reg to a special value to tell gen_M_fp_sysreg_read()
1069
- * we only care about the top 4 bits of FPSCR there.
1070
- */
1071
- if (a->rt == 15) {
1072
- if (a->l && a->reg == ARM_VFP_FPSCR) {
1073
- a->reg = QEMU_VFP_FPSCR_NZCV;
1074
- } else {
1075
- return false;
1076
- }
1077
- }
1078
-
1079
- if (a->l) {
1080
- /* VMRS, move FP system register to gp register */
1081
- return gen_M_fp_sysreg_read(s, a->reg, fp_sysreg_to_gpr, a);
1082
- } else {
1083
- /* VMSR, move gp register to FP system register */
1084
- return gen_M_fp_sysreg_write(s, a->reg, gpr_to_fp_sysreg, a);
1085
- }
1086
-}
1087
-
1088
static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
1089
{
1090
TCGv_i32 tmp;
1091
bool ignore_vfp_enabled = false;
1092
1093
if (arm_dc_feature(s, ARM_FEATURE_M)) {
1094
- return gen_M_VMSR_VMRS(s, a);
1095
+ /* M profile version was already handled in m-nocp.decode */
1096
+ return false;
1097
}
1098
1099
if (!dc_isar_feature(aa32_fpsp_v2, s)) {
1100
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
1101
return true;
1102
}
1103
1104
-static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
1105
-{
1106
- arg_vldr_sysreg *a = opaque;
1107
- uint32_t offset = a->imm;
1108
- TCGv_i32 addr;
1109
-
1110
- if (!a->a) {
1111
- offset = -offset;
1112
- }
1113
-
1114
- addr = load_reg(s, a->rn);
1115
- if (a->p) {
1116
- tcg_gen_addi_i32(addr, addr, offset);
1117
- }
1118
-
1119
- if (s->v8m_stackcheck && a->rn == 13 && a->w) {
1120
- gen_helper_v8m_stackcheck(cpu_env, addr);
1121
- }
1122
-
1123
- gen_aa32_st_i32(s, value, addr, get_mem_index(s),
1124
- MO_UL | MO_ALIGN | s->be_data);
1125
- tcg_temp_free_i32(value);
1126
-
1127
- if (a->w) {
1128
- /* writeback */
1129
- if (!a->p) {
1130
- tcg_gen_addi_i32(addr, addr, offset);
1131
- }
1132
- store_reg(s, a->rn, addr);
1133
- } else {
1134
- tcg_temp_free_i32(addr);
1135
- }
1136
-}
1137
-
1138
-static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque)
1139
-{
1140
- arg_vldr_sysreg *a = opaque;
1141
- uint32_t offset = a->imm;
1142
- TCGv_i32 addr;
1143
- TCGv_i32 value = tcg_temp_new_i32();
1144
-
1145
- if (!a->a) {
1146
- offset = -offset;
1147
- }
1148
-
1149
- addr = load_reg(s, a->rn);
1150
- if (a->p) {
1151
- tcg_gen_addi_i32(addr, addr, offset);
1152
- }
1153
-
1154
- if (s->v8m_stackcheck && a->rn == 13 && a->w) {
1155
- gen_helper_v8m_stackcheck(cpu_env, addr);
1156
- }
1157
-
1158
- gen_aa32_ld_i32(s, value, addr, get_mem_index(s),
1159
- MO_UL | MO_ALIGN | s->be_data);
1160
-
1161
- if (a->w) {
1162
- /* writeback */
1163
- if (!a->p) {
1164
- tcg_gen_addi_i32(addr, addr, offset);
1165
- }
1166
- store_reg(s, a->rn, addr);
1167
- } else {
1168
- tcg_temp_free_i32(addr);
1169
- }
1170
- return value;
1171
-}
1172
-
1173
-static bool trans_VLDR_sysreg(DisasContext *s, arg_vldr_sysreg *a)
1174
-{
1175
- if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
1176
- return false;
1177
- }
1178
- if (a->rn == 15) {
1179
- return false;
1180
- }
1181
- return gen_M_fp_sysreg_write(s, a->reg, memory_to_fp_sysreg, a);
1182
-}
1183
-
1184
-static bool trans_VSTR_sysreg(DisasContext *s, arg_vldr_sysreg *a)
1185
-{
1186
- if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
1187
- return false;
1188
- }
1189
- if (a->rn == 15) {
1190
- return false;
1191
- }
1192
- return gen_M_fp_sysreg_read(s, a->reg, fp_sysreg_to_memory, a);
1193
-}
1194
1195
static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a)
1196
{
1197
--
141
--
1198
2.20.1
142
2.34.1
1199
1200
diff view generated by jsdifflib
New patch
1
From: Richard Henderson <richard.henderson@linaro.org>
1
2
3
Unpacking and repacking the parts may be slightly more work
4
than we did before, but we get to reuse more code. For a
5
code path handling exceptional values, this is an improvement.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241203203949.483774-8-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
fpu/softfloat.c | 43 +++++--------------------------------------
13
1 file changed, 5 insertions(+), 38 deletions(-)
14
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/fpu/softfloat.c
18
+++ b/fpu/softfloat.c
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
20
21
floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
22
{
23
- bool aIsLargerSignificand;
24
- FloatClass a_cls, b_cls;
25
+ FloatParts128 pa, pb, *pr;
26
27
- /* This is not complete, but is good enough for pickNaN. */
28
- a_cls = (!floatx80_is_any_nan(a)
29
- ? float_class_normal
30
- : floatx80_is_signaling_nan(a, status)
31
- ? float_class_snan
32
- : float_class_qnan);
33
- b_cls = (!floatx80_is_any_nan(b)
34
- ? float_class_normal
35
- : floatx80_is_signaling_nan(b, status)
36
- ? float_class_snan
37
- : float_class_qnan);
38
-
39
- if (is_snan(a_cls) || is_snan(b_cls)) {
40
- float_raise(float_flag_invalid, status);
41
- }
42
-
43
- if (status->default_nan_mode) {
44
+ if (!floatx80_unpack_canonical(&pa, a, status) ||
45
+ !floatx80_unpack_canonical(&pb, b, status)) {
46
return floatx80_default_nan(status);
47
}
48
49
- if (a.low < b.low) {
50
- aIsLargerSignificand = 0;
51
- } else if (b.low < a.low) {
52
- aIsLargerSignificand = 1;
53
- } else {
54
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
55
- }
56
-
57
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
58
- if (is_snan(b_cls)) {
59
- return floatx80_silence_nan(b, status);
60
- }
61
- return b;
62
- } else {
63
- if (is_snan(a_cls)) {
64
- return floatx80_silence_nan(a, status);
65
- }
66
- return a;
67
- }
68
+ pr = parts_pick_nan(&pa, &pb, status);
69
+ return floatx80_round_pack_canonical(pr, status);
70
}
71
72
/*----------------------------------------------------------------------------
73
--
74
2.34.1
diff view generated by jsdifflib
1
The Arm MVE VDUP implementation would like to be able to emit code to
1
From: Richard Henderson <richard.henderson@linaro.org>
2
duplicate a byte or halfword value into an i32. We have code to do
2
3
this already in tcg-op-gvec.c, so all we need to do is make the
3
Inline pickNaN into its only caller. This makes one assert
4
functions global.
4
redundant with the immediately preceding IF.
5
5
6
For consistency with other functions made available to the frontends:
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
* we rename to tcg_gen_dup_*
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
* we expose both the _i32 and _i64 forms
8
Message-id: 20241203203949.483774-9-richard.henderson@linaro.org
9
* we provide the #define for a _tl form
10
11
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
15
---
10
---
16
include/tcg/tcg-op.h | 8 ++++++++
11
fpu/softfloat-parts.c.inc | 82 +++++++++++++++++++++++++----
17
include/tcg/tcg.h | 1 -
12
fpu/softfloat-specialize.c.inc | 96 ----------------------------------
18
tcg/tcg-op-gvec.c | 20 ++++++++++----------
13
2 files changed, 73 insertions(+), 105 deletions(-)
19
3 files changed, 18 insertions(+), 11 deletions(-)
14
20
15
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
21
diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
22
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
23
--- a/include/tcg/tcg-op.h
17
--- a/fpu/softfloat-parts.c.inc
24
+++ b/include/tcg/tcg-op.h
18
+++ b/fpu/softfloat-parts.c.inc
25
@@ -XXX,XX +XXX,XX @@ void tcg_gen_umin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
19
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
26
void tcg_gen_umax_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
20
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
27
void tcg_gen_abs_i32(TCGv_i32, TCGv_i32);
21
float_status *s)
28
22
{
29
+/* Replicate a value of size @vece from @in to all the lanes in @out */
23
+ int cmp, which;
30
+void tcg_gen_dup_i32(unsigned vece, TCGv_i32 out, TCGv_i32 in);
31
+
24
+
32
static inline void tcg_gen_discard_i32(TCGv_i32 arg)
25
if (is_snan(a->cls) || is_snan(b->cls)) {
33
{
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
34
tcg_gen_op1_i32(INDEX_op_discard, arg);
27
}
35
@@ -XXX,XX +XXX,XX @@ void tcg_gen_umin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
28
36
void tcg_gen_umax_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
29
if (s->default_nan_mode) {
37
void tcg_gen_abs_i64(TCGv_i64, TCGv_i64);
30
parts_default_nan(a, s);
38
31
- } else {
39
+/* Replicate a value of size @vece from @in to all the lanes in @out */
32
- int cmp = frac_cmp(a, b);
40
+void tcg_gen_dup_i64(unsigned vece, TCGv_i64 out, TCGv_i64 in);
33
- if (cmp == 0) {
34
- cmp = a->sign < b->sign;
35
- }
36
+ return a;
37
+ }
38
39
- if (pickNaN(a->cls, b->cls, cmp > 0, s)) {
40
- a = b;
41
- }
42
+ cmp = frac_cmp(a, b);
43
+ if (cmp == 0) {
44
+ cmp = a->sign < b->sign;
45
+ }
41
+
46
+
42
#if TCG_TARGET_REG_BITS == 64
47
+ switch (s->float_2nan_prop_rule) {
43
static inline void tcg_gen_discard_i64(TCGv_i64 arg)
48
+ case float_2nan_prop_s_ab:
44
{
49
if (is_snan(a->cls)) {
45
@@ -XXX,XX +XXX,XX @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset, TCGType t);
50
- parts_silence_nan(a, s);
46
#define tcg_gen_atomic_smax_fetch_tl tcg_gen_atomic_smax_fetch_i64
51
+ which = 0;
47
#define tcg_gen_atomic_umax_fetch_tl tcg_gen_atomic_umax_fetch_i64
52
+ } else if (is_snan(b->cls)) {
48
#define tcg_gen_dup_tl_vec tcg_gen_dup_i64_vec
53
+ which = 1;
49
+#define tcg_gen_dup_tl tcg_gen_dup_i64
54
+ } else if (is_qnan(a->cls)) {
50
#else
55
+ which = 0;
51
#define tcg_gen_movi_tl tcg_gen_movi_i32
56
+ } else {
52
#define tcg_gen_mov_tl tcg_gen_mov_i32
57
+ which = 1;
53
@@ -XXX,XX +XXX,XX @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset, TCGType t);
58
}
54
#define tcg_gen_atomic_smax_fetch_tl tcg_gen_atomic_smax_fetch_i32
59
+ break;
55
#define tcg_gen_atomic_umax_fetch_tl tcg_gen_atomic_umax_fetch_i32
60
+ case float_2nan_prop_s_ba:
56
#define tcg_gen_dup_tl_vec tcg_gen_dup_i32_vec
61
+ if (is_snan(b->cls)) {
57
+#define tcg_gen_dup_tl tcg_gen_dup_i32
62
+ which = 1;
58
#endif
63
+ } else if (is_snan(a->cls)) {
59
64
+ which = 0;
60
#if UINTPTR_MAX == UINT32_MAX
65
+ } else if (is_qnan(b->cls)) {
61
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
66
+ which = 1;
67
+ } else {
68
+ which = 0;
69
+ }
70
+ break;
71
+ case float_2nan_prop_ab:
72
+ which = is_nan(a->cls) ? 0 : 1;
73
+ break;
74
+ case float_2nan_prop_ba:
75
+ which = is_nan(b->cls) ? 1 : 0;
76
+ break;
77
+ case float_2nan_prop_x87:
78
+ /*
79
+ * This implements x87 NaN propagation rules:
80
+ * SNaN + QNaN => return the QNaN
81
+ * two SNaNs => return the one with the larger significand, silenced
82
+ * two QNaNs => return the one with the larger significand
83
+ * SNaN and a non-NaN => return the SNaN, silenced
84
+ * QNaN and a non-NaN => return the QNaN
85
+ *
86
+ * If we get down to comparing significands and they are the same,
87
+ * return the NaN with the positive sign bit (if any).
88
+ */
89
+ if (is_snan(a->cls)) {
90
+ if (is_snan(b->cls)) {
91
+ which = cmp > 0 ? 0 : 1;
92
+ } else {
93
+ which = is_qnan(b->cls) ? 1 : 0;
94
+ }
95
+ } else if (is_qnan(a->cls)) {
96
+ if (is_snan(b->cls) || !is_qnan(b->cls)) {
97
+ which = 0;
98
+ } else {
99
+ which = cmp > 0 ? 0 : 1;
100
+ }
101
+ } else {
102
+ which = 1;
103
+ }
104
+ break;
105
+ default:
106
+ g_assert_not_reached();
107
+ }
108
+
109
+ if (which) {
110
+ a = b;
111
+ }
112
+ if (is_snan(a->cls)) {
113
+ parts_silence_nan(a, s);
114
}
115
return a;
116
}
117
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
62
index XXXXXXX..XXXXXXX 100644
118
index XXXXXXX..XXXXXXX 100644
63
--- a/include/tcg/tcg.h
119
--- a/fpu/softfloat-specialize.c.inc
64
+++ b/include/tcg/tcg.h
120
+++ b/fpu/softfloat-specialize.c.inc
65
@@ -XXX,XX +XXX,XX @@ uint64_t dup_const(unsigned vece, uint64_t c);
121
@@ -XXX,XX +XXX,XX @@ bool float32_is_signaling_nan(float32 a_, float_status *status)
66
: (qemu_build_not_reached_always(), 0)) \
67
: dup_const(VECE, C))
68
69
-
70
/*
71
* Memory helpers that will be used by TCG generated code.
72
*/
73
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/tcg/tcg-op-gvec.c
76
+++ b/tcg/tcg-op-gvec.c
77
@@ -XXX,XX +XXX,XX @@ uint64_t (dup_const)(unsigned vece, uint64_t c)
78
}
79
80
/* Duplicate IN into OUT as per VECE. */
81
-static void gen_dup_i32(unsigned vece, TCGv_i32 out, TCGv_i32 in)
82
+void tcg_gen_dup_i32(unsigned vece, TCGv_i32 out, TCGv_i32 in)
83
{
84
switch (vece) {
85
case MO_8:
86
@@ -XXX,XX +XXX,XX @@ static void gen_dup_i32(unsigned vece, TCGv_i32 out, TCGv_i32 in)
87
}
122
}
88
}
123
}
89
124
90
-static void gen_dup_i64(unsigned vece, TCGv_i64 out, TCGv_i64 in)
125
-/*----------------------------------------------------------------------------
91
+void tcg_gen_dup_i64(unsigned vece, TCGv_i64 out, TCGv_i64 in)
126
-| Select which NaN to propagate for a two-input operation.
92
{
127
-| IEEE754 doesn't specify all the details of this, so the
93
switch (vece) {
128
-| algorithm is target-specific.
94
case MO_8:
129
-| The routine is passed various bits of information about the
95
@@ -XXX,XX +XXX,XX @@ static void do_dup(unsigned vece, uint32_t dofs, uint32_t oprsz,
130
-| two NaNs and should return 0 to select NaN a and 1 for NaN b.
96
&& (vece != MO_32 || !check_size_impl(oprsz, 4))) {
131
-| Note that signalling NaNs are always squashed to quiet NaNs
97
t_64 = tcg_temp_new_i64();
132
-| by the caller, by calling floatXX_silence_nan() before
98
tcg_gen_extu_i32_i64(t_64, in_32);
133
-| returning them.
99
- gen_dup_i64(vece, t_64, t_64);
134
-|
100
+ tcg_gen_dup_i64(vece, t_64, t_64);
135
-| aIsLargerSignificand is only valid if both a and b are NaNs
101
} else {
136
-| of some kind, and is true if a has the larger significand,
102
t_32 = tcg_temp_new_i32();
137
-| or if both a and b have the same significand but a is
103
- gen_dup_i32(vece, t_32, in_32);
138
-| positive but b is negative. It is only needed for the x87
104
+ tcg_gen_dup_i32(vece, t_32, in_32);
139
-| tie-break rule.
105
}
140
-*----------------------------------------------------------------------------*/
106
} else if (in_64) {
141
-
107
/* We are given a 64-bit variable input. */
142
-static int pickNaN(FloatClass a_cls, FloatClass b_cls,
108
t_64 = tcg_temp_new_i64();
143
- bool aIsLargerSignificand, float_status *status)
109
- gen_dup_i64(vece, t_64, in_64);
144
-{
110
+ tcg_gen_dup_i64(vece, t_64, in_64);
145
- /*
111
} else {
146
- * We guarantee not to require the target to tell us how to
112
/* We are given a constant input. */
147
- * pick a NaN if we're always returning the default NaN.
113
/* For 64-bit hosts, use 64-bit constants for "simple" constants
148
- * But if we're not in default-NaN mode then the target must
114
@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_2s(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
149
- * specify via set_float_2nan_prop_rule().
115
} else if (g->fni8 && check_size_impl(oprsz, 8)) {
150
- */
116
TCGv_i64 t64 = tcg_temp_new_i64();
151
- assert(!status->default_nan_mode);
117
152
-
118
- gen_dup_i64(g->vece, t64, c);
153
- switch (status->float_2nan_prop_rule) {
119
+ tcg_gen_dup_i64(g->vece, t64, c);
154
- case float_2nan_prop_s_ab:
120
expand_2s_i64(dofs, aofs, oprsz, t64, g->scalar_first, g->fni8);
155
- if (is_snan(a_cls)) {
121
tcg_temp_free_i64(t64);
156
- return 0;
122
} else if (g->fni4 && check_size_impl(oprsz, 4)) {
157
- } else if (is_snan(b_cls)) {
123
TCGv_i32 t32 = tcg_temp_new_i32();
158
- return 1;
124
159
- } else if (is_qnan(a_cls)) {
125
tcg_gen_extrl_i64_i32(t32, c);
160
- return 0;
126
- gen_dup_i32(g->vece, t32, t32);
161
- } else {
127
+ tcg_gen_dup_i32(g->vece, t32, t32);
162
- return 1;
128
expand_2s_i32(dofs, aofs, oprsz, t32, g->scalar_first, g->fni4);
163
- }
129
tcg_temp_free_i32(t32);
164
- break;
130
} else {
165
- case float_2nan_prop_s_ba:
131
@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_ands(unsigned vece, uint32_t dofs, uint32_t aofs,
166
- if (is_snan(b_cls)) {
132
TCGv_i64 c, uint32_t oprsz, uint32_t maxsz)
167
- return 1;
133
{
168
- } else if (is_snan(a_cls)) {
134
TCGv_i64 tmp = tcg_temp_new_i64();
169
- return 0;
135
- gen_dup_i64(vece, tmp, c);
170
- } else if (is_qnan(b_cls)) {
136
+ tcg_gen_dup_i64(vece, tmp, c);
171
- return 1;
137
tcg_gen_gvec_2s(dofs, aofs, oprsz, maxsz, tmp, &gop_ands);
172
- } else {
138
tcg_temp_free_i64(tmp);
173
- return 0;
139
}
174
- }
140
@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_xors(unsigned vece, uint32_t dofs, uint32_t aofs,
175
- break;
141
TCGv_i64 c, uint32_t oprsz, uint32_t maxsz)
176
- case float_2nan_prop_ab:
142
{
177
- if (is_nan(a_cls)) {
143
TCGv_i64 tmp = tcg_temp_new_i64();
178
- return 0;
144
- gen_dup_i64(vece, tmp, c);
179
- } else {
145
+ tcg_gen_dup_i64(vece, tmp, c);
180
- return 1;
146
tcg_gen_gvec_2s(dofs, aofs, oprsz, maxsz, tmp, &gop_xors);
181
- }
147
tcg_temp_free_i64(tmp);
182
- break;
148
}
183
- case float_2nan_prop_ba:
149
@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_ors(unsigned vece, uint32_t dofs, uint32_t aofs,
184
- if (is_nan(b_cls)) {
150
TCGv_i64 c, uint32_t oprsz, uint32_t maxsz)
185
- return 1;
151
{
186
- } else {
152
TCGv_i64 tmp = tcg_temp_new_i64();
187
- return 0;
153
- gen_dup_i64(vece, tmp, c);
188
- }
154
+ tcg_gen_dup_i64(vece, tmp, c);
189
- break;
155
tcg_gen_gvec_2s(dofs, aofs, oprsz, maxsz, tmp, &gop_ors);
190
- case float_2nan_prop_x87:
156
tcg_temp_free_i64(tmp);
191
- /*
157
}
192
- * This implements x87 NaN propagation rules:
193
- * SNaN + QNaN => return the QNaN
194
- * two SNaNs => return the one with the larger significand, silenced
195
- * two QNaNs => return the one with the larger significand
196
- * SNaN and a non-NaN => return the SNaN, silenced
197
- * QNaN and a non-NaN => return the QNaN
198
- *
199
- * If we get down to comparing significands and they are the same,
200
- * return the NaN with the positive sign bit (if any).
201
- */
202
- if (is_snan(a_cls)) {
203
- if (is_snan(b_cls)) {
204
- return aIsLargerSignificand ? 0 : 1;
205
- }
206
- return is_qnan(b_cls) ? 1 : 0;
207
- } else if (is_qnan(a_cls)) {
208
- if (is_snan(b_cls) || !is_qnan(b_cls)) {
209
- return 0;
210
- } else {
211
- return aIsLargerSignificand ? 0 : 1;
212
- }
213
- } else {
214
- return 1;
215
- }
216
- default:
217
- g_assert_not_reached();
218
- }
219
-}
220
-
221
/*----------------------------------------------------------------------------
222
| Returns 1 if the double-precision floating-point value `a' is a quiet
223
| NaN; otherwise returns 0.
158
--
224
--
159
2.20.1
225
2.34.1
160
226
161
227
diff view generated by jsdifflib
New patch
1
From: Richard Henderson <richard.henderson@linaro.org>
1
2
3
Remember if there was an SNaN, and use that to simplify
4
float_2nan_prop_s_{ab,ba} to only the snan component.
5
Then, fall through to the corresponding
6
float_2nan_prop_{ab,ba} case to handle any remaining
7
nans, which must be quiet.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20241203203949.483774-10-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
fpu/softfloat-parts.c.inc | 32 ++++++++++++--------------------
15
1 file changed, 12 insertions(+), 20 deletions(-)
16
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
18
index XXXXXXX..XXXXXXX 100644
19
--- a/fpu/softfloat-parts.c.inc
20
+++ b/fpu/softfloat-parts.c.inc
21
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
22
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
23
float_status *s)
24
{
25
+ bool have_snan = false;
26
int cmp, which;
27
28
if (is_snan(a->cls) || is_snan(b->cls)) {
29
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
30
+ have_snan = true;
31
}
32
33
if (s->default_nan_mode) {
34
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
35
36
switch (s->float_2nan_prop_rule) {
37
case float_2nan_prop_s_ab:
38
- if (is_snan(a->cls)) {
39
- which = 0;
40
- } else if (is_snan(b->cls)) {
41
- which = 1;
42
- } else if (is_qnan(a->cls)) {
43
- which = 0;
44
- } else {
45
- which = 1;
46
+ if (have_snan) {
47
+ which = is_snan(a->cls) ? 0 : 1;
48
+ break;
49
}
50
- break;
51
- case float_2nan_prop_s_ba:
52
- if (is_snan(b->cls)) {
53
- which = 1;
54
- } else if (is_snan(a->cls)) {
55
- which = 0;
56
- } else if (is_qnan(b->cls)) {
57
- which = 1;
58
- } else {
59
- which = 0;
60
- }
61
- break;
62
+ /* fall through */
63
case float_2nan_prop_ab:
64
which = is_nan(a->cls) ? 0 : 1;
65
break;
66
+ case float_2nan_prop_s_ba:
67
+ if (have_snan) {
68
+ which = is_snan(b->cls) ? 1 : 0;
69
+ break;
70
+ }
71
+ /* fall through */
72
case float_2nan_prop_ba:
73
which = is_nan(b->cls) ? 1 : 0;
74
break;
75
--
76
2.34.1
diff view generated by jsdifflib
New patch
1
From: Richard Henderson <richard.henderson@linaro.org>
1
2
3
Move the fractional comparison to the end of the
4
float_2nan_prop_x87 case. This is not required for
5
any other 2nan propagation rule. Reorganize the
6
x87 case itself to break out of the switch when the
7
fractional comparison is not required.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20241203203949.483774-11-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
fpu/softfloat-parts.c.inc | 19 +++++++++----------
15
1 file changed, 9 insertions(+), 10 deletions(-)
16
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
18
index XXXXXXX..XXXXXXX 100644
19
--- a/fpu/softfloat-parts.c.inc
20
+++ b/fpu/softfloat-parts.c.inc
21
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
22
return a;
23
}
24
25
- cmp = frac_cmp(a, b);
26
- if (cmp == 0) {
27
- cmp = a->sign < b->sign;
28
- }
29
-
30
switch (s->float_2nan_prop_rule) {
31
case float_2nan_prop_s_ab:
32
if (have_snan) {
33
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
34
* return the NaN with the positive sign bit (if any).
35
*/
36
if (is_snan(a->cls)) {
37
- if (is_snan(b->cls)) {
38
- which = cmp > 0 ? 0 : 1;
39
- } else {
40
+ if (!is_snan(b->cls)) {
41
which = is_qnan(b->cls) ? 1 : 0;
42
+ break;
43
}
44
} else if (is_qnan(a->cls)) {
45
if (is_snan(b->cls) || !is_qnan(b->cls)) {
46
which = 0;
47
- } else {
48
- which = cmp > 0 ? 0 : 1;
49
+ break;
50
}
51
} else {
52
which = 1;
53
+ break;
54
}
55
+ cmp = frac_cmp(a, b);
56
+ if (cmp == 0) {
57
+ cmp = a->sign < b->sign;
58
+ }
59
+ which = cmp > 0 ? 0 : 1;
60
break;
61
default:
62
g_assert_not_reached();
63
--
64
2.34.1
diff view generated by jsdifflib
1
A few subcases of VLDR/VSTR sysreg succeed but do not perform a
1
From: Richard Henderson <richard.henderson@linaro.org>
2
memory access:
3
* VSTR of VPR when unprivileged
4
* VLDR to VPR when unprivileged
5
* VLDR to FPCXT_NS when fpInactive
6
2
7
In these cases, even though we don't do the memory access we should
3
Replace the "index" selecting between A and B with a result variable
8
still update the base register and perform the stack limit check if
4
of the proper type. This improves clarity within the function.
9
the insn's addressing mode specifies writeback. Our implementation
10
failed to do this, because we handle these side-effects inside the
11
memory_to_fp_sysreg() and fp_sysreg_to_memory() callback functions,
12
which are only called if there's something to load or store.
13
5
14
Fix this by adding an extra argument to the callbacks which is set to
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
15
true to actually perform the access and false to only do side effects
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
16
like writeback, and calling the callback with do_access = false
8
Message-id: 20241203203949.483774-12-richard.henderson@linaro.org
17
for the three cases listed above.
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
fpu/softfloat-parts.c.inc | 28 +++++++++++++---------------
12
1 file changed, 13 insertions(+), 15 deletions(-)
18
13
19
This produces slightly suboptimal code for the case of a write
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
20
to FPCXT_NS when the FPU is inactive and the insn didn't have
21
side effects (ie no writeback, or via VMSR), in which case we'll
22
generate a conditional branch over an unconditional branch.
23
But this doesn't seem to be important enough to merit requiring
24
the callback to report back whether it generated any code or not.
25
26
Cc: qemu-stable@nongnu.org
27
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
28
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
29
Message-id: 20210618141019.10671-5-peter.maydell@linaro.org
30
---
31
target/arm/translate-m-nocp.c | 102 ++++++++++++++++++++++++----------
32
1 file changed, 72 insertions(+), 30 deletions(-)
33
34
diff --git a/target/arm/translate-m-nocp.c b/target/arm/translate-m-nocp.c
35
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/translate-m-nocp.c
16
--- a/fpu/softfloat-parts.c.inc
37
+++ b/target/arm/translate-m-nocp.c
17
+++ b/fpu/softfloat-parts.c.inc
38
@@ -XXX,XX +XXX,XX @@ static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
39
19
float_status *s)
40
/*
20
{
41
* Emit code to store the sysreg to its final destination; frees the
21
bool have_snan = false;
42
- * TCG temp 'value' it is passed.
22
- int cmp, which;
43
+ * TCG temp 'value' it is passed. do_access is true to do the store,
23
+ FloatPartsN *ret;
44
+ * and false to skip it and only perform side-effects like base
24
+ int cmp;
45
+ * register writeback.
25
46
*/
26
if (is_snan(a->cls) || is_snan(b->cls)) {
47
-typedef void fp_sysreg_storefn(DisasContext *s, void *opaque, TCGv_i32 value);
27
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
48
+typedef void fp_sysreg_storefn(DisasContext *s, void *opaque, TCGv_i32 value,
28
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
49
+ bool do_access);
29
switch (s->float_2nan_prop_rule) {
50
/*
30
case float_2nan_prop_s_ab:
51
* Emit code to load the value to be copied to the sysreg; returns
31
if (have_snan) {
52
- * a new TCG temporary
32
- which = is_snan(a->cls) ? 0 : 1;
53
+ * a new TCG temporary. do_access is true to do the store,
33
+ ret = is_snan(a->cls) ? a : b;
54
+ * and false to skip it and only perform side-effects like base
55
+ * register writeback.
56
*/
57
-typedef TCGv_i32 fp_sysreg_loadfn(DisasContext *s, void *opaque);
58
+typedef TCGv_i32 fp_sysreg_loadfn(DisasContext *s, void *opaque,
59
+ bool do_access);
60
61
/* Common decode/access checks for fp sysreg read/write */
62
typedef enum FPSysRegCheckResult {
63
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
64
65
switch (regno) {
66
case ARM_VFP_FPSCR:
67
- tmp = loadfn(s, opaque);
68
+ tmp = loadfn(s, opaque, true);
69
gen_helper_vfp_set_fpscr(cpu_env, tmp);
70
tcg_temp_free_i32(tmp);
71
gen_lookup_tb(s);
72
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
73
case ARM_VFP_FPSCR_NZCVQC:
74
{
75
TCGv_i32 fpscr;
76
- tmp = loadfn(s, opaque);
77
+ tmp = loadfn(s, opaque, true);
78
if (dc_isar_feature(aa32_mve, s)) {
79
/* QC is only present for MVE; otherwise RES0 */
80
TCGv_i32 qc = tcg_temp_new_i32();
81
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
82
break;
83
}
84
case ARM_VFP_FPCXT_NS:
85
+ {
86
+ TCGLabel *lab_active = gen_new_label();
87
+
88
lab_end = gen_new_label();
89
- /* fpInactive case: write is a NOP, so branch to end */
90
- gen_branch_fpInactive(s, TCG_COND_NE, lab_end);
91
+ gen_branch_fpInactive(s, TCG_COND_EQ, lab_active);
92
+ /*
93
+ * fpInactive case: write is a NOP, so only do side effects
94
+ * like register writeback before we branch to end
95
+ */
96
+ loadfn(s, opaque, false);
97
+ tcg_gen_br(lab_end);
98
+
99
+ gen_set_label(lab_active);
100
/*
101
* !fpInactive: if FPU disabled, take NOCP exception;
102
* otherwise PreserveFPState(), and then FPCXT_NS writes
103
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
104
break;
34
break;
105
}
35
}
106
gen_preserve_fp_state(s);
36
/* fall through */
107
- /* fall through */
37
case float_2nan_prop_ab:
108
+ }
38
- which = is_nan(a->cls) ? 0 : 1;
109
+ /* fall through */
39
+ ret = is_nan(a->cls) ? a : b;
110
case ARM_VFP_FPCXT_S:
40
break;
111
{
41
case float_2nan_prop_s_ba:
112
TCGv_i32 sfpa, control;
42
if (have_snan) {
113
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
43
- which = is_snan(b->cls) ? 1 : 0;
114
* Set FPSCR and CONTROL.SFPA from value; the new FPSCR takes
44
+ ret = is_snan(b->cls) ? b : a;
115
* bits [27:0] from value and zeroes bits [31:28].
116
*/
117
- tmp = loadfn(s, opaque);
118
+ tmp = loadfn(s, opaque, true);
119
sfpa = tcg_temp_new_i32();
120
tcg_gen_shri_i32(sfpa, tmp, 31);
121
control = load_cpu_field(v7m.control[M_REG_S]);
122
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
123
case ARM_VFP_VPR:
124
/* Behaves as NOP if not privileged */
125
if (IS_USER(s)) {
126
+ loadfn(s, opaque, false);
127
break;
45
break;
128
}
46
}
129
- tmp = loadfn(s, opaque);
47
/* fall through */
130
+ tmp = loadfn(s, opaque, true);
48
case float_2nan_prop_ba:
131
store_cpu_field(tmp, v7m.vpr);
49
- which = is_nan(b->cls) ? 1 : 0;
50
+ ret = is_nan(b->cls) ? b : a;
132
break;
51
break;
133
case ARM_VFP_P0:
52
case float_2nan_prop_x87:
134
{
135
TCGv_i32 vpr;
136
- tmp = loadfn(s, opaque);
137
+ tmp = loadfn(s, opaque, true);
138
vpr = load_cpu_field(v7m.vpr);
139
tcg_gen_deposit_i32(vpr, vpr, tmp,
140
R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
141
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
142
case ARM_VFP_FPSCR:
143
tmp = tcg_temp_new_i32();
144
gen_helper_vfp_get_fpscr(tmp, cpu_env);
145
- storefn(s, opaque, tmp);
146
+ storefn(s, opaque, tmp, true);
147
break;
148
case ARM_VFP_FPSCR_NZCVQC:
149
tmp = tcg_temp_new_i32();
150
gen_helper_vfp_get_fpscr(tmp, cpu_env);
151
tcg_gen_andi_i32(tmp, tmp, FPCR_NZCVQC_MASK);
152
- storefn(s, opaque, tmp);
153
+ storefn(s, opaque, tmp, true);
154
break;
155
case QEMU_VFP_FPSCR_NZCV:
156
/*
53
/*
157
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
54
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
158
*/
55
*/
159
tmp = load_cpu_field(vfp.xregs[ARM_VFP_FPSCR]);
56
if (is_snan(a->cls)) {
160
tcg_gen_andi_i32(tmp, tmp, FPCR_NZCV_MASK);
57
if (!is_snan(b->cls)) {
161
- storefn(s, opaque, tmp);
58
- which = is_qnan(b->cls) ? 1 : 0;
162
+ storefn(s, opaque, tmp, true);
59
+ ret = is_qnan(b->cls) ? b : a;
163
break;
60
break;
164
case ARM_VFP_FPCXT_S:
61
}
165
{
62
} else if (is_qnan(a->cls)) {
166
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
63
if (is_snan(b->cls) || !is_qnan(b->cls)) {
167
* Store result before updating FPSCR etc, in case
64
- which = 0;
168
* it is a memory write which causes an exception.
65
+ ret = a;
169
*/
66
break;
170
- storefn(s, opaque, tmp);
67
}
171
+ storefn(s, opaque, tmp, true);
68
} else {
172
/*
69
- which = 1;
173
* Now we must reset FPSCR from FPDSCR_NS, and clear
70
+ ret = b;
174
* CONTROL.SFPA; so we'll end the TB here.
175
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
176
gen_branch_fpInactive(s, TCG_COND_EQ, lab_active);
177
/* fpInactive case: reads as FPDSCR_NS */
178
TCGv_i32 tmp = load_cpu_field(v7m.fpdscr[M_REG_NS]);
179
- storefn(s, opaque, tmp);
180
+ storefn(s, opaque, tmp, true);
181
lab_end = gen_new_label();
182
tcg_gen_br(lab_end);
183
184
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
185
tcg_gen_or_i32(tmp, tmp, sfpa);
186
tcg_temp_free_i32(control);
187
/* Store result before updating FPSCR, in case it faults */
188
- storefn(s, opaque, tmp);
189
+ storefn(s, opaque, tmp, true);
190
/* If SFPA is zero then set FPSCR from FPDSCR_NS */
191
fpdscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
192
zero = tcg_const_i32(0);
193
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
194
case ARM_VFP_VPR:
195
/* Behaves as NOP if not privileged */
196
if (IS_USER(s)) {
197
+ storefn(s, opaque, NULL, false);
198
break;
71
break;
199
}
72
}
200
tmp = load_cpu_field(v7m.vpr);
73
cmp = frac_cmp(a, b);
201
- storefn(s, opaque, tmp);
74
if (cmp == 0) {
202
+ storefn(s, opaque, tmp, true);
75
cmp = a->sign < b->sign;
203
break;
76
}
204
case ARM_VFP_P0:
77
- which = cmp > 0 ? 0 : 1;
205
tmp = load_cpu_field(v7m.vpr);
78
+ ret = cmp > 0 ? a : b;
206
tcg_gen_extract_i32(tmp, tmp, R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
207
- storefn(s, opaque, tmp);
208
+ storefn(s, opaque, tmp, true);
209
break;
79
break;
210
default:
80
default:
211
g_assert_not_reached();
81
g_assert_not_reached();
212
@@ -XXX,XX +XXX,XX @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
82
}
213
return true;
83
84
- if (which) {
85
- a = b;
86
+ if (is_snan(ret->cls)) {
87
+ parts_silence_nan(ret, s);
88
}
89
- if (is_snan(a->cls)) {
90
- parts_silence_nan(a, s);
91
- }
92
- return a;
93
+ return ret;
214
}
94
}
215
95
216
-static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, TCGv_i32 value)
96
static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
217
+static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, TCGv_i32 value,
218
+ bool do_access)
219
{
220
arg_VMSR_VMRS *a = opaque;
221
222
+ if (!do_access) {
223
+ return;
224
+ }
225
+
226
if (a->rt == 15) {
227
/* Set the 4 flag bits in the CPSR */
228
gen_set_nzcv(value);
229
@@ -XXX,XX +XXX,XX @@ static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, TCGv_i32 value)
230
}
231
}
232
233
-static TCGv_i32 gpr_to_fp_sysreg(DisasContext *s, void *opaque)
234
+static TCGv_i32 gpr_to_fp_sysreg(DisasContext *s, void *opaque, bool do_access)
235
{
236
arg_VMSR_VMRS *a = opaque;
237
238
+ if (!do_access) {
239
+ return NULL;
240
+ }
241
return load_reg(s, a->rt);
242
}
243
244
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
245
}
246
}
247
248
-static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
249
+static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value,
250
+ bool do_access)
251
{
252
arg_vldr_sysreg *a = opaque;
253
uint32_t offset = a->imm;
254
@@ -XXX,XX +XXX,XX @@ static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
255
offset = -offset;
256
}
257
258
+ if (!do_access && !a->w) {
259
+ return;
260
+ }
261
+
262
addr = load_reg(s, a->rn);
263
if (a->p) {
264
tcg_gen_addi_i32(addr, addr, offset);
265
@@ -XXX,XX +XXX,XX @@ static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
266
gen_helper_v8m_stackcheck(cpu_env, addr);
267
}
268
269
- gen_aa32_st_i32(s, value, addr, get_mem_index(s),
270
- MO_UL | MO_ALIGN | s->be_data);
271
- tcg_temp_free_i32(value);
272
+ if (do_access) {
273
+ gen_aa32_st_i32(s, value, addr, get_mem_index(s),
274
+ MO_UL | MO_ALIGN | s->be_data);
275
+ tcg_temp_free_i32(value);
276
+ }
277
278
if (a->w) {
279
/* writeback */
280
@@ -XXX,XX +XXX,XX @@ static void fp_sysreg_to_memory(DisasContext *s, void *opaque, TCGv_i32 value)
281
}
282
}
283
284
-static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque)
285
+static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque,
286
+ bool do_access)
287
{
288
arg_vldr_sysreg *a = opaque;
289
uint32_t offset = a->imm;
290
TCGv_i32 addr;
291
- TCGv_i32 value = tcg_temp_new_i32();
292
+ TCGv_i32 value = NULL;
293
294
if (!a->a) {
295
offset = -offset;
296
}
297
298
+ if (!do_access && !a->w) {
299
+ return NULL;
300
+ }
301
+
302
addr = load_reg(s, a->rn);
303
if (a->p) {
304
tcg_gen_addi_i32(addr, addr, offset);
305
@@ -XXX,XX +XXX,XX @@ static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void *opaque)
306
gen_helper_v8m_stackcheck(cpu_env, addr);
307
}
308
309
- gen_aa32_ld_i32(s, value, addr, get_mem_index(s),
310
- MO_UL | MO_ALIGN | s->be_data);
311
+ if (do_access) {
312
+ value = tcg_temp_new_i32();
313
+ gen_aa32_ld_i32(s, value, addr, get_mem_index(s),
314
+ MO_UL | MO_ALIGN | s->be_data);
315
+ }
316
317
if (a->w) {
318
/* writeback */
319
--
97
--
320
2.20.1
98
2.34.1
321
99
322
100
diff view generated by jsdifflib
1
From: Alexandre Iooss <erdnaxe@crans.org>
1
From: Leif Lindholm <quic_llindhol@quicinc.com>
2
2
3
This adds the target guide for BBC Micro:bit.
3
I'm migrating to Qualcomm's new open source email infrastructure, so
4
update my email address, and update the mailmap to match.
4
5
5
Information is taken from https://wiki.qemu.org/Features/MicroBit
6
Signed-off-by: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
6
and from hw/arm/nrf51_soc.c.
7
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
7
8
Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
8
Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Reviewed-by: Joel Stanley <joel@jms.id.au>
11
Message-id: 20241205114047.1125842-1-leif.lindholm@oss.qualcomm.com
11
Message-id: 20210621075625.540471-1-erdnaxe@crans.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
13
---
14
docs/system/arm/nrf.rst | 51 ++++++++++++++++++++++++++++++++++++++
14
MAINTAINERS | 2 +-
15
docs/system/target-arm.rst | 1 +
15
.mailmap | 5 +++--
16
MAINTAINERS | 1 +
16
2 files changed, 4 insertions(+), 3 deletions(-)
17
3 files changed, 53 insertions(+)
18
create mode 100644 docs/system/arm/nrf.rst
19
17
20
diff --git a/docs/system/arm/nrf.rst b/docs/system/arm/nrf.rst
21
new file mode 100644
22
index XXXXXXX..XXXXXXX
23
--- /dev/null
24
+++ b/docs/system/arm/nrf.rst
25
@@ -XXX,XX +XXX,XX @@
26
+Nordic nRF boards (``microbit``)
27
+================================
28
+
29
+The `Nordic nRF`_ chips are a family of ARM-based System-on-Chip that
30
+are designed to be used for low-power and short-range wireless solutions.
31
+
32
+.. _Nordic nRF: https://www.nordicsemi.com/Products
33
+
34
+The nRF51 series is the first series for short range wireless applications.
35
+It is superseded by the nRF52 series.
36
+The following machines are based on this chip :
37
+
38
+- ``microbit`` BBC micro:bit board with nRF51822 SoC
39
+
40
+There are other series such as nRF52, nRF53 and nRF91 which are currently not
41
+supported by QEMU.
42
+
43
+Supported devices
44
+-----------------
45
+
46
+ * ARM Cortex-M0 (ARMv6-M)
47
+ * Serial ports (UART)
48
+ * Clock controller
49
+ * Timers
50
+ * Random Number Generator (RNG)
51
+ * GPIO controller
52
+ * NVMC
53
+ * SWI
54
+
55
+Missing devices
56
+---------------
57
+
58
+ * Watchdog
59
+ * Real-Time Clock (RTC) controller
60
+ * TWI (i2c)
61
+ * SPI controller
62
+ * Analog to Digital Converter (ADC)
63
+ * Quadrature decoder
64
+ * Radio
65
+
66
+Boot options
67
+------------
68
+
69
+The Micro:bit machine can be started using the ``-device`` option to load a
70
+firmware in `ihex format`_. Example:
71
+
72
+.. _ihex format: https://en.wikipedia.org/wiki/Intel_HEX
73
+
74
+.. code-block:: bash
75
+
76
+ $ qemu-system-arm -M microbit -device loader,file=test.hex
77
diff --git a/docs/system/target-arm.rst b/docs/system/target-arm.rst
78
index XXXXXXX..XXXXXXX 100644
79
--- a/docs/system/target-arm.rst
80
+++ b/docs/system/target-arm.rst
81
@@ -XXX,XX +XXX,XX @@ undocumented; you can get a complete list by running
82
arm/digic
83
arm/musicpal
84
arm/gumstix
85
+ arm/nrf
86
arm/nseries
87
arm/nuvoton
88
arm/orangepi
89
diff --git a/MAINTAINERS b/MAINTAINERS
18
diff --git a/MAINTAINERS b/MAINTAINERS
90
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
91
--- a/MAINTAINERS
20
--- a/MAINTAINERS
92
+++ b/MAINTAINERS
21
+++ b/MAINTAINERS
93
@@ -XXX,XX +XXX,XX @@ F: hw/*/microbit*.c
22
@@ -XXX,XX +XXX,XX @@ F: include/hw/ssi/imx_spi.h
94
F: include/hw/*/nrf51*.h
23
SBSA-REF
95
F: include/hw/*/microbit*.h
24
M: Radoslaw Biernacki <rad@semihalf.com>
96
F: tests/qtest/microbit-test.c
25
M: Peter Maydell <peter.maydell@linaro.org>
97
+F: docs/system/arm/nrf.rst
26
-R: Leif Lindholm <quic_llindhol@quicinc.com>
98
27
+R: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
99
AVR Machines
28
R: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
100
-------------
29
L: qemu-arm@nongnu.org
30
S: Maintained
31
diff --git a/.mailmap b/.mailmap
32
index XXXXXXX..XXXXXXX 100644
33
--- a/.mailmap
34
+++ b/.mailmap
35
@@ -XXX,XX +XXX,XX @@ Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
36
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
37
James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
38
Juan Quintela <quintela@trasno.org> <quintela@redhat.com>
39
-Leif Lindholm <quic_llindhol@quicinc.com> <leif.lindholm@linaro.org>
40
-Leif Lindholm <quic_llindhol@quicinc.com> <leif@nuviainc.com>
41
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <quic_llindhol@quicinc.com>
42
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif.lindholm@linaro.org>
43
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif@nuviainc.com>
44
Luc Michel <luc@lmichel.fr> <luc.michel@git.antfield.fr>
45
Luc Michel <luc@lmichel.fr> <luc.michel@greensocs.com>
46
Luc Michel <luc@lmichel.fr> <lmichel@kalray.eu>
101
--
47
--
102
2.20.1
48
2.34.1
103
49
104
50
diff view generated by jsdifflib
New patch
1
From: Vikram Garhwal <vikram.garhwal@bytedance.com>
1
2
3
Previously, maintainer role was paused due to inactive email id. Commit id:
4
c009d715721861984c4987bcc78b7ee183e86d75.
5
6
Signed-off-by: Vikram Garhwal <vikram.garhwal@bytedance.com>
7
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
8
Message-id: 20241204184205.12952-1-vikram.garhwal@bytedance.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
MAINTAINERS | 2 ++
12
1 file changed, 2 insertions(+)
13
14
diff --git a/MAINTAINERS b/MAINTAINERS
15
index XXXXXXX..XXXXXXX 100644
16
--- a/MAINTAINERS
17
+++ b/MAINTAINERS
18
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/fuzz-sb16-test.c
19
20
Xilinx CAN
21
M: Francisco Iglesias <francisco.iglesias@amd.com>
22
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
23
S: Maintained
24
F: hw/net/can/xlnx-*
25
F: include/hw/net/xlnx-*
26
@@ -XXX,XX +XXX,XX @@ F: include/hw/rx/
27
CAN bus subsystem and hardware
28
M: Pavel Pisa <pisa@cmp.felk.cvut.cz>
29
M: Francisco Iglesias <francisco.iglesias@amd.com>
30
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
31
S: Maintained
32
W: https://canbus.pages.fel.cvut.cz/
33
F: net/can/*
34
--
35
2.34.1
diff view generated by jsdifflib