1
The following changes since commit 14556211bc6d7125a44d5b5df90caba019b0ec0e:
1
First arm pullreq of the cycle; this is mostly my softfloat NaN
2
handling series. (Lots more in my to-review queue, but I don't
3
like pullreqs growing too close to a hundred patches at a time :-))
2
4
3
Merge tag 'qemu-macppc-20240918' of https://github.com/mcayland/qemu into staging (2024-09-18 20:59:10 +0100)
5
thanks
6
-- PMM
7
8
The following changes since commit 97f2796a3736ed37a1b85dc1c76a6c45b829dd17:
9
10
Open 10.0 development tree (2024-12-10 17:41:17 +0000)
4
11
5
are available in the Git repository at:
12
are available in the Git repository at:
6
13
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20240919
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20241211
8
15
9
for you to fetch changes up to 89b30b4921e51bb47313d2d8fdc3d7bce987e4c5:
16
for you to fetch changes up to 1abe28d519239eea5cf9620bb13149423e5665f8:
10
17
11
docs/devel: Remove nested-papr.txt (2024-09-19 13:33:15 +0100)
18
MAINTAINERS: Add correct email address for Vikram Garhwal (2024-12-11 15:31:09 +0000)
12
19
13
----------------------------------------------------------------
20
----------------------------------------------------------------
14
target-arm queue:
21
target-arm queue:
15
* target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
22
* hw/net/lan9118: Extract PHY model, reuse with imx_fec, fix bugs
16
* target/arm: More conversions to decodetree of A64 SIMD insns
23
* fpu: Make muladd NaN handling runtime-selected, not compile-time
17
* hw/char/stm32l4x5_usart.c: Enable USART ACK bit response
24
* fpu: Make default NaN pattern runtime-selected, not compile-time
18
* tests: update aarch64/sbsa-ref tests
25
* fpu: Minor NaN-related cleanups
19
* kvm: minor Coverity nit fixes
26
* MAINTAINERS: email address updates
20
* docs/devel: Remove nested-papr.txt
21
27
22
----------------------------------------------------------------
28
----------------------------------------------------------------
23
Jacob Abrams (1):
29
Bernhard Beschow (5):
24
hw/char/stm32l4x5_usart.c: Enable USART ACK bit response
30
hw/net/lan9118: Extract lan9118_phy
31
hw/net/lan9118_phy: Reuse in imx_fec and consolidate implementations
32
hw/net/lan9118_phy: Fix off-by-one error in MII_ANLPAR register
33
hw/net/lan9118_phy: Reuse MII constants
34
hw/net/lan9118_phy: Add missing 100 mbps full duplex advertisement
25
35
26
Marcin Juszkiewicz (4):
36
Leif Lindholm (1):
27
tests: use default cpu for aarch64/sbsa-ref
37
MAINTAINERS: update email address for Leif Lindholm
28
tests: add FreeBSD tests for aarch64/sbsa-ref
29
tests: expand timeout information for aarch64/sbsa-ref
30
tests: drop OpenBSD tests for aarch64/sbsa-ref
31
38
32
Peter Maydell (4):
39
Peter Maydell (54):
33
kvm: Make 'mmap_size' be 'int' in kvm_init_vcpu(), do_kvm_destroy_vcpu()
40
fpu: handle raising Invalid for infzero in pick_nan_muladd
34
kvm: Remove unreachable code in kvm_dirty_ring_reaper_thread()
41
fpu: Check for default_nan_mode before calling pickNaNMulAdd
35
target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
42
softfloat: Allow runtime choice of inf * 0 + NaN result
36
docs/devel: Remove nested-papr.txt
43
tests/fp: Explicitly set inf-zero-nan rule
44
target/arm: Set FloatInfZeroNaNRule explicitly
45
target/s390: Set FloatInfZeroNaNRule explicitly
46
target/ppc: Set FloatInfZeroNaNRule explicitly
47
target/mips: Set FloatInfZeroNaNRule explicitly
48
target/sparc: Set FloatInfZeroNaNRule explicitly
49
target/xtensa: Set FloatInfZeroNaNRule explicitly
50
target/x86: Set FloatInfZeroNaNRule explicitly
51
target/loongarch: Set FloatInfZeroNaNRule explicitly
52
target/hppa: Set FloatInfZeroNaNRule explicitly
53
softfloat: Pass have_snan to pickNaNMulAdd
54
softfloat: Allow runtime choice of NaN propagation for muladd
55
tests/fp: Explicitly set 3-NaN propagation rule
56
target/arm: Set Float3NaNPropRule explicitly
57
target/loongarch: Set Float3NaNPropRule explicitly
58
target/ppc: Set Float3NaNPropRule explicitly
59
target/s390x: Set Float3NaNPropRule explicitly
60
target/sparc: Set Float3NaNPropRule explicitly
61
target/mips: Set Float3NaNPropRule explicitly
62
target/xtensa: Set Float3NaNPropRule explicitly
63
target/i386: Set Float3NaNPropRule explicitly
64
target/hppa: Set Float3NaNPropRule explicitly
65
fpu: Remove use_first_nan field from float_status
66
target/m68k: Don't pass NULL float_status to floatx80_default_nan()
67
softfloat: Create floatx80 default NaN from parts64_default_nan
68
target/loongarch: Use normal float_status in fclass_s and fclass_d helpers
69
target/m68k: In frem helper, initialize local float_status from env->fp_status
70
target/m68k: Init local float_status from env fp_status in gdb get/set reg
71
target/sparc: Initialize local scratch float_status from env->fp_status
72
target/ppc: Use env->fp_status in helper_compute_fprf functions
73
fpu: Allow runtime choice of default NaN value
74
tests/fp: Set default NaN pattern explicitly
75
target/microblaze: Set default NaN pattern explicitly
76
target/i386: Set default NaN pattern explicitly
77
target/hppa: Set default NaN pattern explicitly
78
target/alpha: Set default NaN pattern explicitly
79
target/arm: Set default NaN pattern explicitly
80
target/loongarch: Set default NaN pattern explicitly
81
target/m68k: Set default NaN pattern explicitly
82
target/mips: Set default NaN pattern explicitly
83
target/openrisc: Set default NaN pattern explicitly
84
target/ppc: Set default NaN pattern explicitly
85
target/sh4: Set default NaN pattern explicitly
86
target/rx: Set default NaN pattern explicitly
87
target/s390x: Set default NaN pattern explicitly
88
target/sparc: Set default NaN pattern explicitly
89
target/xtensa: Set default NaN pattern explicitly
90
target/hexagon: Set default NaN pattern explicitly
91
target/riscv: Set default NaN pattern explicitly
92
target/tricore: Set default NaN pattern explicitly
93
fpu: Remove default handling for dnan_pattern
37
94
38
Richard Henderson (29):
95
Richard Henderson (11):
39
target/arm: Replace tcg_gen_dupi_vec with constants in gengvec.c
96
target/arm: Copy entire float_status in is_ebf
40
target/arm: Replace tcg_gen_dupi_vec with constants in translate-sve.c
97
softfloat: Inline pickNaNMulAdd
41
target/arm: Use cmpsel in gen_ushl_vec
98
softfloat: Use goto for default nan case in pick_nan_muladd
42
target/arm: Use cmpsel in gen_sshl_vec
99
softfloat: Remove which from parts_pick_nan_muladd
43
target/arm: Use tcg_gen_extract2_i64 for EXT
100
softfloat: Pad array size in pick_nan_muladd
44
target/arm: Convert EXT to decodetree
101
softfloat: Move propagateFloatx80NaN to softfloat.c
45
target/arm: Convert TBL, TBX to decodetree
102
softfloat: Use parts_pick_nan in propagateFloatx80NaN
46
target/arm: Convert UZP, TRN, ZIP to decodetree
103
softfloat: Inline pickNaN
47
target/arm: Simplify do_reduction_op
104
softfloat: Share code between parts_pick_nan cases
48
target/arm: Convert ADDV, *ADDLV, *MAXV, *MINV to decodetree
105
softfloat: Sink frac_cmp in parts_pick_nan until needed
49
target/arm: Convert FMAXNMV, FMINNMV, FMAXV, FMINV to decodetree
106
softfloat: Replace WHICH with RET in parts_pick_nan
50
target/arm: Convert FMOVI (scalar, immediate) to decodetree
51
target/arm: Convert MOVI, FMOV, ORR, BIC (vector immediate) to decodetree
52
target/arm: Introduce gen_gvec_sshr, gen_gvec_ushr
53
target/arm: Fix whitespace near gen_srshr64_i64
54
target/arm: Convert handle_vec_simd_shri to decodetree
55
target/arm: Convert handle_vec_simd_shli to decodetree
56
target/arm: Use {, s}extract in handle_vec_simd_wshli
57
target/arm: Convert SSHLL, USHLL to decodetree
58
target/arm: Push tcg_rnd into handle_shri_with_rndacc
59
target/arm: Split out subroutines of handle_shri_with_rndacc
60
target/arm: Convert SHRN, RSHRN to decodetree
61
target/arm: Convert handle_scalar_simd_shri to decodetree
62
target/arm: Convert handle_scalar_simd_shli to decodetree
63
target/arm: Convert VQSHL, VQSHLU to gvec
64
target/arm: Widen NeonGenNarrowEnvFn return to 64 bits
65
target/arm: Convert SQSHL, UQSHL, SQSHLU (immediate) to decodetree
66
target/arm: Convert vector [US]QSHRN, [US]QRSHRN, SQSHRUN to decodetree
67
target/arm: Convert scalar [US]QSHRN, [US]QRSHRN, SQSHRUN to decodetree
68
107
69
docs/devel/nested-papr.txt | 119 --
108
Vikram Garhwal (1):
70
target/arm/helper.h | 34 +-
109
MAINTAINERS: Add correct email address for Vikram Garhwal
71
target/arm/tcg/translate.h | 14 +-
110
72
target/arm/tcg/a64.decode | 257 ++++
111
MAINTAINERS | 4 +-
73
target/arm/tcg/neon-dp.decode | 6 +-
112
include/fpu/softfloat-helpers.h | 38 +++-
74
accel/kvm/kvm-all.c | 10 +-
113
include/fpu/softfloat-types.h | 89 +++++++-
75
hw/char/stm32l4x5_usart.c | 16 +
114
include/hw/net/imx_fec.h | 9 +-
76
target/arm/tcg/cpu64.c | 2 +-
115
include/hw/net/lan9118_phy.h | 37 ++++
77
target/arm/tcg/gengvec.c | 121 +-
116
include/hw/net/mii.h | 6 +
78
target/arm/tcg/neon_helper.c | 76 +-
117
target/mips/fpu_helper.h | 20 ++
79
target/arm/tcg/translate-a64.c | 2081 +++++++++++++-----------------
118
target/sparc/helper.h | 4 +-
80
target/arm/tcg/translate-neon.c | 179 +--
119
fpu/softfloat.c | 19 ++
81
target/arm/tcg/translate-sve.c | 128 +-
120
hw/net/imx_fec.c | 146 ++------------
82
tests/qtest/stm32l4x5_usart-test.c | 36 +-
121
hw/net/lan9118.c | 137 ++-----------
83
tests/functional/test_aarch64_sbsaref.py | 58 +-
122
hw/net/lan9118_phy.c | 222 ++++++++++++++++++++
84
15 files changed, 1479 insertions(+), 1658 deletions(-)
123
linux-user/arm/nwfpe/fpa11.c | 5 +
85
delete mode 100644 docs/devel/nested-papr.txt
124
target/alpha/cpu.c | 2 +
125
target/arm/cpu.c | 10 +
126
target/arm/tcg/vec_helper.c | 20 +-
127
target/hexagon/cpu.c | 2 +
128
target/hppa/fpu_helper.c | 12 ++
129
target/i386/tcg/fpu_helper.c | 12 ++
130
target/loongarch/tcg/fpu_helper.c | 14 +-
131
target/m68k/cpu.c | 14 +-
132
target/m68k/fpu_helper.c | 6 +-
133
target/m68k/helper.c | 6 +-
134
target/microblaze/cpu.c | 2 +
135
target/mips/msa.c | 10 +
136
target/openrisc/cpu.c | 2 +
137
target/ppc/cpu_init.c | 19 ++
138
target/ppc/fpu_helper.c | 3 +-
139
target/riscv/cpu.c | 2 +
140
target/rx/cpu.c | 2 +
141
target/s390x/cpu.c | 5 +
142
target/sh4/cpu.c | 2 +
143
target/sparc/cpu.c | 6 +
144
target/sparc/fop_helper.c | 8 +-
145
target/sparc/translate.c | 4 +-
146
target/tricore/helper.c | 2 +
147
target/xtensa/cpu.c | 4 +
148
target/xtensa/fpu_helper.c | 3 +-
149
tests/fp/fp-bench.c | 7 +
150
tests/fp/fp-test-log2.c | 1 +
151
tests/fp/fp-test.c | 7 +
152
fpu/softfloat-parts.c.inc | 152 +++++++++++---
153
fpu/softfloat-specialize.c.inc | 412 ++------------------------------------
154
.mailmap | 5 +-
155
hw/net/Kconfig | 5 +
156
hw/net/meson.build | 1 +
157
hw/net/trace-events | 10 +-
158
47 files changed, 778 insertions(+), 730 deletions(-)
159
create mode 100644 include/hw/net/lan9118_phy.h
160
create mode 100644 hw/net/lan9118_phy.c
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
A very similar implementation of the same device exists in imx_fec. Prepare for
4
a common implementation by extracting a device model into its own files.
5
6
Some migration state has been moved into the new device model which breaks
7
migration compatibility for the following machines:
8
* smdkc210
9
* realview-*
10
* vexpress-*
11
* kzm
12
* mps2-*
13
14
While breaking migration ABI, fix the size of the MII registers to be 16 bit,
15
as defined by IEEE 802.3u.
16
17
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
18
Tested-by: Guenter Roeck <linux@roeck-us.net>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20241102125724.532843-2-shentey@gmail.com
5
Message-id: 20240912024114.1097832-12-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
22
---
8
target/arm/tcg/a64.decode | 14 +++
23
include/hw/net/lan9118_phy.h | 37 ++++++++
9
target/arm/tcg/translate-a64.c | 176 ++++++++++-----------------------
24
hw/net/lan9118.c | 137 +++++-----------------------
10
2 files changed, 67 insertions(+), 123 deletions(-)
25
hw/net/lan9118_phy.c | 169 +++++++++++++++++++++++++++++++++++
26
hw/net/Kconfig | 4 +
27
hw/net/meson.build | 1 +
28
5 files changed, 233 insertions(+), 115 deletions(-)
29
create mode 100644 include/hw/net/lan9118_phy.h
30
create mode 100644 hw/net/lan9118_phy.c
11
31
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
32
diff --git a/include/hw/net/lan9118_phy.h b/include/hw/net/lan9118_phy.h
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/include/hw/net/lan9118_phy.h
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * SMSC LAN9118 PHY emulation
40
+ *
41
+ * Copyright (c) 2009 CodeSourcery, LLC.
42
+ * Written by Paul Brook
43
+ *
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
45
+ * See the COPYING file in the top-level directory.
46
+ */
47
+
48
+#ifndef HW_NET_LAN9118_PHY_H
49
+#define HW_NET_LAN9118_PHY_H
50
+
51
+#include "qom/object.h"
52
+#include "hw/sysbus.h"
53
+
54
+#define TYPE_LAN9118_PHY "lan9118-phy"
55
+OBJECT_DECLARE_SIMPLE_TYPE(Lan9118PhyState, LAN9118_PHY)
56
+
57
+typedef struct Lan9118PhyState {
58
+ SysBusDevice parent_obj;
59
+
60
+ uint16_t status;
61
+ uint16_t control;
62
+ uint16_t advertise;
63
+ uint16_t ints;
64
+ uint16_t int_mask;
65
+ qemu_irq irq;
66
+ bool link_down;
67
+} Lan9118PhyState;
68
+
69
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down);
70
+void lan9118_phy_reset(Lan9118PhyState *s);
71
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg);
72
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val);
73
+
74
+#endif
75
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
13
index XXXXXXX..XXXXXXX 100644
76
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
77
--- a/hw/net/lan9118.c
15
+++ b/target/arm/tcg/a64.decode
78
+++ b/hw/net/lan9118.c
16
@@ -XXX,XX +XXX,XX @@
79
@@ -XXX,XX +XXX,XX @@
17
@rrx_d ........ .. . rm:5 .... idx:1 . rn:5 rd:5 &rrx_e esz=3
80
#include "net/net.h"
18
81
#include "net/eth.h"
19
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
82
#include "hw/irq.h"
20
+@rr_q1e2 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=2
83
+#include "hw/net/lan9118_phy.h"
21
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
84
#include "hw/net/lan9118.h"
22
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
85
#include "hw/ptimer.h"
23
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
86
#include "hw/qdev-properties.h"
24
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
87
@@ -XXX,XX +XXX,XX @@ do { printf("lan9118: " fmt , ## __VA_ARGS__); } while (0)
25
88
#define MAC_CR_RXEN 0x00000004
26
+@qrr_h . q:1 ...... .. ...... ...... rn:5 rd:5 &qrr_e esz=1
89
#define MAC_CR_RESERVED 0x7f404213
27
@qrr_e . q:1 ...... esz:2 ...... ...... rn:5 rd:5 &qrr_e
90
28
91
-#define PHY_INT_ENERGYON 0x80
29
@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0
92
-#define PHY_INT_AUTONEG_COMPLETE 0x40
30
@@ -XXX,XX +XXX,XX @@ SMAXV 0.00 1110 .. 11000 01010 10 ..... ..... @qrr_e
93
-#define PHY_INT_FAULT 0x20
31
UMAXV 0.10 1110 .. 11000 01010 10 ..... ..... @qrr_e
94
-#define PHY_INT_DOWN 0x10
32
SMINV 0.00 1110 .. 11000 11010 10 ..... ..... @qrr_e
95
-#define PHY_INT_AUTONEG_LP 0x08
33
UMINV 0.10 1110 .. 11000 11010 10 ..... ..... @qrr_e
96
-#define PHY_INT_PARFAULT 0x04
34
+
97
-#define PHY_INT_AUTONEG_PAGE 0x02
35
+FMAXNMV_h 0.00 1110 00 11000 01100 10 ..... ..... @qrr_h
98
-
36
+FMAXNMV_s 0110 1110 00 11000 01100 10 ..... ..... @rr_q1e2
99
#define GPT_TIMER_EN 0x20000000
37
+
100
38
+FMINNMV_h 0.00 1110 10 11000 01100 10 ..... ..... @qrr_h
101
/*
39
+FMINNMV_s 0110 1110 10 11000 01100 10 ..... ..... @rr_q1e2
102
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
40
+
103
uint32_t mac_mii_data;
41
+FMAXV_h 0.00 1110 00 11000 01111 10 ..... ..... @qrr_h
104
uint32_t mac_flow;
42
+FMAXV_s 0110 1110 00 11000 01111 10 ..... ..... @rr_q1e2
105
43
+
106
- uint32_t phy_status;
44
+FMINV_h 0.00 1110 10 11000 01111 10 ..... ..... @qrr_h
107
- uint32_t phy_control;
45
+FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
108
- uint32_t phy_advertise;
46
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
109
- uint32_t phy_int;
47
index XXXXXXX..XXXXXXX 100644
110
- uint32_t phy_int_mask;
48
--- a/target/arm/tcg/translate-a64.c
111
+ Lan9118PhyState mii;
49
+++ b/target/arm/tcg/translate-a64.c
112
+ IRQState mii_irq;
50
@@ -XXX,XX +XXX,XX @@ TRANS(UMAXV, do_int_reduction, a, false, 0, tcg_gen_umax_i64)
113
51
TRANS(SMINV, do_int_reduction, a, false, MO_SIGN, tcg_gen_smin_i64)
114
int32_t eeprom_writable;
52
TRANS(UMINV, do_int_reduction, a, false, 0, tcg_gen_umin_i64)
115
uint8_t eeprom[128];
53
116
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
54
+/*
117
55
+ * do_fp_reduction helper
118
static const VMStateDescription vmstate_lan9118 = {
56
+ *
119
.name = "lan9118",
57
+ * This mirrors the Reduce() pseudocode in the ARM ARM. It is
120
- .version_id = 2,
58
+ * important for correct NaN propagation that we do these
121
- .minimum_version_id = 1,
59
+ * operations in exactly the order specified by the pseudocode.
122
+ .version_id = 3,
60
+ *
123
+ .minimum_version_id = 3,
61
+ * This is a recursive function.
124
.fields = (const VMStateField[]) {
62
+ */
125
VMSTATE_PTIMER(timer, lan9118_state),
63
+static TCGv_i32 do_reduction_op(DisasContext *s, int rn, MemOp esz,
126
VMSTATE_UINT32(irq_cfg, lan9118_state),
64
+ int ebase, int ecount, TCGv_ptr fpst,
127
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118 = {
65
+ NeonGenTwoSingleOpFn *fn)
128
VMSTATE_UINT32(mac_mii_acc, lan9118_state),
66
+{
129
VMSTATE_UINT32(mac_mii_data, lan9118_state),
67
+ if (ecount == 1) {
130
VMSTATE_UINT32(mac_flow, lan9118_state),
68
+ TCGv_i32 tcg_elem = tcg_temp_new_i32();
131
- VMSTATE_UINT32(phy_status, lan9118_state),
69
+ read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
132
- VMSTATE_UINT32(phy_control, lan9118_state),
70
+ return tcg_elem;
133
- VMSTATE_UINT32(phy_advertise, lan9118_state),
71
+ } else {
134
- VMSTATE_UINT32(phy_int, lan9118_state),
72
+ int half = ecount >> 1;
135
- VMSTATE_UINT32(phy_int_mask, lan9118_state),
73
+ TCGv_i32 tcg_hi, tcg_lo, tcg_res;
136
VMSTATE_INT32(eeprom_writable, lan9118_state),
74
+
137
VMSTATE_UINT8_ARRAY(eeprom, lan9118_state, 128),
75
+ tcg_hi = do_reduction_op(s, rn, esz, ebase + half, half, fpst, fn);
138
VMSTATE_INT32(tx_fifo_size, lan9118_state),
76
+ tcg_lo = do_reduction_op(s, rn, esz, ebase, half, fpst, fn);
139
@@ -XXX,XX +XXX,XX @@ static void lan9118_reload_eeprom(lan9118_state *s)
77
+ tcg_res = tcg_temp_new_i32();
140
lan9118_mac_changed(s);
78
+
141
}
79
+ fn(tcg_res, tcg_lo, tcg_hi, fpst);
142
80
+ return tcg_res;
143
-static void phy_update_irq(lan9118_state *s)
81
+ }
144
+static void lan9118_update_irq(void *opaque, int n, int level)
82
+}
145
{
83
+
146
- if (s->phy_int & s->phy_int_mask) {
84
+static bool do_fp_reduction(DisasContext *s, arg_qrr_e *a,
147
+ lan9118_state *s = opaque;
85
+ NeonGenTwoSingleOpFn *fn)
148
+
86
+{
149
+ if (level) {
87
+ if (fp_access_check(s)) {
150
s->int_sts |= PHY_INT;
88
+ MemOp esz = a->esz;
151
} else {
89
+ int elts = (a->q ? 16 : 8) >> esz;
152
s->int_sts &= ~PHY_INT;
90
+ TCGv_ptr fpst = fpstatus_ptr(esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
153
@@ -XXX,XX +XXX,XX @@ static void phy_update_irq(lan9118_state *s)
91
+ TCGv_i32 res = do_reduction_op(s, a->rn, esz, 0, elts, fpst, fn);
154
lan9118_update(s);
92
+ write_fp_sreg(s, a->rd, res);
155
}
93
+ }
156
94
+ return true;
157
-static void phy_update_link(lan9118_state *s)
95
+}
158
-{
96
+
159
- /* Autonegotiation status mirrors link status. */
97
+TRANS_FEAT(FMAXNMV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_maxnumh)
160
- if (qemu_get_queue(s->nic)->link_down) {
98
+TRANS_FEAT(FMINNMV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_minnumh)
161
- s->phy_status &= ~0x0024;
99
+TRANS_FEAT(FMAXV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_maxh)
162
- s->phy_int |= PHY_INT_DOWN;
100
+TRANS_FEAT(FMINV_h, aa64_fp16, do_fp_reduction, a, gen_helper_advsimd_minh)
163
- } else {
101
+
164
- s->phy_status |= 0x0024;
102
+TRANS(FMAXNMV_s, do_fp_reduction, a, gen_helper_vfp_maxnums)
165
- s->phy_int |= PHY_INT_ENERGYON;
103
+TRANS(FMINNMV_s, do_fp_reduction, a, gen_helper_vfp_minnums)
166
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
104
+TRANS(FMAXV_s, do_fp_reduction, a, gen_helper_vfp_maxs)
167
- }
105
+TRANS(FMINV_s, do_fp_reduction, a, gen_helper_vfp_mins)
168
- phy_update_irq(s);
106
+
169
-}
107
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
170
-
108
* Note that it is the caller's responsibility to ensure that the
171
static void lan9118_set_link(NetClientState *nc)
109
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
172
{
110
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
173
- phy_update_link(qemu_get_nic_opaque(nc));
174
-}
175
-
176
-static void phy_reset(lan9118_state *s)
177
-{
178
- s->phy_status = 0x7809;
179
- s->phy_control = 0x3000;
180
- s->phy_advertise = 0x01e1;
181
- s->phy_int_mask = 0;
182
- s->phy_int = 0;
183
- phy_update_link(s);
184
+ lan9118_phy_update_link(&LAN9118(qemu_get_nic_opaque(nc))->mii,
185
+ nc->link_down);
186
}
187
188
static void lan9118_reset(DeviceState *d)
189
@@ -XXX,XX +XXX,XX @@ static void lan9118_reset(DeviceState *d)
190
s->read_word_n = 0;
191
s->write_word_n = 0;
192
193
- phy_reset(s);
194
-
195
s->eeprom_writable = 0;
196
lan9118_reload_eeprom(s);
197
}
198
@@ -XXX,XX +XXX,XX @@ static void do_tx_packet(lan9118_state *s)
199
uint32_t status;
200
201
/* FIXME: Honor TX disable, and allow queueing of packets. */
202
- if (s->phy_control & 0x4000) {
203
+ if (s->mii.control & 0x4000) {
204
/* This assumes the receive routine doesn't touch the VLANClient. */
205
qemu_receive_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
206
} else {
207
@@ -XXX,XX +XXX,XX @@ static void tx_fifo_push(lan9118_state *s, uint32_t val)
111
}
208
}
112
}
209
}
113
210
114
-/*
211
-static uint32_t do_phy_read(lan9118_state *s, int reg)
115
- * do_reduction_op helper
116
- *
117
- * This mirrors the Reduce() pseudocode in the ARM ARM. It is
118
- * important for correct NaN propagation that we do these
119
- * operations in exactly the order specified by the pseudocode.
120
- *
121
- * This is a recursive function.
122
- */
123
-static TCGv_i32 do_reduction_op(DisasContext *s, int fpopcode, int rn,
124
- MemOp esz, int ebase, int ecount, TCGv_ptr fpst)
125
-{
212
-{
126
- if (ecount == 1) {
213
- uint32_t val;
127
- TCGv_i32 tcg_elem = tcg_temp_new_i32();
214
-
128
- read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
215
- switch (reg) {
129
- return tcg_elem;
216
- case 0: /* Basic Control */
130
- } else {
217
- return s->phy_control;
131
- int half = ecount >> 1;
218
- case 1: /* Basic Status */
132
- TCGv_i32 tcg_hi, tcg_lo, tcg_res;
219
- return s->phy_status;
133
-
220
- case 2: /* ID1 */
134
- tcg_hi = do_reduction_op(s, fpopcode, rn, esz,
221
- return 0x0007;
135
- ebase + half, half, fpst);
222
- case 3: /* ID2 */
136
- tcg_lo = do_reduction_op(s, fpopcode, rn, esz,
223
- return 0xc0d1;
137
- ebase, half, fpst);
224
- case 4: /* Auto-neg advertisement */
138
- tcg_res = tcg_temp_new_i32();
225
- return s->phy_advertise;
139
-
226
- case 5: /* Auto-neg Link Partner Ability */
140
- switch (fpopcode) {
227
- return 0x0f71;
141
- case 0x0c: /* fmaxnmv half-precision */
228
- case 6: /* Auto-neg Expansion */
142
- gen_helper_advsimd_maxnumh(tcg_res, tcg_lo, tcg_hi, fpst);
229
- return 1;
143
- break;
230
- /* TODO 17, 18, 27, 29, 30, 31 */
144
- case 0x0f: /* fmaxv half-precision */
231
- case 29: /* Interrupt source. */
145
- gen_helper_advsimd_maxh(tcg_res, tcg_lo, tcg_hi, fpst);
232
- val = s->phy_int;
146
- break;
233
- s->phy_int = 0;
147
- case 0x1c: /* fminnmv half-precision */
234
- phy_update_irq(s);
148
- gen_helper_advsimd_minnumh(tcg_res, tcg_lo, tcg_hi, fpst);
235
- return val;
149
- break;
236
- case 30: /* Interrupt mask */
150
- case 0x1f: /* fminv half-precision */
237
- return s->phy_int_mask;
151
- gen_helper_advsimd_minh(tcg_res, tcg_lo, tcg_hi, fpst);
238
- default:
152
- break;
239
- qemu_log_mask(LOG_GUEST_ERROR,
153
- case 0x2c: /* fmaxnmv */
240
- "do_phy_read: PHY read reg %d\n", reg);
154
- gen_helper_vfp_maxnums(tcg_res, tcg_lo, tcg_hi, fpst);
241
- return 0;
155
- break;
156
- case 0x2f: /* fmaxv */
157
- gen_helper_vfp_maxs(tcg_res, tcg_lo, tcg_hi, fpst);
158
- break;
159
- case 0x3c: /* fminnmv */
160
- gen_helper_vfp_minnums(tcg_res, tcg_lo, tcg_hi, fpst);
161
- break;
162
- case 0x3f: /* fminv */
163
- gen_helper_vfp_mins(tcg_res, tcg_lo, tcg_hi, fpst);
164
- break;
165
- default:
166
- g_assert_not_reached();
167
- }
168
- return tcg_res;
169
- }
242
- }
170
-}
243
-}
171
-
244
-
172
-/* AdvSIMD across lanes
245
-static void do_phy_write(lan9118_state *s, int reg, uint32_t val)
173
- * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0
174
- * +---+---+---+-----------+------+-----------+--------+-----+------+------+
175
- * | 0 | Q | U | 0 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 | Rn | Rd |
176
- * +---+---+---+-----------+------+-----------+--------+-----+------+------+
177
- */
178
-static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
179
-{
246
-{
180
- int rd = extract32(insn, 0, 5);
247
- switch (reg) {
181
- int rn = extract32(insn, 5, 5);
248
- case 0: /* Basic Control */
182
- int size = extract32(insn, 22, 2);
249
- if (val & 0x8000) {
183
- int opcode = extract32(insn, 12, 5);
250
- phy_reset(s);
184
- bool is_q = extract32(insn, 30, 1);
251
- break;
185
- bool is_u = extract32(insn, 29, 1);
252
- }
186
- bool is_min = false;
253
- s->phy_control = val & 0x7980;
187
- int elements;
254
- /* Complete autonegotiation immediately. */
188
-
255
- if (val & 0x1000) {
189
- switch (opcode) {
256
- s->phy_status |= 0x0020;
190
- case 0xc: /* FMAXNMV, FMINNMV */
191
- case 0xf: /* FMAXV, FMINV */
192
- /* Bit 1 of size field encodes min vs max and the actual size
193
- * depends on the encoding of the U bit. If not set (and FP16
194
- * enabled) then we do half-precision float instead of single
195
- * precision.
196
- */
197
- is_min = extract32(size, 1, 1);
198
- if (!is_u && dc_isar_feature(aa64_fp16, s)) {
199
- size = 1;
200
- } else if (!is_u || !is_q || extract32(size, 0, 1)) {
201
- unallocated_encoding(s);
202
- return;
203
- } else {
204
- size = 2;
205
- }
257
- }
206
- break;
258
- break;
259
- case 4: /* Auto-neg advertisement */
260
- s->phy_advertise = (val & 0x2d7f) | 0x80;
261
- break;
262
- /* TODO 17, 18, 27, 31 */
263
- case 30: /* Interrupt mask */
264
- s->phy_int_mask = val & 0xff;
265
- phy_update_irq(s);
266
- break;
207
- default:
267
- default:
208
- case 0x3: /* SADDLV, UADDLV */
268
- qemu_log_mask(LOG_GUEST_ERROR,
209
- case 0xa: /* SMAXV, UMAXV */
269
- "do_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
210
- case 0x1a: /* SMINV, UMINV */
211
- case 0x1b: /* ADDV */
212
- unallocated_encoding(s);
213
- return;
214
- }
215
-
216
- if (!fp_access_check(s)) {
217
- return;
218
- }
219
-
220
- elements = (is_q ? 16 : 8) >> size;
221
-
222
- {
223
- /* Floating point vector reduction ops which work across 32
224
- * bit (single) or 16 bit (half-precision) intermediates.
225
- * Note that correct NaN propagation requires that we do these
226
- * operations in exactly the order specified by the pseudocode.
227
- */
228
- TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
229
- int fpopcode = opcode | is_min << 4 | is_u << 5;
230
- TCGv_i32 tcg_res = do_reduction_op(s, fpopcode, rn, size,
231
- 0, elements, fpst);
232
- write_fp_sreg(s, rd, tcg_res);
233
- }
270
- }
234
-}
271
-}
235
-
272
-
236
/* AdvSIMD modified immediate
273
static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
237
* 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0
274
{
238
* +---+---+----+---------------------+-----+-------+----+---+-------+------+
275
switch (reg) {
239
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
276
@@ -XXX,XX +XXX,XX @@ static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
240
static const AArch64DecodeTable data_proc_simd[] = {
277
if (val & 2) {
241
/* pattern , mask , fn */
278
DPRINTF("PHY write %d = 0x%04x\n",
242
{ 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
279
(val >> 6) & 0x1f, s->mac_mii_data);
243
- { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
280
- do_phy_write(s, (val >> 6) & 0x1f, s->mac_mii_data);
244
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
281
+ lan9118_phy_write(&s->mii, (val >> 6) & 0x1f, s->mac_mii_data);
245
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
282
} else {
246
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
283
- s->mac_mii_data = do_phy_read(s, (val >> 6) & 0x1f);
284
+ s->mac_mii_data = lan9118_phy_read(&s->mii, (val >> 6) & 0x1f);
285
DPRINTF("PHY read %d = 0x%04x\n",
286
(val >> 6) & 0x1f, s->mac_mii_data);
287
}
288
@@ -XXX,XX +XXX,XX @@ static void lan9118_writel(void *opaque, hwaddr offset,
289
break;
290
case CSR_PMT_CTRL:
291
if (val & 0x400) {
292
- phy_reset(s);
293
+ lan9118_phy_reset(&s->mii);
294
}
295
s->pmt_ctrl &= ~0x34e;
296
s->pmt_ctrl |= (val & 0x34e);
297
@@ -XXX,XX +XXX,XX @@ static void lan9118_realize(DeviceState *dev, Error **errp)
298
const MemoryRegionOps *mem_ops =
299
s->mode_16bit ? &lan9118_16bit_mem_ops : &lan9118_mem_ops;
300
301
+ qemu_init_irq(&s->mii_irq, lan9118_update_irq, s, 0);
302
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
303
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
304
+ return;
305
+ }
306
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
307
+
308
memory_region_init_io(&s->mmio, OBJECT(dev), mem_ops, s,
309
"lan9118-mmio", 0x100);
310
sysbus_init_mmio(sbd, &s->mmio);
311
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
312
new file mode 100644
313
index XXXXXXX..XXXXXXX
314
--- /dev/null
315
+++ b/hw/net/lan9118_phy.c
316
@@ -XXX,XX +XXX,XX @@
317
+/*
318
+ * SMSC LAN9118 PHY emulation
319
+ *
320
+ * Copyright (c) 2009 CodeSourcery, LLC.
321
+ * Written by Paul Brook
322
+ *
323
+ * This code is licensed under the GNU GPL v2
324
+ *
325
+ * Contributions after 2012-01-13 are licensed under the terms of the
326
+ * GNU GPL, version 2 or (at your option) any later version.
327
+ */
328
+
329
+#include "qemu/osdep.h"
330
+#include "hw/net/lan9118_phy.h"
331
+#include "hw/irq.h"
332
+#include "hw/resettable.h"
333
+#include "migration/vmstate.h"
334
+#include "qemu/log.h"
335
+
336
+#define PHY_INT_ENERGYON (1 << 7)
337
+#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
338
+#define PHY_INT_FAULT (1 << 5)
339
+#define PHY_INT_DOWN (1 << 4)
340
+#define PHY_INT_AUTONEG_LP (1 << 3)
341
+#define PHY_INT_PARFAULT (1 << 2)
342
+#define PHY_INT_AUTONEG_PAGE (1 << 1)
343
+
344
+static void lan9118_phy_update_irq(Lan9118PhyState *s)
345
+{
346
+ qemu_set_irq(s->irq, !!(s->ints & s->int_mask));
347
+}
348
+
349
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
350
+{
351
+ uint16_t val;
352
+
353
+ switch (reg) {
354
+ case 0: /* Basic Control */
355
+ return s->control;
356
+ case 1: /* Basic Status */
357
+ return s->status;
358
+ case 2: /* ID1 */
359
+ return 0x0007;
360
+ case 3: /* ID2 */
361
+ return 0xc0d1;
362
+ case 4: /* Auto-neg advertisement */
363
+ return s->advertise;
364
+ case 5: /* Auto-neg Link Partner Ability */
365
+ return 0x0f71;
366
+ case 6: /* Auto-neg Expansion */
367
+ return 1;
368
+ /* TODO 17, 18, 27, 29, 30, 31 */
369
+ case 29: /* Interrupt source. */
370
+ val = s->ints;
371
+ s->ints = 0;
372
+ lan9118_phy_update_irq(s);
373
+ return val;
374
+ case 30: /* Interrupt mask */
375
+ return s->int_mask;
376
+ default:
377
+ qemu_log_mask(LOG_GUEST_ERROR,
378
+ "lan9118_phy_read: PHY read reg %d\n", reg);
379
+ return 0;
380
+ }
381
+}
382
+
383
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
384
+{
385
+ switch (reg) {
386
+ case 0: /* Basic Control */
387
+ if (val & 0x8000) {
388
+ lan9118_phy_reset(s);
389
+ break;
390
+ }
391
+ s->control = val & 0x7980;
392
+ /* Complete autonegotiation immediately. */
393
+ if (val & 0x1000) {
394
+ s->status |= 0x0020;
395
+ }
396
+ break;
397
+ case 4: /* Auto-neg advertisement */
398
+ s->advertise = (val & 0x2d7f) | 0x80;
399
+ break;
400
+ /* TODO 17, 18, 27, 31 */
401
+ case 30: /* Interrupt mask */
402
+ s->int_mask = val & 0xff;
403
+ lan9118_phy_update_irq(s);
404
+ break;
405
+ default:
406
+ qemu_log_mask(LOG_GUEST_ERROR,
407
+ "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
408
+ }
409
+}
410
+
411
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
412
+{
413
+ s->link_down = link_down;
414
+
415
+ /* Autonegotiation status mirrors link status. */
416
+ if (link_down) {
417
+ s->status &= ~0x0024;
418
+ s->ints |= PHY_INT_DOWN;
419
+ } else {
420
+ s->status |= 0x0024;
421
+ s->ints |= PHY_INT_ENERGYON;
422
+ s->ints |= PHY_INT_AUTONEG_COMPLETE;
423
+ }
424
+ lan9118_phy_update_irq(s);
425
+}
426
+
427
+void lan9118_phy_reset(Lan9118PhyState *s)
428
+{
429
+ s->control = 0x3000;
430
+ s->status = 0x7809;
431
+ s->advertise = 0x01e1;
432
+ s->int_mask = 0;
433
+ s->ints = 0;
434
+ lan9118_phy_update_link(s, s->link_down);
435
+}
436
+
437
+static void lan9118_phy_reset_hold(Object *obj, ResetType type)
438
+{
439
+ Lan9118PhyState *s = LAN9118_PHY(obj);
440
+
441
+ lan9118_phy_reset(s);
442
+}
443
+
444
+static void lan9118_phy_init(Object *obj)
445
+{
446
+ Lan9118PhyState *s = LAN9118_PHY(obj);
447
+
448
+ qdev_init_gpio_out(DEVICE(s), &s->irq, 1);
449
+}
450
+
451
+static const VMStateDescription vmstate_lan9118_phy = {
452
+ .name = "lan9118-phy",
453
+ .version_id = 1,
454
+ .minimum_version_id = 1,
455
+ .fields = (const VMStateField[]) {
456
+ VMSTATE_UINT16(control, Lan9118PhyState),
457
+ VMSTATE_UINT16(status, Lan9118PhyState),
458
+ VMSTATE_UINT16(advertise, Lan9118PhyState),
459
+ VMSTATE_UINT16(ints, Lan9118PhyState),
460
+ VMSTATE_UINT16(int_mask, Lan9118PhyState),
461
+ VMSTATE_BOOL(link_down, Lan9118PhyState),
462
+ VMSTATE_END_OF_LIST()
463
+ }
464
+};
465
+
466
+static void lan9118_phy_class_init(ObjectClass *klass, void *data)
467
+{
468
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
469
+ DeviceClass *dc = DEVICE_CLASS(klass);
470
+
471
+ rc->phases.hold = lan9118_phy_reset_hold;
472
+ dc->vmsd = &vmstate_lan9118_phy;
473
+}
474
+
475
+static const TypeInfo types[] = {
476
+ {
477
+ .name = TYPE_LAN9118_PHY,
478
+ .parent = TYPE_SYS_BUS_DEVICE,
479
+ .instance_size = sizeof(Lan9118PhyState),
480
+ .instance_init = lan9118_phy_init,
481
+ .class_init = lan9118_phy_class_init,
482
+ }
483
+};
484
+
485
+DEFINE_TYPES(types)
486
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
487
index XXXXXXX..XXXXXXX 100644
488
--- a/hw/net/Kconfig
489
+++ b/hw/net/Kconfig
490
@@ -XXX,XX +XXX,XX @@ config VMXNET3_PCI
491
config SMC91C111
492
bool
493
494
+config LAN9118_PHY
495
+ bool
496
+
497
config LAN9118
498
bool
499
+ select LAN9118_PHY
500
select PTIMER
501
502
config NE2000_ISA
503
diff --git a/hw/net/meson.build b/hw/net/meson.build
504
index XXXXXXX..XXXXXXX 100644
505
--- a/hw/net/meson.build
506
+++ b/hw/net/meson.build
507
@@ -XXX,XX +XXX,XX @@ system_ss.add(when: 'CONFIG_VMXNET3_PCI', if_true: files('vmxnet3.c'))
508
509
system_ss.add(when: 'CONFIG_SMC91C111', if_true: files('smc91c111.c'))
510
system_ss.add(when: 'CONFIG_LAN9118', if_true: files('lan9118.c'))
511
+system_ss.add(when: 'CONFIG_LAN9118_PHY', if_true: files('lan9118_phy.c'))
512
system_ss.add(when: 'CONFIG_NE2000_ISA', if_true: files('ne2000-isa.c'))
513
system_ss.add(when: 'CONFIG_OPENCORES_ETH', if_true: files('opencores_eth.c'))
514
system_ss.add(when: 'CONFIG_XGMAC', if_true: files('xgmac.c'))
247
--
515
--
248
2.34.1
516
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
imx_fec models the same PHY as lan9118_phy. The code is almost the same with
4
imx_fec having more logging and tracing. Merge these improvements into
5
lan9118_phy and reuse in imx_fec to fix the code duplication.
6
7
Some migration state how resides in the new device model which breaks migration
8
compatibility for the following machines:
9
* imx25-pdk
10
* sabrelite
11
* mcimx7d-sabre
12
* mcimx6ul-evk
13
14
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
15
Tested-by: Guenter Roeck <linux@roeck-us.net>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
17
Message-id: 20241102125724.532843-3-shentey@gmail.com
5
Message-id: 20240912024114.1097832-28-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
19
---
8
target/arm/tcg/a64.decode | 36 +++++-
20
include/hw/net/imx_fec.h | 9 ++-
9
target/arm/tcg/translate-a64.c | 223 ++++++++++++++-------------------
21
hw/net/imx_fec.c | 146 ++++-----------------------------------
10
2 files changed, 128 insertions(+), 131 deletions(-)
22
hw/net/lan9118_phy.c | 82 ++++++++++++++++------
23
hw/net/Kconfig | 1 +
24
hw/net/trace-events | 10 +--
25
5 files changed, 85 insertions(+), 163 deletions(-)
11
26
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
27
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
13
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
29
--- a/include/hw/net/imx_fec.h
15
+++ b/target/arm/tcg/a64.decode
30
+++ b/include/hw/net/imx_fec.h
16
@@ -XXX,XX +XXX,XX @@ RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_b
31
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(IMXFECState, IMX_FEC)
17
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_h
32
#define TYPE_IMX_ENET "imx.enet"
18
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
33
19
34
#include "hw/sysbus.h"
20
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_b
35
+#include "hw/net/lan9118_phy.h"
21
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_h
36
+#include "hw/irq.h"
22
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_s
37
#include "net/net.h"
23
+SQSHL_vi 0.00 11110 .... ... 01110 1 ..... ..... @q_shli_d
38
24
+
39
#define ENET_EIR 1
25
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_b
40
@@ -XXX,XX +XXX,XX @@ struct IMXFECState {
26
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_h
41
uint32_t tx_descriptor[ENET_TX_RING_NUM];
27
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_s
42
uint32_t tx_ring_num;
28
+UQSHL_vi 0.10 11110 .... ... 01110 1 ..... ..... @q_shli_d
43
29
+
44
- uint32_t phy_status;
30
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_b
45
- uint32_t phy_control;
31
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_h
46
- uint32_t phy_advertise;
32
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_s
47
- uint32_t phy_int;
33
+SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_d
48
- uint32_t phy_int_mask;
34
+
49
+ Lan9118PhyState mii;
35
# Advanced SIMD scalar shift by immediate
50
+ IRQState mii_irq;
36
51
uint32_t phy_num;
37
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
52
bool phy_connected;
38
&rri_e esz=3 imm=%neon_rshift_i6
53
struct IMXFECState *phy_consumer;
39
-@shli_d .... ..... 1 imm:6 ..... . rn:5 rd:5 &rri_e esz=3
54
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
40
+
41
+@shli_b .... ..... 0001 imm:3 ..... . rn:5 rd:5 &rri_e esz=0
42
+@shli_h .... ..... 001 imm:4 ..... . rn:5 rd:5 &rri_e esz=1
43
+@shli_s .... ..... 01 imm:5 ..... . rn:5 rd:5 &rri_e esz=2
44
+@shli_d .... ..... 1 imm:6 ..... . rn:5 rd:5 &rri_e esz=3
45
46
SSHR_s 0101 11110 .... ... 00000 1 ..... ..... @shri_d
47
USHR_s 0111 11110 .... ... 00000 1 ..... ..... @shri_d
48
@@ -XXX,XX +XXX,XX @@ SRI_s 0111 11110 .... ... 01000 1 ..... ..... @shri_d
49
50
SHL_s 0101 11110 .... ... 01010 1 ..... ..... @shli_d
51
SLI_s 0111 11110 .... ... 01010 1 ..... ..... @shli_d
52
+
53
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_b
54
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_h
55
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_s
56
+SQSHL_si 0101 11110 .... ... 01110 1 ..... ..... @shli_d
57
+
58
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_b
59
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_h
60
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_s
61
+UQSHL_si 0111 11110 .... ... 01110 1 ..... ..... @shli_d
62
+
63
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_b
64
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_h
65
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_s
66
+SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_d
67
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
68
index XXXXXXX..XXXXXXX 100644
55
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/tcg/translate-a64.c
56
--- a/hw/net/imx_fec.c
70
+++ b/target/arm/tcg/translate-a64.c
57
+++ b/hw/net/imx_fec.c
71
@@ -XXX,XX +XXX,XX @@ TRANS(URSRA_v, do_vec_shift_imm, a, gen_gvec_ursra)
58
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth_txdescs = {
72
TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
59
73
TRANS(SHL_v, do_vec_shift_imm, a, tcg_gen_gvec_shli)
60
static const VMStateDescription vmstate_imx_eth = {
74
TRANS(SLI_v, do_vec_shift_imm, a, gen_gvec_sli);
61
.name = TYPE_IMX_FEC,
75
+TRANS(SQSHL_vi, do_vec_shift_imm, a, gen_neon_sqshli)
62
- .version_id = 2,
76
+TRANS(UQSHL_vi, do_vec_shift_imm, a, gen_neon_uqshli)
63
- .minimum_version_id = 2,
77
+TRANS(SQSHLU_vi, do_vec_shift_imm, a, gen_neon_sqshlui)
64
+ .version_id = 3,
78
65
+ .minimum_version_id = 3,
79
static bool do_vec_shift_imm_wide(DisasContext *s, arg_qrri_e *a, bool is_u)
66
.fields = (const VMStateField[]) {
67
VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
68
VMSTATE_UINT32(rx_descriptor, IMXFECState),
69
VMSTATE_UINT32(tx_descriptor[0], IMXFECState),
70
- VMSTATE_UINT32(phy_status, IMXFECState),
71
- VMSTATE_UINT32(phy_control, IMXFECState),
72
- VMSTATE_UINT32(phy_advertise, IMXFECState),
73
- VMSTATE_UINT32(phy_int, IMXFECState),
74
- VMSTATE_UINT32(phy_int_mask, IMXFECState),
75
VMSTATE_END_OF_LIST()
76
},
77
.subsections = (const VMStateDescription * const []) {
78
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth = {
79
},
80
};
81
82
-#define PHY_INT_ENERGYON (1 << 7)
83
-#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
84
-#define PHY_INT_FAULT (1 << 5)
85
-#define PHY_INT_DOWN (1 << 4)
86
-#define PHY_INT_AUTONEG_LP (1 << 3)
87
-#define PHY_INT_PARFAULT (1 << 2)
88
-#define PHY_INT_AUTONEG_PAGE (1 << 1)
89
-
90
static void imx_eth_update(IMXFECState *s);
91
92
/*
93
@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
94
* For now we don't handle any GPIO/interrupt line, so the OS will
95
* have to poll for the PHY status.
96
*/
97
-static void imx_phy_update_irq(IMXFECState *s)
98
+static void imx_phy_update_irq(void *opaque, int n, int level)
80
{
99
{
81
@@ -XXX,XX +XXX,XX @@ TRANS(SRI_s, do_scalar_shift_imm, a, gen_sri_d, true, 0)
100
- imx_eth_update(s);
82
TRANS(SHL_s, do_scalar_shift_imm, a, tcg_gen_shli_i64, false, 0)
101
-}
83
TRANS(SLI_s, do_scalar_shift_imm, a, gen_sli_d, true, 0)
102
-
84
103
-static void imx_phy_update_link(IMXFECState *s)
85
+static void trunc_i64_env_imm(TCGv_i64 d, TCGv_i64 s, int64_t i,
86
+ NeonGenTwoOpEnvFn *fn)
87
+{
88
+ TCGv_i32 t = tcg_temp_new_i32();
89
+ tcg_gen_extrl_i64_i32(t, s);
90
+ fn(t, tcg_env, t, tcg_constant_i32(i));
91
+ tcg_gen_extu_i32_i64(d, t);
92
+}
93
+
94
+static void gen_sqshli_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
95
+{
96
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_s8);
97
+}
98
+
99
+static void gen_sqshli_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
100
+{
101
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_s16);
102
+}
103
+
104
+static void gen_sqshli_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
105
+{
106
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_s32);
107
+}
108
+
109
+static void gen_sqshli_d(TCGv_i64 d, TCGv_i64 s, int64_t i)
110
+{
111
+ gen_helper_neon_qshl_s64(d, tcg_env, s, tcg_constant_i64(i));
112
+}
113
+
114
+static void gen_uqshli_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
115
+{
116
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_u8);
117
+}
118
+
119
+static void gen_uqshli_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
120
+{
121
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_u16);
122
+}
123
+
124
+static void gen_uqshli_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
125
+{
126
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshl_u32);
127
+}
128
+
129
+static void gen_uqshli_d(TCGv_i64 d, TCGv_i64 s, int64_t i)
130
+{
131
+ gen_helper_neon_qshl_u64(d, tcg_env, s, tcg_constant_i64(i));
132
+}
133
+
134
+static void gen_sqshlui_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
135
+{
136
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshlu_s8);
137
+}
138
+
139
+static void gen_sqshlui_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
140
+{
141
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshlu_s16);
142
+}
143
+
144
+static void gen_sqshlui_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
145
+{
146
+ trunc_i64_env_imm(d, s, i, gen_helper_neon_qshlu_s32);
147
+}
148
+
149
+static void gen_sqshlui_d(TCGv_i64 d, TCGv_i64 s, int64_t i)
150
+{
151
+ gen_helper_neon_qshlu_s64(d, tcg_env, s, tcg_constant_i64(i));
152
+}
153
+
154
+static WideShiftImmFn * const f_scalar_sqshli[] = {
155
+ gen_sqshli_b, gen_sqshli_h, gen_sqshli_s, gen_sqshli_d
156
+};
157
+
158
+static WideShiftImmFn * const f_scalar_uqshli[] = {
159
+ gen_uqshli_b, gen_uqshli_h, gen_uqshli_s, gen_uqshli_d
160
+};
161
+
162
+static WideShiftImmFn * const f_scalar_sqshlui[] = {
163
+ gen_sqshlui_b, gen_sqshlui_h, gen_sqshlui_s, gen_sqshlui_d
164
+};
165
+
166
+/* Note that the helpers sign-extend their inputs, so don't do it here. */
167
+TRANS(SQSHL_si, do_scalar_shift_imm, a, f_scalar_sqshli[a->esz], false, 0)
168
+TRANS(UQSHL_si, do_scalar_shift_imm, a, f_scalar_uqshli[a->esz], false, 0)
169
+TRANS(SQSHLU_si, do_scalar_shift_imm, a, f_scalar_sqshlui[a->esz], false, 0)
170
+
171
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
172
* Note that it is the caller's responsibility to ensure that the
173
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
174
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
175
clear_vec_high(s, is_q, rd);
176
}
177
178
-/* SQSHLU, UQSHL, SQSHL: saturating left shifts */
179
-static void handle_simd_qshl(DisasContext *s, bool scalar, bool is_q,
180
- bool src_unsigned, bool dst_unsigned,
181
- int immh, int immb, int rn, int rd)
182
-{
104
-{
183
- int immhb = immh << 3 | immb;
105
- /* Autonegotiation status mirrors link status. */
184
- int size = 32 - clz32(immh) - 1;
106
- if (qemu_get_queue(s->nic)->link_down) {
185
- int shift = immhb - (8 << size);
107
- trace_imx_phy_update_link("down");
186
- int pass;
108
- s->phy_status &= ~0x0024;
187
-
109
- s->phy_int |= PHY_INT_DOWN;
188
- assert(immh != 0);
110
- } else {
189
- assert(!(scalar && is_q));
111
- trace_imx_phy_update_link("up");
190
-
112
- s->phy_status |= 0x0024;
191
- if (!scalar) {
113
- s->phy_int |= PHY_INT_ENERGYON;
192
- if (!is_q && extract32(immh, 3, 1)) {
114
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
193
- unallocated_encoding(s);
194
- return;
195
- }
196
-
197
- /* Since we use the variable-shift helpers we must
198
- * replicate the shift count into each element of
199
- * the tcg_shift value.
200
- */
201
- switch (size) {
202
- case 0:
203
- shift |= shift << 8;
204
- /* fall through */
205
- case 1:
206
- shift |= shift << 16;
207
- break;
208
- case 2:
209
- case 3:
210
- break;
211
- default:
212
- g_assert_not_reached();
213
- }
214
- }
115
- }
215
-
116
- imx_phy_update_irq(s);
216
- if (!fp_access_check(s)) {
117
+ imx_eth_update(opaque);
217
- return;
118
}
119
120
static void imx_eth_set_link(NetClientState *nc)
121
{
122
- imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
123
-}
124
-
125
-static void imx_phy_reset(IMXFECState *s)
126
-{
127
- trace_imx_phy_reset();
128
-
129
- s->phy_status = 0x7809;
130
- s->phy_control = 0x3000;
131
- s->phy_advertise = 0x01e1;
132
- s->phy_int_mask = 0;
133
- s->phy_int = 0;
134
- imx_phy_update_link(s);
135
+ lan9118_phy_update_link(&IMX_FEC(qemu_get_nic_opaque(nc))->mii,
136
+ nc->link_down);
137
}
138
139
static uint32_t imx_phy_read(IMXFECState *s, int reg)
140
{
141
- uint32_t val;
142
uint32_t phy = reg / 32;
143
144
if (!s->phy_connected) {
145
@@ -XXX,XX +XXX,XX @@ static uint32_t imx_phy_read(IMXFECState *s, int reg)
146
147
reg %= 32;
148
149
- switch (reg) {
150
- case 0: /* Basic Control */
151
- val = s->phy_control;
152
- break;
153
- case 1: /* Basic Status */
154
- val = s->phy_status;
155
- break;
156
- case 2: /* ID1 */
157
- val = 0x0007;
158
- break;
159
- case 3: /* ID2 */
160
- val = 0xc0d1;
161
- break;
162
- case 4: /* Auto-neg advertisement */
163
- val = s->phy_advertise;
164
- break;
165
- case 5: /* Auto-neg Link Partner Ability */
166
- val = 0x0f71;
167
- break;
168
- case 6: /* Auto-neg Expansion */
169
- val = 1;
170
- break;
171
- case 29: /* Interrupt source. */
172
- val = s->phy_int;
173
- s->phy_int = 0;
174
- imx_phy_update_irq(s);
175
- break;
176
- case 30: /* Interrupt mask */
177
- val = s->phy_int_mask;
178
- break;
179
- case 17:
180
- case 18:
181
- case 27:
182
- case 31:
183
- qemu_log_mask(LOG_UNIMP, "[%s.phy]%s: reg %d not implemented\n",
184
- TYPE_IMX_FEC, __func__, reg);
185
- val = 0;
186
- break;
187
- default:
188
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
189
- TYPE_IMX_FEC, __func__, reg);
190
- val = 0;
191
- break;
218
- }
192
- }
219
-
193
-
220
- if (size == 3) {
194
- trace_imx_phy_read(val, phy, reg);
221
- TCGv_i64 tcg_shift = tcg_constant_i64(shift);
195
-
222
- static NeonGenTwo64OpEnvFn * const fns[2][2] = {
196
- return val;
223
- { gen_helper_neon_qshl_s64, gen_helper_neon_qshlu_s64 },
197
+ return lan9118_phy_read(&s->mii, reg);
224
- { NULL, gen_helper_neon_qshl_u64 },
198
}
225
- };
199
226
- NeonGenTwo64OpEnvFn *genfn = fns[src_unsigned][dst_unsigned];
200
static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
227
- int maxpass = is_q ? 2 : 1;
201
@@ -XXX,XX +XXX,XX @@ static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
228
-
202
229
- for (pass = 0; pass < maxpass; pass++) {
203
reg %= 32;
230
- TCGv_i64 tcg_op = tcg_temp_new_i64();
204
231
-
205
- trace_imx_phy_write(val, phy, reg);
232
- read_vec_element(s, tcg_op, rn, pass, MO_64);
206
-
233
- genfn(tcg_op, tcg_env, tcg_op, tcg_shift);
207
- switch (reg) {
234
- write_vec_element(s, tcg_op, rd, pass, MO_64);
208
- case 0: /* Basic Control */
235
- }
209
- if (val & 0x8000) {
236
- clear_vec_high(s, is_q, rd);
210
- imx_phy_reset(s);
237
- } else {
211
- } else {
238
- TCGv_i32 tcg_shift = tcg_constant_i32(shift);
212
- s->phy_control = val & 0x7980;
239
- static NeonGenTwoOpEnvFn * const fns[2][2][3] = {
213
- /* Complete autonegotiation immediately. */
240
- {
214
- if (val & 0x1000) {
241
- { gen_helper_neon_qshl_s8,
215
- s->phy_status |= 0x0020;
242
- gen_helper_neon_qshl_s16,
243
- gen_helper_neon_qshl_s32 },
244
- { gen_helper_neon_qshlu_s8,
245
- gen_helper_neon_qshlu_s16,
246
- gen_helper_neon_qshlu_s32 }
247
- }, {
248
- { NULL, NULL, NULL },
249
- { gen_helper_neon_qshl_u8,
250
- gen_helper_neon_qshl_u16,
251
- gen_helper_neon_qshl_u32 }
252
- }
253
- };
254
- NeonGenTwoOpEnvFn *genfn = fns[src_unsigned][dst_unsigned][size];
255
- MemOp memop = scalar ? size : MO_32;
256
- int maxpass = scalar ? 1 : is_q ? 4 : 2;
257
-
258
- for (pass = 0; pass < maxpass; pass++) {
259
- TCGv_i32 tcg_op = tcg_temp_new_i32();
260
-
261
- read_vec_element_i32(s, tcg_op, rn, pass, memop);
262
- genfn(tcg_op, tcg_env, tcg_op, tcg_shift);
263
- if (scalar) {
264
- switch (size) {
265
- case 0:
266
- tcg_gen_ext8u_i32(tcg_op, tcg_op);
267
- break;
268
- case 1:
269
- tcg_gen_ext16u_i32(tcg_op, tcg_op);
270
- break;
271
- case 2:
272
- break;
273
- default:
274
- g_assert_not_reached();
275
- }
276
- write_fp_sreg(s, rd, tcg_op);
277
- } else {
278
- write_vec_element_i32(s, tcg_op, rd, pass, MO_32);
279
- }
216
- }
280
- }
217
- }
281
-
218
- break;
282
- if (!scalar) {
219
- case 4: /* Auto-neg advertisement */
283
- clear_vec_high(s, is_q, rd);
220
- s->phy_advertise = (val & 0x2d7f) | 0x80;
221
- break;
222
- case 30: /* Interrupt mask */
223
- s->phy_int_mask = val & 0xff;
224
- imx_phy_update_irq(s);
225
- break;
226
- case 17:
227
- case 18:
228
- case 27:
229
- case 31:
230
- qemu_log_mask(LOG_UNIMP, "[%s.phy)%s: reg %d not implemented\n",
231
- TYPE_IMX_FEC, __func__, reg);
232
- break;
233
- default:
234
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
235
- TYPE_IMX_FEC, __func__, reg);
236
- break;
237
- }
238
+ lan9118_phy_write(&s->mii, reg, val);
239
}
240
241
static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
242
@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
243
244
s->rx_descriptor = 0;
245
memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
246
-
247
- /* We also reset the PHY */
248
- imx_phy_reset(s);
249
}
250
251
static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
252
@@ -XXX,XX +XXX,XX @@ static void imx_eth_realize(DeviceState *dev, Error **errp)
253
sysbus_init_irq(sbd, &s->irq[0]);
254
sysbus_init_irq(sbd, &s->irq[1]);
255
256
+ qemu_init_irq(&s->mii_irq, imx_phy_update_irq, s, 0);
257
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
258
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
259
+ return;
260
+ }
261
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
262
+
263
qemu_macaddr_default_if_unset(&s->conf.macaddr);
264
265
s->nic = qemu_new_nic(&imx_eth_net_info, &s->conf,
266
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
267
index XXXXXXX..XXXXXXX 100644
268
--- a/hw/net/lan9118_phy.c
269
+++ b/hw/net/lan9118_phy.c
270
@@ -XXX,XX +XXX,XX @@
271
* Copyright (c) 2009 CodeSourcery, LLC.
272
* Written by Paul Brook
273
*
274
+ * Copyright (c) 2013 Jean-Christophe Dubois. <jcd@tribudubois.net>
275
+ *
276
* This code is licensed under the GNU GPL v2
277
*
278
* Contributions after 2012-01-13 are licensed under the terms of the
279
@@ -XXX,XX +XXX,XX @@
280
#include "hw/resettable.h"
281
#include "migration/vmstate.h"
282
#include "qemu/log.h"
283
+#include "trace.h"
284
285
#define PHY_INT_ENERGYON (1 << 7)
286
#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
287
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
288
289
switch (reg) {
290
case 0: /* Basic Control */
291
- return s->control;
292
+ val = s->control;
293
+ break;
294
case 1: /* Basic Status */
295
- return s->status;
296
+ val = s->status;
297
+ break;
298
case 2: /* ID1 */
299
- return 0x0007;
300
+ val = 0x0007;
301
+ break;
302
case 3: /* ID2 */
303
- return 0xc0d1;
304
+ val = 0xc0d1;
305
+ break;
306
case 4: /* Auto-neg advertisement */
307
- return s->advertise;
308
+ val = s->advertise;
309
+ break;
310
case 5: /* Auto-neg Link Partner Ability */
311
- return 0x0f71;
312
+ val = 0x0f71;
313
+ break;
314
case 6: /* Auto-neg Expansion */
315
- return 1;
316
- /* TODO 17, 18, 27, 29, 30, 31 */
317
+ val = 1;
318
+ break;
319
case 29: /* Interrupt source. */
320
val = s->ints;
321
s->ints = 0;
322
lan9118_phy_update_irq(s);
323
- return val;
324
+ break;
325
case 30: /* Interrupt mask */
326
- return s->int_mask;
327
+ val = s->int_mask;
328
+ break;
329
+ case 17:
330
+ case 18:
331
+ case 27:
332
+ case 31:
333
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
334
+ __func__, reg);
335
+ val = 0;
336
+ break;
337
default:
338
- qemu_log_mask(LOG_GUEST_ERROR,
339
- "lan9118_phy_read: PHY read reg %d\n", reg);
340
- return 0;
341
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
342
+ __func__, reg);
343
+ val = 0;
344
+ break;
345
}
346
+
347
+ trace_lan9118_phy_read(val, reg);
348
+
349
+ return val;
350
}
351
352
void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
353
{
354
+ trace_lan9118_phy_write(val, reg);
355
+
356
switch (reg) {
357
case 0: /* Basic Control */
358
if (val & 0x8000) {
359
lan9118_phy_reset(s);
360
- break;
284
- }
361
- }
285
- }
362
- s->control = val & 0x7980;
286
-}
363
- /* Complete autonegotiation immediately. */
287
-
364
- if (val & 0x1000) {
288
/* Common vector code for handling integer to FP conversion */
365
- s->status |= 0x0020;
289
static void handle_simd_intfp_conv(DisasContext *s, int rd, int rn,
366
+ } else {
290
int elements, int is_signed,
367
+ s->control = val & 0x7980;
291
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
368
+ /* Complete autonegotiation immediately. */
292
handle_vec_simd_sqshrn(s, true, false, is_u, is_u,
369
+ if (val & 0x1000) {
293
immh, immb, opcode, rn, rd);
370
+ s->status |= 0x0020;
371
+ }
372
}
294
break;
373
break;
295
- case 0xc: /* SQSHLU */
374
case 4: /* Auto-neg advertisement */
296
- if (!is_u) {
375
s->advertise = (val & 0x2d7f) | 0x80;
297
- unallocated_encoding(s);
298
- return;
299
- }
300
- handle_simd_qshl(s, true, false, false, true, immh, immb, rn, rd);
301
- break;
302
- case 0xe: /* SQSHL, UQSHL */
303
- handle_simd_qshl(s, true, false, is_u, is_u, immh, immb, rn, rd);
304
- break;
305
case 0x1f: /* FCVTZS, FCVTZU */
306
handle_simd_shift_fpint_conv(s, true, false, is_u, immh, immb, rn, rd);
307
break;
376
break;
308
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
377
- /* TODO 17, 18, 27, 31 */
309
case 0x06: /* SRSRA / URSRA */
378
case 30: /* Interrupt mask */
310
case 0x08: /* SRI */
379
s->int_mask = val & 0xff;
311
case 0x0a: /* SHL / SLI */
380
lan9118_phy_update_irq(s);
312
+ case 0x0c: /* SQSHLU */
313
+ case 0x0e: /* SQSHL, UQSHL */
314
unallocated_encoding(s);
315
break;
381
break;
382
+ case 17:
383
+ case 18:
384
+ case 27:
385
+ case 31:
386
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
387
+ __func__, reg);
388
+ break;
389
default:
390
- qemu_log_mask(LOG_GUEST_ERROR,
391
- "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
392
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
393
+ __func__, reg);
394
+ break;
316
}
395
}
317
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
396
}
318
handle_simd_shift_intfp_conv(s, false, is_q, is_u, immh, immb,
397
319
opcode, rn, rd);
398
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
320
break;
399
321
- case 0xc: /* SQSHLU */
400
/* Autonegotiation status mirrors link status. */
322
- if (!is_u) {
401
if (link_down) {
323
- unallocated_encoding(s);
402
+ trace_lan9118_phy_update_link("down");
324
- return;
403
s->status &= ~0x0024;
325
- }
404
s->ints |= PHY_INT_DOWN;
326
- handle_simd_qshl(s, false, is_q, false, true, immh, immb, rn, rd);
405
} else {
327
- break;
406
+ trace_lan9118_phy_update_link("up");
328
- case 0xe: /* SQSHL, UQSHL */
407
s->status |= 0x0024;
329
- handle_simd_qshl(s, false, is_q, is_u, is_u, immh, immb, rn, rd);
408
s->ints |= PHY_INT_ENERGYON;
330
- break;
409
s->ints |= PHY_INT_AUTONEG_COMPLETE;
331
case 0x1f: /* FCVTZS/ FCVTZU */
410
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
332
handle_simd_shift_fpint_conv(s, false, is_q, is_u, immh, immb, rn, rd);
411
333
return;
412
void lan9118_phy_reset(Lan9118PhyState *s)
334
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
413
{
335
case 0x06: /* SRSRA / URSRA (accum + rounding) */
414
+ trace_lan9118_phy_reset();
336
case 0x08: /* SRI */
415
+
337
case 0x0a: /* SHL / SLI */
416
s->control = 0x3000;
338
+ case 0x0c: /* SQSHLU */
417
s->status = 0x7809;
339
+ case 0x0e: /* SQSHL, UQSHL */
418
s->advertise = 0x01e1;
340
case 0x14: /* SSHLL / USHLL */
419
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_phy = {
341
unallocated_encoding(s);
420
.version_id = 1,
342
return;
421
.minimum_version_id = 1,
422
.fields = (const VMStateField[]) {
423
- VMSTATE_UINT16(control, Lan9118PhyState),
424
VMSTATE_UINT16(status, Lan9118PhyState),
425
+ VMSTATE_UINT16(control, Lan9118PhyState),
426
VMSTATE_UINT16(advertise, Lan9118PhyState),
427
VMSTATE_UINT16(ints, Lan9118PhyState),
428
VMSTATE_UINT16(int_mask, Lan9118PhyState),
429
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
430
index XXXXXXX..XXXXXXX 100644
431
--- a/hw/net/Kconfig
432
+++ b/hw/net/Kconfig
433
@@ -XXX,XX +XXX,XX @@ config ALLWINNER_SUN8I_EMAC
434
435
config IMX_FEC
436
bool
437
+ select LAN9118_PHY
438
439
config CADENCE
440
bool
441
diff --git a/hw/net/trace-events b/hw/net/trace-events
442
index XXXXXXX..XXXXXXX 100644
443
--- a/hw/net/trace-events
444
+++ b/hw/net/trace-events
445
@@ -XXX,XX +XXX,XX @@ allwinner_sun8i_emac_set_link(bool active) "Set link: active=%u"
446
allwinner_sun8i_emac_read(uint64_t offset, uint64_t val) "MMIO read: offset=0x%" PRIx64 " value=0x%" PRIx64
447
allwinner_sun8i_emac_write(uint64_t offset, uint64_t val) "MMIO write: offset=0x%" PRIx64 " value=0x%" PRIx64
448
449
+# lan9118_phy.c
450
+lan9118_phy_read(uint16_t val, int reg) "[0x%02x] -> 0x%04" PRIx16
451
+lan9118_phy_write(uint16_t val, int reg) "[0x%02x] <- 0x%04" PRIx16
452
+lan9118_phy_update_link(const char *s) "%s"
453
+lan9118_phy_reset(void) ""
454
+
455
# lance.c
456
lance_mem_readw(uint64_t addr, uint32_t ret) "addr=0x%"PRIx64"val=0x%04x"
457
lance_mem_writew(uint64_t addr, uint32_t val) "addr=0x%"PRIx64"val=0x%04x"
458
@@ -XXX,XX +XXX,XX @@ i82596_set_multicast(uint16_t count) "Added %d multicast entries"
459
i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
460
461
# imx_fec.c
462
-imx_phy_read(uint32_t val, int phy, int reg) "0x%04"PRIx32" <= phy[%d].reg[%d]"
463
imx_phy_read_num(int phy, int configured) "read request from unconfigured phy %d (configured %d)"
464
-imx_phy_write(uint32_t val, int phy, int reg) "0x%04"PRIx32" => phy[%d].reg[%d]"
465
imx_phy_write_num(int phy, int configured) "write request to unconfigured phy %d (configured %d)"
466
-imx_phy_update_link(const char *s) "%s"
467
-imx_phy_reset(void) ""
468
imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
469
imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
470
imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
343
--
471
--
344
2.34.1
472
2.34.1
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
OpenBSD 7.3 we use is EoL. Both 7.4 and 7.5 releases do not work on
3
Turns 0x70 into 0xe0 (== 0x70 << 1) which adds the missing MII_ANLPAR_TX and
4
anything above Neoverse-N1 due to PAC emulation:
4
fixes the MSB of selector field to be zero, as specified in the datasheet.
5
5
6
https://marc.info/?l=openbsd-arm&m=171050428327850&w=2
6
Fixes: 2a424990170b "LAN9118 emulation"
7
7
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
8
OpenBSD 7.6 is not yet released.
8
Tested-by: Guenter Roeck <linux@roeck-us.net>
9
10
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
11
Message-id: 20240910-b4-move-to-freebsd-v5-4-0fb66d803c93@linaro.org
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20241102125724.532843-4-shentey@gmail.com
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
12
---
15
tests/functional/test_aarch64_sbsaref.py | 44 ------------------------
13
hw/net/lan9118_phy.c | 2 +-
16
1 file changed, 44 deletions(-)
14
1 file changed, 1 insertion(+), 1 deletion(-)
17
15
18
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
19
index XXXXXXX..XXXXXXX 100755
17
index XXXXXXX..XXXXXXX 100644
20
--- a/tests/functional/test_aarch64_sbsaref.py
18
--- a/hw/net/lan9118_phy.c
21
+++ b/tests/functional/test_aarch64_sbsaref.py
19
+++ b/hw/net/lan9118_phy.c
22
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_alpine_linux_max(self):
20
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
23
self.boot_alpine_linux("max")
21
val = s->advertise;
24
22
break;
25
23
case 5: /* Auto-neg Link Partner Ability */
26
- ASSET_OPENBSD_ISO = Asset(
24
- val = 0x0f71;
27
- ('https://cdn.openbsd.org/pub/OpenBSD/7.3/arm64/miniroot73.img'),
25
+ val = 0x0fe1;
28
- '7fc2c75401d6f01fbfa25f4953f72ad7d7c18650056d30755c44b9c129b707e5')
26
break;
29
-
27
case 6: /* Auto-neg Expansion */
30
- # This tests the whole boot chain from EFI to Userspace
28
val = 1;
31
- # We only boot a whole OS for the current top level CPU and GIC
32
- # Other test profiles should use more minimal boots
33
- def boot_openbsd73(self, cpu=None):
34
- self.fetch_firmware()
35
-
36
- img_path = self.ASSET_OPENBSD_ISO.fetch()
37
-
38
- self.vm.set_console()
39
- self.vm.add_args(
40
- "-drive", f"file={img_path},format=raw,snapshot=on",
41
- )
42
- if cpu:
43
- self.vm.add_args("-cpu", cpu)
44
-
45
- self.vm.launch()
46
- wait_for_console_pattern(self,
47
- "Welcome to the OpenBSD/arm64"
48
- " 7.3 installation program.")
49
-
50
- def test_sbsaref_openbsd73_cortex_a57(self):
51
- self.boot_openbsd73("cortex-a57")
52
-
53
- def test_sbsaref_openbsd73_default_cpu(self):
54
- self.boot_openbsd73()
55
-
56
- def test_sbsaref_openbsd73_max_pauth_off(self):
57
- self.boot_openbsd73("max,pauth=off")
58
-
59
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
60
- 'Test might timeout due to PAuth emulation')
61
- def test_sbsaref_openbsd73_max_pauth_impdef(self):
62
- self.boot_openbsd73("max,pauth-impdef=on")
63
-
64
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
65
- 'Test might timeout due to PAuth emulation')
66
- def test_sbsaref_openbsd73_max(self):
67
- self.boot_openbsd73("max")
68
-
69
-
70
ASSET_FREEBSD_ISO = Asset(
71
('https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/'
72
'14.1/FreeBSD-14.1-RELEASE-arm64-aarch64-bootonly.iso'),
73
--
29
--
74
2.34.1
30
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
Instead of cmp+and or cmp+andc, use cmpsel. This will
3
Prefer named constants over magic values for better readability.
4
be better for hosts that use predicate registers for cmp.
5
4
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
8
Message-id: 20240912024114.1097832-4-richard.henderson@linaro.org
7
Tested-by: Guenter Roeck <linux@roeck-us.net>
8
Message-id: 20241102125724.532843-5-shentey@gmail.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
target/arm/tcg/gengvec.c | 19 ++++++++-----------
11
include/hw/net/mii.h | 6 +++++
12
1 file changed, 8 insertions(+), 11 deletions(-)
12
hw/net/lan9118_phy.c | 63 ++++++++++++++++++++++++++++----------------
13
2 files changed, 46 insertions(+), 23 deletions(-)
13
14
14
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
15
diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/gengvec.c
17
--- a/include/hw/net/mii.h
17
+++ b/target/arm/tcg/gengvec.c
18
+++ b/include/hw/net/mii.h
18
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
19
@@ -XXX,XX +XXX,XX @@
19
TCGv_vec rval = tcg_temp_new_vec_matching(dst);
20
#define MII_BMSR_JABBER (1 << 1) /* Jabber detected */
20
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
21
#define MII_BMSR_EXTCAP (1 << 0) /* Ext-reg capability */
21
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
22
22
- TCGv_vec max;
23
+#define MII_ANAR_RFAULT (1 << 13) /* Say we can detect faults */
23
+ TCGv_vec max, zero;
24
#define MII_ANAR_PAUSE_ASYM (1 << 11) /* Try for asymmetric pause */
24
25
#define MII_ANAR_PAUSE (1 << 10) /* Try for pause */
25
tcg_gen_neg_vec(vece, rsh, shift);
26
#define MII_ANAR_TXFD (1 << 8)
26
if (vece == MO_8) {
27
@@ -XXX,XX +XXX,XX @@
27
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
28
#define MII_ANAR_10FD (1 << 6)
28
tcg_gen_shrv_vec(vece, rval, src, rsh);
29
#define MII_ANAR_10 (1 << 5)
29
30
#define MII_ANAR_CSMACD (1 << 0)
30
/*
31
+#define MII_ANAR_SELECT (0x001f) /* Selector bits */
31
- * The choice of LT (signed) and GEU (unsigned) are biased toward
32
32
+ * The choice of GE (signed) and GEU (unsigned) are biased toward
33
#define MII_ANLPAR_ACK (1 << 14)
33
* the instructions of the x86_64 host. For MO_8, the whole byte
34
#define MII_ANLPAR_PAUSEASY (1 << 11) /* can pause asymmetrically */
34
* is significant so we must use an unsigned compare; otherwise we
35
@@ -XXX,XX +XXX,XX @@
35
* have already masked to a byte and so a signed compare works.
36
#define RTL8201CP_PHYID1 0x0000
36
* Other tcg hosts have a full set of comparisons and do not care.
37
#define RTL8201CP_PHYID2 0x8201
37
*/
38
38
+ zero = tcg_constant_vec_matching(dst, vece, 0);
39
+/* SMSC LAN9118 */
39
max = tcg_constant_vec_matching(dst, vece, 8 << vece);
40
+#define SMSCLAN9118_PHYID1 0x0007
40
if (vece == MO_8) {
41
+#define SMSCLAN9118_PHYID2 0xc0d1
41
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
42
+
42
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
43
/* RealTek 8211E */
43
- tcg_gen_andc_vec(vece, lval, lval, lsh);
44
#define RTL8211E_PHYID1 0x001c
44
- tcg_gen_andc_vec(vece, rval, rval, rsh);
45
#define RTL8211E_PHYID2 0xc915
45
+ tcg_gen_cmpsel_vec(TCG_COND_GEU, vece, lval, lsh, max, zero, lval);
46
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
46
+ tcg_gen_cmpsel_vec(TCG_COND_GEU, vece, rval, rsh, max, zero, rval);
47
index XXXXXXX..XXXXXXX 100644
48
--- a/hw/net/lan9118_phy.c
49
+++ b/hw/net/lan9118_phy.c
50
@@ -XXX,XX +XXX,XX @@
51
52
#include "qemu/osdep.h"
53
#include "hw/net/lan9118_phy.h"
54
+#include "hw/net/mii.h"
55
#include "hw/irq.h"
56
#include "hw/resettable.h"
57
#include "migration/vmstate.h"
58
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
59
uint16_t val;
60
61
switch (reg) {
62
- case 0: /* Basic Control */
63
+ case MII_BMCR:
64
val = s->control;
65
break;
66
- case 1: /* Basic Status */
67
+ case MII_BMSR:
68
val = s->status;
69
break;
70
- case 2: /* ID1 */
71
- val = 0x0007;
72
+ case MII_PHYID1:
73
+ val = SMSCLAN9118_PHYID1;
74
break;
75
- case 3: /* ID2 */
76
- val = 0xc0d1;
77
+ case MII_PHYID2:
78
+ val = SMSCLAN9118_PHYID2;
79
break;
80
- case 4: /* Auto-neg advertisement */
81
+ case MII_ANAR:
82
val = s->advertise;
83
break;
84
- case 5: /* Auto-neg Link Partner Ability */
85
- val = 0x0fe1;
86
+ case MII_ANLPAR:
87
+ val = MII_ANLPAR_PAUSEASY | MII_ANLPAR_PAUSE | MII_ANLPAR_T4 |
88
+ MII_ANLPAR_TXFD | MII_ANLPAR_TX | MII_ANLPAR_10FD |
89
+ MII_ANLPAR_10 | MII_ANLPAR_CSMACD;
90
break;
91
- case 6: /* Auto-neg Expansion */
92
- val = 1;
93
+ case MII_ANER:
94
+ val = MII_ANER_NWAY;
95
break;
96
case 29: /* Interrupt source. */
97
val = s->ints;
98
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
99
trace_lan9118_phy_write(val, reg);
100
101
switch (reg) {
102
- case 0: /* Basic Control */
103
- if (val & 0x8000) {
104
+ case MII_BMCR:
105
+ if (val & MII_BMCR_RESET) {
106
lan9118_phy_reset(s);
107
} else {
108
- s->control = val & 0x7980;
109
+ s->control = val & (MII_BMCR_LOOPBACK | MII_BMCR_SPEED100 |
110
+ MII_BMCR_AUTOEN | MII_BMCR_PDOWN | MII_BMCR_FD |
111
+ MII_BMCR_CTST);
112
/* Complete autonegotiation immediately. */
113
- if (val & 0x1000) {
114
- s->status |= 0x0020;
115
+ if (val & MII_BMCR_AUTOEN) {
116
+ s->status |= MII_BMSR_AN_COMP;
117
}
118
}
119
break;
120
- case 4: /* Auto-neg advertisement */
121
- s->advertise = (val & 0x2d7f) | 0x80;
122
+ case MII_ANAR:
123
+ s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
124
+ MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
125
+ MII_ANAR_SELECT))
126
+ | MII_ANAR_TX;
127
break;
128
case 30: /* Interrupt mask */
129
s->int_mask = val & 0xff;
130
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
131
/* Autonegotiation status mirrors link status. */
132
if (link_down) {
133
trace_lan9118_phy_update_link("down");
134
- s->status &= ~0x0024;
135
+ s->status &= ~(MII_BMSR_AN_COMP | MII_BMSR_LINK_ST);
136
s->ints |= PHY_INT_DOWN;
47
} else {
137
} else {
48
- tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
138
trace_lan9118_phy_update_link("up");
49
- tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
139
- s->status |= 0x0024;
50
- tcg_gen_and_vec(vece, lval, lval, lsh);
140
+ s->status |= MII_BMSR_AN_COMP | MII_BMSR_LINK_ST;
51
- tcg_gen_and_vec(vece, rval, rval, rsh);
141
s->ints |= PHY_INT_ENERGYON;
52
+ tcg_gen_cmpsel_vec(TCG_COND_GE, vece, lval, lsh, max, zero, lval);
142
s->ints |= PHY_INT_AUTONEG_COMPLETE;
53
+ tcg_gen_cmpsel_vec(TCG_COND_GE, vece, rval, rsh, max, zero, rval);
54
}
143
}
55
tcg_gen_or_vec(vece, dst, lval, rval);
144
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_reset(Lan9118PhyState *s)
56
}
57
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
58
{
145
{
59
static const TCGOpcode vecop_list[] = {
146
trace_lan9118_phy_reset();
60
INDEX_op_neg_vec, INDEX_op_shlv_vec,
147
61
- INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
148
- s->control = 0x3000;
62
+ INDEX_op_shrv_vec, INDEX_op_cmpsel_vec, 0
149
- s->status = 0x7809;
63
};
150
- s->advertise = 0x01e1;
64
static const GVecGen3 ops[4] = {
151
+ s->control = MII_BMCR_AUTOEN | MII_BMCR_SPEED100;
65
{ .fniv = gen_ushl_vec,
152
+ s->status = MII_BMSR_100TX_FD
153
+ | MII_BMSR_100TX_HD
154
+ | MII_BMSR_10T_FD
155
+ | MII_BMSR_10T_HD
156
+ | MII_BMSR_AUTONEG
157
+ | MII_BMSR_EXTCAP;
158
+ s->advertise = MII_ANAR_TXFD
159
+ | MII_ANAR_TX
160
+ | MII_ANAR_10FD
161
+ | MII_ANAR_10
162
+ | MII_ANAR_CSMACD;
163
s->int_mask = 0;
164
s->ints = 0;
165
lan9118_phy_update_link(s, s->link_down);
66
--
166
--
67
2.34.1
167
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
Handle the two special cases within these new
3
The real device advertises this mode and the device model already advertises
4
functions instead of higher in the call stack.
4
100 mbps half duplex and 10 mbps full+half duplex. So advertise this mode to
5
make the model more realistic.
5
6
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
8
Message-id: 20240912024114.1097832-15-richard.henderson@linaro.org
9
Tested-by: Guenter Roeck <linux@roeck-us.net>
10
Message-id: 20241102125724.532843-6-shentey@gmail.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
12
---
11
target/arm/tcg/translate.h | 5 +++++
13
hw/net/lan9118_phy.c | 4 ++--
12
target/arm/tcg/gengvec.c | 19 +++++++++++++++++++
14
1 file changed, 2 insertions(+), 2 deletions(-)
13
target/arm/tcg/translate-a64.c | 16 +---------------
14
target/arm/tcg/translate-neon.c | 25 ++-----------------------
15
4 files changed, 27 insertions(+), 38 deletions(-)
16
15
17
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
18
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/tcg/translate.h
18
--- a/hw/net/lan9118_phy.c
20
+++ b/target/arm/tcg/translate.h
19
+++ b/hw/net/lan9118_phy.c
21
@@ -XXX,XX +XXX,XX @@ void gen_sqsub_d(TCGv_i64 d, TCGv_i64 q, TCGv_i64 a, TCGv_i64 b);
20
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
22
void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
23
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
24
25
+void gen_gvec_sshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
26
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
27
+void gen_gvec_ushr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
28
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
29
+
30
void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
31
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
32
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
33
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/gengvec.c
36
+++ b/target/arm/tcg/gengvec.c
37
@@ -XXX,XX +XXX,XX @@ GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT)
38
39
#undef GEN_CMP0
40
41
+void gen_gvec_sshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
42
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
43
+{
44
+ /* Signed shift out of range results in all-sign-bits */
45
+ shift = MIN(shift, (8 << vece) - 1);
46
+ tcg_gen_gvec_sari(vece, rd_ofs, rm_ofs, shift, opr_sz, max_sz);
47
+}
48
+
49
+void gen_gvec_ushr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
50
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
51
+{
52
+ /* Unsigned shift out of range results in all-zero-bits */
53
+ if (shift >= (8 << vece)) {
54
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
55
+ } else {
56
+ tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift, opr_sz, max_sz);
57
+ }
58
+}
59
+
60
static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
61
{
62
tcg_gen_vec_sar8i_i64(a, a, shift);
63
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/tcg/translate-a64.c
66
+++ b/target/arm/tcg/translate-a64.c
67
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
68
break;
21
break;
69
22
case MII_ANAR:
70
case 0x00: /* SSHR / USHR */
23
s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
71
- if (is_u) {
24
- MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
72
- if (shift == 8 << size) {
25
- MII_ANAR_SELECT))
73
- /* Shift count the same size as element size produces zero. */
26
+ MII_ANAR_PAUSE | MII_ANAR_TXFD | MII_ANAR_10FD |
74
- tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
27
+ MII_ANAR_10 | MII_ANAR_SELECT))
75
- is_q ? 16 : 8, vec_full_reg_size(s), 0);
28
| MII_ANAR_TX;
76
- return;
77
- }
78
- gvec_fn = tcg_gen_gvec_shri;
79
- } else {
80
- /* Shift count the same size as element size produces all sign. */
81
- if (shift == 8 << size) {
82
- shift -= 1;
83
- }
84
- gvec_fn = tcg_gen_gvec_sari;
85
- }
86
+ gvec_fn = is_u ? gen_gvec_ushr : gen_gvec_sshr;
87
break;
29
break;
88
30
case 30: /* Interrupt mask */
89
case 0x04: /* SRSHR / URSHR (rounding) */
90
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/tcg/translate-neon.c
93
+++ b/target/arm/tcg/translate-neon.c
94
@@ -XXX,XX +XXX,XX @@ DO_2SH(VRSHR_S, gen_gvec_srshr)
95
DO_2SH(VRSHR_U, gen_gvec_urshr)
96
DO_2SH(VRSRA_S, gen_gvec_srsra)
97
DO_2SH(VRSRA_U, gen_gvec_ursra)
98
-
99
-static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
100
-{
101
- /* Signed shift out of range results in all-sign-bits */
102
- a->shift = MIN(a->shift, (8 << a->size) - 1);
103
- return do_vector_2sh(s, a, tcg_gen_gvec_sari);
104
-}
105
-
106
-static void gen_zero_rd_2sh(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
107
- int64_t shift, uint32_t oprsz, uint32_t maxsz)
108
-{
109
- tcg_gen_gvec_dup_imm(vece, rd_ofs, oprsz, maxsz, 0);
110
-}
111
-
112
-static bool trans_VSHR_U_2sh(DisasContext *s, arg_2reg_shift *a)
113
-{
114
- /* Shift out of range is architecturally valid and results in zero. */
115
- if (a->shift >= (8 << a->size)) {
116
- return do_vector_2sh(s, a, gen_zero_rd_2sh);
117
- } else {
118
- return do_vector_2sh(s, a, tcg_gen_gvec_shri);
119
- }
120
-}
121
+DO_2SH(VSHR_S, gen_gvec_sshr)
122
+DO_2SH(VSHR_U, gen_gvec_ushr)
123
124
static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
125
NeonGenTwo64OpEnvFn *fn)
126
--
31
--
127
2.34.1
32
2.34.1
diff view generated by jsdifflib
New patch
1
For IEEE fused multiply-add, the (0 * inf) + NaN case should raise
2
Invalid for the multiplication of 0 by infinity. Currently we handle
3
this in the per-architecture ifdef ladder in pickNaNMulAdd().
4
However, since this isn't really architecture specific we can hoist
5
it up to the generic code.
1
6
7
For the cases where the infzero test in pickNaNMulAdd was
8
returning 2, we can delete the check entirely and allow the
9
code to fall into the normal pick-a-NaN handling, because this
10
will return 2 anyway (input 'c' being the only NaN in this case).
11
For the cases where infzero was returning 3 to indicate "return
12
the default NaN", we must retain that "return 3".
13
14
For Arm, this looks like it might be a behaviour change because we
15
used to set float_flag_invalid | float_flag_invalid_imz only if C is
16
a quiet NaN. However, it is not, because Arm target code never looks
17
at float_flag_invalid_imz, and for the (0 * inf) + SNaN case we
18
already raised float_flag_invalid via the "abc_mask &
19
float_cmask_snan" check in pick_nan_muladd.
20
21
For any target architecture using the "default implementation" at the
22
bottom of the ifdef, this is a behaviour change but will be fixing a
23
bug (where we failed to raise the Invalid exception for (0 * inf +
24
QNaN). The architectures using the default case are:
25
* hppa
26
* i386
27
* sh4
28
* tricore
29
30
The x86, Tricore and SH4 CPU architecture manuals are clear that this
31
should have raised Invalid; HPPA is a bit vaguer but still seems
32
clear enough.
33
34
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
35
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
36
Message-id: 20241202131347.498124-2-peter.maydell@linaro.org
37
---
38
fpu/softfloat-parts.c.inc | 13 +++++++------
39
fpu/softfloat-specialize.c.inc | 29 +----------------------------
40
2 files changed, 8 insertions(+), 34 deletions(-)
41
42
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
43
index XXXXXXX..XXXXXXX 100644
44
--- a/fpu/softfloat-parts.c.inc
45
+++ b/fpu/softfloat-parts.c.inc
46
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
47
int ab_mask, int abc_mask)
48
{
49
int which;
50
+ bool infzero = (ab_mask == float_cmask_infzero);
51
52
if (unlikely(abc_mask & float_cmask_snan)) {
53
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
54
}
55
56
- which = pickNaNMulAdd(a->cls, b->cls, c->cls,
57
- ab_mask == float_cmask_infzero, s);
58
+ if (infzero) {
59
+ /* This is (0 * inf) + NaN or (inf * 0) + NaN */
60
+ float_raise(float_flag_invalid | float_flag_invalid_imz, s);
61
+ }
62
+
63
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
64
65
if (s->default_nan_mode || which == 3) {
66
- /*
67
- * Note that this check is after pickNaNMulAdd so that function
68
- * has an opportunity to set the Invalid flag for infzero.
69
- */
70
parts_default_nan(a, s);
71
return a;
72
}
73
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
74
index XXXXXXX..XXXXXXX 100644
75
--- a/fpu/softfloat-specialize.c.inc
76
+++ b/fpu/softfloat-specialize.c.inc
77
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
78
* the default NaN
79
*/
80
if (infzero && is_qnan(c_cls)) {
81
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
82
return 3;
83
}
84
85
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
86
* case sets InvalidOp and returns the default NaN
87
*/
88
if (infzero) {
89
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
90
return 3;
91
}
92
/* Prefer sNaN over qNaN, in the a, b, c order. */
93
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
94
* For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
95
* case sets InvalidOp and returns the input value 'c'
96
*/
97
- if (infzero) {
98
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
99
- return 2;
100
- }
101
/* Prefer sNaN over qNaN, in the c, a, b order. */
102
if (is_snan(c_cls)) {
103
return 2;
104
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
105
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
106
* case sets InvalidOp and returns the input value 'c'
107
*/
108
- if (infzero) {
109
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
110
- return 2;
111
- }
112
+
113
/* Prefer sNaN over qNaN, in the c, a, b order. */
114
if (is_snan(c_cls)) {
115
return 2;
116
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
117
* to return an input NaN if we have one (ie c) rather than generating
118
* a default NaN
119
*/
120
- if (infzero) {
121
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
122
- return 2;
123
- }
124
125
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
126
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
127
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
128
return 1;
129
}
130
#elif defined(TARGET_RISCV)
131
- /* For RISC-V, InvalidOp is set when multiplicands are Inf and zero */
132
- if (infzero) {
133
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
134
- }
135
return 3; /* default NaN */
136
#elif defined(TARGET_S390X)
137
if (infzero) {
138
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
139
return 3;
140
}
141
142
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
143
return 2;
144
}
145
#elif defined(TARGET_SPARC)
146
- /* For (inf,0,nan) return c. */
147
- if (infzero) {
148
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
149
- return 2;
150
- }
151
/* Prefer SNaN over QNaN, order C, B, A. */
152
if (is_snan(c_cls)) {
153
return 2;
154
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
155
* For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
156
* an input NaN if we have one (ie c).
157
*/
158
- if (infzero) {
159
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
160
- return 2;
161
- }
162
if (status->use_first_nan) {
163
if (is_nan(a_cls)) {
164
return 0;
165
--
166
2.34.1
diff view generated by jsdifflib
New patch
1
If the target sets default_nan_mode then we're always going to return
2
the default NaN, and pickNaNMulAdd() no longer has any side effects.
3
For consistency with pickNaN(), check for default_nan_mode before
4
calling pickNaNMulAdd().
1
5
6
When we convert pickNaNMulAdd() to allow runtime selection of the NaN
7
propagation rule, this means we won't have to make the targets which
8
use default_nan_mode also set a propagation rule.
9
10
Since RiscV always uses default_nan_mode, this allows us to remove
11
its ifdef case from pickNaNMulAdd().
12
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20241202131347.498124-3-peter.maydell@linaro.org
16
---
17
fpu/softfloat-parts.c.inc | 8 ++++++--
18
fpu/softfloat-specialize.c.inc | 9 +++++++--
19
2 files changed, 13 insertions(+), 4 deletions(-)
20
21
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
22
index XXXXXXX..XXXXXXX 100644
23
--- a/fpu/softfloat-parts.c.inc
24
+++ b/fpu/softfloat-parts.c.inc
25
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
26
float_raise(float_flag_invalid | float_flag_invalid_imz, s);
27
}
28
29
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
30
+ if (s->default_nan_mode) {
31
+ which = 3;
32
+ } else {
33
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ }
35
36
- if (s->default_nan_mode || which == 3) {
37
+ if (which == 3) {
38
parts_default_nan(a, s);
39
return a;
40
}
41
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
42
index XXXXXXX..XXXXXXX 100644
43
--- a/fpu/softfloat-specialize.c.inc
44
+++ b/fpu/softfloat-specialize.c.inc
45
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
46
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
47
bool infzero, float_status *status)
48
{
49
+ /*
50
+ * We guarantee not to require the target to tell us how to
51
+ * pick a NaN if we're always returning the default NaN.
52
+ * But if we're not in default-NaN mode then the target must
53
+ * specify.
54
+ */
55
+ assert(!status->default_nan_mode);
56
#if defined(TARGET_ARM)
57
/* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
58
* the default NaN
59
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
60
} else {
61
return 1;
62
}
63
-#elif defined(TARGET_RISCV)
64
- return 3; /* default NaN */
65
#elif defined(TARGET_S390X)
66
if (infzero) {
67
return 3;
68
--
69
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
IEEE 758 does not define a fixed rule for what NaN to return in
2
2
the case of a fused multiply-add of inf * 0 + NaN. Different
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
architectures thus do different things:
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
* some return the default NaN
5
Message-id: 20240912024114.1097832-9-richard.henderson@linaro.org
5
* some return the input NaN
6
* Arm returns the default NaN if the input NaN is quiet,
7
and the input NaN if it is signalling
8
9
We want to make this logic be runtime selected rather than
10
hardcoded into the binary, because:
11
* this will let us have multiple targets in one QEMU binary
12
* the Arm FEAT_AFP architectural feature includes letting
13
the guest select a NaN propagation rule at runtime
14
15
In this commit we add an enum for the propagation rule, the field in
16
float_status, and the corresponding getters and setters. We change
17
pickNaNMulAdd to honour this, but because all targets still leave
18
this field at its default 0 value, the fallback logic will pick the
19
rule type with the old ifdef ladder.
20
21
Note that four architectures both use the muladd softfloat functions
22
and did not have a branch of the ifdef ladder to specify their
23
behaviour (and so were ending up with the "default" case, probably
24
wrongly): i386, HPPA, SH4 and Tricore. SH4 and Tricore both set
25
default_nan_mode, and so will never get into pickNaNMulAdd(). For
26
HPPA and i386 we retain the same behaviour as the old default-case,
27
which is to not ever return the default NaN. This might not be
28
correct but it is not a behaviour change.
29
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
31
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
32
Message-id: 20241202131347.498124-4-peter.maydell@linaro.org
7
---
33
---
8
target/arm/tcg/a64.decode | 9 ++
34
include/fpu/softfloat-helpers.h | 11 ++++
9
target/arm/tcg/translate-a64.c | 158 ++++++++++++++-------------------
35
include/fpu/softfloat-types.h | 23 +++++++++
10
2 files changed, 77 insertions(+), 90 deletions(-)
36
fpu/softfloat-specialize.c.inc | 91 ++++++++++++++++++++++-----------
11
37
3 files changed, 95 insertions(+), 30 deletions(-)
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
38
39
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
13
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
41
--- a/include/fpu/softfloat-helpers.h
15
+++ b/target/arm/tcg/a64.decode
42
+++ b/include/fpu/softfloat-helpers.h
16
@@ -XXX,XX +XXX,XX @@ EXT_q 0110 1110 00 0 rm:5 0 imm:4 0 rn:5 rd:5
43
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
17
# Advanced SIMD Table Lookup
44
status->float_2nan_prop_rule = rule;
18
45
}
19
TBL_TBX 0 q:1 00 1110 000 rm:5 0 len:2 tbx:1 00 rn:5 rd:5
46
20
+
47
+static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
21
+# Advanced SIMD Permute
48
+ float_status *status)
22
+
49
+{
23
+UZP1 0.00 1110 .. 0 ..... 0 001 10 ..... ..... @qrrr_e
50
+ status->float_infzeronan_rule = rule;
24
+UZP2 0.00 1110 .. 0 ..... 0 101 10 ..... ..... @qrrr_e
51
+}
25
+TRN1 0.00 1110 .. 0 ..... 0 010 10 ..... ..... @qrrr_e
52
+
26
+TRN2 0.00 1110 .. 0 ..... 0 110 10 ..... ..... @qrrr_e
53
static inline void set_flush_to_zero(bool val, float_status *status)
27
+ZIP1 0.00 1110 .. 0 ..... 0 011 10 ..... ..... @qrrr_e
54
{
28
+ZIP2 0.00 1110 .. 0 ..... 0 111 10 ..... ..... @qrrr_e
55
status->flush_to_zero = val;
29
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
56
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
57
return status->float_2nan_prop_rule;
58
}
59
60
+static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
61
+{
62
+ return status->float_infzeronan_rule;
63
+}
64
+
65
static inline bool get_flush_to_zero(float_status *status)
66
{
67
return status->flush_to_zero;
68
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
30
index XXXXXXX..XXXXXXX 100644
69
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/tcg/translate-a64.c
70
--- a/include/fpu/softfloat-types.h
32
+++ b/target/arm/tcg/translate-a64.c
71
+++ b/include/fpu/softfloat-types.h
33
@@ -XXX,XX +XXX,XX @@ static bool trans_TBL_TBX(DisasContext *s, arg_TBL_TBX *a)
72
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
34
return true;
73
float_2nan_prop_x87,
35
}
74
} Float2NaNPropRule;
36
75
37
+typedef int simd_permute_idx_fn(int i, int part, int elements);
76
+/*
38
+
77
+ * Rule for result of fused multiply-add 0 * Inf + NaN.
39
+static bool do_simd_permute(DisasContext *s, arg_qrrr_e *a,
78
+ * This must be a NaN, but implementations differ on whether this
40
+ simd_permute_idx_fn *fn, int part)
79
+ * is the input NaN or the default NaN.
41
+{
80
+ *
42
+ MemOp esz = a->esz;
81
+ * You don't need to set this if default_nan_mode is enabled.
43
+ int datasize = a->q ? 16 : 8;
82
+ * When not in default-NaN mode, it is an error for the target
44
+ int elements = datasize >> esz;
83
+ * not to set the rule in float_status if it uses muladd, and we
45
+ TCGv_i64 tcg_res[2], tcg_ele;
84
+ * will assert if we need to handle an input NaN and no rule was
46
+
85
+ * selected.
47
+ if (esz == MO_64 && !a->q) {
86
+ */
48
+ return false;
87
+typedef enum __attribute__((__packed__)) {
49
+ }
88
+ /* No propagation rule specified */
50
+ if (!fp_access_check(s)) {
89
+ float_infzeronan_none = 0,
51
+ return true;
90
+ /* Result is never the default NaN (so always the input NaN) */
52
+ }
91
+ float_infzeronan_dnan_never,
53
+
92
+ /* Result is always the default NaN */
54
+ tcg_res[0] = tcg_temp_new_i64();
93
+ float_infzeronan_dnan_always,
55
+ tcg_res[1] = a->q ? tcg_temp_new_i64() : NULL;
94
+ /* Result is the default NaN if the input NaN is quiet */
56
+ tcg_ele = tcg_temp_new_i64();
95
+ float_infzeronan_dnan_if_qnan,
57
+
96
+} FloatInfZeroNaNRule;
58
+ for (int i = 0; i < elements; i++) {
97
+
59
+ int o, w, idx;
98
/*
60
+
99
* Floating Point Status. Individual architectures may maintain
61
+ idx = fn(i, part, elements);
100
* several versions of float_status for different functions. The
62
+ read_vec_element(s, tcg_ele, (idx & elements ? a->rm : a->rn),
101
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
63
+ idx & (elements - 1), esz);
102
FloatRoundMode float_rounding_mode;
64
+
103
FloatX80RoundPrec floatx80_rounding_precision;
65
+ w = (i << (esz + 3)) / 64;
104
Float2NaNPropRule float_2nan_prop_rule;
66
+ o = (i << (esz + 3)) % 64;
105
+ FloatInfZeroNaNRule float_infzeronan_rule;
67
+ if (o == 0) {
106
bool tininess_before_rounding;
68
+ tcg_gen_mov_i64(tcg_res[w], tcg_ele);
107
/* should denormalised results go to zero and set the inexact flag? */
108
bool flush_to_zero;
109
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
110
index XXXXXXX..XXXXXXX 100644
111
--- a/fpu/softfloat-specialize.c.inc
112
+++ b/fpu/softfloat-specialize.c.inc
113
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
114
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
115
bool infzero, float_status *status)
116
{
117
+ FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
118
+
119
/*
120
* We guarantee not to require the target to tell us how to
121
* pick a NaN if we're always returning the default NaN.
122
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
123
* specify.
124
*/
125
assert(!status->default_nan_mode);
126
+
127
+ if (rule == float_infzeronan_none) {
128
+ /*
129
+ * Temporarily fall back to ifdef ladder
130
+ */
131
#if defined(TARGET_ARM)
132
- /* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
133
- * the default NaN
134
- */
135
- if (infzero && is_qnan(c_cls)) {
136
- return 3;
137
+ /*
138
+ * For ARM, the (inf,zero,qnan) case returns the default NaN,
139
+ * but (inf,zero,snan) returns the input NaN.
140
+ */
141
+ rule = float_infzeronan_dnan_if_qnan;
142
+#elif defined(TARGET_MIPS)
143
+ if (snan_bit_is_one(status)) {
144
+ /*
145
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
146
+ * case sets InvalidOp and returns the default NaN
147
+ */
148
+ rule = float_infzeronan_dnan_always;
69
+ } else {
149
+ } else {
70
+ tcg_gen_deposit_i64(tcg_res[w], tcg_res[w], tcg_ele, o, 8 << esz);
150
+ /*
151
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
152
+ * case sets InvalidOp and returns the input value 'c'
153
+ */
154
+ rule = float_infzeronan_dnan_never;
155
+ }
156
+#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
157
+ defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
158
+ defined(TARGET_I386) || defined(TARGET_LOONGARCH)
159
+ /*
160
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
161
+ * case sets InvalidOp and returns the input value 'c'
162
+ */
163
+ /*
164
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
165
+ * to return an input NaN if we have one (ie c) rather than generating
166
+ * a default NaN
167
+ */
168
+ rule = float_infzeronan_dnan_never;
169
+#elif defined(TARGET_S390X)
170
+ rule = float_infzeronan_dnan_always;
171
+#endif
172
}
173
174
+ if (infzero) {
175
+ /*
176
+ * Inf * 0 + NaN -- some implementations return the default NaN here,
177
+ * and some return the input NaN.
178
+ */
179
+ switch (rule) {
180
+ case float_infzeronan_dnan_never:
181
+ return 2;
182
+ case float_infzeronan_dnan_always:
183
+ return 3;
184
+ case float_infzeronan_dnan_if_qnan:
185
+ return is_qnan(c_cls) ? 3 : 2;
186
+ default:
187
+ g_assert_not_reached();
71
+ }
188
+ }
72
+ }
189
+ }
73
+
190
+
74
+ for (int i = a->q; i >= 0; --i) {
191
+#if defined(TARGET_ARM)
75
+ write_vec_element(s, tcg_res[i], a->rd, i, MO_64);
192
+
76
+ }
193
/* This looks different from the ARM ARM pseudocode, because the ARM ARM
77
+ clear_vec_high(s, a->q, a->rd);
194
* puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
78
+ return true;
195
*/
79
+}
196
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
80
+
197
}
81
+static int permute_load_uzp(int i, int part, int elements)
198
#elif defined(TARGET_MIPS)
82
+{
199
if (snan_bit_is_one(status)) {
83
+ return 2 * i + part;
200
- /*
84
+}
201
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
85
+
202
- * case sets InvalidOp and returns the default NaN
86
+TRANS(UZP1, do_simd_permute, a, permute_load_uzp, 0)
203
- */
87
+TRANS(UZP2, do_simd_permute, a, permute_load_uzp, 1)
204
- if (infzero) {
88
+
205
- return 3;
89
+static int permute_load_trn(int i, int part, int elements)
206
- }
90
+{
207
/* Prefer sNaN over qNaN, in the a, b, c order. */
91
+ return (i & 1) * elements + (i & ~1) + part;
208
if (is_snan(a_cls)) {
92
+}
209
return 0;
93
+
210
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
94
+TRANS(TRN1, do_simd_permute, a, permute_load_trn, 0)
211
return 2;
95
+TRANS(TRN2, do_simd_permute, a, permute_load_trn, 1)
212
}
96
+
213
} else {
97
+static int permute_load_zip(int i, int part, int elements)
214
- /*
98
+{
215
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
99
+ return (i & 1) * elements + ((part * elements + i) >> 1);
216
- * case sets InvalidOp and returns the input value 'c'
100
+}
217
- */
101
+
218
/* Prefer sNaN over qNaN, in the c, a, b order. */
102
+TRANS(ZIP1, do_simd_permute, a, permute_load_zip, 0)
219
if (is_snan(c_cls)) {
103
+TRANS(ZIP2, do_simd_permute, a, permute_load_zip, 1)
220
return 2;
104
+
221
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
105
/*
222
}
106
* Cryptographic AES, SHA, SHA512
223
}
107
*/
224
#elif defined(TARGET_LOONGARCH64)
108
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
225
- /*
109
}
226
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
110
}
227
- * case sets InvalidOp and returns the input value 'c'
111
112
-/* ZIP/UZP/TRN
113
- * 31 30 29 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
114
- * +---+---+-------------+------+---+------+---+------------------+------+
115
- * | 0 | Q | 0 0 1 1 1 0 | size | 0 | Rm | 0 | opc | 1 0 | Rn | Rd |
116
- * +---+---+-------------+------+---+------+---+------------------+------+
117
- */
118
-static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
119
-{
120
- int rd = extract32(insn, 0, 5);
121
- int rn = extract32(insn, 5, 5);
122
- int rm = extract32(insn, 16, 5);
123
- int size = extract32(insn, 22, 2);
124
- /* opc field bits [1:0] indicate ZIP/UZP/TRN;
125
- * bit 2 indicates 1 vs 2 variant of the insn.
126
- */
228
- */
127
- int opcode = extract32(insn, 12, 2);
128
- bool part = extract32(insn, 14, 1);
129
- bool is_q = extract32(insn, 30, 1);
130
- int esize = 8 << size;
131
- int i;
132
- int datasize = is_q ? 128 : 64;
133
- int elements = datasize / esize;
134
- TCGv_i64 tcg_res[2], tcg_ele;
135
-
229
-
136
- if (opcode == 0 || (size == 3 && !is_q)) {
230
/* Prefer sNaN over qNaN, in the c, a, b order. */
137
- unallocated_encoding(s);
231
if (is_snan(c_cls)) {
138
- return;
232
return 2;
233
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
234
return 1;
235
}
236
#elif defined(TARGET_PPC)
237
- /* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
238
- * to return an input NaN if we have one (ie c) rather than generating
239
- * a default NaN
240
- */
241
-
242
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
243
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
244
*/
245
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
246
return 1;
247
}
248
#elif defined(TARGET_S390X)
249
- if (infzero) {
250
- return 3;
139
- }
251
- }
140
-
252
-
141
- if (!fp_access_check(s)) {
253
if (is_snan(a_cls)) {
142
- return;
254
return 0;
143
- }
255
} else if (is_snan(b_cls)) {
144
-
145
- tcg_res[0] = tcg_temp_new_i64();
146
- tcg_res[1] = is_q ? tcg_temp_new_i64() : NULL;
147
- tcg_ele = tcg_temp_new_i64();
148
-
149
- for (i = 0; i < elements; i++) {
150
- int o, w;
151
-
152
- switch (opcode) {
153
- case 1: /* UZP1/2 */
154
- {
155
- int midpoint = elements / 2;
156
- if (i < midpoint) {
157
- read_vec_element(s, tcg_ele, rn, 2 * i + part, size);
158
- } else {
159
- read_vec_element(s, tcg_ele, rm,
160
- 2 * (i - midpoint) + part, size);
161
- }
162
- break;
163
- }
164
- case 2: /* TRN1/2 */
165
- if (i & 1) {
166
- read_vec_element(s, tcg_ele, rm, (i & ~1) + part, size);
167
- } else {
168
- read_vec_element(s, tcg_ele, rn, (i & ~1) + part, size);
169
- }
170
- break;
171
- case 3: /* ZIP1/2 */
172
- {
173
- int base = part * elements / 2;
174
- if (i & 1) {
175
- read_vec_element(s, tcg_ele, rm, base + (i >> 1), size);
176
- } else {
177
- read_vec_element(s, tcg_ele, rn, base + (i >> 1), size);
178
- }
179
- break;
180
- }
181
- default:
182
- g_assert_not_reached();
183
- }
184
-
185
- w = (i * esize) / 64;
186
- o = (i * esize) % 64;
187
- if (o == 0) {
188
- tcg_gen_mov_i64(tcg_res[w], tcg_ele);
189
- } else {
190
- tcg_gen_shli_i64(tcg_ele, tcg_ele, o);
191
- tcg_gen_or_i64(tcg_res[w], tcg_res[w], tcg_ele);
192
- }
193
- }
194
-
195
- for (i = 0; i <= is_q; ++i) {
196
- write_vec_element(s, tcg_res[i], rd, i, MO_64);
197
- }
198
- clear_vec_high(s, is_q, rd);
199
-}
200
-
201
/*
202
* do_reduction_op helper
203
*
204
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
205
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
206
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
207
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
208
- { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
209
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
210
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
211
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
212
--
256
--
213
2.34.1
257
2.34.1
diff view generated by jsdifflib
New patch
1
Explicitly set a rule in the softfloat tests for the inf-zero-nan
2
muladd special case. In meson.build we put -DTARGET_ARM in fpcflags,
3
and so we should select here the Arm rule of
4
float_infzeronan_dnan_if_qnan.
1
5
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241202131347.498124-5-peter.maydell@linaro.org
9
---
10
tests/fp/fp-bench.c | 5 +++++
11
tests/fp/fp-test.c | 5 +++++
12
2 files changed, 10 insertions(+)
13
14
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/fp/fp-bench.c
17
+++ b/tests/fp/fp-bench.c
18
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
19
{
20
bench_func_t f;
21
22
+ /*
23
+ * These implementation-defined choices for various things IEEE
24
+ * doesn't specify match those used by the Arm architecture.
25
+ */
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
28
29
f = bench_funcs[operation][precision];
30
g_assert(f);
31
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/tests/fp/fp-test.c
34
+++ b/tests/fp/fp-test.c
35
@@ -XXX,XX +XXX,XX @@ void run_test(void)
36
{
37
unsigned int i;
38
39
+ /*
40
+ * These implementation-defined choices for various things IEEE
41
+ * doesn't specify match those used by the Arm architecture.
42
+ */
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
44
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
45
46
genCases_setLevel(test_level);
47
verCases_maxErrorCount = n_max_errors;
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the Arm target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-6-peter.maydell@linaro.org
7
---
8
target/arm/cpu.c | 3 +++
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 4 insertions(+), 7 deletions(-)
11
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
17
* * tininess-before-rounding
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
20
+ * * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
21
+ * and the input NaN if it is signalling
22
*/
23
static void arm_set_default_fp_behaviours(float_status *s)
24
{
25
set_float_detect_tininess(float_tininess_before_rounding, s);
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
28
}
29
30
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
32
index XXXXXXX..XXXXXXX 100644
33
--- a/fpu/softfloat-specialize.c.inc
34
+++ b/fpu/softfloat-specialize.c.inc
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
36
/*
37
* Temporarily fall back to ifdef ladder
38
*/
39
-#if defined(TARGET_ARM)
40
- /*
41
- * For ARM, the (inf,zero,qnan) case returns the default NaN,
42
- * but (inf,zero,snan) returns the input NaN.
43
- */
44
- rule = float_infzeronan_dnan_if_qnan;
45
-#elif defined(TARGET_MIPS)
46
+#if defined(TARGET_MIPS)
47
if (snan_bit_is_one(status)) {
48
/*
49
* For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
50
--
51
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for s390, so we
2
can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-7-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/s390x/cpu.c
15
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
set_float_detect_tininess(float_tininess_before_rounding,
18
&env->fpu_status);
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
20
+ set_float_infzeronan_rule(float_infzeronan_dnan_always,
21
+ &env->fpu_status);
22
/* fall through */
23
case RESET_TYPE_S390_CPU_NORMAL:
24
env->psw.mask &= ~PSW_MASK_RI;
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
* a default NaN
31
*/
32
rule = float_infzeronan_dnan_never;
33
-#elif defined(TARGET_S390X)
34
- rule = float_infzeronan_dnan_always;
35
#endif
36
}
37
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the PPC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-8-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 7 +++++++
9
fpu/softfloat-specialize.c.inc | 7 +------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
22
+ * to return an input NaN if we have one (ie c) rather than generating
23
+ * a default NaN
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
27
28
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
29
ppc_spr_t *spr = &env->spr_cb[i];
30
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
index XXXXXXX..XXXXXXX 100644
32
--- a/fpu/softfloat-specialize.c.inc
33
+++ b/fpu/softfloat-specialize.c.inc
34
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
*/
36
rule = float_infzeronan_dnan_never;
37
}
38
-#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
39
+#elif defined(TARGET_SPARC) || \
40
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
41
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
42
/*
43
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
44
* case sets InvalidOp and returns the input value 'c'
45
*/
46
- /*
47
- * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
48
- * to return an input NaN if we have one (ie c) rather than generating
49
- * a default NaN
50
- */
51
rule = float_infzeronan_dnan_never;
52
#endif
53
}
54
--
55
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the MIPS target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-9-peter.maydell@linaro.org
7
---
8
target/mips/fpu_helper.h | 9 +++++++++
9
target/mips/msa.c | 4 ++++
10
fpu/softfloat-specialize.c.inc | 16 +---------------
11
3 files changed, 14 insertions(+), 15 deletions(-)
12
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/mips/fpu_helper.h
16
+++ b/target/mips/fpu_helper.h
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_flush_mode(CPUMIPSState *env)
18
static inline void restore_snan_bit_mode(CPUMIPSState *env)
19
{
20
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
21
+ FloatInfZeroNaNRule izn_rule;
22
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
set_snan_bit_is_one(!nan2008, &env->active_fpu.fp_status);
28
set_default_nan_mode(!nan2008, &env->active_fpu.fp_status);
29
+ /*
30
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
31
+ * case sets InvalidOp and returns the default NaN.
32
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
33
+ * case sets InvalidOp and returns the input value 'c'.
34
+ */
35
+ izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
36
+ set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
37
}
38
39
static inline void restore_fp_status(CPUMIPSState *env)
40
diff --git a/target/mips/msa.c b/target/mips/msa.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/mips/msa.c
43
+++ b/target/mips/msa.c
44
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
45
46
/* set proper signanling bit meaning ("1" means "quiet") */
47
set_snan_bit_is_one(0, &env->active_tc.msa_fp_status);
48
+
49
+ /* Inf * 0 + NaN returns the input NaN */
50
+ set_float_infzeronan_rule(float_infzeronan_dnan_never,
51
+ &env->active_tc.msa_fp_status);
52
}
53
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
54
index XXXXXXX..XXXXXXX 100644
55
--- a/fpu/softfloat-specialize.c.inc
56
+++ b/fpu/softfloat-specialize.c.inc
57
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
58
/*
59
* Temporarily fall back to ifdef ladder
60
*/
61
-#if defined(TARGET_MIPS)
62
- if (snan_bit_is_one(status)) {
63
- /*
64
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
65
- * case sets InvalidOp and returns the default NaN
66
- */
67
- rule = float_infzeronan_dnan_always;
68
- } else {
69
- /*
70
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
71
- * case sets InvalidOp and returns the input value 'c'
72
- */
73
- rule = float_infzeronan_dnan_never;
74
- }
75
-#elif defined(TARGET_SPARC) || \
76
+#if defined(TARGET_SPARC) || \
77
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
78
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
79
/*
80
--
81
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the SPARC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-10-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 +--
10
2 files changed, 3 insertions(+), 2 deletions(-)
11
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/sparc/cpu.c
15
+++ b/target/sparc/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
17
* the CPU state struct so it won't get zeroed on reset.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
23
cpu_exec_realizefn(cs, &local_err);
24
if (local_err != NULL) {
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_SPARC) || \
34
- defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
35
+#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
36
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
37
/*
38
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
--
40
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the xtensa target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-11-peter.maydell@linaro.org
7
---
8
target/xtensa/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 +-
10
2 files changed, 3 insertions(+), 1 deletion(-)
11
12
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/xtensa/cpu.c
15
+++ b/target/xtensa/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
17
reset_mmu(env);
18
cs->halted = env->runstall;
19
#endif
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
set_no_signaling_nans(!dfpu, &env->fp_status);
23
xtensa_use_first_nan(env, !dfpu);
24
}
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
34
+#if defined(TARGET_HPPA) || \
35
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
36
/*
37
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the x86 target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-12-peter.maydell@linaro.org
6
---
7
target/i386/tcg/fpu_helper.c | 7 +++++++
8
fpu/softfloat-specialize.c.inc | 2 +-
9
2 files changed, 8 insertions(+), 1 deletion(-)
10
11
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/i386/tcg/fpu_helper.c
14
+++ b/target/i386/tcg/fpu_helper.c
15
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
16
*/
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->mmx_status);
18
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->sse_status);
19
+ /*
20
+ * Only SSE has multiply-add instructions. In the SDM Section 14.5.2
21
+ * "Fused-Multiply-ADD (FMA) Numeric Behavior" the NaN handling is
22
+ * specified -- for 0 * inf + NaN the input NaN is selected, and if
23
+ * there are multiple input NaNs they are selected in the order a, b, c.
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
26
}
27
28
static inline uint8_t save_exception_flags(CPUX86State *env)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
34
* Temporarily fall back to ifdef ladder
35
*/
36
#if defined(TARGET_HPPA) || \
37
- defined(TARGET_I386) || defined(TARGET_LOONGARCH)
38
+ defined(TARGET_LOONGARCH)
39
/*
40
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
41
* case sets InvalidOp and returns the input value 'c'
42
--
43
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the loongarch target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-13-peter.maydell@linaro.org
6
---
7
target/loongarch/tcg/fpu_helper.c | 5 +++++
8
fpu/softfloat-specialize.c.inc | 7 +------
9
2 files changed, 6 insertions(+), 6 deletions(-)
10
11
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/loongarch/tcg/fpu_helper.c
14
+++ b/target/loongarch/tcg/fpu_helper.c
15
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
16
&env->fp_status);
17
set_flush_to_zero(0, &env->fp_status);
18
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
19
+ /*
20
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
21
+ * case sets InvalidOp and returns the input value 'c'
22
+ */
23
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
}
25
26
int ieee_ex_to_loongarch(int xcpt)
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
index XXXXXXX..XXXXXXX 100644
29
--- a/fpu/softfloat-specialize.c.inc
30
+++ b/fpu/softfloat-specialize.c.inc
31
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
32
/*
33
* Temporarily fall back to ifdef ladder
34
*/
35
-#if defined(TARGET_HPPA) || \
36
- defined(TARGET_LOONGARCH)
37
- /*
38
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
- * case sets InvalidOp and returns the input value 'c'
40
- */
41
+#if defined(TARGET_HPPA)
42
rule = float_infzeronan_dnan_never;
43
#endif
44
}
45
--
46
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the HPPA target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
As this is the last target to be converted to explicitly setting
5
the rule, we can remove the fallback code in pickNaNMulAdd()
6
entirely.
7
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20241202131347.498124-14-peter.maydell@linaro.org
11
---
12
target/hppa/fpu_helper.c | 2 ++
13
fpu/softfloat-specialize.c.inc | 13 +------------
14
2 files changed, 3 insertions(+), 12 deletions(-)
15
16
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/hppa/fpu_helper.c
19
+++ b/target/hppa/fpu_helper.c
20
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
21
* HPPA does note implement a CPU reset method at all...
22
*/
23
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
24
+ /* For inf * 0 + NaN, return the input NaN */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
}
27
28
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
34
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
bool infzero, float_status *status)
36
{
37
- FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
38
-
39
/*
40
* We guarantee not to require the target to tell us how to
41
* pick a NaN if we're always returning the default NaN.
42
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
43
*/
44
assert(!status->default_nan_mode);
45
46
- if (rule == float_infzeronan_none) {
47
- /*
48
- * Temporarily fall back to ifdef ladder
49
- */
50
-#if defined(TARGET_HPPA)
51
- rule = float_infzeronan_dnan_never;
52
-#endif
53
- }
54
-
55
if (infzero) {
56
/*
57
* Inf * 0 + NaN -- some implementations return the default NaN here,
58
* and some return the input NaN.
59
*/
60
- switch (rule) {
61
+ switch (status->float_infzeronan_rule) {
62
case float_infzeronan_dnan_never:
63
return 2;
64
case float_infzeronan_dnan_always:
65
--
66
2.34.1
diff view generated by jsdifflib
New patch
1
The new implementation of pickNaNMulAdd() will find it convenient
2
to know whether at least one of the three arguments to the muladd
3
was a signaling NaN. We already calculate that in the caller,
4
so pass it in as a new bool have_snan.
1
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-15-peter.maydell@linaro.org
9
---
10
fpu/softfloat-parts.c.inc | 5 +++--
11
fpu/softfloat-specialize.c.inc | 2 +-
12
2 files changed, 4 insertions(+), 3 deletions(-)
13
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
16
--- a/fpu/softfloat-parts.c.inc
17
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
19
{
20
int which;
21
bool infzero = (ab_mask == float_cmask_infzero);
22
+ bool have_snan = (abc_mask & float_cmask_snan);
23
24
- if (unlikely(abc_mask & float_cmask_snan)) {
25
+ if (unlikely(have_snan)) {
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
27
}
28
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
30
if (s->default_nan_mode) {
31
which = 3;
32
} else {
33
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
35
}
36
37
if (which == 3) {
38
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
39
index XXXXXXX..XXXXXXX 100644
40
--- a/fpu/softfloat-specialize.c.inc
41
+++ b/fpu/softfloat-specialize.c.inc
42
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
43
| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
44
*----------------------------------------------------------------------------*/
45
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
46
- bool infzero, float_status *status)
47
+ bool infzero, bool have_snan, float_status *status)
48
{
49
/*
50
* We guarantee not to require the target to tell us how to
51
--
52
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
IEEE 758 does not define a fixed rule for which NaN to pick as the
2
2
result if both operands of a 3-operand fused multiply-add operation
3
There isn't a lot of commonality along the different paths of
3
are NaNs. As a result different architectures have ended up with
4
handle_shri_with_rndacc. Split them out to separate functions,
4
different rules for propagating NaNs.
5
which will be usable during the decodetree conversion.
5
6
6
QEMU currently hardcodes the NaN propagation logic into the binary
7
Simplify 64-bit rounding operations to not require double-word arithmetic.
7
because pickNaNMulAdd() has an ifdef ladder for different targets.
8
8
We want to make the propagation rule instead be selectable at
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
runtime, because:
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
* this will let us have multiple targets in one QEMU binary
11
Message-id: 20240912024114.1097832-22-richard.henderson@linaro.org
11
* the Arm FEAT_AFP architectural feature includes letting
12
the guest select a NaN propagation rule at runtime
13
14
In this commit we add an enum for the propagation rule, the field in
15
float_status, and the corresponding getters and setters. We change
16
pickNaNMulAdd to honour this, but because all targets still leave
17
this field at its default 0 value, the fallback logic will pick the
18
rule type with the old ifdef ladder.
19
20
It's valid not to set a propagation rule if default_nan_mode is
21
enabled, because in that case there's no need to pick a NaN; all the
22
callers of pickNaNMulAdd() catch this case and skip calling it.
23
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
26
Message-id: 20241202131347.498124-16-peter.maydell@linaro.org
13
---
27
---
14
target/arm/tcg/translate-a64.c | 138 ++++++++++++++++++++-------------
28
include/fpu/softfloat-helpers.h | 11 +++
15
1 file changed, 82 insertions(+), 56 deletions(-)
29
include/fpu/softfloat-types.h | 55 +++++++++++
16
30
fpu/softfloat-specialize.c.inc | 167 ++++++++------------------------
17
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
31
3 files changed, 107 insertions(+), 126 deletions(-)
32
33
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
18
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/tcg/translate-a64.c
35
--- a/include/fpu/softfloat-helpers.h
20
+++ b/target/arm/tcg/translate-a64.c
36
+++ b/include/fpu/softfloat-helpers.h
21
@@ -XXX,XX +XXX,XX @@ static bool do_vec_shift_imm_wide(DisasContext *s, arg_qrri_e *a, bool is_u)
37
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
22
TRANS(SSHLL_v, do_vec_shift_imm_wide, a, false)
38
status->float_2nan_prop_rule = rule;
23
TRANS(USHLL_v, do_vec_shift_imm_wide, a, true)
39
}
24
40
25
+static void gen_sshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
41
+static inline void set_float_3nan_prop_rule(Float3NaNPropRule rule,
42
+ float_status *status)
26
+{
43
+{
27
+ assert(shift >= 0 && shift <= 64);
44
+ status->float_3nan_prop_rule = rule;
28
+ tcg_gen_sari_i64(dst, src, MIN(shift, 63));
29
+}
45
+}
30
+
46
+
31
+static void gen_ushr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
47
static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
48
float_status *status)
49
{
50
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
51
return status->float_2nan_prop_rule;
52
}
53
54
+static inline Float3NaNPropRule get_float_3nan_prop_rule(float_status *status)
32
+{
55
+{
33
+ assert(shift >= 0 && shift <= 64);
56
+ return status->float_3nan_prop_rule;
34
+ if (shift == 64) {
35
+ tcg_gen_movi_i64(dst, 0);
36
+ } else {
37
+ tcg_gen_shri_i64(dst, src, shift);
38
+ }
39
+}
57
+}
40
+
58
+
41
+static void gen_srshr_bhs(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
59
static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
42
+{
43
+ assert(shift >= 0 && shift <= 32);
44
+ if (shift) {
45
+ TCGv_i64 rnd = tcg_constant_i64(1ull << (shift - 1));
46
+ tcg_gen_add_i64(dst, src, rnd);
47
+ tcg_gen_sari_i64(dst, dst, shift);
48
+ } else {
49
+ tcg_gen_mov_i64(dst, src);
50
+ }
51
+}
52
+
53
+static void gen_urshr_bhs(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
54
+{
55
+ assert(shift >= 0 && shift <= 32);
56
+ if (shift) {
57
+ TCGv_i64 rnd = tcg_constant_i64(1ull << (shift - 1));
58
+ tcg_gen_add_i64(dst, src, rnd);
59
+ tcg_gen_shri_i64(dst, dst, shift);
60
+ } else {
61
+ tcg_gen_mov_i64(dst, src);
62
+ }
63
+}
64
+
65
+static void gen_srshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
66
+{
67
+ assert(shift >= 0 && shift <= 64);
68
+ if (shift == 0) {
69
+ tcg_gen_mov_i64(dst, src);
70
+ } else if (shift == 64) {
71
+ /* Extension of sign bit (0,-1) plus sign bit (0,1) is zero. */
72
+ tcg_gen_movi_i64(dst, 0);
73
+ } else {
74
+ TCGv_i64 rnd = tcg_temp_new_i64();
75
+ tcg_gen_extract_i64(rnd, src, shift - 1, 1);
76
+ tcg_gen_sari_i64(dst, src, shift);
77
+ tcg_gen_add_i64(dst, dst, rnd);
78
+ }
79
+}
80
+
81
+static void gen_urshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
82
+{
83
+ assert(shift >= 0 && shift <= 64);
84
+ if (shift == 0) {
85
+ tcg_gen_mov_i64(dst, src);
86
+ } else if (shift == 64) {
87
+ /* Rounding will propagate bit 63 into bit 64. */
88
+ tcg_gen_shri_i64(dst, src, 63);
89
+ } else {
90
+ TCGv_i64 rnd = tcg_temp_new_i64();
91
+ tcg_gen_extract_i64(rnd, src, shift - 1, 1);
92
+ tcg_gen_shri_i64(dst, src, shift);
93
+ tcg_gen_add_i64(dst, dst, rnd);
94
+ }
95
+}
96
+
97
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
98
* Note that it is the caller's responsibility to ensure that the
99
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
100
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
101
bool round, bool accumulate,
102
bool is_u, int size, int shift)
103
{
60
{
104
- bool extended_result = false;
61
return status->float_infzeronan_rule;
105
- int ext_lshift = 0;
62
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
106
- TCGv_i64 tcg_src_hi;
63
index XXXXXXX..XXXXXXX 100644
107
-
64
--- a/include/fpu/softfloat-types.h
108
- if (round && size == 3) {
65
+++ b/include/fpu/softfloat-types.h
109
- extended_result = true;
66
@@ -XXX,XX +XXX,XX @@ this code that are retained.
110
- ext_lshift = 64 - shift;
67
#ifndef SOFTFLOAT_TYPES_H
111
- tcg_src_hi = tcg_temp_new_i64();
68
#define SOFTFLOAT_TYPES_H
112
- } else if (shift == 64) {
69
113
- if (!accumulate && is_u) {
70
+#include "hw/registerfields.h"
114
- /* result is zero */
71
+
115
- tcg_gen_movi_i64(tcg_res, 0);
72
/*
116
- return;
73
* Software IEC/IEEE floating-point types.
117
- }
74
*/
118
- }
75
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
119
-
76
float_2nan_prop_x87,
120
- /* Deal with the rounding step */
77
} Float2NaNPropRule;
121
- if (round) {
78
122
- TCGv_i64 tcg_rnd = tcg_constant_i64(1ull << (shift - 1));
79
+/*
123
- if (extended_result) {
80
+ * 3-input NaN propagation rule, for fused multiply-add. Individual
124
- TCGv_i64 tcg_zero = tcg_constant_i64(0);
81
+ * architectures have different rules for which input NaN is
125
- if (!is_u) {
82
+ * propagated to the output when there is more than one NaN on the
126
- /* take care of sign extending tcg_res */
83
+ * input.
127
- tcg_gen_sari_i64(tcg_src_hi, tcg_src, 63);
84
+ *
128
- tcg_gen_add2_i64(tcg_src, tcg_src_hi,
85
+ * If default_nan_mode is enabled then it is valid not to set a NaN
129
- tcg_src, tcg_src_hi,
86
+ * propagation rule, because the softfloat code guarantees not to try
130
- tcg_rnd, tcg_zero);
87
+ * to pick a NaN to propagate in default NaN mode. When not in
131
- } else {
88
+ * default-NaN mode, it is an error for the target not to set the rule
132
- tcg_gen_add2_i64(tcg_src, tcg_src_hi,
89
+ * in float_status if it uses a muladd, and we will assert if we need
133
- tcg_src, tcg_zero,
90
+ * to handle an input NaN and no rule was selected.
134
- tcg_rnd, tcg_zero);
91
+ *
135
- }
92
+ * The naming scheme for Float3NaNPropRule values is:
136
+ if (!round) {
93
+ * float_3nan_prop_s_abc:
137
+ if (is_u) {
94
+ * = "Prefer SNaN over QNaN, then operand A over B over C"
138
+ gen_ushr_d(tcg_src, tcg_src, shift);
95
+ * float_3nan_prop_abc:
139
} else {
96
+ * = "Prefer A over B over C regardless of SNaN vs QNAN"
140
- tcg_gen_add_i64(tcg_src, tcg_src, tcg_rnd);
97
+ *
141
+ gen_sshr_d(tcg_src, tcg_src, shift);
98
+ * For QEMU, the multiply-add operation is A * B + C.
142
}
99
+ */
143
- }
100
+
144
-
101
+/*
145
- /* Now do the shift right */
102
+ * We set the Float3NaNPropRule enum values up so we can select the
146
- if (round && extended_result) {
103
+ * right value in pickNaNMulAdd in a data driven way.
147
- /* extended case, >64 bit precision required */
104
+ */
148
- if (ext_lshift == 0) {
105
+FIELD(3NAN, 1ST, 0, 2) /* which operand is most preferred ? */
149
- /* special case, only high bits matter */
106
+FIELD(3NAN, 2ND, 2, 2) /* which operand is next most preferred ? */
150
- tcg_gen_mov_i64(tcg_src, tcg_src_hi);
107
+FIELD(3NAN, 3RD, 4, 2) /* which operand is least preferred ? */
151
+ } else if (size == MO_64) {
108
+FIELD(3NAN, SNAN, 6, 1) /* do we prefer SNaN over QNaN ? */
152
+ if (is_u) {
109
+
153
+ gen_urshr_d(tcg_src, tcg_src, shift);
110
+#define PROPRULE(X, Y, Z) \
154
} else {
111
+ ((X << R_3NAN_1ST_SHIFT) | (Y << R_3NAN_2ND_SHIFT) | (Z << R_3NAN_3RD_SHIFT))
155
- tcg_gen_shri_i64(tcg_src, tcg_src, shift);
112
+
156
- tcg_gen_shli_i64(tcg_src_hi, tcg_src_hi, ext_lshift);
113
+typedef enum __attribute__((__packed__)) {
157
- tcg_gen_or_i64(tcg_src, tcg_src, tcg_src_hi);
114
+ float_3nan_prop_none = 0, /* No propagation rule specified */
158
+ gen_srshr_d(tcg_src, tcg_src, shift);
115
+ float_3nan_prop_abc = PROPRULE(0, 1, 2),
159
}
116
+ float_3nan_prop_acb = PROPRULE(0, 2, 1),
160
} else {
117
+ float_3nan_prop_bac = PROPRULE(1, 0, 2),
161
if (is_u) {
118
+ float_3nan_prop_bca = PROPRULE(1, 2, 0),
162
- if (shift == 64) {
119
+ float_3nan_prop_cab = PROPRULE(2, 0, 1),
163
- /* essentially shifting in 64 zeros */
120
+ float_3nan_prop_cba = PROPRULE(2, 1, 0),
164
- tcg_gen_movi_i64(tcg_src, 0);
121
+ float_3nan_prop_s_abc = float_3nan_prop_abc | R_3NAN_SNAN_MASK,
165
- } else {
122
+ float_3nan_prop_s_acb = float_3nan_prop_acb | R_3NAN_SNAN_MASK,
166
- tcg_gen_shri_i64(tcg_src, tcg_src, shift);
123
+ float_3nan_prop_s_bac = float_3nan_prop_bac | R_3NAN_SNAN_MASK,
167
- }
124
+ float_3nan_prop_s_bca = float_3nan_prop_bca | R_3NAN_SNAN_MASK,
168
+ gen_urshr_bhs(tcg_src, tcg_src, shift);
125
+ float_3nan_prop_s_cab = float_3nan_prop_cab | R_3NAN_SNAN_MASK,
169
} else {
126
+ float_3nan_prop_s_cba = float_3nan_prop_cba | R_3NAN_SNAN_MASK,
170
- if (shift == 64) {
127
+} Float3NaNPropRule;
171
- /* effectively extending the sign-bit */
128
+
172
- tcg_gen_sari_i64(tcg_src, tcg_src, 63);
129
+#undef PROPRULE
173
- } else {
130
+
174
- tcg_gen_sari_i64(tcg_src, tcg_src, shift);
131
/*
175
- }
132
* Rule for result of fused multiply-add 0 * Inf + NaN.
176
+ gen_srshr_bhs(tcg_src, tcg_src, shift);
133
* This must be a NaN, but implementations differ on whether this
134
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
135
FloatRoundMode float_rounding_mode;
136
FloatX80RoundPrec floatx80_rounding_precision;
137
Float2NaNPropRule float_2nan_prop_rule;
138
+ Float3NaNPropRule float_3nan_prop_rule;
139
FloatInfZeroNaNRule float_infzeronan_rule;
140
bool tininess_before_rounding;
141
/* should denormalised results go to zero and set the inexact flag? */
142
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
143
index XXXXXXX..XXXXXXX 100644
144
--- a/fpu/softfloat-specialize.c.inc
145
+++ b/fpu/softfloat-specialize.c.inc
146
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
147
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
148
bool infzero, bool have_snan, float_status *status)
149
{
150
+ FloatClass cls[3] = { a_cls, b_cls, c_cls };
151
+ Float3NaNPropRule rule = status->float_3nan_prop_rule;
152
+ int which;
153
+
154
/*
155
* We guarantee not to require the target to tell us how to
156
* pick a NaN if we're always returning the default NaN.
157
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
177
}
158
}
178
}
159
}
179
160
161
+ if (rule == float_3nan_prop_none) {
162
#if defined(TARGET_ARM)
163
-
164
- /* This looks different from the ARM ARM pseudocode, because the ARM ARM
165
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
166
- */
167
- if (is_snan(c_cls)) {
168
- return 2;
169
- } else if (is_snan(a_cls)) {
170
- return 0;
171
- } else if (is_snan(b_cls)) {
172
- return 1;
173
- } else if (is_qnan(c_cls)) {
174
- return 2;
175
- } else if (is_qnan(a_cls)) {
176
- return 0;
177
- } else {
178
- return 1;
179
- }
180
+ /*
181
+ * This looks different from the ARM ARM pseudocode, because the ARM ARM
182
+ * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
183
+ */
184
+ rule = float_3nan_prop_s_cab;
185
#elif defined(TARGET_MIPS)
186
- if (snan_bit_is_one(status)) {
187
- /* Prefer sNaN over qNaN, in the a, b, c order. */
188
- if (is_snan(a_cls)) {
189
- return 0;
190
- } else if (is_snan(b_cls)) {
191
- return 1;
192
- } else if (is_snan(c_cls)) {
193
- return 2;
194
- } else if (is_qnan(a_cls)) {
195
- return 0;
196
- } else if (is_qnan(b_cls)) {
197
- return 1;
198
+ if (snan_bit_is_one(status)) {
199
+ rule = float_3nan_prop_s_abc;
200
} else {
201
- return 2;
202
+ rule = float_3nan_prop_s_cab;
203
}
204
- } else {
205
- /* Prefer sNaN over qNaN, in the c, a, b order. */
206
- if (is_snan(c_cls)) {
207
- return 2;
208
- } else if (is_snan(a_cls)) {
209
- return 0;
210
- } else if (is_snan(b_cls)) {
211
- return 1;
212
- } else if (is_qnan(c_cls)) {
213
- return 2;
214
- } else if (is_qnan(a_cls)) {
215
- return 0;
216
- } else {
217
- return 1;
218
- }
219
- }
220
#elif defined(TARGET_LOONGARCH64)
221
- /* Prefer sNaN over qNaN, in the c, a, b order. */
222
- if (is_snan(c_cls)) {
223
- return 2;
224
- } else if (is_snan(a_cls)) {
225
- return 0;
226
- } else if (is_snan(b_cls)) {
227
- return 1;
228
- } else if (is_qnan(c_cls)) {
229
- return 2;
230
- } else if (is_qnan(a_cls)) {
231
- return 0;
232
- } else {
233
- return 1;
234
- }
235
+ rule = float_3nan_prop_s_cab;
236
#elif defined(TARGET_PPC)
237
- /* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
238
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
239
- */
240
- if (is_nan(a_cls)) {
241
- return 0;
242
- } else if (is_nan(c_cls)) {
243
- return 2;
244
- } else {
245
- return 1;
246
- }
247
+ /*
248
+ * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
249
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
250
+ */
251
+ rule = float_3nan_prop_acb;
252
#elif defined(TARGET_S390X)
253
- if (is_snan(a_cls)) {
254
- return 0;
255
- } else if (is_snan(b_cls)) {
256
- return 1;
257
- } else if (is_snan(c_cls)) {
258
- return 2;
259
- } else if (is_qnan(a_cls)) {
260
- return 0;
261
- } else if (is_qnan(b_cls)) {
262
- return 1;
263
- } else {
264
- return 2;
265
- }
266
+ rule = float_3nan_prop_s_abc;
267
#elif defined(TARGET_SPARC)
268
- /* Prefer SNaN over QNaN, order C, B, A. */
269
- if (is_snan(c_cls)) {
270
- return 2;
271
- } else if (is_snan(b_cls)) {
272
- return 1;
273
- } else if (is_snan(a_cls)) {
274
- return 0;
275
- } else if (is_qnan(c_cls)) {
276
- return 2;
277
- } else if (is_qnan(b_cls)) {
278
- return 1;
279
- } else {
280
- return 0;
281
- }
282
+ rule = float_3nan_prop_s_cba;
283
#elif defined(TARGET_XTENSA)
284
- /*
285
- * For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
286
- * an input NaN if we have one (ie c).
287
- */
288
- if (status->use_first_nan) {
289
- if (is_nan(a_cls)) {
290
- return 0;
291
- } else if (is_nan(b_cls)) {
292
- return 1;
293
+ if (status->use_first_nan) {
294
+ rule = float_3nan_prop_abc;
295
} else {
296
- return 2;
297
+ rule = float_3nan_prop_cba;
298
}
299
- } else {
300
- if (is_nan(c_cls)) {
301
- return 2;
302
- } else if (is_nan(b_cls)) {
303
- return 1;
304
- } else {
305
- return 0;
306
- }
307
- }
308
#else
309
- /* A default implementation: prefer a to b to c.
310
- * This is unlikely to actually match any real implementation.
311
- */
312
- if (is_nan(a_cls)) {
313
- return 0;
314
- } else if (is_nan(b_cls)) {
315
- return 1;
316
- } else {
317
- return 2;
318
- }
319
+ rule = float_3nan_prop_abc;
320
#endif
321
+ }
322
+
323
+ assert(rule != float_3nan_prop_none);
324
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
325
+ /* We have at least one SNaN input and should prefer it */
326
+ do {
327
+ which = rule & R_3NAN_1ST_MASK;
328
+ rule >>= R_3NAN_1ST_LENGTH;
329
+ } while (!is_snan(cls[which]));
330
+ } else {
331
+ do {
332
+ which = rule & R_3NAN_1ST_MASK;
333
+ rule >>= R_3NAN_1ST_LENGTH;
334
+ } while (!is_nan(cls[which]));
335
+ }
336
+ return which;
337
}
338
339
/*----------------------------------------------------------------------------
180
--
340
--
181
2.34.1
341
2.34.1
diff view generated by jsdifflib
New patch
1
Explicitly set a rule in the softfloat tests for propagating NaNs in
2
the muladd case. In meson.build we put -DTARGET_ARM in fpcflags, and
3
so we should select here the Arm rule of float_3nan_prop_s_cab.
1
4
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-17-peter.maydell@linaro.org
8
---
9
tests/fp/fp-bench.c | 1 +
10
tests/fp/fp-test.c | 1 +
11
2 files changed, 2 insertions(+)
12
13
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tests/fp/fp-bench.c
16
+++ b/tests/fp/fp-bench.c
17
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
18
* doesn't specify match those used by the Arm architecture.
19
*/
20
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
22
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
23
24
f = bench_funcs[operation][precision];
25
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/tests/fp/fp-test.c
28
+++ b/tests/fp/fp-test.c
29
@@ -XXX,XX +XXX,XX @@ void run_test(void)
30
* doesn't specify match those used by the Arm architecture.
31
*/
32
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
33
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
34
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
35
36
genCases_setLevel(test_level);
37
--
38
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-18-peter.maydell@linaro.org
7
---
8
target/arm/cpu.c | 5 +++++
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 6 insertions(+), 7 deletions(-)
11
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
17
* * tininess-before-rounding
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
20
+ * * 3-input NaN propagation prefers SNaN over QNaN, and then
21
+ * operand C over A over B (see FPProcessNaNs3() pseudocode,
22
+ * but note that for QEMU muladd is a * b + c, whereas for
23
+ * the pseudocode function the arguments are in the order c, a, b.
24
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
25
* and the input NaN if it is signalling
26
*/
27
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
28
{
29
set_float_detect_tininess(float_tininess_before_rounding, s);
30
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
31
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
32
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
33
}
34
35
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
36
index XXXXXXX..XXXXXXX 100644
37
--- a/fpu/softfloat-specialize.c.inc
38
+++ b/fpu/softfloat-specialize.c.inc
39
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
40
}
41
42
if (rule == float_3nan_prop_none) {
43
-#if defined(TARGET_ARM)
44
- /*
45
- * This looks different from the ARM ARM pseudocode, because the ARM ARM
46
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
47
- */
48
- rule = float_3nan_prop_s_cab;
49
-#elif defined(TARGET_MIPS)
50
+#if defined(TARGET_MIPS)
51
if (snan_bit_is_one(status)) {
52
rule = float_3nan_prop_s_abc;
53
} else {
54
--
55
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for loongarch, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-19-peter.maydell@linaro.org
7
---
8
target/loongarch/tcg/fpu_helper.c | 1 +
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 1 insertion(+), 2 deletions(-)
11
12
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/loongarch/tcg/fpu_helper.c
15
+++ b/target/loongarch/tcg/fpu_helper.c
16
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
17
* case sets InvalidOp and returns the input value 'c'
18
*/
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
21
}
22
23
int ieee_ex_to_loongarch(int xcpt)
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
29
} else {
30
rule = float_3nan_prop_s_cab;
31
}
32
-#elif defined(TARGET_LOONGARCH64)
33
- rule = float_3nan_prop_s_cab;
34
#elif defined(TARGET_PPC)
35
/*
36
* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
37
--
38
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for PPC, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-20-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 8 ++++++++
9
fpu/softfloat-specialize.c.inc | 6 ------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * NaN propagation for fused multiply-add:
22
+ * if fRA is a NaN return it; otherwise if fRB is a NaN return it;
23
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
24
+ * whereas QEMU labels the operands as (a * b) + c.
25
+ */
26
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->fp_status);
27
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->vec_status);
28
/*
29
* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
30
* to return an input NaN if we have one (ie c) rather than generating
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
32
index XXXXXXX..XXXXXXX 100644
33
--- a/fpu/softfloat-specialize.c.inc
34
+++ b/fpu/softfloat-specialize.c.inc
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
36
} else {
37
rule = float_3nan_prop_s_cab;
38
}
39
-#elif defined(TARGET_PPC)
40
- /*
41
- * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
42
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
43
- */
44
- rule = float_3nan_prop_acb;
45
#elif defined(TARGET_S390X)
46
rule = float_3nan_prop_s_abc;
47
#elif defined(TARGET_SPARC)
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for s390x, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-21-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 1 +
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 1 insertion(+), 2 deletions(-)
11
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/s390x/cpu.c
15
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
set_float_detect_tininess(float_tininess_before_rounding,
18
&env->fpu_status);
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
21
set_float_infzeronan_rule(float_infzeronan_dnan_always,
22
&env->fpu_status);
23
/* fall through */
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
29
} else {
30
rule = float_3nan_prop_s_cab;
31
}
32
-#elif defined(TARGET_S390X)
33
- rule = float_3nan_prop_s_abc;
34
#elif defined(TARGET_SPARC)
35
rule = float_3nan_prop_s_cba;
36
#elif defined(TARGET_XTENSA)
37
--
38
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for SPARC, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-22-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/sparc/cpu.c
15
+++ b/target/sparc/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
17
* the CPU state struct so it won't get zeroed on reset.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
20
+ /* For fused-multiply add, prefer SNaN over QNaN, then C->B->A */
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
22
/* For inf * 0 + NaN, return the input NaN */
23
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
} else {
31
rule = float_3nan_prop_s_cab;
32
}
33
-#elif defined(TARGET_SPARC)
34
- rule = float_3nan_prop_s_cba;
35
#elif defined(TARGET_XTENSA)
36
if (status->use_first_nan) {
37
rule = float_3nan_prop_abc;
38
--
39
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Instead of cmp+and or cmp+andc, use cmpsel. This will
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
be better for hosts that use predicate registers for cmp.
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-23-peter.maydell@linaro.org
7
---
8
target/mips/fpu_helper.h | 4 ++++
9
target/mips/msa.c | 3 +++
10
fpu/softfloat-specialize.c.inc | 8 +-------
11
3 files changed, 8 insertions(+), 7 deletions(-)
5
12
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-5-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/gengvec.c | 8 +++-----
12
1 file changed, 3 insertions(+), 5 deletions(-)
13
14
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/gengvec.c
15
--- a/target/mips/fpu_helper.h
17
+++ b/target/arm/tcg/gengvec.c
16
+++ b/target/mips/fpu_helper.h
18
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
19
TCGv_vec rval = tcg_temp_new_vec_matching(dst);
18
{
20
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
19
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
21
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
20
FloatInfZeroNaNRule izn_rule;
22
- TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
21
+ Float3NaNPropRule nan3_rule;
23
TCGv_vec max, zero;
24
22
25
/*
23
/*
26
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
24
* With nan2008, SNaNs are silenced in the usual way.
27
/* Bound rsh so out of bound right shift gets -1. */
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
28
max = tcg_constant_vec_matching(dst, vece, (8 << vece) - 1);
26
*/
29
tcg_gen_umin_vec(vece, rsh, rsh, max);
27
izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
30
- tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, max);
28
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
31
29
+ nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
32
tcg_gen_shlv_vec(vece, lval, src, lsh);
30
+ set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
33
tcg_gen_sarv_vec(vece, rval, src, rsh);
31
+
34
32
}
35
/* Select in-bound left shift. */
33
36
- tcg_gen_andc_vec(vece, lval, lval, tmp);
34
static inline void restore_fp_status(CPUMIPSState *env)
37
+ zero = tcg_constant_vec_matching(dst, vece, 0);
35
diff --git a/target/mips/msa.c b/target/mips/msa.c
38
+ tcg_gen_cmpsel_vec(TCG_COND_GT, vece, lval, lsh, max, zero, lval);
36
index XXXXXXX..XXXXXXX 100644
39
37
--- a/target/mips/msa.c
40
/* Select between left and right shift. */
38
+++ b/target/mips/msa.c
41
- zero = tcg_constant_vec_matching(dst, vece, 0);
39
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
42
if (vece == MO_8) {
40
set_float_2nan_prop_rule(float_2nan_prop_s_ab,
43
tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, zero, rval, lval);
41
&env->active_tc.msa_fp_status);
44
} else {
42
45
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
43
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab,
46
{
44
+ &env->active_tc.msa_fp_status);
47
static const TCGOpcode vecop_list[] = {
45
+
48
INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
46
/* clear float_status exception flags */
49
- INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
47
set_float_exception_flags(0, &env->active_tc.msa_fp_status);
50
+ INDEX_op_sarv_vec, INDEX_op_cmpsel_vec, 0
48
51
};
49
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
52
static const GVecGen3 ops[4] = {
50
index XXXXXXX..XXXXXXX 100644
53
{ .fniv = gen_sshl_vec,
51
--- a/fpu/softfloat-specialize.c.inc
52
+++ b/fpu/softfloat-specialize.c.inc
53
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
54
}
55
56
if (rule == float_3nan_prop_none) {
57
-#if defined(TARGET_MIPS)
58
- if (snan_bit_is_one(status)) {
59
- rule = float_3nan_prop_s_abc;
60
- } else {
61
- rule = float_3nan_prop_s_cab;
62
- }
63
-#elif defined(TARGET_XTENSA)
64
+#if defined(TARGET_XTENSA)
65
if (status->use_first_nan) {
66
rule = float_3nan_prop_abc;
67
} else {
54
--
68
--
55
2.34.1
69
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the Float3NaNPropRule explicitly for xtensa, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
Use simple shift and add instead of ctpop, ctz, shift and mask.
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Unlike SVE, there is no predicate to disable elements.
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-24-peter.maydell@linaro.org
7
---
8
target/xtensa/fpu_helper.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 8 --------
10
2 files changed, 2 insertions(+), 8 deletions(-)
5
11
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-10-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/translate-a64.c | 40 +++++++++++-----------------------
12
1 file changed, 13 insertions(+), 27 deletions(-)
13
14
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/translate-a64.c
14
--- a/target/xtensa/fpu_helper.c
17
+++ b/target/arm/tcg/translate-a64.c
15
+++ b/target/xtensa/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
16
@@ -XXX,XX +XXX,XX @@ void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
19
* important for correct NaN propagation that we do these
17
set_use_first_nan(use_first, &env->fp_status);
20
* operations in exactly the order specified by the pseudocode.
18
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
21
*
19
&env->fp_status);
22
- * This is a recursive function, TCG temps should be freed by the
20
+ set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
23
- * calling function once it is done with the values.
21
+ &env->fp_status);
24
+ * This is a recursive function.
22
}
25
*/
23
26
static TCGv_i32 do_reduction_op(DisasContext *s, int fpopcode, int rn,
24
void HELPER(wur_fpu2k_fcr)(CPUXtensaState *env, uint32_t v)
27
- int esize, int size, int vmap, TCGv_ptr fpst)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
+ MemOp esz, int ebase, int ecount, TCGv_ptr fpst)
26
index XXXXXXX..XXXXXXX 100644
29
{
27
--- a/fpu/softfloat-specialize.c.inc
30
- if (esize == size) {
28
+++ b/fpu/softfloat-specialize.c.inc
31
- int element;
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
32
- MemOp msize = esize == 16 ? MO_16 : MO_32;
33
- TCGv_i32 tcg_elem;
34
-
35
- /* We should have one register left here */
36
- assert(ctpop8(vmap) == 1);
37
- element = ctz32(vmap);
38
- assert(element < 8);
39
-
40
- tcg_elem = tcg_temp_new_i32();
41
- read_vec_element_i32(s, tcg_elem, rn, element, msize);
42
+ if (ecount == 1) {
43
+ TCGv_i32 tcg_elem = tcg_temp_new_i32();
44
+ read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
45
return tcg_elem;
46
} else {
47
- int bits = size / 2;
48
- int shift = ctpop8(vmap) / 2;
49
- int vmap_lo = (vmap >> shift) & vmap;
50
- int vmap_hi = (vmap & ~vmap_lo);
51
+ int half = ecount >> 1;
52
TCGv_i32 tcg_hi, tcg_lo, tcg_res;
53
54
- tcg_hi = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_hi, fpst);
55
- tcg_lo = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_lo, fpst);
56
+ tcg_hi = do_reduction_op(s, fpopcode, rn, esz,
57
+ ebase + half, half, fpst);
58
+ tcg_lo = do_reduction_op(s, fpopcode, rn, esz,
59
+ ebase, half, fpst);
60
tcg_res = tcg_temp_new_i32();
61
62
switch (fpopcode) {
63
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
64
bool is_u = extract32(insn, 29, 1);
65
bool is_fp = false;
66
bool is_min = false;
67
- int esize;
68
int elements;
69
int i;
70
TCGv_i64 tcg_res, tcg_elt;
71
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
72
return;
73
}
30
}
74
31
75
- esize = 8 << size;
32
if (rule == float_3nan_prop_none) {
76
- elements = (is_q ? 128 : 64) / esize;
33
-#if defined(TARGET_XTENSA)
77
+ elements = (is_q ? 16 : 8) >> size;
34
- if (status->use_first_nan) {
78
35
- rule = float_3nan_prop_abc;
79
tcg_res = tcg_temp_new_i64();
36
- } else {
80
tcg_elt = tcg_temp_new_i64();
37
- rule = float_3nan_prop_cba;
81
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
38
- }
82
*/
39
-#else
83
TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
40
rule = float_3nan_prop_abc;
84
int fpopcode = opcode | is_min << 4 | is_u << 5;
41
-#endif
85
- int vmap = (1 << elements) - 1;
86
- TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, esize,
87
- (is_q ? 128 : 64), vmap, fpst);
88
+ TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, size,
89
+ 0, elements, fpst);
90
tcg_gen_extu_i32_i64(tcg_res, tcg_res32);
91
}
42
}
92
43
44
assert(rule != float_3nan_prop_none);
93
--
45
--
94
2.34.1
46
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for i386. We had no
2
i386-specific behaviour in the old ifdef ladder, so we were using the
3
default "prefer a then b then c" fallback; this is actually the
4
correct per-the-spec handling for i386.
1
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-25-peter.maydell@linaro.org
9
---
10
target/i386/tcg/fpu_helper.c | 1 +
11
1 file changed, 1 insertion(+)
12
13
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/i386/tcg/fpu_helper.c
16
+++ b/target/i386/tcg/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
18
* there are multiple input NaNs they are selected in the order a, b, c.
19
*/
20
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
21
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
22
}
23
24
static inline uint8_t save_exception_flags(CPUX86State *env)
25
--
26
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for HPPA, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
HPPA is the only target that was using the default branch of the
5
ifdef ladder (other targets either do not use muladd or set
6
default_nan_mode), so we can remove the ifdef fallback entirely now
7
(allowing the "rule not set" case to fall into the default of the
8
switch statement and assert).
9
10
We add a TODO note that the HPPA rule is probably wrong; this is
11
not a behavioural change for this refactoring.
12
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20241202131347.498124-26-peter.maydell@linaro.org
16
---
17
target/hppa/fpu_helper.c | 8 ++++++++
18
fpu/softfloat-specialize.c.inc | 4 ----
19
2 files changed, 8 insertions(+), 4 deletions(-)
20
21
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
22
index XXXXXXX..XXXXXXX 100644
23
--- a/target/hppa/fpu_helper.c
24
+++ b/target/hppa/fpu_helper.c
25
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
26
* HPPA does note implement a CPU reset method at all...
27
*/
28
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
29
+ /*
30
+ * TODO: The HPPA architecture reference only documents its NaN
31
+ * propagation rule for 2-operand operations. Testing on real hardware
32
+ * might be necessary to confirm whether this order for muladd is correct.
33
+ * Not preferring the SNaN is almost certainly incorrect as it diverges
34
+ * from the documented rules for 2-operand operations.
35
+ */
36
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
37
/* For inf * 0 + NaN, return the input NaN */
38
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
39
}
40
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
41
index XXXXXXX..XXXXXXX 100644
42
--- a/fpu/softfloat-specialize.c.inc
43
+++ b/fpu/softfloat-specialize.c.inc
44
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
45
}
46
}
47
48
- if (rule == float_3nan_prop_none) {
49
- rule = float_3nan_prop_abc;
50
- }
51
-
52
assert(rule != float_3nan_prop_none);
53
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
54
/* We have at least one SNaN input and should prefer it */
55
--
56
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The use_first_nan field in float_status was an xtensa-specific way to
2
select at runtime from two different NaN propagation rules. Now that
3
xtensa is using the target-agnostic NaN propagation rule selection
4
that we've just added, we can remove use_first_nan, because there is
5
no longer any code that reads it.
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-30-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20241202131347.498124-27-peter.maydell@linaro.org
7
---
10
---
8
target/arm/tcg/a64.decode | 30 +++++++
11
include/fpu/softfloat-helpers.h | 5 -----
9
target/arm/tcg/translate-a64.c | 160 +++++++--------------------------
12
include/fpu/softfloat-types.h | 1 -
10
2 files changed, 63 insertions(+), 127 deletions(-)
13
target/xtensa/fpu_helper.c | 1 -
14
3 files changed, 7 deletions(-)
11
15
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
13
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
18
--- a/include/fpu/softfloat-helpers.h
15
+++ b/target/arm/tcg/a64.decode
19
+++ b/include/fpu/softfloat-helpers.h
16
@@ -XXX,XX +XXX,XX @@ SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_s
20
@@ -XXX,XX +XXX,XX @@ static inline void set_snan_bit_is_one(bool val, float_status *status)
17
21
status->snan_bit_is_one = val;
18
# Advanced SIMD scalar shift by immediate
19
20
+@shri_b .... ..... 0001 ... ..... . rn:5 rd:5 \
21
+ &rri_e esz=0 imm=%neon_rshift_i3
22
+@shri_h .... ..... 001 .... ..... . rn:5 rd:5 \
23
+ &rri_e esz=1 imm=%neon_rshift_i4
24
+@shri_s .... ..... 01 ..... ..... . rn:5 rd:5 \
25
+ &rri_e esz=2 imm=%neon_rshift_i5
26
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
27
&rri_e esz=3 imm=%neon_rshift_i6
28
29
@@ -XXX,XX +XXX,XX @@ SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_b
30
SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_h
31
SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_s
32
SQSHLU_si 0111 11110 .... ... 01100 1 ..... ..... @shli_d
33
+
34
+SQSHRN_si 0101 11110 .... ... 10010 1 ..... ..... @shri_b
35
+SQSHRN_si 0101 11110 .... ... 10010 1 ..... ..... @shri_h
36
+SQSHRN_si 0101 11110 .... ... 10010 1 ..... ..... @shri_s
37
+
38
+UQSHRN_si 0111 11110 .... ... 10010 1 ..... ..... @shri_b
39
+UQSHRN_si 0111 11110 .... ... 10010 1 ..... ..... @shri_h
40
+UQSHRN_si 0111 11110 .... ... 10010 1 ..... ..... @shri_s
41
+
42
+SQSHRUN_si 0111 11110 .... ... 10000 1 ..... ..... @shri_b
43
+SQSHRUN_si 0111 11110 .... ... 10000 1 ..... ..... @shri_h
44
+SQSHRUN_si 0111 11110 .... ... 10000 1 ..... ..... @shri_s
45
+
46
+SQRSHRN_si 0101 11110 .... ... 10011 1 ..... ..... @shri_b
47
+SQRSHRN_si 0101 11110 .... ... 10011 1 ..... ..... @shri_h
48
+SQRSHRN_si 0101 11110 .... ... 10011 1 ..... ..... @shri_s
49
+
50
+UQRSHRN_si 0111 11110 .... ... 10011 1 ..... ..... @shri_b
51
+UQRSHRN_si 0111 11110 .... ... 10011 1 ..... ..... @shri_h
52
+UQRSHRN_si 0111 11110 .... ... 10011 1 ..... ..... @shri_s
53
+
54
+SQRSHRUN_si 0111 11110 .... ... 10001 1 ..... ..... @shri_b
55
+SQRSHRUN_si 0111 11110 .... ... 10001 1 ..... ..... @shri_h
56
+SQRSHRUN_si 0111 11110 .... ... 10001 1 ..... ..... @shri_s
57
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/tcg/translate-a64.c
60
+++ b/target/arm/tcg/translate-a64.c
61
@@ -XXX,XX +XXX,XX @@ TRANS(SQSHL_si, do_scalar_shift_imm, a, f_scalar_sqshli[a->esz], false, 0)
62
TRANS(UQSHL_si, do_scalar_shift_imm, a, f_scalar_uqshli[a->esz], false, 0)
63
TRANS(SQSHLU_si, do_scalar_shift_imm, a, f_scalar_sqshlui[a->esz], false, 0)
64
65
+static bool do_scalar_shift_imm_narrow(DisasContext *s, arg_rri_e *a,
66
+ WideShiftImmFn * const fns[3],
67
+ MemOp sign, bool zext)
68
+{
69
+ MemOp esz = a->esz;
70
+
71
+ tcg_debug_assert(esz >= MO_8 && esz <= MO_32);
72
+
73
+ if (fp_access_check(s)) {
74
+ TCGv_i64 rd = tcg_temp_new_i64();
75
+ TCGv_i64 rn = tcg_temp_new_i64();
76
+
77
+ read_vec_element(s, rn, a->rn, 0, (esz + 1) | sign);
78
+ fns[esz](rd, rn, a->imm);
79
+ if (zext) {
80
+ tcg_gen_ext_i64(rd, rd, esz);
81
+ }
82
+ write_fp_dreg(s, a->rd, rd);
83
+ }
84
+ return true;
85
+}
86
+
87
+TRANS(SQSHRN_si, do_scalar_shift_imm_narrow, a, sqshrn_fns, MO_SIGN, true)
88
+TRANS(SQRSHRN_si, do_scalar_shift_imm_narrow, a, sqrshrn_fns, MO_SIGN, true)
89
+TRANS(UQSHRN_si, do_scalar_shift_imm_narrow, a, uqshrn_fns, 0, false)
90
+TRANS(UQRSHRN_si, do_scalar_shift_imm_narrow, a, uqrshrn_fns, 0, false)
91
+TRANS(SQSHRUN_si, do_scalar_shift_imm_narrow, a, sqshrun_fns, MO_SIGN, false)
92
+TRANS(SQRSHRUN_si, do_scalar_shift_imm_narrow, a, sqrshrun_fns, MO_SIGN, false)
93
+
94
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
95
* Note that it is the caller's responsibility to ensure that the
96
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
97
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
98
}
99
}
22
}
100
23
101
-/*
24
-static inline void set_use_first_nan(bool val, float_status *status)
102
- * Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
103
- *
104
- * This code is handles the common shifting code and is used by both
105
- * the vector and scalar code.
106
- */
107
-static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
108
- bool round, bool accumulate,
109
- bool is_u, int size, int shift)
110
-{
25
-{
111
- if (!round) {
26
- status->use_first_nan = val;
112
- if (is_u) {
113
- gen_ushr_d(tcg_src, tcg_src, shift);
114
- } else {
115
- gen_sshr_d(tcg_src, tcg_src, shift);
116
- }
117
- } else if (size == MO_64) {
118
- if (is_u) {
119
- gen_urshr_d(tcg_src, tcg_src, shift);
120
- } else {
121
- gen_srshr_d(tcg_src, tcg_src, shift);
122
- }
123
- } else {
124
- if (is_u) {
125
- gen_urshr_bhs(tcg_src, tcg_src, shift);
126
- } else {
127
- gen_srshr_bhs(tcg_src, tcg_src, shift);
128
- }
129
- }
130
-
131
- if (accumulate) {
132
- tcg_gen_add_i64(tcg_res, tcg_res, tcg_src);
133
- } else {
134
- tcg_gen_mov_i64(tcg_res, tcg_src);
135
- }
136
-}
27
-}
137
-
28
-
138
-/* SQSHRN/SQSHRUN - Saturating (signed/unsigned) shift right with
29
static inline void set_no_signaling_nans(bool val, float_status *status)
139
- * (signed/unsigned) narrowing */
30
{
140
-static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
31
status->no_signaling_nans = val;
141
- bool is_u_shift, bool is_u_narrow,
32
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
142
- int immh, int immb, int opcode,
33
index XXXXXXX..XXXXXXX 100644
143
- int rn, int rd)
34
--- a/include/fpu/softfloat-types.h
144
-{
35
+++ b/include/fpu/softfloat-types.h
145
- int immhb = immh << 3 | immb;
36
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
146
- int size = 32 - clz32(immh) - 1;
37
* softfloat-specialize.inc.c)
147
- int esize = 8 << size;
38
*/
148
- int shift = (2 * esize) - immhb;
39
bool snan_bit_is_one;
149
- int elements = is_scalar ? 1 : (64 / esize);
40
- bool use_first_nan;
150
- bool round = extract32(opcode, 0, 1);
41
bool no_signaling_nans;
151
- MemOp ldop = (size + 1) | (is_u_shift ? 0 : MO_SIGN);
42
/* should overflowed results subtract re_bias to its exponent? */
152
- TCGv_i64 tcg_rn, tcg_rd, tcg_final;
43
bool rebias_overflow;
153
-
44
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
154
- static NeonGenOne64OpEnvFn * const signed_narrow_fns[4][2] = {
45
index XXXXXXX..XXXXXXX 100644
155
- { gen_helper_neon_narrow_sat_s8,
46
--- a/target/xtensa/fpu_helper.c
156
- gen_helper_neon_unarrow_sat8 },
47
+++ b/target/xtensa/fpu_helper.c
157
- { gen_helper_neon_narrow_sat_s16,
48
@@ -XXX,XX +XXX,XX @@ static const struct {
158
- gen_helper_neon_unarrow_sat16 },
49
159
- { gen_helper_neon_narrow_sat_s32,
50
void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
160
- gen_helper_neon_unarrow_sat32 },
51
{
161
- { NULL, NULL },
52
- set_use_first_nan(use_first, &env->fp_status);
162
- };
53
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
163
- static NeonGenOne64OpEnvFn * const unsigned_narrow_fns[4] = {
54
&env->fp_status);
164
- gen_helper_neon_narrow_sat_u8,
55
set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
165
- gen_helper_neon_narrow_sat_u16,
166
- gen_helper_neon_narrow_sat_u32,
167
- NULL
168
- };
169
- NeonGenOne64OpEnvFn *narrowfn;
170
-
171
- int i;
172
-
173
- assert(size < 4);
174
-
175
- if (extract32(immh, 3, 1)) {
176
- unallocated_encoding(s);
177
- return;
178
- }
179
-
180
- if (!fp_access_check(s)) {
181
- return;
182
- }
183
-
184
- if (is_u_shift) {
185
- narrowfn = unsigned_narrow_fns[size];
186
- } else {
187
- narrowfn = signed_narrow_fns[size][is_u_narrow ? 1 : 0];
188
- }
189
-
190
- tcg_rn = tcg_temp_new_i64();
191
- tcg_rd = tcg_temp_new_i64();
192
- tcg_final = tcg_temp_new_i64();
193
-
194
- for (i = 0; i < elements; i++) {
195
- read_vec_element(s, tcg_rn, rn, i, ldop);
196
- handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
197
- false, is_u_shift, size+1, shift);
198
- narrowfn(tcg_rd, tcg_env, tcg_rd);
199
- if (i == 0) {
200
- tcg_gen_extract_i64(tcg_final, tcg_rd, 0, esize);
201
- } else {
202
- tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
203
- }
204
- }
205
-
206
- if (!is_q) {
207
- write_vec_element(s, tcg_final, rd, 0, MO_64);
208
- } else {
209
- write_vec_element(s, tcg_final, rd, 1, MO_64);
210
- }
211
- clear_vec_high(s, is_q, rd);
212
-}
213
-
214
/* Common vector code for handling integer to FP conversion */
215
static void handle_simd_intfp_conv(DisasContext *s, int rd, int rn,
216
int elements, int is_signed,
217
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
218
handle_simd_shift_intfp_conv(s, true, false, is_u, immh, immb,
219
opcode, rn, rd);
220
break;
221
- case 0x10: /* SQSHRUN, SQSHRUN2 */
222
- case 0x11: /* SQRSHRUN, SQRSHRUN2 */
223
- if (!is_u) {
224
- unallocated_encoding(s);
225
- return;
226
- }
227
- handle_vec_simd_sqshrn(s, true, false, false, true,
228
- immh, immb, opcode, rn, rd);
229
- break;
230
- case 0x12: /* SQSHRN, SQSHRN2, UQSHRN */
231
- case 0x13: /* SQRSHRN, SQRSHRN2, UQRSHRN, UQRSHRN2 */
232
- handle_vec_simd_sqshrn(s, true, false, is_u, is_u,
233
- immh, immb, opcode, rn, rd);
234
- break;
235
case 0x1f: /* FCVTZS, FCVTZU */
236
handle_simd_shift_fpint_conv(s, true, false, is_u, immh, immb, rn, rd);
237
break;
238
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
239
case 0x0a: /* SHL / SLI */
240
case 0x0c: /* SQSHLU */
241
case 0x0e: /* SQSHL, UQSHL */
242
+ case 0x10: /* SQSHRUN */
243
+ case 0x11: /* SQRSHRUN */
244
+ case 0x12: /* SQSHRN, UQSHRN */
245
+ case 0x13: /* SQRSHRN, UQRSHRN */
246
unallocated_encoding(s);
247
break;
248
}
249
--
56
--
250
2.34.1
57
2.34.1
diff view generated by jsdifflib
New patch
1
Currently m68k_cpu_reset_hold() calls floatx80_default_nan(NULL)
2
to get the NaN bit pattern to reset the FPU registers. This
3
works because it happens that our implementation of
4
floatx80_default_nan() doesn't actually look at the float_status
5
pointer except for TARGET_MIPS. However, this isn't guaranteed,
6
and to be able to remove the ifdef in floatx80_default_nan()
7
we're going to need a real float_status here.
1
8
9
Rearrange m68k_cpu_reset_hold() so that we initialize env->fp_status
10
earlier, and thus can pass it to floatx80_default_nan().
11
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20241202131347.498124-28-peter.maydell@linaro.org
15
---
16
target/m68k/cpu.c | 12 +++++++-----
17
1 file changed, 7 insertions(+), 5 deletions(-)
18
19
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/m68k/cpu.c
22
+++ b/target/m68k/cpu.c
23
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
24
CPUState *cs = CPU(obj);
25
M68kCPUClass *mcc = M68K_CPU_GET_CLASS(obj);
26
CPUM68KState *env = cpu_env(cs);
27
- floatx80 nan = floatx80_default_nan(NULL);
28
+ floatx80 nan;
29
int i;
30
31
if (mcc->parent_phases.hold) {
32
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
33
#else
34
cpu_m68k_set_sr(env, SR_S | SR_I);
35
#endif
36
- for (i = 0; i < 8; i++) {
37
- env->fregs[i].d = nan;
38
- }
39
- cpu_m68k_set_fpcr(env, 0);
40
/*
41
* M68000 FAMILY PROGRAMMER'S REFERENCE MANUAL
42
* 3.4 FLOATING-POINT INSTRUCTION DETAILS
43
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
44
* preceding paragraph for nonsignaling NaNs.
45
*/
46
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
47
+
48
+ nan = floatx80_default_nan(&env->fp_status);
49
+ for (i = 0; i < 8; i++) {
50
+ env->fregs[i].d = nan;
51
+ }
52
+ cpu_m68k_set_fpcr(env, 0);
53
env->fpsr = 0;
54
55
/* TODO: We should set PC from the interrupt vector. */
56
--
57
2.34.1
diff view generated by jsdifflib
New patch
1
We create our 128-bit default NaN by calling parts64_default_nan()
2
and then adjusting the result. We can do the same trick for creating
3
the floatx80 default NaN, which lets us drop a target ifdef.
1
4
5
floatx80 is used only by:
6
i386
7
m68k
8
arm nwfpe old floating-point emulation emulation support
9
(which is essentially dead, especially the parts involving floatx80)
10
PPC (only in the xsrqpxp instruction, which just rounds an input
11
value by converting to floatx80 and back, so will never generate
12
the default NaN)
13
14
The floatx80 default NaN as currently implemented is:
15
m68k: sign = 0, exp = 1...1, int = 1, frac = 1....1
16
i386: sign = 1, exp = 1...1, int = 1, frac = 10...0
17
18
These are the same as the parts64_default_nan for these architectures.
19
20
This is technically a possible behaviour change for arm linux-user
21
nwfpe emulation emulation, because the default NaN will now have the
22
sign bit clear. But we were already generating a different floatx80
23
default NaN from the real kernel emulation we are supposedly
24
following, which appears to use an all-bits-1 value:
25
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L267
26
27
This won't affect the only "real" use of the nwfpe emulation, which
28
is ancient binaries that used it as part of the old floating point
29
calling convention; that only uses loads and stores of 32 and 64 bit
30
floats, not any of the floatx80 behaviour the original hardware had.
31
We also get the nwfpe float64 default NaN value wrong:
32
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L166
33
so if we ever cared about this obscure corner the right fix would be
34
to correct that so nwfpe used its own default-NaN setting rather
35
than the Arm VFP one.
36
37
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
38
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
39
Message-id: 20241202131347.498124-29-peter.maydell@linaro.org
40
---
41
fpu/softfloat-specialize.c.inc | 20 ++++++++++----------
42
1 file changed, 10 insertions(+), 10 deletions(-)
43
44
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
45
index XXXXXXX..XXXXXXX 100644
46
--- a/fpu/softfloat-specialize.c.inc
47
+++ b/fpu/softfloat-specialize.c.inc
48
@@ -XXX,XX +XXX,XX @@ static void parts128_silence_nan(FloatParts128 *p, float_status *status)
49
floatx80 floatx80_default_nan(float_status *status)
50
{
51
floatx80 r;
52
+ /*
53
+ * Extrapolate from the choices made by parts64_default_nan to fill
54
+ * in the floatx80 format. We assume that floatx80's explicit
55
+ * integer bit is always set (this is true for i386 and m68k,
56
+ * which are the only real users of this format).
57
+ */
58
+ FloatParts64 p64;
59
+ parts64_default_nan(&p64, status);
60
61
- /* None of the targets that have snan_bit_is_one use floatx80. */
62
- assert(!snan_bit_is_one(status));
63
-#if defined(TARGET_M68K)
64
- r.low = UINT64_C(0xFFFFFFFFFFFFFFFF);
65
- r.high = 0x7FFF;
66
-#else
67
- /* X86 */
68
- r.low = UINT64_C(0xC000000000000000);
69
- r.high = 0xFFFF;
70
-#endif
71
+ r.high = 0x7FFF | (p64.sign << 15);
72
+ r.low = (1ULL << DECOMPOSED_BINARY_POINT) | p64.frac;
73
return r;
74
}
75
76
--
77
2.34.1
diff view generated by jsdifflib
New patch
1
In target/loongarch's helper_fclass_s() and helper_fclass_d() we pass
2
a zero-initialized float_status struct to float32_is_quiet_nan() and
3
float64_is_quiet_nan(), with the cryptic comment "for
4
snan_bit_is_one".
1
5
6
This pattern appears to have been copied from target/riscv, where it
7
is used because the functions there do not have ready access to the
8
CPU state struct. The comment presumably refers to the fact that the
9
main reason the is_quiet_nan() functions want the float_state is
10
because they want to know about the snan_bit_is_one config.
11
12
In the loongarch helpers, though, we have the CPU state struct
13
to hand. Use the usual env->fp_status here. This avoids our needing
14
to track that we need to update the initializer of the local
15
float_status structs when the core softfloat code adds new
16
options for targets to configure their behaviour.
17
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20241202131347.498124-30-peter.maydell@linaro.org
21
---
22
target/loongarch/tcg/fpu_helper.c | 6 ++----
23
1 file changed, 2 insertions(+), 4 deletions(-)
24
25
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/loongarch/tcg/fpu_helper.c
28
+++ b/target/loongarch/tcg/fpu_helper.c
29
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_s(CPULoongArchState *env, uint64_t fj)
30
} else if (float32_is_zero_or_denormal(f)) {
31
return sign ? 1 << 4 : 1 << 8;
32
} else if (float32_is_any_nan(f)) {
33
- float_status s = { }; /* for snan_bit_is_one */
34
- return float32_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
35
+ return float32_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
36
} else {
37
return sign ? 1 << 3 : 1 << 7;
38
}
39
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_d(CPULoongArchState *env, uint64_t fj)
40
} else if (float64_is_zero_or_denormal(f)) {
41
return sign ? 1 << 4 : 1 << 8;
42
} else if (float64_is_any_nan(f)) {
43
- float_status s = { }; /* for snan_bit_is_one */
44
- return float64_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
45
+ return float64_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
46
} else {
47
return sign ? 1 << 3 : 1 << 7;
48
}
49
--
50
2.34.1
diff view generated by jsdifflib
New patch
1
In the frem helper, we have a local float_status because we want to
2
execute the floatx80_div() with a custom rounding mode. Instead of
3
zero-initializing the local float_status and then having to set it up
4
with the m68k standard behaviour (including the NaN propagation rule
5
and copying the rounding precision from env->fp_status), initialize
6
it as a complete copy of env->fp_status. This will avoid our having
7
to add new code in this function for every new config knob we add
8
to fp_status.
1
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-31-peter.maydell@linaro.org
13
---
14
target/m68k/fpu_helper.c | 6 ++----
15
1 file changed, 2 insertions(+), 4 deletions(-)
16
17
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/m68k/fpu_helper.c
20
+++ b/target/m68k/fpu_helper.c
21
@@ -XXX,XX +XXX,XX @@ void HELPER(frem)(CPUM68KState *env, FPReg *res, FPReg *val0, FPReg *val1)
22
23
fp_rem = floatx80_rem(val1->d, val0->d, &env->fp_status);
24
if (!floatx80_is_any_nan(fp_rem)) {
25
- float_status fp_status = { };
26
+ /* Use local temporary fp_status to set different rounding mode */
27
+ float_status fp_status = env->fp_status;
28
uint32_t quotient;
29
int sign;
30
31
/* Calculate quotient directly using round to nearest mode */
32
- set_float_2nan_prop_rule(float_2nan_prop_ab, &fp_status);
33
set_float_rounding_mode(float_round_nearest_even, &fp_status);
34
- set_floatx80_rounding_precision(
35
- get_floatx80_rounding_precision(&env->fp_status), &fp_status);
36
fp_quot.d = floatx80_div(val1->d, val0->d, &fp_status);
37
38
sign = extractFloatx80Sign(fp_quot.d);
39
--
40
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we do the conversion
2
from float64 to floatx80 using a scratch float_status, because we
3
don't want the conversion to affect the CPU's floating point exception
4
status. Currently we use a zero-initialized float_status. This will
5
get steadily more awkward as we add config knobs to float_status
6
that the target must initialize. Avoid having to add any of that
7
configuration here by instead initializing our local float_status
8
from the env->fp_status.
2
9
3
Combine the right shift with the extension via
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
the tcg extract operations.
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-32-peter.maydell@linaro.org
13
---
14
target/m68k/helper.c | 6 ++++--
15
1 file changed, 4 insertions(+), 2 deletions(-)
5
16
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
17
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-19-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/translate-a64.c | 7 +++++--
12
1 file changed, 5 insertions(+), 2 deletions(-)
13
14
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/translate-a64.c
19
--- a/target/m68k/helper.c
17
+++ b/target/arm/tcg/translate-a64.c
20
+++ b/target/m68k/helper.c
18
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
21
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_get_reg(CPUState *cs, GByteArray *mem_buf, int n)
19
read_vec_element(s, tcg_rn, rn, is_q ? 1 : 0, MO_64);
22
CPUM68KState *env = &cpu->env;
20
23
21
for (i = 0; i < elements; i++) {
24
if (n < 8) {
22
- tcg_gen_shri_i64(tcg_rd, tcg_rn, i * esize);
25
- float_status s = {};
23
- ext_and_shift_reg(tcg_rd, tcg_rd, size | (!is_u << 2), 0);
26
+ /* Use scratch float_status so any exceptions don't change CPU state */
24
+ if (is_u) {
27
+ float_status s = env->fp_status;
25
+ tcg_gen_extract_i64(tcg_rd, tcg_rn, i * esize, esize);
28
return gdb_get_reg64(mem_buf, floatx80_to_float64(env->fregs[n].d, &s));
26
+ } else {
29
}
27
+ tcg_gen_sextract_i64(tcg_rd, tcg_rn, i * esize, esize);
30
switch (n) {
28
+ }
31
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_set_reg(CPUState *cs, uint8_t *mem_buf, int n)
29
tcg_gen_shli_i64(tcg_rd, tcg_rd, shift);
32
CPUM68KState *env = &cpu->env;
30
write_vec_element(s, tcg_rd, rd, i, size + 1);
33
34
if (n < 8) {
35
- float_status s = {};
36
+ /* Use scratch float_status so any exceptions don't change CPU state */
37
+ float_status s = env->fp_status;
38
env->fregs[n].d = float64_to_floatx80(ldq_be_p(mem_buf), &s);
39
return 8;
31
}
40
}
32
--
41
--
33
2.34.1
42
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
In the helper functions flcmps and flcmpd we use a scratch float_status
2
so that we don't change the CPU state if the comparison raises any
3
floating point exception flags. Instead of zero-initializing this
4
scratch float_status, initialize it as a copy of env->fp_status. This
5
avoids the need to explicitly initialize settings like the NaN
6
propagation rule or others we might add to softfloat in future.
2
7
3
Instead of copying a constant into a temporary with dupi,
8
To do this we need to pass the CPU env pointer in to the helper.
4
use a vector constant directly.
5
9
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-3-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-33-peter.maydell@linaro.org
10
---
13
---
11
target/arm/tcg/translate-sve.c | 128 +++++++++++++--------------------
14
target/sparc/helper.h | 4 ++--
12
1 file changed, 49 insertions(+), 79 deletions(-)
15
target/sparc/fop_helper.c | 8 ++++----
16
target/sparc/translate.c | 4 ++--
17
3 files changed, 8 insertions(+), 8 deletions(-)
13
18
14
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
19
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
15
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/translate-sve.c
21
--- a/target/sparc/helper.h
17
+++ b/target/arm/tcg/translate-sve.c
22
+++ b/target/sparc/helper.h
18
@@ -XXX,XX +XXX,XX @@ static void gen_sshll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm)
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(fcmpd, TCG_CALL_NO_WG, i32, env, f64, f64)
19
24
DEF_HELPER_FLAGS_3(fcmped, TCG_CALL_NO_WG, i32, env, f64, f64)
20
if (top) {
25
DEF_HELPER_FLAGS_3(fcmpq, TCG_CALL_NO_WG, i32, env, i128, i128)
21
if (shl == halfbits) {
26
DEF_HELPER_FLAGS_3(fcmpeq, TCG_CALL_NO_WG, i32, env, i128, i128)
22
- TCGv_vec t = tcg_temp_new_vec_matching(d);
27
-DEF_HELPER_FLAGS_2(flcmps, TCG_CALL_NO_RWG_SE, i32, f32, f32)
23
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits));
28
-DEF_HELPER_FLAGS_2(flcmpd, TCG_CALL_NO_RWG_SE, i32, f64, f64)
24
- tcg_gen_and_vec(vece, d, n, t);
29
+DEF_HELPER_FLAGS_3(flcmps, TCG_CALL_NO_RWG_SE, i32, env, f32, f32)
25
+ tcg_gen_and_vec(vece, d, n,
30
+DEF_HELPER_FLAGS_3(flcmpd, TCG_CALL_NO_RWG_SE, i32, env, f64, f64)
26
+ tcg_constant_vec_matching(d, vece,
31
DEF_HELPER_2(raise_exception, noreturn, env, int)
27
+ MAKE_64BIT_MASK(halfbits, halfbits)));
32
28
} else {
33
DEF_HELPER_FLAGS_3(faddd, TCG_CALL_NO_WG, f64, env, f64, f64)
29
tcg_gen_sari_vec(vece, d, n, halfbits);
34
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
30
tcg_gen_shli_vec(vece, d, d, shl);
35
index XXXXXXX..XXXXXXX 100644
31
@@ -XXX,XX +XXX,XX @@ static void gen_ushll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm)
36
--- a/target/sparc/fop_helper.c
32
37
+++ b/target/sparc/fop_helper.c
33
if (top) {
38
@@ -XXX,XX +XXX,XX @@ uint32_t helper_fcmpeq(CPUSPARCState *env, Int128 src1, Int128 src2)
34
if (shl == halfbits) {
39
return finish_fcmp(env, r, GETPC());
35
- TCGv_vec t = tcg_temp_new_vec_matching(d);
40
}
36
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits));
41
37
- tcg_gen_and_vec(vece, d, n, t);
42
-uint32_t helper_flcmps(float32 src1, float32 src2)
38
+ tcg_gen_and_vec(vece, d, n,
43
+uint32_t helper_flcmps(CPUSPARCState *env, float32 src1, float32 src2)
39
+ tcg_constant_vec_matching(d, vece,
40
+ MAKE_64BIT_MASK(halfbits, halfbits)));
41
} else {
42
tcg_gen_shri_vec(vece, d, n, halfbits);
43
tcg_gen_shli_vec(vece, d, d, shl);
44
}
45
} else {
46
if (shl == 0) {
47
- TCGv_vec t = tcg_temp_new_vec_matching(d);
48
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
49
- tcg_gen_and_vec(vece, d, n, t);
50
+ tcg_gen_and_vec(vece, d, n,
51
+ tcg_constant_vec_matching(d, vece,
52
+ MAKE_64BIT_MASK(0, halfbits)));
53
} else {
54
tcg_gen_shli_vec(vece, d, n, halfbits);
55
tcg_gen_shri_vec(vece, d, d, halfbits - shl);
56
@@ -XXX,XX +XXX,XX @@ static const TCGOpcode sqxtn_list[] = {
57
58
static void gen_sqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
59
{
44
{
60
- TCGv_vec t = tcg_temp_new_vec_matching(d);
45
/*
61
int halfbits = 4 << vece;
46
* FLCMP never raises an exception nor modifies any FSR fields.
62
int64_t mask = (1ull << halfbits) - 1;
47
* Perform the comparison with a dummy fp environment.
63
int64_t min = -1ull << (halfbits - 1);
48
*/
64
int64_t max = -min - 1;
49
- float_status discard = { };
65
50
+ float_status discard = env->fp_status;
66
- tcg_gen_dupi_vec(vece, t, min);
51
FloatRelation r;
67
- tcg_gen_smax_vec(vece, d, n, t);
52
68
- tcg_gen_dupi_vec(vece, t, max);
53
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
69
- tcg_gen_smin_vec(vece, d, d, t);
54
@@ -XXX,XX +XXX,XX @@ uint32_t helper_flcmps(float32 src1, float32 src2)
70
- tcg_gen_dupi_vec(vece, t, mask);
55
g_assert_not_reached();
71
- tcg_gen_and_vec(vece, d, d, t);
72
+ tcg_gen_smax_vec(vece, d, n, tcg_constant_vec_matching(d, vece, min));
73
+ tcg_gen_smin_vec(vece, d, d, tcg_constant_vec_matching(d, vece, max));
74
+ tcg_gen_and_vec(vece, d, d, tcg_constant_vec_matching(d, vece, mask));
75
}
56
}
76
57
77
static const GVecGen2 sqxtnb_ops[3] = {
58
-uint32_t helper_flcmpd(float64 src1, float64 src2)
78
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQXTNB, aa64_sve2, do_narrow_extract, a, sqxtnb_ops)
59
+uint32_t helper_flcmpd(CPUSPARCState *env, float64 src1, float64 src2)
79
80
static void gen_sqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
81
{
60
{
82
- TCGv_vec t = tcg_temp_new_vec_matching(d);
61
- float_status discard = { };
83
int halfbits = 4 << vece;
62
+ float_status discard = env->fp_status;
84
int64_t mask = (1ull << halfbits) - 1;
63
FloatRelation r;
85
int64_t min = -1ull << (halfbits - 1);
64
86
int64_t max = -min - 1;
65
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
87
66
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
88
- tcg_gen_dupi_vec(vece, t, min);
67
index XXXXXXX..XXXXXXX 100644
89
- tcg_gen_smax_vec(vece, n, n, t);
68
--- a/target/sparc/translate.c
90
- tcg_gen_dupi_vec(vece, t, max);
69
+++ b/target/sparc/translate.c
91
- tcg_gen_smin_vec(vece, n, n, t);
70
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPs(DisasContext *dc, arg_FLCMPs *a)
92
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, min));
71
93
+ tcg_gen_smin_vec(vece, n, n, tcg_constant_vec_matching(d, vece, max));
72
src1 = gen_load_fpr_F(dc, a->rs1);
94
tcg_gen_shli_vec(vece, n, n, halfbits);
73
src2 = gen_load_fpr_F(dc, a->rs2);
95
- tcg_gen_dupi_vec(vece, t, mask);
74
- gen_helper_flcmps(cpu_fcc[a->cc], src1, src2);
96
- tcg_gen_bitsel_vec(vece, d, t, d, n);
75
+ gen_helper_flcmps(cpu_fcc[a->cc], tcg_env, src1, src2);
97
+ tcg_gen_bitsel_vec(vece, d, tcg_constant_vec_matching(d, vece, mask), d, n);
76
return advance_pc(dc);
98
}
77
}
99
78
100
static const GVecGen2 sqxtnt_ops[3] = {
79
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPd(DisasContext *dc, arg_FLCMPd *a)
101
@@ -XXX,XX +XXX,XX @@ static const TCGOpcode uqxtn_list[] = {
80
102
81
src1 = gen_load_fpr_D(dc, a->rs1);
103
static void gen_uqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
82
src2 = gen_load_fpr_D(dc, a->rs2);
104
{
83
- gen_helper_flcmpd(cpu_fcc[a->cc], src1, src2);
105
- TCGv_vec t = tcg_temp_new_vec_matching(d);
84
+ gen_helper_flcmpd(cpu_fcc[a->cc], tcg_env, src1, src2);
106
int halfbits = 4 << vece;
85
return advance_pc(dc);
107
int64_t max = (1ull << halfbits) - 1;
108
109
- tcg_gen_dupi_vec(vece, t, max);
110
- tcg_gen_umin_vec(vece, d, n, t);
111
+ tcg_gen_umin_vec(vece, d, n, tcg_constant_vec_matching(d, vece, max));
112
}
86
}
113
87
114
static const GVecGen2 uqxtnb_ops[3] = {
115
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(UQXTNB, aa64_sve2, do_narrow_extract, a, uqxtnb_ops)
116
117
static void gen_uqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
118
{
119
- TCGv_vec t = tcg_temp_new_vec_matching(d);
120
int halfbits = 4 << vece;
121
int64_t max = (1ull << halfbits) - 1;
122
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
123
124
- tcg_gen_dupi_vec(vece, t, max);
125
- tcg_gen_umin_vec(vece, n, n, t);
126
+ tcg_gen_umin_vec(vece, n, n, maxv);
127
tcg_gen_shli_vec(vece, n, n, halfbits);
128
- tcg_gen_bitsel_vec(vece, d, t, d, n);
129
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
130
}
131
132
static const GVecGen2 uqxtnt_ops[3] = {
133
@@ -XXX,XX +XXX,XX @@ static const TCGOpcode sqxtun_list[] = {
134
135
static void gen_sqxtunb_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
136
{
137
- TCGv_vec t = tcg_temp_new_vec_matching(d);
138
int halfbits = 4 << vece;
139
int64_t max = (1ull << halfbits) - 1;
140
141
- tcg_gen_dupi_vec(vece, t, 0);
142
- tcg_gen_smax_vec(vece, d, n, t);
143
- tcg_gen_dupi_vec(vece, t, max);
144
- tcg_gen_umin_vec(vece, d, d, t);
145
+ tcg_gen_smax_vec(vece, d, n, tcg_constant_vec_matching(d, vece, 0));
146
+ tcg_gen_umin_vec(vece, d, d, tcg_constant_vec_matching(d, vece, max));
147
}
148
149
static const GVecGen2 sqxtunb_ops[3] = {
150
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQXTUNB, aa64_sve2, do_narrow_extract, a, sqxtunb_ops)
151
152
static void gen_sqxtunt_vec(unsigned vece, TCGv_vec d, TCGv_vec n)
153
{
154
- TCGv_vec t = tcg_temp_new_vec_matching(d);
155
int halfbits = 4 << vece;
156
int64_t max = (1ull << halfbits) - 1;
157
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
158
159
- tcg_gen_dupi_vec(vece, t, 0);
160
- tcg_gen_smax_vec(vece, n, n, t);
161
- tcg_gen_dupi_vec(vece, t, max);
162
- tcg_gen_umin_vec(vece, n, n, t);
163
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, 0));
164
+ tcg_gen_umin_vec(vece, n, n, maxv);
165
tcg_gen_shli_vec(vece, n, n, halfbits);
166
- tcg_gen_bitsel_vec(vece, d, t, d, n);
167
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
168
}
169
170
static const GVecGen2 sqxtunt_ops[3] = {
171
@@ -XXX,XX +XXX,XX @@ static void gen_shrnb64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr)
172
173
static void gen_shrnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr)
174
{
175
- TCGv_vec t = tcg_temp_new_vec_matching(d);
176
int halfbits = 4 << vece;
177
uint64_t mask = MAKE_64BIT_MASK(0, halfbits);
178
179
tcg_gen_shri_vec(vece, n, n, shr);
180
- tcg_gen_dupi_vec(vece, t, mask);
181
- tcg_gen_and_vec(vece, d, n, t);
182
+ tcg_gen_and_vec(vece, d, n, tcg_constant_vec_matching(d, vece, mask));
183
}
184
185
static const TCGOpcode shrnb_vec_list[] = { INDEX_op_shri_vec, 0 };
186
@@ -XXX,XX +XXX,XX @@ static void gen_shrnt64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr)
187
188
static void gen_shrnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr)
189
{
190
- TCGv_vec t = tcg_temp_new_vec_matching(d);
191
int halfbits = 4 << vece;
192
uint64_t mask = MAKE_64BIT_MASK(0, halfbits);
193
194
tcg_gen_shli_vec(vece, n, n, halfbits - shr);
195
- tcg_gen_dupi_vec(vece, t, mask);
196
- tcg_gen_bitsel_vec(vece, d, t, d, n);
197
+ tcg_gen_bitsel_vec(vece, d, tcg_constant_vec_matching(d, vece, mask), d, n);
198
}
199
200
static const TCGOpcode shrnt_vec_list[] = { INDEX_op_shli_vec, 0 };
201
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(RSHRNT, aa64_sve2, do_shr_narrow, a, rshrnt_ops)
202
static void gen_sqshrunb_vec(unsigned vece, TCGv_vec d,
203
TCGv_vec n, int64_t shr)
204
{
205
- TCGv_vec t = tcg_temp_new_vec_matching(d);
206
int halfbits = 4 << vece;
207
+ uint64_t max = MAKE_64BIT_MASK(0, halfbits);
208
209
tcg_gen_sari_vec(vece, n, n, shr);
210
- tcg_gen_dupi_vec(vece, t, 0);
211
- tcg_gen_smax_vec(vece, n, n, t);
212
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
213
- tcg_gen_umin_vec(vece, d, n, t);
214
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, 0));
215
+ tcg_gen_umin_vec(vece, d, n, tcg_constant_vec_matching(d, vece, max));
216
}
217
218
static const TCGOpcode sqshrunb_vec_list[] = {
219
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQSHRUNB, aa64_sve2, do_shr_narrow, a, sqshrunb_ops)
220
static void gen_sqshrunt_vec(unsigned vece, TCGv_vec d,
221
TCGv_vec n, int64_t shr)
222
{
223
- TCGv_vec t = tcg_temp_new_vec_matching(d);
224
int halfbits = 4 << vece;
225
+ uint64_t max = MAKE_64BIT_MASK(0, halfbits);
226
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
227
228
tcg_gen_sari_vec(vece, n, n, shr);
229
- tcg_gen_dupi_vec(vece, t, 0);
230
- tcg_gen_smax_vec(vece, n, n, t);
231
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
232
- tcg_gen_umin_vec(vece, n, n, t);
233
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, 0));
234
+ tcg_gen_umin_vec(vece, n, n, maxv);
235
tcg_gen_shli_vec(vece, n, n, halfbits);
236
- tcg_gen_bitsel_vec(vece, d, t, d, n);
237
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
238
}
239
240
static const TCGOpcode sqshrunt_vec_list[] = {
241
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQRSHRUNT, aa64_sve2, do_shr_narrow, a, sqrshrunt_ops)
242
static void gen_sqshrnb_vec(unsigned vece, TCGv_vec d,
243
TCGv_vec n, int64_t shr)
244
{
245
- TCGv_vec t = tcg_temp_new_vec_matching(d);
246
int halfbits = 4 << vece;
247
int64_t max = MAKE_64BIT_MASK(0, halfbits - 1);
248
int64_t min = -max - 1;
249
+ int64_t mask = MAKE_64BIT_MASK(0, halfbits);
250
251
tcg_gen_sari_vec(vece, n, n, shr);
252
- tcg_gen_dupi_vec(vece, t, min);
253
- tcg_gen_smax_vec(vece, n, n, t);
254
- tcg_gen_dupi_vec(vece, t, max);
255
- tcg_gen_smin_vec(vece, n, n, t);
256
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
257
- tcg_gen_and_vec(vece, d, n, t);
258
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, min));
259
+ tcg_gen_smin_vec(vece, n, n, tcg_constant_vec_matching(d, vece, max));
260
+ tcg_gen_and_vec(vece, d, n, tcg_constant_vec_matching(d, vece, mask));
261
}
262
263
static const TCGOpcode sqshrnb_vec_list[] = {
264
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQSHRNB, aa64_sve2, do_shr_narrow, a, sqshrnb_ops)
265
static void gen_sqshrnt_vec(unsigned vece, TCGv_vec d,
266
TCGv_vec n, int64_t shr)
267
{
268
- TCGv_vec t = tcg_temp_new_vec_matching(d);
269
int halfbits = 4 << vece;
270
int64_t max = MAKE_64BIT_MASK(0, halfbits - 1);
271
int64_t min = -max - 1;
272
+ int64_t mask = MAKE_64BIT_MASK(0, halfbits);
273
274
tcg_gen_sari_vec(vece, n, n, shr);
275
- tcg_gen_dupi_vec(vece, t, min);
276
- tcg_gen_smax_vec(vece, n, n, t);
277
- tcg_gen_dupi_vec(vece, t, max);
278
- tcg_gen_smin_vec(vece, n, n, t);
279
+ tcg_gen_smax_vec(vece, n, n, tcg_constant_vec_matching(d, vece, min));
280
+ tcg_gen_smin_vec(vece, n, n, tcg_constant_vec_matching(d, vece, max));
281
tcg_gen_shli_vec(vece, n, n, halfbits);
282
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
283
- tcg_gen_bitsel_vec(vece, d, t, d, n);
284
+ tcg_gen_bitsel_vec(vece, d, tcg_constant_vec_matching(d, vece, mask), d, n);
285
}
286
287
static const TCGOpcode sqshrnt_vec_list[] = {
288
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQRSHRNT, aa64_sve2, do_shr_narrow, a, sqrshrnt_ops)
289
static void gen_uqshrnb_vec(unsigned vece, TCGv_vec d,
290
TCGv_vec n, int64_t shr)
291
{
292
- TCGv_vec t = tcg_temp_new_vec_matching(d);
293
int halfbits = 4 << vece;
294
+ int64_t max = MAKE_64BIT_MASK(0, halfbits);
295
296
tcg_gen_shri_vec(vece, n, n, shr);
297
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
298
- tcg_gen_umin_vec(vece, d, n, t);
299
+ tcg_gen_umin_vec(vece, d, n, tcg_constant_vec_matching(d, vece, max));
300
}
301
302
static const TCGOpcode uqshrnb_vec_list[] = {
303
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(UQSHRNB, aa64_sve2, do_shr_narrow, a, uqshrnb_ops)
304
static void gen_uqshrnt_vec(unsigned vece, TCGv_vec d,
305
TCGv_vec n, int64_t shr)
306
{
307
- TCGv_vec t = tcg_temp_new_vec_matching(d);
308
int halfbits = 4 << vece;
309
+ int64_t max = MAKE_64BIT_MASK(0, halfbits);
310
+ TCGv_vec maxv = tcg_constant_vec_matching(d, vece, max);
311
312
tcg_gen_shri_vec(vece, n, n, shr);
313
- tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits));
314
- tcg_gen_umin_vec(vece, n, n, t);
315
+ tcg_gen_umin_vec(vece, n, n, maxv);
316
tcg_gen_shli_vec(vece, n, n, halfbits);
317
- tcg_gen_bitsel_vec(vece, d, t, d, n);
318
+ tcg_gen_bitsel_vec(vece, d, maxv, d, n);
319
}
320
321
static const TCGOpcode uqshrnt_vec_list[] = {
322
--
88
--
323
2.34.1
89
2.34.1
diff view generated by jsdifflib
New patch
1
In the helper_compute_fprf functions, we pass a dummy float_status
2
in to the is_signaling_nan() function. This is unnecessary, because
3
we have convenient access to the CPU env pointer here and that
4
is already set up with the correct values for the snan_bit_is_one
5
and no_signaling_nans config settings. is_signaling_nan() doesn't
6
ever update the fp_status with any exception flags, so there is
7
no reason not to use env->fp_status here.
1
8
9
Use env->fp_status instead of the dummy fp_status.
10
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20241202131347.498124-34-peter.maydell@linaro.org
14
---
15
target/ppc/fpu_helper.c | 3 +--
16
1 file changed, 1 insertion(+), 2 deletions(-)
17
18
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/ppc/fpu_helper.c
21
+++ b/target/ppc/fpu_helper.c
22
@@ -XXX,XX +XXX,XX @@ void helper_compute_fprf_##tp(CPUPPCState *env, tp arg) \
23
} else if (tp##_is_infinity(arg)) { \
24
fprf = neg ? 0x09 << FPSCR_FPRF : 0x05 << FPSCR_FPRF; \
25
} else { \
26
- float_status dummy = { }; /* snan_bit_is_one = 0 */ \
27
- if (tp##_is_signaling_nan(arg, &dummy)) { \
28
+ if (tp##_is_signaling_nan(arg, &env->fp_status)) { \
29
fprf = 0x00 << FPSCR_FPRF; \
30
} else { \
31
fprf = 0x11 << FPSCR_FPRF; \
32
--
33
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Now that float_status has a bunch of fp parameters,
4
it is easier to copy an existing structure than create
5
one from scratch. Begin by copying the structure that
6
corresponds to the FPSR and make only the adjustments
7
required for BFloat16 semantics.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241203203949.483774-2-richard.henderson@linaro.org
6
Message-id: 20240912024114.1097832-8-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
14
---
9
target/arm/tcg/a64.decode | 4 +++
15
target/arm/tcg/vec_helper.c | 20 +++++++-------------
10
target/arm/tcg/translate-a64.c | 47 ++++++++++------------------------
16
1 file changed, 7 insertions(+), 13 deletions(-)
11
2 files changed, 18 insertions(+), 33 deletions(-)
12
17
13
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
18
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
14
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/tcg/a64.decode
20
--- a/target/arm/tcg/vec_helper.c
16
+++ b/target/arm/tcg/a64.decode
21
+++ b/target/arm/tcg/vec_helper.c
17
@@ -XXX,XX +XXX,XX @@ FNMSUB 0001 1111 .. 1 ..... 1 ..... ..... ..... @rrrr_hsd
22
@@ -XXX,XX +XXX,XX @@ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
18
23
* no effect on AArch32 instructions.
19
EXT_d 0010 1110 00 0 rm:5 00 imm:3 0 rn:5 rd:5
24
*/
20
EXT_q 0110 1110 00 0 rm:5 0 imm:4 0 rn:5 rd:5
25
bool ebf = is_a64(env) && env->vfp.fpcr & FPCR_EBF;
26
- *statusp = (float_status){
27
- .tininess_before_rounding = float_tininess_before_rounding,
28
- .float_rounding_mode = float_round_to_odd_inf,
29
- .flush_to_zero = true,
30
- .flush_inputs_to_zero = true,
31
- .default_nan_mode = true,
32
- };
21
+
33
+
22
+# Advanced SIMD Table Lookup
34
+ *statusp = env->vfp.fp_status;
23
+
35
+ set_default_nan_mode(true, statusp);
24
+TBL_TBX 0 q:1 00 1110 000 rm:5 0 len:2 tbx:1 00 rn:5 rd:5
36
25
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
37
if (ebf) {
26
index XXXXXXX..XXXXXXX 100644
38
- float_status *fpst = &env->vfp.fp_status;
27
--- a/target/arm/tcg/translate-a64.c
39
- set_flush_to_zero(get_flush_to_zero(fpst), statusp);
28
+++ b/target/arm/tcg/translate-a64.c
40
- set_flush_inputs_to_zero(get_flush_inputs_to_zero(fpst), statusp);
29
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
41
- set_float_rounding_mode(get_float_rounding_mode(fpst), statusp);
30
return true;
42
-
43
/* EBF=1 needs to do a step with round-to-odd semantics */
44
*oddstatusp = *statusp;
45
set_float_rounding_mode(float_round_to_odd, oddstatusp);
46
+ } else {
47
+ set_flush_to_zero(true, statusp);
48
+ set_flush_inputs_to_zero(true, statusp);
49
+ set_float_rounding_mode(float_round_to_odd_inf, statusp);
50
}
51
-
52
return ebf;
31
}
53
}
32
54
33
+static bool trans_TBL_TBX(DisasContext *s, arg_TBL_TBX *a)
34
+{
35
+ if (fp_access_check(s)) {
36
+ int len = (a->len + 1) * 16;
37
+
38
+ tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd),
39
+ vec_full_reg_offset(s, a->rm), tcg_env,
40
+ a->q ? 16 : 8, vec_full_reg_size(s),
41
+ (len << 6) | (a->tbx << 5) | a->rn,
42
+ gen_helper_simd_tblx);
43
+ }
44
+ return true;
45
+}
46
+
47
/*
48
* Cryptographic AES, SHA, SHA512
49
*/
50
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
51
}
52
}
53
54
-/* TBL/TBX
55
- * 31 30 29 24 23 22 21 20 16 15 14 13 12 11 10 9 5 4 0
56
- * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
57
- * | 0 | Q | 0 0 1 1 1 0 | op2 | 0 | Rm | 0 | len | op | 0 0 | Rn | Rd |
58
- * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
59
- */
60
-static void disas_simd_tb(DisasContext *s, uint32_t insn)
61
-{
62
- int op2 = extract32(insn, 22, 2);
63
- int is_q = extract32(insn, 30, 1);
64
- int rm = extract32(insn, 16, 5);
65
- int rn = extract32(insn, 5, 5);
66
- int rd = extract32(insn, 0, 5);
67
- int is_tbx = extract32(insn, 12, 1);
68
- int len = (extract32(insn, 13, 2) + 1) * 16;
69
-
70
- if (op2 != 0) {
71
- unallocated_encoding(s);
72
- return;
73
- }
74
-
75
- if (!fp_access_check(s)) {
76
- return;
77
- }
78
-
79
- tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, rd),
80
- vec_full_reg_offset(s, rm), tcg_env,
81
- is_q ? 16 : 8, vec_full_reg_size(s),
82
- (len << 6) | (is_tbx << 5) | rn,
83
- gen_helper_simd_tblx);
84
-}
85
-
86
/* ZIP/UZP/TRN
87
* 31 30 29 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
88
* +---+---+-------------+------+---+------+---+------------------+------+
89
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
90
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
91
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
92
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
93
- { 0x0e000000, 0xbf208c00, disas_simd_tb },
94
{ 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
95
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
96
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
97
--
55
--
98
2.34.1
56
2.34.1
99
57
100
58
diff view generated by jsdifflib
1
From: Jacob Abrams <satur9nine@gmail.com>
1
Currently we hardcode the default NaN value in parts64_default_nan()
2
using a compile-time ifdef ladder. This is awkward for two cases:
3
* for single-QEMU-binary we can't hard-code target-specifics like this
4
* for Arm FEAT_AFP the default NaN value depends on FPCR.AH
5
(specifically the sign bit is different)
2
6
3
SW modifying USART_CR1 TE bit should cuase HW to respond by altering
7
Add a field to float_status to specify the default NaN value; fall
4
USART_ISR TEACK bit, and likewise for RE and REACK bit.
8
back to the old ifdef behaviour if these are not set.
5
9
6
This resolves some but not all issues necessary for the official STM USART
10
The default NaN value is specified by setting a uint8_t to a
7
HAL driver to function as is.
11
pattern corresponding to the sign and upper fraction parts of
12
the NaN; the lower bits of the fraction are set from bit 0 of
13
the pattern.
8
14
9
Fixes: 87b77e6e01ca ("hw/char/stm32l4x5_usart: Enable serial read and write")
10
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2540
11
Signed-off-by: Jacob Abrams <satur9nine@gmail.com>
12
Message-id: 20240911043255.51966-1-satur9nine@gmail.com
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Message-id: 20241202131347.498124-35-peter.maydell@linaro.org
15
---
18
---
16
hw/char/stm32l4x5_usart.c | 16 +++++++++++++
19
include/fpu/softfloat-helpers.h | 11 +++++++
17
tests/qtest/stm32l4x5_usart-test.c | 36 +++++++++++++++++++++++++++++-
20
include/fpu/softfloat-types.h | 10 ++++++
18
2 files changed, 51 insertions(+), 1 deletion(-)
21
fpu/softfloat-specialize.c.inc | 55 ++++++++++++++++++++-------------
22
3 files changed, 54 insertions(+), 22 deletions(-)
19
23
20
diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
24
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
21
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/char/stm32l4x5_usart.c
26
--- a/include/fpu/softfloat-helpers.h
23
+++ b/hw/char/stm32l4x5_usart.c
27
+++ b/include/fpu/softfloat-helpers.h
24
@@ -XXX,XX +XXX,XX @@ REG32(RDR, 0x24)
28
@@ -XXX,XX +XXX,XX @@ static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
25
REG32(TDR, 0x28)
29
status->float_infzeronan_rule = rule;
26
FIELD(TDR, TDR, 0, 9)
30
}
27
31
28
+static void stm32l4x5_update_isr(Stm32l4x5UsartBaseState *s)
32
+static inline void set_float_default_nan_pattern(uint8_t dnan_pattern,
33
+ float_status *status)
29
+{
34
+{
30
+ if (s->cr1 & R_CR1_TE_MASK) {
35
+ status->default_nan_pattern = dnan_pattern;
31
+ s->isr |= R_ISR_TEACK_MASK;
32
+ } else {
33
+ s->isr &= ~R_ISR_TEACK_MASK;
34
+ }
35
+
36
+ if (s->cr1 & R_CR1_RE_MASK) {
37
+ s->isr |= R_ISR_REACK_MASK;
38
+ } else {
39
+ s->isr &= ~R_ISR_REACK_MASK;
40
+ }
41
+}
36
+}
42
+
37
+
43
static void stm32l4x5_update_irq(Stm32l4x5UsartBaseState *s)
38
static inline void set_flush_to_zero(bool val, float_status *status)
44
{
39
{
45
if (((s->isr & R_ISR_WUF_MASK) && (s->cr3 & R_CR3_WUFIE_MASK)) ||
40
status->flush_to_zero = val;
46
@@ -XXX,XX +XXX,XX @@ static void stm32l4x5_usart_base_write(void *opaque, hwaddr addr,
41
@@ -XXX,XX +XXX,XX @@ static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status
47
case A_CR1:
42
return status->float_infzeronan_rule;
48
s->cr1 = value;
49
stm32l4x5_update_params(s);
50
+ stm32l4x5_update_isr(s);
51
stm32l4x5_update_irq(s);
52
return;
53
case A_CR2:
54
diff --git a/tests/qtest/stm32l4x5_usart-test.c b/tests/qtest/stm32l4x5_usart-test.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/tests/qtest/stm32l4x5_usart-test.c
57
+++ b/tests/qtest/stm32l4x5_usart-test.c
58
@@ -XXX,XX +XXX,XX @@ REG32(GTPR, 0x10)
59
REG32(RTOR, 0x14)
60
REG32(RQR, 0x18)
61
REG32(ISR, 0x1C)
62
+ FIELD(ISR, REACK, 22, 1)
63
+ FIELD(ISR, TEACK, 21, 1)
64
FIELD(ISR, TXE, 7, 1)
65
FIELD(ISR, RXNE, 5, 1)
66
FIELD(ISR, ORE, 3, 1)
67
@@ -XXX,XX +XXX,XX @@ static void init_uart(QTestState *qts)
68
69
/* Enable the transmitter, the receiver and the USART. */
70
qtest_writel(qts, (USART1_BASE_ADDR + A_CR1),
71
- R_CR1_UE_MASK | R_CR1_RE_MASK | R_CR1_TE_MASK);
72
+ cr1 | R_CR1_UE_MASK | R_CR1_RE_MASK | R_CR1_TE_MASK);
73
}
43
}
74
44
75
static void test_write_read(void)
45
+static inline uint8_t get_float_default_nan_pattern(float_status *status)
76
@@ -XXX,XX +XXX,XX @@ static void test_send_str(void)
77
qtest_quit(qts);
78
}
79
80
+static void test_ack(void)
81
+{
46
+{
82
+ uint32_t cr1;
47
+ return status->default_nan_pattern;
83
+ uint32_t isr;
84
+ QTestState *qts = qtest_init("-M b-l475e-iot01a");
85
+
86
+ init_uart(qts);
87
+
88
+ cr1 = qtest_readl(qts, (USART1_BASE_ADDR + A_CR1));
89
+
90
+ /* Disable the transmitter and receiver. */
91
+ qtest_writel(qts, (USART1_BASE_ADDR + A_CR1),
92
+ cr1 & ~(R_CR1_RE_MASK | R_CR1_TE_MASK));
93
+
94
+ /* Test ISR ACK for transmitter and receiver disabled */
95
+ isr = qtest_readl(qts, (USART1_BASE_ADDR + A_ISR));
96
+ g_assert_false(isr & R_ISR_TEACK_MASK);
97
+ g_assert_false(isr & R_ISR_REACK_MASK);
98
+
99
+ /* Enable the transmitter and receiver. */
100
+ qtest_writel(qts, (USART1_BASE_ADDR + A_CR1),
101
+ cr1 | (R_CR1_RE_MASK | R_CR1_TE_MASK));
102
+
103
+ /* Test ISR ACK for transmitter and receiver disabled */
104
+ isr = qtest_readl(qts, (USART1_BASE_ADDR + A_ISR));
105
+ g_assert_true(isr & R_ISR_TEACK_MASK);
106
+ g_assert_true(isr & R_ISR_REACK_MASK);
107
+
108
+ qtest_quit(qts);
109
+}
48
+}
110
+
49
+
111
int main(int argc, char **argv)
50
static inline bool get_flush_to_zero(float_status *status)
112
{
51
{
113
int ret;
52
return status->flush_to_zero;
114
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
53
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
115
qtest_add_func("stm32l4x5/usart/send_char", test_send_char);
54
index XXXXXXX..XXXXXXX 100644
116
qtest_add_func("stm32l4x5/usart/receive_str", test_receive_str);
55
--- a/include/fpu/softfloat-types.h
117
qtest_add_func("stm32l4x5/usart/send_str", test_send_str);
56
+++ b/include/fpu/softfloat-types.h
118
+ qtest_add_func("stm32l4x5/usart/ack", test_ack);
57
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
119
ret = g_test_run();
58
/* should denormalised inputs go to zero and set the input_denormal flag? */
120
59
bool flush_inputs_to_zero;
121
return ret;
60
bool default_nan_mode;
61
+ /*
62
+ * The pattern to use for the default NaN. Here the high bit specifies
63
+ * the default NaN's sign bit, and bits 6..0 specify the high bits of the
64
+ * fractional part. The low bits of the fractional part are copies of bit 0.
65
+ * The exponent of the default NaN is (as for any NaN) always all 1s.
66
+ * Note that a value of 0 here is not a valid NaN. The target must set
67
+ * this to the correct non-zero value, or we will assert when trying to
68
+ * create a default NaN.
69
+ */
70
+ uint8_t default_nan_pattern;
71
/*
72
* The flags below are not used on all specializations and may
73
* constant fold away (see snan_bit_is_one()/no_signalling_nans() in
74
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
75
index XXXXXXX..XXXXXXX 100644
76
--- a/fpu/softfloat-specialize.c.inc
77
+++ b/fpu/softfloat-specialize.c.inc
78
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
79
{
80
bool sign = 0;
81
uint64_t frac;
82
+ uint8_t dnan_pattern = status->default_nan_pattern;
83
84
+ if (dnan_pattern == 0) {
85
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
86
- /* !snan_bit_is_one, set all bits */
87
- frac = (1ULL << DECOMPOSED_BINARY_POINT) - 1;
88
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
89
+ /* Sign bit clear, all frac bits set */
90
+ dnan_pattern = 0b01111111;
91
+#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
92
|| defined(TARGET_MICROBLAZE)
93
- /* !snan_bit_is_one, set sign and msb */
94
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
95
- sign = 1;
96
+ /* Sign bit set, most significant frac bit set */
97
+ dnan_pattern = 0b11000000;
98
#elif defined(TARGET_HPPA)
99
- /* snan_bit_is_one, set msb-1. */
100
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 2);
101
+ /* Sign bit clear, msb-1 frac bit set */
102
+ dnan_pattern = 0b00100000;
103
#elif defined(TARGET_HEXAGON)
104
- sign = 1;
105
- frac = ~0ULL;
106
+ /* Sign bit set, all frac bits set. */
107
+ dnan_pattern = 0b11111111;
108
#else
109
- /*
110
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
111
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
112
- * do not have floating-point.
113
- */
114
- if (snan_bit_is_one(status)) {
115
- /* set all bits other than msb */
116
- frac = (1ULL << (DECOMPOSED_BINARY_POINT - 1)) - 1;
117
- } else {
118
- /* set msb */
119
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
120
- }
121
+ /*
122
+ * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
123
+ * S390, SH4, TriCore, and Xtensa. Our other supported targets
124
+ * do not have floating-point.
125
+ */
126
+ if (snan_bit_is_one(status)) {
127
+ /* sign bit clear, set all frac bits other than msb */
128
+ dnan_pattern = 0b00111111;
129
+ } else {
130
+ /* sign bit clear, set frac msb */
131
+ dnan_pattern = 0b01000000;
132
+ }
133
#endif
134
+ }
135
+ assert(dnan_pattern != 0);
136
+
137
+ sign = dnan_pattern >> 7;
138
+ /*
139
+ * Place default_nan_pattern [6:0] into bits [62:56],
140
+ * and replecate bit [0] down into [55:0]
141
+ */
142
+ frac = deposit64(0, DECOMPOSED_BINARY_POINT - 7, 7, dnan_pattern);
143
+ frac = deposit64(frac, 0, DECOMPOSED_BINARY_POINT - 7, -(dnan_pattern & 1));
144
145
*p = (FloatParts64) {
146
.cls = float_class_qnan,
122
--
147
--
123
2.34.1
148
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for the tests/fp code.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-36-peter.maydell@linaro.org
6
---
7
tests/fp/fp-bench.c | 1 +
8
tests/fp/fp-test-log2.c | 1 +
9
tests/fp/fp-test.c | 1 +
10
3 files changed, 3 insertions(+)
11
12
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tests/fp/fp-bench.c
15
+++ b/tests/fp/fp-bench.c
16
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
17
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
18
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
19
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
20
+ set_float_default_nan_pattern(0b01000000, &soft_status);
21
22
f = bench_funcs[operation][precision];
23
g_assert(f);
24
diff --git a/tests/fp/fp-test-log2.c b/tests/fp/fp-test-log2.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/tests/fp/fp-test-log2.c
27
+++ b/tests/fp/fp-test-log2.c
28
@@ -XXX,XX +XXX,XX @@ int main(int ac, char **av)
29
int i;
30
31
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
32
+ set_float_default_nan_pattern(0b01000000, &qsf);
33
set_float_rounding_mode(float_round_nearest_even, &qsf);
34
35
test.d = 0.0;
36
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/tests/fp/fp-test.c
39
+++ b/tests/fp/fp-test.c
40
@@ -XXX,XX +XXX,XX @@ void run_test(void)
41
*/
42
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
43
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
44
+ set_float_default_nan_pattern(0b01000000, &qsf);
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
46
47
genCases_setLevel(test_level);
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-37-peter.maydell@linaro.org
7
---
8
target/microblaze/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 +--
10
2 files changed, 3 insertions(+), 2 deletions(-)
11
12
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/microblaze/cpu.c
15
+++ b/target/microblaze/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_reset_hold(Object *obj, ResetType type)
17
* this architecture.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
20
+ /* Default NaN: sign bit set, most significant frac bit set */
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
22
23
#if defined(CONFIG_USER_ONLY)
24
/* start in user mode with interrupts enabled. */
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
31
/* Sign bit clear, all frac bits set */
32
dnan_pattern = 0b01111111;
33
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
34
- || defined(TARGET_MICROBLAZE)
35
+#elif defined(TARGET_I386) || defined(TARGET_X86_64)
36
/* Sign bit set, most significant frac bit set */
37
dnan_pattern = 0b11000000;
38
#elif defined(TARGET_HPPA)
39
--
40
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-38-peter.maydell@linaro.org
7
---
8
target/i386/tcg/fpu_helper.c | 4 ++++
9
fpu/softfloat-specialize.c.inc | 3 ---
10
2 files changed, 4 insertions(+), 3 deletions(-)
11
12
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/i386/tcg/fpu_helper.c
15
+++ b/target/i386/tcg/fpu_helper.c
16
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
17
*/
18
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
19
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
20
+ /* Default NaN: sign bit set, most significant frac bit set */
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
22
+ set_float_default_nan_pattern(0b11000000, &env->mmx_status);
23
+ set_float_default_nan_pattern(0b11000000, &env->sse_status);
24
}
25
26
static inline uint8_t save_exception_flags(CPUX86State *env)
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
index XXXXXXX..XXXXXXX 100644
29
--- a/fpu/softfloat-specialize.c.inc
30
+++ b/fpu/softfloat-specialize.c.inc
31
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
32
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
33
/* Sign bit clear, all frac bits set */
34
dnan_pattern = 0b01111111;
35
-#elif defined(TARGET_I386) || defined(TARGET_X86_64)
36
- /* Sign bit set, most significant frac bit set */
37
- dnan_pattern = 0b11000000;
38
#elif defined(TARGET_HPPA)
39
/* Sign bit clear, msb-1 frac bit set */
40
dnan_pattern = 0b00100000;
41
--
42
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
parts64_default_nan().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-39-peter.maydell@linaro.org
7
---
8
target/hppa/fpu_helper.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 ---
10
2 files changed, 2 insertions(+), 3 deletions(-)
11
12
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/hppa/fpu_helper.c
15
+++ b/target/hppa/fpu_helper.c
16
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
17
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
18
/* For inf * 0 + NaN, return the input NaN */
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
20
+ /* Default NaN: sign bit clear, msb-1 frac bit set */
21
+ set_float_default_nan_pattern(0b00100000, &env->fp_status);
22
}
23
24
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
31
/* Sign bit clear, all frac bits set */
32
dnan_pattern = 0b01111111;
33
-#elif defined(TARGET_HPPA)
34
- /* Sign bit clear, msb-1 frac bit set */
35
- dnan_pattern = 0b00100000;
36
#elif defined(TARGET_HEXAGON)
37
/* Sign bit set, all frac bits set. */
38
dnan_pattern = 0b11111111;
39
--
40
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for the alpha target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-40-peter.maydell@linaro.org
6
---
7
target/alpha/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
9
10
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/target/alpha/cpu.c
13
+++ b/target/alpha/cpu.c
14
@@ -XXX,XX +XXX,XX @@ static void alpha_cpu_initfn(Object *obj)
15
* operand in Fa. That is float_2nan_prop_ba.
16
*/
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
18
+ /* Default NaN: sign bit clear, msb frac bit set */
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
20
#if defined(CONFIG_USER_ONLY)
21
env->flags = ENV_FLAG_PS_USER | ENV_FLAG_FEN;
22
cpu_alpha_store_fpcr(env, (uint64_t)(FPCR_INVD | FPCR_DZED | FPCR_OVFD
23
--
24
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for the arm target.
2
This includes setting it for the old linux-user nwfpe emulation.
3
For nwfpe, our default doesn't match the real kernel, but we
4
avoid making a behaviour change in this commit.
1
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-41-peter.maydell@linaro.org
9
---
10
linux-user/arm/nwfpe/fpa11.c | 5 +++++
11
target/arm/cpu.c | 2 ++
12
2 files changed, 7 insertions(+)
13
14
diff --git a/linux-user/arm/nwfpe/fpa11.c b/linux-user/arm/nwfpe/fpa11.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/linux-user/arm/nwfpe/fpa11.c
17
+++ b/linux-user/arm/nwfpe/fpa11.c
18
@@ -XXX,XX +XXX,XX @@ void resetFPA11(void)
19
* this late date.
20
*/
21
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &fpa11->fp_status);
22
+ /*
23
+ * Use the same default NaN value as Arm VFP. This doesn't match
24
+ * the Linux kernel's nwfpe emulation, which uses an all-1s value.
25
+ */
26
+ set_float_default_nan_pattern(0b01000000, &fpa11->fp_status);
27
}
28
29
void SetRoundingMode(const unsigned int opcode)
30
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/cpu.c
33
+++ b/target/arm/cpu.c
34
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
35
* the pseudocode function the arguments are in the order c, a, b.
36
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
37
* and the input NaN if it is signalling
38
+ * * Default NaN has sign bit clear, msb frac bit set
39
*/
40
static void arm_set_default_fp_behaviours(float_status *s)
41
{
42
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
44
set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
46
+ set_float_default_nan_pattern(0b01000000, s);
47
}
48
49
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
50
--
51
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for loongarch.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-26-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-42-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper.h | 12 ++++
7
target/loongarch/tcg/fpu_helper.c | 2 ++
9
target/arm/tcg/translate.h | 7 ++
8
1 file changed, 2 insertions(+)
10
target/arm/tcg/neon-dp.decode | 6 +-
11
target/arm/tcg/gengvec.c | 36 +++++++++++
12
target/arm/tcg/neon_helper.c | 33 ++++++++++
13
target/arm/tcg/translate-neon.c | 110 +-------------------------------
14
6 files changed, 94 insertions(+), 110 deletions(-)
15
9
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
10
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
17
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.h
12
--- a/target/loongarch/tcg/fpu_helper.c
19
+++ b/target/arm/helper.h
13
+++ b/target/loongarch/tcg/fpu_helper.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(neon_uqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32
14
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
21
DEF_HELPER_FLAGS_5(neon_uqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
15
*/
22
DEF_HELPER_FLAGS_5(neon_uqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
23
DEF_HELPER_FLAGS_5(neon_uqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
17
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
24
+DEF_HELPER_FLAGS_4(neon_sqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
18
+ /* Default NaN: sign bit clear, msb frac bit set */
25
+DEF_HELPER_FLAGS_4(neon_sqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
26
+DEF_HELPER_FLAGS_4(neon_sqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(neon_sqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(neon_uqshli_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(neon_uqshli_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(neon_uqshli_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(neon_uqshli_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(neon_sqshlui_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(neon_sqshlui_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(neon_sqshlui_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(neon_sqshlui_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
37
DEF_HELPER_FLAGS_4(gvec_srshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_4(gvec_srshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/tcg/translate.h
42
+++ b/target/arm/tcg/translate.h
43
@@ -XXX,XX +XXX,XX @@ void gen_neon_sqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
44
void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
45
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
46
47
+void gen_neon_sqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
48
+ int64_t c, uint32_t opr_sz, uint32_t max_sz);
49
+void gen_neon_uqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
50
+ int64_t c, uint32_t opr_sz, uint32_t max_sz);
51
+void gen_neon_sqshlui(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
52
+ int64_t c, uint32_t opr_sz, uint32_t max_sz);
53
+
54
void gen_gvec_shadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
55
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
56
void gen_gvec_uhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
57
diff --git a/target/arm/tcg/neon-dp.decode b/target/arm/tcg/neon-dp.decode
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/tcg/neon-dp.decode
60
+++ b/target/arm/tcg/neon-dp.decode
61
@@ -XXX,XX +XXX,XX @@ VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
62
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
63
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
64
65
-VQSHLU_64_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_d
66
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_d
67
VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_s
68
VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_h
69
VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_b
70
71
-VQSHL_S_64_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
72
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
73
VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
74
VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
75
VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
76
77
-VQSHL_U_64_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
78
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
79
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
80
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
81
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
82
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
83
index XXXXXXX..XXXXXXX 100644
84
--- a/target/arm/tcg/gengvec.c
85
+++ b/target/arm/tcg/gengvec.c
86
@@ -XXX,XX +XXX,XX @@ void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
87
opr_sz, max_sz, 0, fns[vece]);
88
}
20
}
89
21
90
+void gen_neon_sqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
22
int ieee_ex_to_loongarch(int xcpt)
91
+ int64_t c, uint32_t opr_sz, uint32_t max_sz)
92
+{
93
+ static gen_helper_gvec_2_ptr * const fns[] = {
94
+ gen_helper_neon_sqshli_b, gen_helper_neon_sqshli_h,
95
+ gen_helper_neon_sqshli_s, gen_helper_neon_sqshli_d,
96
+ };
97
+ tcg_debug_assert(vece <= MO_64);
98
+ tcg_debug_assert(c >= 0 && c <= (8 << vece));
99
+ tcg_gen_gvec_2_ptr(rd_ofs, rn_ofs, tcg_env, opr_sz, max_sz, c, fns[vece]);
100
+}
101
+
102
+void gen_neon_uqshli(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
103
+ int64_t c, uint32_t opr_sz, uint32_t max_sz)
104
+{
105
+ static gen_helper_gvec_2_ptr * const fns[] = {
106
+ gen_helper_neon_uqshli_b, gen_helper_neon_uqshli_h,
107
+ gen_helper_neon_uqshli_s, gen_helper_neon_uqshli_d,
108
+ };
109
+ tcg_debug_assert(vece <= MO_64);
110
+ tcg_debug_assert(c >= 0 && c <= (8 << vece));
111
+ tcg_gen_gvec_2_ptr(rd_ofs, rn_ofs, tcg_env, opr_sz, max_sz, c, fns[vece]);
112
+}
113
+
114
+void gen_neon_sqshlui(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
115
+ int64_t c, uint32_t opr_sz, uint32_t max_sz)
116
+{
117
+ static gen_helper_gvec_2_ptr * const fns[] = {
118
+ gen_helper_neon_sqshlui_b, gen_helper_neon_sqshlui_h,
119
+ gen_helper_neon_sqshlui_s, gen_helper_neon_sqshlui_d,
120
+ };
121
+ tcg_debug_assert(vece <= MO_64);
122
+ tcg_debug_assert(c >= 0 && c <= (8 << vece));
123
+ tcg_gen_gvec_2_ptr(rd_ofs, rn_ofs, tcg_env, opr_sz, max_sz, c, fns[vece]);
124
+}
125
+
126
void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz)
127
{
128
uint64_t max = MAKE_64BIT_MASK(0, 8 << esz);
129
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/target/arm/tcg/neon_helper.c
132
+++ b/target/arm/tcg/neon_helper.c
133
@@ -XXX,XX +XXX,XX @@ void HELPER(name)(void *vd, void *vn, void *vm, void *venv, uint32_t desc) \
134
clear_tail(d, opr_sz, simd_maxsz(desc)); \
135
}
136
137
+#define NEON_GVEC_VOP2i_ENV(name, vtype) \
138
+void HELPER(name)(void *vd, void *vn, void *venv, uint32_t desc) \
139
+{ \
140
+ intptr_t i, opr_sz = simd_oprsz(desc); \
141
+ int imm = simd_data(desc); \
142
+ vtype *d = vd, *n = vn; \
143
+ CPUARMState *env = venv; \
144
+ for (i = 0; i < opr_sz / sizeof(vtype); i++) { \
145
+ NEON_FN(d[i], n[i], imm); \
146
+ } \
147
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
148
+}
149
+
150
/* Pairwise operations. */
151
/* For 32-bit elements each segment only contains a single element, so
152
the elementwise and pairwise operations are the same. */
153
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift)
154
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc))
155
NEON_VOP_ENV(qshl_u8, neon_u8, 4)
156
NEON_GVEC_VOP2_ENV(neon_uqshl_b, uint8_t)
157
+NEON_GVEC_VOP2i_ENV(neon_uqshli_b, uint8_t)
158
#undef NEON_FN
159
160
#define NEON_FN(dest, src1, src2) \
161
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc))
162
NEON_VOP_ENV(qshl_u16, neon_u16, 2)
163
NEON_GVEC_VOP2_ENV(neon_uqshl_h, uint16_t)
164
+NEON_GVEC_VOP2i_ENV(neon_uqshli_h, uint16_t)
165
#undef NEON_FN
166
167
#define NEON_FN(dest, src1, src2) \
168
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc))
169
NEON_GVEC_VOP2_ENV(neon_uqshl_s, uint32_t)
170
+NEON_GVEC_VOP2i_ENV(neon_uqshli_s, uint32_t)
171
#undef NEON_FN
172
173
#define NEON_FN(dest, src1, src2) \
174
(dest = do_uqrshl_d(src1, (int8_t)src2, false, env->vfp.qc))
175
NEON_GVEC_VOP2_ENV(neon_uqshl_d, uint64_t)
176
+NEON_GVEC_VOP2i_ENV(neon_uqshli_d, uint64_t)
177
#undef NEON_FN
178
179
uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift)
180
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift)
181
(dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc))
182
NEON_VOP_ENV(qshl_s8, neon_s8, 4)
183
NEON_GVEC_VOP2_ENV(neon_sqshl_b, int8_t)
184
+NEON_GVEC_VOP2i_ENV(neon_sqshli_b, int8_t)
185
#undef NEON_FN
186
187
#define NEON_FN(dest, src1, src2) \
188
(dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc))
189
NEON_VOP_ENV(qshl_s16, neon_s16, 2)
190
NEON_GVEC_VOP2_ENV(neon_sqshl_h, int16_t)
191
+NEON_GVEC_VOP2i_ENV(neon_sqshli_h, int16_t)
192
#undef NEON_FN
193
194
#define NEON_FN(dest, src1, src2) \
195
(dest = do_sqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc))
196
NEON_GVEC_VOP2_ENV(neon_sqshl_s, int32_t)
197
+NEON_GVEC_VOP2i_ENV(neon_sqshli_s, int32_t)
198
#undef NEON_FN
199
200
#define NEON_FN(dest, src1, src2) \
201
(dest = do_sqrshl_d(src1, (int8_t)src2, false, env->vfp.qc))
202
NEON_GVEC_VOP2_ENV(neon_sqshl_d, int64_t)
203
+NEON_GVEC_VOP2i_ENV(neon_sqshli_d, int64_t)
204
#undef NEON_FN
205
206
uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift)
207
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift)
208
#define NEON_FN(dest, src1, src2) \
209
(dest = do_suqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc))
210
NEON_VOP_ENV(qshlu_s8, neon_s8, 4)
211
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_b, int8_t)
212
#undef NEON_FN
213
214
#define NEON_FN(dest, src1, src2) \
215
(dest = do_suqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc))
216
NEON_VOP_ENV(qshlu_s16, neon_s16, 2)
217
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_h, int16_t)
218
#undef NEON_FN
219
220
uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t val, uint32_t shift)
221
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift)
222
return do_suqrshl_d(val, (int8_t)shift, false, env->vfp.qc);
223
}
224
225
+#define NEON_FN(dest, src1, src2) \
226
+ (dest = do_suqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc))
227
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_s, int32_t)
228
+#undef NEON_FN
229
+
230
+#define NEON_FN(dest, src1, src2) \
231
+ (dest = do_suqrshl_d(src1, (int8_t)src2, false, env->vfp.qc))
232
+NEON_GVEC_VOP2i_ENV(neon_sqshlui_d, int64_t)
233
+#undef NEON_FN
234
+
235
#define NEON_FN(dest, src1, src2) \
236
(dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc))
237
NEON_VOP_ENV(qrshl_u8, neon_u8, 4)
238
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
239
index XXXXXXX..XXXXXXX 100644
240
--- a/target/arm/tcg/translate-neon.c
241
+++ b/target/arm/tcg/translate-neon.c
242
@@ -XXX,XX +XXX,XX @@ DO_2SH(VRSRA_S, gen_gvec_srsra)
243
DO_2SH(VRSRA_U, gen_gvec_ursra)
244
DO_2SH(VSHR_S, gen_gvec_sshr)
245
DO_2SH(VSHR_U, gen_gvec_ushr)
246
-
247
-static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
248
- NeonGenTwo64OpEnvFn *fn)
249
-{
250
- /*
251
- * 2-reg-and-shift operations, size == 3 case, where the
252
- * function needs to be passed tcg_env.
253
- */
254
- TCGv_i64 constimm;
255
- int pass;
256
-
257
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
258
- return false;
259
- }
260
-
261
- /* UNDEF accesses to D16-D31 if they don't exist. */
262
- if (!dc_isar_feature(aa32_simd_r32, s) &&
263
- ((a->vd | a->vm) & 0x10)) {
264
- return false;
265
- }
266
-
267
- if ((a->vm | a->vd) & a->q) {
268
- return false;
269
- }
270
-
271
- if (!vfp_access_check(s)) {
272
- return true;
273
- }
274
-
275
- /*
276
- * To avoid excessive duplication of ops we implement shift
277
- * by immediate using the variable shift operations.
278
- */
279
- constimm = tcg_constant_i64(dup_const(a->size, a->shift));
280
-
281
- for (pass = 0; pass < a->q + 1; pass++) {
282
- TCGv_i64 tmp = tcg_temp_new_i64();
283
-
284
- read_neon_element64(tmp, a->vm, pass, MO_64);
285
- fn(tmp, tcg_env, tmp, constimm);
286
- write_neon_element64(tmp, a->vd, pass, MO_64);
287
- }
288
- return true;
289
-}
290
-
291
-static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
292
- NeonGenTwoOpEnvFn *fn)
293
-{
294
- /*
295
- * 2-reg-and-shift operations, size < 3 case, where the
296
- * helper needs to be passed tcg_env.
297
- */
298
- TCGv_i32 constimm, tmp;
299
- int pass;
300
-
301
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
302
- return false;
303
- }
304
-
305
- /* UNDEF accesses to D16-D31 if they don't exist. */
306
- if (!dc_isar_feature(aa32_simd_r32, s) &&
307
- ((a->vd | a->vm) & 0x10)) {
308
- return false;
309
- }
310
-
311
- if ((a->vm | a->vd) & a->q) {
312
- return false;
313
- }
314
-
315
- if (!vfp_access_check(s)) {
316
- return true;
317
- }
318
-
319
- /*
320
- * To avoid excessive duplication of ops we implement shift
321
- * by immediate using the variable shift operations.
322
- */
323
- constimm = tcg_constant_i32(dup_const(a->size, a->shift));
324
- tmp = tcg_temp_new_i32();
325
-
326
- for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
327
- read_neon_element32(tmp, a->vm, pass, MO_32);
328
- fn(tmp, tcg_env, tmp, constimm);
329
- write_neon_element32(tmp, a->vd, pass, MO_32);
330
- }
331
- return true;
332
-}
333
-
334
-#define DO_2SHIFT_ENV(INSN, FUNC) \
335
- static bool trans_##INSN##_64_2sh(DisasContext *s, arg_2reg_shift *a) \
336
- { \
337
- return do_2shift_env_64(s, a, gen_helper_neon_##FUNC##64); \
338
- } \
339
- static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
340
- { \
341
- static NeonGenTwoOpEnvFn * const fns[] = { \
342
- gen_helper_neon_##FUNC##8, \
343
- gen_helper_neon_##FUNC##16, \
344
- gen_helper_neon_##FUNC##32, \
345
- }; \
346
- assert(a->size < ARRAY_SIZE(fns)); \
347
- return do_2shift_env_32(s, a, fns[a->size]); \
348
- }
349
-
350
-DO_2SHIFT_ENV(VQSHLU, qshlu_s)
351
-DO_2SHIFT_ENV(VQSHL_U, qshl_u)
352
-DO_2SHIFT_ENV(VQSHL_S, qshl_s)
353
+DO_2SH(VQSHLU, gen_neon_sqshlui)
354
+DO_2SH(VQSHL_U, gen_neon_uqshli)
355
+DO_2SH(VQSHL_S, gen_neon_sqshli)
356
357
static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
358
NeonGenTwo64OpFn *shiftfn,
359
--
23
--
360
2.34.1
24
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for m68k.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-43-peter.maydell@linaro.org
6
---
7
target/m68k/cpu.c | 2 ++
8
fpu/softfloat-specialize.c.inc | 2 +-
9
2 files changed, 3 insertions(+), 1 deletion(-)
10
11
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/m68k/cpu.c
14
+++ b/target/m68k/cpu.c
15
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
16
* preceding paragraph for nonsignaling NaNs.
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
+ /* Default NaN: sign bit clear, all frac bits set */
20
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
21
22
nan = floatx80_default_nan(&env->fp_status);
23
for (i = 0; i < 8; i++) {
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
29
uint8_t dnan_pattern = status->default_nan_pattern;
30
31
if (dnan_pattern == 0) {
32
-#if defined(TARGET_SPARC) || defined(TARGET_M68K)
33
+#if defined(TARGET_SPARC)
34
/* Sign bit clear, all frac bits set */
35
dnan_pattern = 0b01111111;
36
#elif defined(TARGET_HEXAGON)
37
--
38
2.34.1
diff view generated by jsdifflib
1
The code at the tail end of the loop in kvm_dirty_ring_reaper_thread()
1
Set the default NaN pattern explicitly for MIPS. Note that this
2
is unreachable, because there is no way for execution to leave the
2
is our only target which currently changes the default NaN
3
loop. Replace it with a g_assert_not_reached().
3
at runtime (which it was previously doing indirectly when it
4
changed the snan_bit_is_one setting).
4
5
5
(The code has always been unreachable, right from the start
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
when the function was added in commit b4420f198dd8.)
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-44-peter.maydell@linaro.org
9
---
10
target/mips/fpu_helper.h | 7 +++++++
11
target/mips/msa.c | 3 +++
12
2 files changed, 10 insertions(+)
7
13
8
Resolves: Coverity CID 1547687
14
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20240815131206.3231819-3-peter.maydell@linaro.org
11
---
12
accel/kvm/kvm-all.c | 6 +-----
13
1 file changed, 1 insertion(+), 5 deletions(-)
14
15
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/accel/kvm/kvm-all.c
16
--- a/target/mips/fpu_helper.h
18
+++ b/accel/kvm/kvm-all.c
17
+++ b/target/mips/fpu_helper.h
19
@@ -XXX,XX +XXX,XX @@ static void *kvm_dirty_ring_reaper_thread(void *data)
18
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
20
r->reaper_iteration++;
19
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
21
}
20
nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
22
21
set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
23
- trace_kvm_dirty_ring_reaper("exit");
22
+ /*
24
-
23
+ * With nan2008, the default NaN value has the sign bit clear and the
25
- rcu_unregister_thread();
24
+ * frac msb set; with the older mode, the sign bit is clear, and all
26
-
25
+ * frac bits except the msb are set.
27
- return NULL;
26
+ */
28
+ g_assert_not_reached();
27
+ set_float_default_nan_pattern(nan2008 ? 0b01000000 : 0b00111111,
28
+ &env->active_fpu.fp_status);
29
29
}
30
}
30
31
31
static void kvm_dirty_ring_reaper_init(KVMState *s)
32
diff --git a/target/mips/msa.c b/target/mips/msa.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/mips/msa.c
35
+++ b/target/mips/msa.c
36
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
37
/* Inf * 0 + NaN returns the input NaN */
38
set_float_infzeronan_rule(float_infzeronan_dnan_never,
39
&env->active_tc.msa_fp_status);
40
+ /* Default NaN: sign bit clear, frac msb set */
41
+ set_float_default_nan_pattern(0b01000000,
42
+ &env->active_tc.msa_fp_status);
43
}
32
--
44
--
33
2.34.1
45
2.34.1
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for openrisc.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-45-peter.maydell@linaro.org
6
---
7
target/openrisc/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
9
10
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/target/openrisc/cpu.c
13
+++ b/target/openrisc/cpu.c
14
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_reset_hold(Object *obj, ResetType type)
15
*/
16
set_float_2nan_prop_rule(float_2nan_prop_x87, &cpu->env.fp_status);
17
18
+ /* Default NaN: sign bit clear, frac msb set */
19
+ set_float_default_nan_pattern(0b01000000, &cpu->env.fp_status);
20
21
#ifndef CONFIG_USER_ONLY
22
cpu->env.picmr = 0x00000000;
23
--
24
2.34.1
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
Set the default NaN pattern explicitly for ppc.
2
2
3
'Test might timeout' means nothing. Replace it with useful information
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
that it is emulation of pointer authentication what makes this test run
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
too long.
5
Message-id: 20241202131347.498124-46-peter.maydell@linaro.org
6
---
7
target/ppc/cpu_init.c | 4 ++++
8
1 file changed, 4 insertions(+)
6
9
7
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
10
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
8
Message-id: 20240910-b4-move-to-freebsd-v5-3-0fb66d803c93@linaro.org
11
index XXXXXXX..XXXXXXX 100644
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
--- a/target/ppc/cpu_init.c
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
+++ b/target/ppc/cpu_init.c
11
---
14
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
12
tests/functional/test_aarch64_sbsaref.py | 15 ++++++++++-----
15
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
13
1 file changed, 10 insertions(+), 5 deletions(-)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
14
17
15
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
18
+ /* Default NaN: sign bit clear, set frac msb */
16
index XXXXXXX..XXXXXXX 100755
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
17
--- a/tests/functional/test_aarch64_sbsaref.py
20
+ set_float_default_nan_pattern(0b01000000, &env->vec_status);
18
+++ b/tests/functional/test_aarch64_sbsaref.py
21
+
19
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_alpine_linux_max_pauth_off(self):
22
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
20
def test_sbsaref_alpine_linux_max_pauth_impdef(self):
23
ppc_spr_t *spr = &env->spr_cb[i];
21
self.boot_alpine_linux("max,pauth-impdef=on")
22
23
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
24
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
25
+ 'Test might timeout due to PAuth emulation')
26
def test_sbsaref_alpine_linux_max(self):
27
self.boot_alpine_linux("max")
28
29
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_openbsd73_default_cpu(self):
30
def test_sbsaref_openbsd73_max_pauth_off(self):
31
self.boot_openbsd73("max,pauth=off")
32
33
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
34
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
35
+ 'Test might timeout due to PAuth emulation')
36
def test_sbsaref_openbsd73_max_pauth_impdef(self):
37
self.boot_openbsd73("max,pauth-impdef=on")
38
39
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
40
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
41
+ 'Test might timeout due to PAuth emulation')
42
def test_sbsaref_openbsd73_max(self):
43
self.boot_openbsd73("max")
44
45
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_freebsd14_default_cpu(self):
46
def test_sbsaref_freebsd14_max_pauth_off(self):
47
self.boot_freebsd14("max,pauth=off")
48
49
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
50
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
51
+ 'Test might timeout due to PAuth emulation')
52
def test_sbsaref_freebsd14_max_pauth_impdef(self):
53
self.boot_freebsd14("max,pauth-impdef=on")
54
55
- @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
56
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'),
57
+ 'Test might timeout due to PAuth emulation')
58
def test_sbsaref_freebsd14_max(self):
59
self.boot_freebsd14("max")
60
24
61
--
25
--
62
2.34.1
26
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for sh4. Note that sh4
2
is one of the only three targets (the others being HPPA and
3
sometimes MIPS) that has snan_bit_is_one set.
2
4
3
This includes SHL and SLI.
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-47-peter.maydell@linaro.org
8
---
9
target/sh4/cpu.c | 2 ++
10
1 file changed, 2 insertions(+)
4
11
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20240912024114.1097832-25-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/tcg/a64.decode | 4 ++++
11
target/arm/tcg/translate-a64.c | 44 +++++++---------------------------
12
2 files changed, 13 insertions(+), 35 deletions(-)
13
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/a64.decode
14
--- a/target/sh4/cpu.c
17
+++ b/target/arm/tcg/a64.decode
15
+++ b/target/sh4/cpu.c
18
@@ -XXX,XX +XXX,XX @@ RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
16
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_reset_hold(Object *obj, ResetType type)
19
17
set_flush_to_zero(1, &env->fp_status);
20
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
18
#endif
21
&rri_e esz=3 imm=%neon_rshift_i6
19
set_default_nan_mode(1, &env->fp_status);
22
+@shli_d .... ..... 1 imm:6 ..... . rn:5 rd:5 &rri_e esz=3
20
+ /* sign bit clear, set all frac bits other than msb */
23
21
+ set_float_default_nan_pattern(0b00111111, &env->fp_status);
24
SSHR_s 0101 11110 .... ... 00000 1 ..... ..... @shri_d
25
USHR_s 0111 11110 .... ... 00000 1 ..... ..... @shri_d
26
@@ -XXX,XX +XXX,XX @@ URSHR_s 0111 11110 .... ... 00100 1 ..... ..... @shri_d
27
SRSRA_s 0101 11110 .... ... 00110 1 ..... ..... @shri_d
28
URSRA_s 0111 11110 .... ... 00110 1 ..... ..... @shri_d
29
SRI_s 0111 11110 .... ... 01000 1 ..... ..... @shri_d
30
+
31
+SHL_s 0101 11110 .... ... 01010 1 ..... ..... @shli_d
32
+SLI_s 0111 11110 .... ... 01010 1 ..... ..... @shli_d
33
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/translate-a64.c
36
+++ b/target/arm/tcg/translate-a64.c
37
@@ -XXX,XX +XXX,XX @@ static void gen_sri_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
38
}
39
}
22
}
40
23
41
+static void gen_sli_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
24
static void superh_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
42
+{
43
+ tcg_gen_deposit_i64(dst, dst, src, shift, 64 - shift);
44
+}
45
+
46
static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
47
WideShiftImmFn * const fns[3], MemOp sign)
48
{
49
@@ -XXX,XX +XXX,XX @@ TRANS(SRSRA_s, do_scalar_shift_imm, a, gen_srsra_d, true, 0)
50
TRANS(URSRA_s, do_scalar_shift_imm, a, gen_ursra_d, true, 0)
51
TRANS(SRI_s, do_scalar_shift_imm, a, gen_sri_d, true, 0)
52
53
+TRANS(SHL_s, do_scalar_shift_imm, a, tcg_gen_shli_i64, false, 0)
54
+TRANS(SLI_s, do_scalar_shift_imm, a, gen_sli_d, true, 0)
55
+
56
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
57
* Note that it is the caller's responsibility to ensure that the
58
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
59
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
60
}
61
}
62
63
-/* SHL/SLI - Scalar shift left */
64
-static void handle_scalar_simd_shli(DisasContext *s, bool insert,
65
- int immh, int immb, int opcode,
66
- int rn, int rd)
67
-{
68
- int size = 32 - clz32(immh) - 1;
69
- int immhb = immh << 3 | immb;
70
- int shift = immhb - (8 << size);
71
- TCGv_i64 tcg_rn;
72
- TCGv_i64 tcg_rd;
73
-
74
- if (!extract32(immh, 3, 1)) {
75
- unallocated_encoding(s);
76
- return;
77
- }
78
-
79
- if (!fp_access_check(s)) {
80
- return;
81
- }
82
-
83
- tcg_rn = read_fp_dreg(s, rn);
84
- tcg_rd = insert ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
85
-
86
- if (insert) {
87
- tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, shift, 64 - shift);
88
- } else {
89
- tcg_gen_shli_i64(tcg_rd, tcg_rn, shift);
90
- }
91
-
92
- write_fp_dreg(s, rd, tcg_rd);
93
-}
94
-
95
/* SQSHRN/SQSHRUN - Saturating (signed/unsigned) shift right with
96
* (signed/unsigned) narrowing */
97
static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
98
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
99
}
100
101
switch (opcode) {
102
- case 0x0a: /* SHL / SLI */
103
- handle_scalar_simd_shli(s, is_u, immh, immb, opcode, rn, rd);
104
- break;
105
case 0x1c: /* SCVTF, UCVTF */
106
handle_simd_shift_intfp_conv(s, true, false, is_u, immh, immb,
107
opcode, rn, rd);
108
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
109
case 0x04: /* SRSHR / URSHR */
110
case 0x06: /* SRSRA / URSRA */
111
case 0x08: /* SRI */
112
+ case 0x0a: /* SHL / SLI */
113
unallocated_encoding(s);
114
break;
115
}
116
--
25
--
117
2.34.1
26
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for rx.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-20-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-48-peter.maydell@linaro.org
7
---
6
---
8
target/arm/tcg/a64.decode | 8 ++++
7
target/rx/cpu.c | 2 ++
9
target/arm/tcg/translate-a64.c | 81 ++++++++++++++++------------------
8
1 file changed, 2 insertions(+)
10
2 files changed, 45 insertions(+), 44 deletions(-)
11
9
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
10
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
13
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
12
--- a/target/rx/cpu.c
15
+++ b/target/arm/tcg/a64.decode
13
+++ b/target/rx/cpu.c
16
@@ -XXX,XX +XXX,XX @@ SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_b
14
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_reset_hold(Object *obj, ResetType type)
17
SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_h
15
* then prefer dest over source", which is float_2nan_prop_s_ab.
18
SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_s
16
*/
19
SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_d
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
20
+
18
+ /* Default NaN value: sign bit clear, set frac msb */
21
+SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_b
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
22
+SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_h
23
+SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_s
24
+
25
+USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_b
26
+USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_h
27
+USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_s
28
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/tcg/translate-a64.c
31
+++ b/target/arm/tcg/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
33
TRANS(SHL_v, do_vec_shift_imm, a, tcg_gen_gvec_shli)
34
TRANS(SLI_v, do_vec_shift_imm, a, gen_gvec_sli);
35
36
+static bool do_vec_shift_imm_wide(DisasContext *s, arg_qrri_e *a, bool is_u)
37
+{
38
+ TCGv_i64 tcg_rn, tcg_rd;
39
+ int esz = a->esz;
40
+ int esize;
41
+
42
+ if (!fp_access_check(s)) {
43
+ return true;
44
+ }
45
+
46
+ /*
47
+ * For the LL variants the store is larger than the load,
48
+ * so if rd == rn we would overwrite parts of our input.
49
+ * So load everything right now and use shifts in the main loop.
50
+ */
51
+ tcg_rd = tcg_temp_new_i64();
52
+ tcg_rn = tcg_temp_new_i64();
53
+ read_vec_element(s, tcg_rn, a->rn, a->q, MO_64);
54
+
55
+ esize = 8 << esz;
56
+ for (int i = 0, elements = 8 >> esz; i < elements; i++) {
57
+ if (is_u) {
58
+ tcg_gen_extract_i64(tcg_rd, tcg_rn, i * esize, esize);
59
+ } else {
60
+ tcg_gen_sextract_i64(tcg_rd, tcg_rn, i * esize, esize);
61
+ }
62
+ tcg_gen_shli_i64(tcg_rd, tcg_rd, a->imm);
63
+ write_vec_element(s, tcg_rd, a->rd, i, esz + 1);
64
+ }
65
+ clear_vec_high(s, true, a->rd);
66
+ return true;
67
+}
68
+
69
+TRANS(SSHLL_v, do_vec_shift_imm_wide, a, false)
70
+TRANS(USHLL_v, do_vec_shift_imm_wide, a, true)
71
+
72
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
73
* Note that it is the caller's responsibility to ensure that the
74
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
75
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
76
}
77
}
20
}
78
21
79
-/* USHLL/SHLL - Vector shift left with widening */
22
static ObjectClass *rx_cpu_class_by_name(const char *cpu_model)
80
-static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
81
- int immh, int immb, int opcode, int rn, int rd)
82
-{
83
- int size = 32 - clz32(immh) - 1;
84
- int immhb = immh << 3 | immb;
85
- int shift = immhb - (8 << size);
86
- int dsize = 64;
87
- int esize = 8 << size;
88
- int elements = dsize/esize;
89
- TCGv_i64 tcg_rn = tcg_temp_new_i64();
90
- TCGv_i64 tcg_rd = tcg_temp_new_i64();
91
- int i;
92
-
93
- if (size >= 3) {
94
- unallocated_encoding(s);
95
- return;
96
- }
97
-
98
- if (!fp_access_check(s)) {
99
- return;
100
- }
101
-
102
- /* For the LL variants the store is larger than the load,
103
- * so if rd == rn we would overwrite parts of our input.
104
- * So load everything right now and use shifts in the main loop.
105
- */
106
- read_vec_element(s, tcg_rn, rn, is_q ? 1 : 0, MO_64);
107
-
108
- for (i = 0; i < elements; i++) {
109
- if (is_u) {
110
- tcg_gen_extract_i64(tcg_rd, tcg_rn, i * esize, esize);
111
- } else {
112
- tcg_gen_sextract_i64(tcg_rd, tcg_rn, i * esize, esize);
113
- }
114
- tcg_gen_shli_i64(tcg_rd, tcg_rd, shift);
115
- write_vec_element(s, tcg_rd, rd, i, size + 1);
116
- }
117
- clear_vec_high(s, true, rd);
118
-}
119
-
120
/* SHRN/RSHRN - Shift right with narrowing (and potential rounding) */
121
static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
122
int immh, int immb, int opcode, int rn, int rd)
123
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
124
handle_vec_simd_sqshrn(s, false, is_q, is_u, is_u, immh, immb,
125
opcode, rn, rd);
126
break;
127
- case 0x14: /* SSHLL / USHLL */
128
- handle_vec_simd_wshli(s, is_q, is_u, immh, immb, opcode, rn, rd);
129
- break;
130
case 0x1c: /* SCVTF / UCVTF */
131
handle_simd_shift_intfp_conv(s, false, is_q, is_u, immh, immb,
132
opcode, rn, rd);
133
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
134
case 0x06: /* SRSRA / URSRA (accum + rounding) */
135
case 0x08: /* SRI */
136
case 0x0a: /* SHL / SLI */
137
+ case 0x14: /* SSHLL / USHLL */
138
unallocated_encoding(s);
139
return;
140
}
141
--
23
--
142
2.34.1
24
2.34.1
diff view generated by jsdifflib
1
docs/devel/nested-papr.txt is entirely (apart from the initial
1
Set the default NaN pattern explicitly for s390x.
2
paragraph) a partial copy of the kernel documentation
3
https://docs.kernel.org/arch/powerpc/kvm-nested.html
4
5
There's no benefit to the QEMU docs to converting this to rST,
6
so instead delete it. Anybody needing to know the API and
7
protocol for the guest to communicate with the hypervisor
8
to created nested VMs should refer to the authoratitative
9
documentation in the kernel docs.
10
2
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20240816133318.3603114-1-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-49-peter.maydell@linaro.org
14
---
6
---
15
docs/devel/nested-papr.txt | 119 -------------------------------------
7
target/s390x/cpu.c | 2 ++
16
1 file changed, 119 deletions(-)
8
1 file changed, 2 insertions(+)
17
delete mode 100644 docs/devel/nested-papr.txt
18
9
19
diff --git a/docs/devel/nested-papr.txt b/docs/devel/nested-papr.txt
10
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
20
deleted file mode 100644
11
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX
12
--- a/target/s390x/cpu.c
22
--- a/docs/devel/nested-papr.txt
13
+++ b/target/s390x/cpu.c
23
+++ /dev/null
14
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
24
@@ -XXX,XX +XXX,XX @@
15
set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
25
-Nested PAPR API (aka KVM on PowerVM)
16
set_float_infzeronan_rule(float_infzeronan_dnan_always,
26
-====================================
17
&env->fpu_status);
27
-
18
+ /* Default NaN value: sign bit clear, frac msb set */
28
-This API aims at providing support to enable nested virtualization with
19
+ set_float_default_nan_pattern(0b01000000, &env->fpu_status);
29
-KVM on PowerVM. While the existing support for nested KVM on PowerNV was
20
/* fall through */
30
-introduced with cap-nested-hv option, however, with a slight design change,
21
case RESET_TYPE_S390_CPU_NORMAL:
31
-to enable this on papr/pseries, a new cap-nested-papr option is added. eg:
22
env->psw.mask &= ~PSW_MASK_RI;
32
-
33
- qemu-system-ppc64 -cpu POWER10 -machine pseries,cap-nested-papr=true ...
34
-
35
-Work by:
36
- Michael Neuling <mikey@neuling.org>
37
- Vaibhav Jain <vaibhav@linux.ibm.com>
38
- Jordan Niethe <jniethe5@gmail.com>
39
- Harsh Prateek Bora <harshpb@linux.ibm.com>
40
- Shivaprasad G Bhat <sbhat@linux.ibm.com>
41
- Kautuk Consul <kconsul@linux.vnet.ibm.com>
42
-
43
-Below taken from the kernel documentation:
44
-
45
-Introduction
46
-============
47
-
48
-This document explains how a guest operating system can act as a
49
-hypervisor and run nested guests through the use of hypercalls, if the
50
-hypervisor has implemented them. The terms L0, L1, and L2 are used to
51
-refer to different software entities. L0 is the hypervisor mode entity
52
-that would normally be called the "host" or "hypervisor". L1 is a
53
-guest virtual machine that is directly run under L0 and is initiated
54
-and controlled by L0. L2 is a guest virtual machine that is initiated
55
-and controlled by L1 acting as a hypervisor. A significant design change
56
-wrt existing API is that now the entire L2 state is maintained within L0.
57
-
58
-Existing Nested-HV API
59
-======================
60
-
61
-Linux/KVM has had support for Nesting as an L0 or L1 since 2018
62
-
63
-The L0 code was added::
64
-
65
- commit 8e3f5fc1045dc49fd175b978c5457f5f51e7a2ce
66
- Author: Paul Mackerras <paulus@ozlabs.org>
67
- Date: Mon Oct 8 16:31:03 2018 +1100
68
- KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization
69
-
70
-The L1 code was added::
71
-
72
- commit 360cae313702cdd0b90f82c261a8302fecef030a
73
- Author: Paul Mackerras <paulus@ozlabs.org>
74
- Date: Mon Oct 8 16:31:04 2018 +1100
75
- KVM: PPC: Book3S HV: Nested guest entry via hypercall
76
-
77
-This API works primarily using a signal hcall h_enter_nested(). This
78
-call made by the L1 to tell the L0 to start an L2 vCPU with the given
79
-state. The L0 then starts this L2 and runs until an L2 exit condition
80
-is reached. Once the L2 exits, the state of the L2 is given back to
81
-the L1 by the L0. The full L2 vCPU state is always transferred from
82
-and to L1 when the L2 is run. The L0 doesn't keep any state on the L2
83
-vCPU (except in the short sequence in the L0 on L1 -> L2 entry and L2
84
--> L1 exit).
85
-
86
-The only state kept by the L0 is the partition table. The L1 registers
87
-it's partition table using the h_set_partition_table() hcall. All
88
-other state held by the L0 about the L2s is cached state (such as
89
-shadow page tables).
90
-
91
-The L1 may run any L2 or vCPU without first informing the L0. It
92
-simply starts the vCPU using h_enter_nested(). The creation of L2s and
93
-vCPUs is done implicitly whenever h_enter_nested() is called.
94
-
95
-In this document, we call this existing API the v1 API.
96
-
97
-New PAPR API
98
-===============
99
-
100
-The new PAPR API changes from the v1 API such that the creating L2 and
101
-associated vCPUs is explicit. In this document, we call this the v2
102
-API.
103
-
104
-h_enter_nested() is replaced with H_GUEST_VCPU_RUN(). Before this can
105
-be called the L1 must explicitly create the L2 using h_guest_create()
106
-and any associated vCPUs() created with h_guest_create_vCPU(). Getting
107
-and setting vCPU state can also be performed using h_guest_{g|s}et
108
-hcall.
109
-
110
-The basic execution flow is for an L1 to create an L2, run it, and
111
-delete it is:
112
-
113
-- L1 and L0 negotiate capabilities with H_GUEST_{G,S}ET_CAPABILITIES()
114
- (normally at L1 boot time).
115
-
116
-- L1 requests the L0 to create an L2 with H_GUEST_CREATE() and receives a token
117
-
118
-- L1 requests the L0 to create an L2 vCPU with H_GUEST_CREATE_VCPU()
119
-
120
-- L1 and L0 communicate the vCPU state using the H_GUEST_{G,S}ET() hcall
121
-
122
-- L1 requests the L0 to run the vCPU using H_GUEST_RUN_VCPU() hcall
123
-
124
-- L1 deletes L2 with H_GUEST_DELETE()
125
-
126
-For more details, please refer:
127
-
128
-[1] Linux Kernel documentation (upstream documentation commit):
129
-
130
-commit 476652297f94a2e5e5ef29e734b0da37ade94110
131
-Author: Michael Neuling <mikey@neuling.org>
132
-Date: Thu Sep 14 13:06:00 2023 +1000
133
-
134
- docs: powerpc: Document nested KVM on POWER
135
-
136
- Document support for nested KVM on POWER using the existing API as well
137
- as the new PAPR API. This includes the new HCALL interface and how it
138
- used by KVM.
139
-
140
- Signed-off-by: Michael Neuling <mikey@neuling.org>
141
- Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
142
- Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
143
- Link: https://msgid.link/20230914030600.16993-12-jniethe5@gmail.com
144
--
23
--
145
2.34.1
24
2.34.1
diff view generated by jsdifflib
1
The Neoverse-V1 TRM is a bit confused about the layout of the
1
Set the default NaN pattern explicitly for SPARC, and remove
2
ID_AA64ISAR1_EL1 register, and so its table 3-6 has the wrong value
2
the ifdef from parts64_default_nan.
3
for this ID register. Trust instead section 3.2.74's list of which
4
fields are set.
5
3
6
This means that we stop incorrectly reporting FEAT_XS as present, and
7
now report the presence of FEAT_BF16.
8
9
Cc: qemu-stable@nongnu.org
10
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20240917161337.3012188-1-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-50-peter.maydell@linaro.org
14
---
7
---
15
target/arm/tcg/cpu64.c | 2 +-
8
target/sparc/cpu.c | 2 ++
16
1 file changed, 1 insertion(+), 1 deletion(-)
9
fpu/softfloat-specialize.c.inc | 5 +----
10
2 files changed, 3 insertions(+), 4 deletions(-)
17
11
18
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
19
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/tcg/cpu64.c
14
--- a/target/sparc/cpu.c
21
+++ b/target/arm/tcg/cpu64.c
15
+++ b/target/sparc/cpu.c
22
@@ -XXX,XX +XXX,XX @@ static void aarch64_neoverse_v1_initfn(Object *obj)
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
23
cpu->isar.id_aa64dfr0 = 0x000001f210305519ull;
17
set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
24
cpu->isar.id_aa64dfr1 = 0x00000000;
18
/* For inf * 0 + NaN, return the input NaN */
25
cpu->isar.id_aa64isar0 = 0x1011111110212120ull; /* with FEAT_RNG */
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
- cpu->isar.id_aa64isar1 = 0x0111000001211032ull;
20
+ /* Default NaN value: sign bit clear, all frac bits set */
27
+ cpu->isar.id_aa64isar1 = 0x0011100001211032ull;
21
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
28
cpu->isar.id_aa64mmfr0 = 0x0000000000101125ull;
22
29
cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
23
cpu_exec_realizefn(cs, &local_err);
30
cpu->isar.id_aa64mmfr2 = 0x0220011102101011ull;
24
if (local_err != NULL) {
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
30
uint8_t dnan_pattern = status->default_nan_pattern;
31
32
if (dnan_pattern == 0) {
33
-#if defined(TARGET_SPARC)
34
- /* Sign bit clear, all frac bits set */
35
- dnan_pattern = 0b01111111;
36
-#elif defined(TARGET_HEXAGON)
37
+#if defined(TARGET_HEXAGON)
38
/* Sign bit set, all frac bits set. */
39
dnan_pattern = 0b11111111;
40
#else
31
--
41
--
32
2.34.1
42
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for xtensa.
2
2
3
This includes SHL and SLI.
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-51-peter.maydell@linaro.org
6
---
7
target/xtensa/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
4
9
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20240912024114.1097832-18-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/tcg/a64.decode | 15 +++++++++++++++
11
target/arm/tcg/translate-a64.c | 33 +++------------------------------
12
2 files changed, 18 insertions(+), 30 deletions(-)
13
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/a64.decode
12
--- a/target/xtensa/cpu.c
17
+++ b/target/arm/tcg/a64.decode
13
+++ b/target/xtensa/cpu.c
18
@@ -XXX,XX +XXX,XX @@ FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
14
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
19
@q_shri_d . 1 .. ..... 1 ...... ..... . rn:5 rd:5 \
15
/* For inf * 0 + NaN, return the input NaN */
20
&qrri_e esz=3 imm=%neon_rshift_i6 q=1
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
21
17
set_no_signaling_nans(!dfpu, &env->fp_status);
22
+@q_shli_b . q:1 .. ..... 0001 imm:3 ..... . rn:5 rd:5 &qrri_e esz=0
18
+ /* Default NaN value: sign bit clear, set frac msb */
23
+@q_shli_h . q:1 .. ..... 001 imm:4 ..... . rn:5 rd:5 &qrri_e esz=1
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
+@q_shli_s . q:1 .. ..... 01 imm:5 ..... . rn:5 rd:5 &qrri_e esz=2
20
xtensa_use_first_nan(env, !dfpu);
25
+@q_shli_d . 1 .. ..... 1 imm:6 ..... . rn:5 rd:5 &qrri_e esz=3 q=1
26
+
27
FMOVI_v_h 0 q:1 00 1111 00000 ... 1111 11 ..... rd:5 %abcdefgh
28
29
# MOVI, MVNI, ORR, BIC, FMOV are all intermixed via cmode.
30
@@ -XXX,XX +XXX,XX @@ SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_b
31
SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_h
32
SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_s
33
SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_d
34
+
35
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_b
36
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_h
37
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_s
38
+SHL_v 0.00 11110 .... ... 01010 1 ..... ..... @q_shli_d
39
+
40
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_b
41
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_h
42
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_s
43
+SLI_v 0.10 11110 .... ... 01010 1 ..... ..... @q_shli_d
44
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/tcg/translate-a64.c
47
+++ b/target/arm/tcg/translate-a64.c
48
@@ -XXX,XX +XXX,XX @@ TRANS(URSHR_v, do_vec_shift_imm, a, gen_gvec_urshr)
49
TRANS(SRSRA_v, do_vec_shift_imm, a, gen_gvec_srsra)
50
TRANS(URSRA_v, do_vec_shift_imm, a, gen_gvec_ursra)
51
TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
52
+TRANS(SHL_v, do_vec_shift_imm, a, tcg_gen_gvec_shli)
53
+TRANS(SLI_v, do_vec_shift_imm, a, gen_gvec_sli);
54
55
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
56
* Note that it is the caller's responsibility to ensure that the
57
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
58
}
59
}
21
}
60
22
61
-/* SHL/SLI - Vector shift left */
62
-static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
63
- int immh, int immb, int opcode, int rn, int rd)
64
-{
65
- int size = 32 - clz32(immh) - 1;
66
- int immhb = immh << 3 | immb;
67
- int shift = immhb - (8 << size);
68
-
69
- /* Range of size is limited by decode: immh is a non-zero 4 bit field */
70
- assert(size >= 0 && size <= 3);
71
-
72
- if (extract32(immh, 3, 1) && !is_q) {
73
- unallocated_encoding(s);
74
- return;
75
- }
76
-
77
- if (!fp_access_check(s)) {
78
- return;
79
- }
80
-
81
- if (insert) {
82
- gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
83
- } else {
84
- gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
85
- }
86
-}
87
-
88
/* USHLL/SHLL - Vector shift left with widening */
89
static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
90
int immh, int immb, int opcode, int rn, int rd)
91
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
92
}
93
94
switch (opcode) {
95
- case 0x0a: /* SHL / SLI */
96
- handle_vec_simd_shli(s, is_q, is_u, immh, immb, opcode, rn, rd);
97
- break;
98
case 0x10: /* SHRN */
99
case 0x11: /* RSHRN / SQRSHRUN */
100
if (is_u) {
101
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
102
case 0x04: /* SRSHR / URSHR (rounding) */
103
case 0x06: /* SRSRA / URSRA (accum + rounding) */
104
case 0x08: /* SRI */
105
+ case 0x0a: /* SHL / SLI */
106
unallocated_encoding(s);
107
return;
108
}
109
--
23
--
110
2.34.1
24
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for hexagon.
2
Remove the ifdef from parts64_default_nan(); the only
3
remaining unconverted targets all use the default case.
2
4
3
Instead of copying a constant into a temporary with dupi,
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
use a vector constant directly.
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-52-peter.maydell@linaro.org
8
---
9
target/hexagon/cpu.c | 2 ++
10
fpu/softfloat-specialize.c.inc | 5 -----
11
2 files changed, 2 insertions(+), 5 deletions(-)
5
12
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240912024114.1097832-2-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/tcg/gengvec.c | 43 ++++++++++++++++++----------------------
12
1 file changed, 19 insertions(+), 24 deletions(-)
13
14
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/gengvec.c
15
--- a/target/hexagon/cpu.c
17
+++ b/target/arm/tcg/gengvec.c
16
+++ b/target/hexagon/cpu.c
18
@@ -XXX,XX +XXX,XX @@ void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
17
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_reset_hold(Object *obj, ResetType type)
19
static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
18
20
{
19
set_default_nan_mode(1, &env->fp_status);
21
TCGv_vec t = tcg_temp_new_vec_matching(d);
20
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
22
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
21
+ /* Default NaN value: sign bit set, all frac bits set */
23
+ TCGv_vec ones = tcg_constant_vec_matching(d, vece, 1);
22
+ set_float_default_nan_pattern(0b11111111, &env->fp_status);
24
25
tcg_gen_shri_vec(vece, t, a, sh - 1);
26
- tcg_gen_dupi_vec(vece, ones, 1);
27
tcg_gen_and_vec(vece, t, t, ones);
28
tcg_gen_sari_vec(vece, d, a, sh);
29
tcg_gen_add_vec(vece, d, d, t);
30
@@ -XXX,XX +XXX,XX @@ void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
31
static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
32
{
33
TCGv_vec t = tcg_temp_new_vec_matching(d);
34
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
35
+ TCGv_vec ones = tcg_constant_vec_matching(d, vece, 1);
36
37
tcg_gen_shri_vec(vece, t, a, shift - 1);
38
- tcg_gen_dupi_vec(vece, ones, 1);
39
tcg_gen_and_vec(vece, t, t, ones);
40
tcg_gen_shri_vec(vece, d, a, shift);
41
tcg_gen_add_vec(vece, d, d, t);
42
@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
43
static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
44
{
45
TCGv_vec t = tcg_temp_new_vec_matching(d);
46
- TCGv_vec m = tcg_temp_new_vec_matching(d);
47
+ int64_t mi = MAKE_64BIT_MASK((8 << vece) - sh, sh);
48
+ TCGv_vec m = tcg_constant_vec_matching(d, vece, mi);
49
50
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
51
tcg_gen_shri_vec(vece, t, a, sh);
52
tcg_gen_and_vec(vece, d, d, m);
53
tcg_gen_or_vec(vece, d, d, t);
54
@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
55
static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
56
{
57
TCGv_vec t = tcg_temp_new_vec_matching(d);
58
- TCGv_vec m = tcg_temp_new_vec_matching(d);
59
+ TCGv_vec m = tcg_constant_vec_matching(d, vece, MAKE_64BIT_MASK(0, sh));
60
61
tcg_gen_shli_vec(vece, t, a, sh);
62
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
63
tcg_gen_and_vec(vece, d, d, m);
64
tcg_gen_or_vec(vece, d, d, t);
65
}
23
}
66
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
24
67
TCGv_vec rval = tcg_temp_new_vec_matching(dst);
25
static void hexagon_cpu_disas_set_info(CPUState *s, disassemble_info *info)
68
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
26
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
69
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
27
index XXXXXXX..XXXXXXX 100644
70
- TCGv_vec msk, max;
28
--- a/fpu/softfloat-specialize.c.inc
71
+ TCGv_vec max;
29
+++ b/fpu/softfloat-specialize.c.inc
72
30
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
73
tcg_gen_neg_vec(vece, rsh, shift);
31
uint8_t dnan_pattern = status->default_nan_pattern;
74
if (vece == MO_8) {
32
75
tcg_gen_mov_vec(lsh, shift);
33
if (dnan_pattern == 0) {
76
} else {
34
-#if defined(TARGET_HEXAGON)
77
- msk = tcg_temp_new_vec_matching(dst);
35
- /* Sign bit set, all frac bits set. */
78
- tcg_gen_dupi_vec(vece, msk, 0xff);
36
- dnan_pattern = 0b11111111;
79
+ TCGv_vec msk = tcg_constant_vec_matching(dst, vece, 0xff);
37
-#else
80
tcg_gen_and_vec(vece, lsh, shift, msk);
38
/*
81
tcg_gen_and_vec(vece, rsh, rsh, msk);
39
* This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
40
* S390, SH4, TriCore, and Xtensa. Our other supported targets
41
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
42
/* sign bit clear, set frac msb */
43
dnan_pattern = 0b01000000;
44
}
45
-#endif
82
}
46
}
83
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
47
assert(dnan_pattern != 0);
84
tcg_gen_shlv_vec(vece, lval, src, lsh);
85
tcg_gen_shrv_vec(vece, rval, src, rsh);
86
87
- max = tcg_temp_new_vec_matching(dst);
88
- tcg_gen_dupi_vec(vece, max, 8 << vece);
89
-
90
/*
91
* The choice of LT (signed) and GEU (unsigned) are biased toward
92
* the instructions of the x86_64 host. For MO_8, the whole byte
93
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
94
* have already masked to a byte and so a signed compare works.
95
* Other tcg hosts have a full set of comparisons and do not care.
96
*/
97
+ max = tcg_constant_vec_matching(dst, vece, 8 << vece);
98
if (vece == MO_8) {
99
tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
100
tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
101
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
102
TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
103
TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
104
TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
105
+ TCGv_vec max, zero;
106
107
/*
108
* Rely on the TCG guarantee that out of range shifts produce
109
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
110
if (vece == MO_8) {
111
tcg_gen_mov_vec(lsh, shift);
112
} else {
113
- tcg_gen_dupi_vec(vece, tmp, 0xff);
114
- tcg_gen_and_vec(vece, lsh, shift, tmp);
115
- tcg_gen_and_vec(vece, rsh, rsh, tmp);
116
+ TCGv_vec msk = tcg_constant_vec_matching(dst, vece, 0xff);
117
+ tcg_gen_and_vec(vece, lsh, shift, msk);
118
+ tcg_gen_and_vec(vece, rsh, rsh, msk);
119
}
120
121
/* Bound rsh so out of bound right shift gets -1. */
122
- tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
123
- tcg_gen_umin_vec(vece, rsh, rsh, tmp);
124
- tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
125
+ max = tcg_constant_vec_matching(dst, vece, (8 << vece) - 1);
126
+ tcg_gen_umin_vec(vece, rsh, rsh, max);
127
+ tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, max);
128
129
tcg_gen_shlv_vec(vece, lval, src, lsh);
130
tcg_gen_sarv_vec(vece, rval, src, rsh);
131
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
132
tcg_gen_andc_vec(vece, lval, lval, tmp);
133
134
/* Select between left and right shift. */
135
+ zero = tcg_constant_vec_matching(dst, vece, 0);
136
if (vece == MO_8) {
137
- tcg_gen_dupi_vec(vece, tmp, 0);
138
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
139
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, zero, rval, lval);
140
} else {
141
- tcg_gen_dupi_vec(vece, tmp, 0x80);
142
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
143
+ TCGv_vec sgn = tcg_constant_vec_matching(dst, vece, 0x80);
144
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, sgn, lval, rval);
145
}
146
}
147
48
148
--
49
--
149
2.34.1
50
2.34.1
diff view generated by jsdifflib
1
In kvm_init_vcpu()and do_kvm_destroy_vcpu(), the return value from
1
Set the default NaN pattern explicitly for riscv.
2
kvm_ioctl(..., KVM_GET_VCPU_MMAP_SIZE, ...)
3
is an 'int', but we put it into a 'long' logal variable mmap_size.
4
Coverity then complains that there might be a truncation when we copy
5
that value into the 'int ret' which we use for returning a value in
6
an error-exit codepath. This can't ever actually overflow because
7
the value was in an 'int' to start with, but it makes more sense
8
to use 'int' for mmap_size so we don't do the widen-then-narrow
9
sequence in the first place.
10
2
11
Resolves: Coverity CID 1547515
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20240815131206.3231819-2-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-53-peter.maydell@linaro.org
15
---
6
---
16
accel/kvm/kvm-all.c | 4 ++--
7
target/riscv/cpu.c | 2 ++
17
1 file changed, 2 insertions(+), 2 deletions(-)
8
1 file changed, 2 insertions(+)
18
9
19
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
10
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
20
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
21
--- a/accel/kvm/kvm-all.c
12
--- a/target/riscv/cpu.c
22
+++ b/accel/kvm/kvm-all.c
13
+++ b/target/riscv/cpu.c
23
@@ -XXX,XX +XXX,XX @@ int kvm_create_and_park_vcpu(CPUState *cpu)
14
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type)
24
static int do_kvm_destroy_vcpu(CPUState *cpu)
15
cs->exception_index = RISCV_EXCP_NONE;
25
{
16
env->load_res = -1;
26
KVMState *s = kvm_state;
17
set_default_nan_mode(1, &env->fp_status);
27
- long mmap_size;
18
+ /* Default NaN value: sign bit clear, frac msb set */
28
+ int mmap_size;
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
29
int ret = 0;
20
env->vill = true;
30
21
31
trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
22
#ifndef CONFIG_USER_ONLY
32
@@ -XXX,XX +XXX,XX @@ void kvm_destroy_vcpu(CPUState *cpu)
33
int kvm_init_vcpu(CPUState *cpu, Error **errp)
34
{
35
KVMState *s = kvm_state;
36
- long mmap_size;
37
+ int mmap_size;
38
int ret;
39
40
trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
41
--
23
--
42
2.34.1
24
2.34.1
43
44
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for tricore.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-54-peter.maydell@linaro.org
7
---
6
---
8
target/arm/tcg/a64.decode | 4 ++
7
target/tricore/helper.c | 2 ++
9
target/arm/tcg/translate-a64.c | 74 ++++++++++++----------------------
8
1 file changed, 2 insertions(+)
10
2 files changed, 30 insertions(+), 48 deletions(-)
11
9
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
10
diff --git a/target/tricore/helper.c b/target/tricore/helper.c
13
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
12
--- a/target/tricore/helper.c
15
+++ b/target/arm/tcg/a64.decode
13
+++ b/target/tricore/helper.c
16
@@ -XXX,XX +XXX,XX @@ FMAXV_s 0110 1110 00 11000 01111 10 ..... ..... @rr_q1e2
14
@@ -XXX,XX +XXX,XX @@ void fpu_set_state(CPUTriCoreState *env)
17
15
set_flush_to_zero(1, &env->fp_status);
18
FMINV_h 0.00 1110 10 11000 01111 10 ..... ..... @qrr_h
16
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
19
FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
17
set_default_nan_mode(1, &env->fp_status);
20
+
18
+ /* Default NaN pattern: sign bit clear, frac msb set */
21
+# Floating-point Immediate
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
22
+
23
+FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
24
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/tcg/translate-a64.c
27
+++ b/target/arm/tcg/translate-a64.c
28
@@ -XXX,XX +XXX,XX @@ TRANS(FMINNMV_s, do_fp_reduction, a, gen_helper_vfp_minnums)
29
TRANS(FMAXV_s, do_fp_reduction, a, gen_helper_vfp_maxs)
30
TRANS(FMINV_s, do_fp_reduction, a, gen_helper_vfp_mins)
31
32
+/*
33
+ * Floating-point Immediate
34
+ */
35
+
36
+static bool trans_FMOVI_s(DisasContext *s, arg_FMOVI_s *a)
37
+{
38
+ switch (a->esz) {
39
+ case MO_32:
40
+ case MO_64:
41
+ break;
42
+ case MO_16:
43
+ if (!dc_isar_feature(aa64_fp16, s)) {
44
+ return false;
45
+ }
46
+ break;
47
+ default:
48
+ return false;
49
+ }
50
+ if (fp_access_check(s)) {
51
+ uint64_t imm = vfp_expand_imm(a->esz, a->imm);
52
+ write_fp_dreg(s, a->rd, tcg_constant_i64(imm));
53
+ }
54
+ return true;
55
+}
56
+
57
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
58
* Note that it is the caller's responsibility to ensure that the
59
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
60
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
61
}
62
}
20
}
63
21
64
-/* Floating point immediate
22
uint32_t psw_read(CPUTriCoreState *env)
65
- * 31 30 29 28 24 23 22 21 20 13 12 10 9 5 4 0
66
- * +---+---+---+-----------+------+---+------------+-------+------+------+
67
- * | M | 0 | S | 1 1 1 1 0 | type | 1 | imm8 | 1 0 0 | imm5 | Rd |
68
- * +---+---+---+-----------+------+---+------------+-------+------+------+
69
- */
70
-static void disas_fp_imm(DisasContext *s, uint32_t insn)
71
-{
72
- int rd = extract32(insn, 0, 5);
73
- int imm5 = extract32(insn, 5, 5);
74
- int imm8 = extract32(insn, 13, 8);
75
- int type = extract32(insn, 22, 2);
76
- int mos = extract32(insn, 29, 3);
77
- uint64_t imm;
78
- MemOp sz;
79
-
80
- if (mos || imm5) {
81
- unallocated_encoding(s);
82
- return;
83
- }
84
-
85
- switch (type) {
86
- case 0:
87
- sz = MO_32;
88
- break;
89
- case 1:
90
- sz = MO_64;
91
- break;
92
- case 3:
93
- sz = MO_16;
94
- if (dc_isar_feature(aa64_fp16, s)) {
95
- break;
96
- }
97
- /* fallthru */
98
- default:
99
- unallocated_encoding(s);
100
- return;
101
- }
102
-
103
- if (!fp_access_check(s)) {
104
- return;
105
- }
106
-
107
- imm = vfp_expand_imm(sz, imm8);
108
- write_fp_dreg(s, rd, tcg_constant_i64(imm));
109
-}
110
-
111
/* Handle floating point <=> fixed point conversions. Note that we can
112
* also deal with fp <=> integer conversions as a special case (scale == 64)
113
* OPTME: consider handling that special case specially or at least skipping
114
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
115
switch (ctz32(extract32(insn, 12, 4))) {
116
case 0: /* [15:12] == xxx1 */
117
/* Floating point immediate */
118
- disas_fp_imm(s, insn);
119
+ unallocated_encoding(s); /* in decodetree */
120
break;
121
case 1: /* [15:12] == xx10 */
122
/* Floating point compare */
123
--
23
--
124
2.34.1
24
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Now that all our targets have bene converted to explicitly specify
2
their pattern for the default NaN value we can remove the remaining
3
fallback code in parts64_default_nan().
2
4
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-14-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-55-peter.maydell@linaro.org
7
---
8
---
8
target/arm/tcg/a64.decode | 9 +++
9
fpu/softfloat-specialize.c.inc | 14 --------------
9
target/arm/tcg/translate-a64.c | 117 ++++++++++++++-------------------
10
1 file changed, 14 deletions(-)
10
2 files changed, 59 insertions(+), 67 deletions(-)
11
11
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
12
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
13
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
14
--- a/fpu/softfloat-specialize.c.inc
15
+++ b/target/arm/tcg/a64.decode
15
+++ b/fpu/softfloat-specialize.c.inc
16
@@ -XXX,XX +XXX,XX @@ FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
16
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
17
# Floating-point Immediate
17
uint64_t frac;
18
18
uint8_t dnan_pattern = status->default_nan_pattern;
19
FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
19
20
+
20
- if (dnan_pattern == 0) {
21
+# Advanced SIMD Modified Immediate
21
- /*
22
+
22
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
23
+%abcdefgh 16:3 5:5
23
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
24
+
24
- * do not have floating-point.
25
+FMOVI_v_h 0 q:1 00 1111 00000 ... 1111 11 ..... rd:5 %abcdefgh
25
- */
26
+
26
- if (snan_bit_is_one(status)) {
27
+# MOVI, MVNI, ORR, BIC, FMOV are all intermixed via cmode.
27
- /* sign bit clear, set all frac bits other than msb */
28
+Vimm 0 q:1 op:1 0 1111 00000 ... cmode:4 01 ..... rd:5 %abcdefgh
28
- dnan_pattern = 0b00111111;
29
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/tcg/translate-a64.c
32
+++ b/target/arm/tcg/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ static bool trans_FMOVI_s(DisasContext *s, arg_FMOVI_s *a)
34
return true;
35
}
36
37
+/*
38
+ * Advanced SIMD Modified Immediate
39
+ */
40
+
41
+static bool trans_FMOVI_v_h(DisasContext *s, arg_FMOVI_v_h *a)
42
+{
43
+ if (!dc_isar_feature(aa64_fp16, s)) {
44
+ return false;
45
+ }
46
+ if (fp_access_check(s)) {
47
+ tcg_gen_gvec_dup_imm(MO_16, vec_full_reg_offset(s, a->rd),
48
+ a->q ? 16 : 8, vec_full_reg_size(s),
49
+ vfp_expand_imm(MO_16, a->abcdefgh));
50
+ }
51
+ return true;
52
+}
53
+
54
+static void gen_movi(unsigned vece, uint32_t dofs, uint32_t aofs,
55
+ int64_t c, uint32_t oprsz, uint32_t maxsz)
56
+{
57
+ tcg_gen_gvec_dup_imm(MO_64, dofs, oprsz, maxsz, c);
58
+}
59
+
60
+static bool trans_Vimm(DisasContext *s, arg_Vimm *a)
61
+{
62
+ GVecGen2iFn *fn;
63
+
64
+ /* Handle decode of cmode/op here between ORR/BIC/MOVI */
65
+ if ((a->cmode & 1) && a->cmode < 12) {
66
+ /* For op=1, the imm will be inverted, so BIC becomes AND. */
67
+ fn = a->op ? tcg_gen_gvec_andi : tcg_gen_gvec_ori;
68
+ } else {
69
+ /* There is one unallocated cmode/op combination in this space */
70
+ if (a->cmode == 15 && a->op == 1 && a->q == 0) {
71
+ return false;
72
+ }
73
+ fn = gen_movi;
74
+ }
75
+
76
+ if (fp_access_check(s)) {
77
+ uint64_t imm = asimd_imm_const(a->abcdefgh, a->cmode, a->op);
78
+ gen_gvec_fn2i(s, a->q, a->rd, a->rd, imm, fn, MO_64);
79
+ }
80
+ return true;
81
+}
82
+
83
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
84
* Note that it is the caller's responsibility to ensure that the
85
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
86
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
87
}
88
}
89
90
-/* AdvSIMD modified immediate
91
- * 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0
92
- * +---+---+----+---------------------+-----+-------+----+---+-------+------+
93
- * | 0 | Q | op | 0 1 1 1 1 0 0 0 0 0 | abc | cmode | o2 | 1 | defgh | Rd |
94
- * +---+---+----+---------------------+-----+-------+----+---+-------+------+
95
- *
96
- * There are a number of operations that can be carried out here:
97
- * MOVI - move (shifted) imm into register
98
- * MVNI - move inverted (shifted) imm into register
99
- * ORR - bitwise OR of (shifted) imm with register
100
- * BIC - bitwise clear of (shifted) imm with register
101
- * With ARMv8.2 we also have:
102
- * FMOV half-precision
103
- */
104
-static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
105
-{
106
- int rd = extract32(insn, 0, 5);
107
- int cmode = extract32(insn, 12, 4);
108
- int o2 = extract32(insn, 11, 1);
109
- uint64_t abcdefgh = extract32(insn, 5, 5) | (extract32(insn, 16, 3) << 5);
110
- bool is_neg = extract32(insn, 29, 1);
111
- bool is_q = extract32(insn, 30, 1);
112
- uint64_t imm = 0;
113
-
114
- if (o2) {
115
- if (cmode != 0xf || is_neg) {
116
- unallocated_encoding(s);
117
- return;
118
- }
119
- /* FMOV (vector, immediate) - half-precision */
120
- if (!dc_isar_feature(aa64_fp16, s)) {
121
- unallocated_encoding(s);
122
- return;
123
- }
124
- imm = vfp_expand_imm(MO_16, abcdefgh);
125
- /* now duplicate across the lanes */
126
- imm = dup_const(MO_16, imm);
127
- } else {
128
- if (cmode == 0xf && is_neg && !is_q) {
129
- unallocated_encoding(s);
130
- return;
131
- }
132
- imm = asimd_imm_const(abcdefgh, cmode, is_neg);
133
- }
134
-
135
- if (!fp_access_check(s)) {
136
- return;
137
- }
138
-
139
- if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) {
140
- /* MOVI or MVNI, with MVNI negation handled above. */
141
- tcg_gen_gvec_dup_imm(MO_64, vec_full_reg_offset(s, rd), is_q ? 16 : 8,
142
- vec_full_reg_size(s), imm);
143
- } else {
144
- /* ORR or BIC, with BIC negation to AND handled above. */
145
- if (is_neg) {
146
- gen_gvec_fn2i(s, is_q, rd, rd, imm, tcg_gen_gvec_andi, MO_64);
147
- } else {
29
- } else {
148
- gen_gvec_fn2i(s, is_q, rd, rd, imm, tcg_gen_gvec_ori, MO_64);
30
- /* sign bit clear, set frac msb */
31
- dnan_pattern = 0b01000000;
149
- }
32
- }
150
- }
33
- }
151
-}
34
assert(dnan_pattern != 0);
152
-
35
153
/*
36
sign = dnan_pattern >> 7;
154
* Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
155
*
156
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
157
bool is_u = extract32(insn, 29, 1);
158
bool is_q = extract32(insn, 30, 1);
159
160
- /* data_proc_simd[] has sent immh == 0 to disas_simd_mod_imm. */
161
- assert(immh != 0);
162
+ if (immh == 0) {
163
+ unallocated_encoding(s);
164
+ return;
165
+ }
166
167
switch (opcode) {
168
case 0x08: /* SRI */
169
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
170
static const AArch64DecodeTable data_proc_simd[] = {
171
/* pattern , mask , fn */
172
{ 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
173
- /* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
174
- { 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
175
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
176
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
177
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
178
--
37
--
179
2.34.1
38
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Inline pickNaNMulAdd into its only caller. This makes
4
one assert redundant with the immediately preceding IF.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241203203949.483774-3-richard.henderson@linaro.org
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
[PMM: keep comment from old code in new location]
6
Message-id: 20240912024114.1097832-7-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
11
---
9
target/arm/tcg/a64.decode | 5 ++
12
fpu/softfloat-parts.c.inc | 41 +++++++++++++++++++++++++-
10
target/arm/tcg/translate-a64.c | 121 +++++++++++++--------------------
13
fpu/softfloat-specialize.c.inc | 54 ----------------------------------
11
2 files changed, 53 insertions(+), 73 deletions(-)
14
2 files changed, 40 insertions(+), 55 deletions(-)
12
15
13
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
14
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/tcg/a64.decode
18
--- a/fpu/softfloat-parts.c.inc
16
+++ b/target/arm/tcg/a64.decode
19
+++ b/fpu/softfloat-parts.c.inc
17
@@ -XXX,XX +XXX,XX @@ FMADD 0001 1111 .. 0 ..... 0 ..... ..... ..... @rrrr_hsd
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
18
FMSUB 0001 1111 .. 0 ..... 1 ..... ..... ..... @rrrr_hsd
21
}
19
FNMADD 0001 1111 .. 1 ..... 0 ..... ..... ..... @rrrr_hsd
22
20
FNMSUB 0001 1111 .. 1 ..... 1 ..... ..... ..... @rrrr_hsd
23
if (s->default_nan_mode) {
24
+ /*
25
+ * We guarantee not to require the target to tell us how to
26
+ * pick a NaN if we're always returning the default NaN.
27
+ * But if we're not in default-NaN mode then the target must
28
+ * specify.
29
+ */
30
which = 3;
31
+ } else if (infzero) {
32
+ /*
33
+ * Inf * 0 + NaN -- some implementations return the
34
+ * default NaN here, and some return the input NaN.
35
+ */
36
+ switch (s->float_infzeronan_rule) {
37
+ case float_infzeronan_dnan_never:
38
+ which = 2;
39
+ break;
40
+ case float_infzeronan_dnan_always:
41
+ which = 3;
42
+ break;
43
+ case float_infzeronan_dnan_if_qnan:
44
+ which = is_qnan(c->cls) ? 3 : 2;
45
+ break;
46
+ default:
47
+ g_assert_not_reached();
48
+ }
49
} else {
50
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
51
+ FloatClass cls[3] = { a->cls, b->cls, c->cls };
52
+ Float3NaNPropRule rule = s->float_3nan_prop_rule;
21
+
53
+
22
+# Advanced SIMD Extract
54
+ assert(rule != float_3nan_prop_none);
23
+
55
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
24
+EXT_d 0010 1110 00 0 rm:5 00 imm:3 0 rn:5 rd:5
56
+ /* We have at least one SNaN input and should prefer it */
25
+EXT_q 0110 1110 00 0 rm:5 0 imm:4 0 rn:5 rd:5
57
+ do {
26
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
58
+ which = rule & R_3NAN_1ST_MASK;
59
+ rule >>= R_3NAN_1ST_LENGTH;
60
+ } while (!is_snan(cls[which]));
61
+ } else {
62
+ do {
63
+ which = rule & R_3NAN_1ST_MASK;
64
+ rule >>= R_3NAN_1ST_LENGTH;
65
+ } while (!is_nan(cls[which]));
66
+ }
67
}
68
69
if (which == 3) {
70
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
27
index XXXXXXX..XXXXXXX 100644
71
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/tcg/translate-a64.c
72
--- a/fpu/softfloat-specialize.c.inc
29
+++ b/target/arm/tcg/translate-a64.c
73
+++ b/fpu/softfloat-specialize.c.inc
30
@@ -XXX,XX +XXX,XX @@ static bool trans_FCSEL(DisasContext *s, arg_FCSEL *a)
74
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
31
return true;
32
}
33
34
+/*
35
+ * Advanced SIMD Extract
36
+ */
37
+
38
+static bool trans_EXT_d(DisasContext *s, arg_EXT_d *a)
39
+{
40
+ if (fp_access_check(s)) {
41
+ TCGv_i64 lo = read_fp_dreg(s, a->rn);
42
+ if (a->imm != 0) {
43
+ TCGv_i64 hi = read_fp_dreg(s, a->rm);
44
+ tcg_gen_extract2_i64(lo, lo, hi, a->imm * 8);
45
+ }
46
+ write_fp_dreg(s, a->rd, lo);
47
+ }
48
+ return true;
49
+}
50
+
51
+static bool trans_EXT_q(DisasContext *s, arg_EXT_q *a)
52
+{
53
+ TCGv_i64 lo, hi;
54
+ int pos = (a->imm & 7) * 8;
55
+ int elt = a->imm >> 3;
56
+
57
+ if (!fp_access_check(s)) {
58
+ return true;
59
+ }
60
+
61
+ lo = tcg_temp_new_i64();
62
+ hi = tcg_temp_new_i64();
63
+
64
+ read_vec_element(s, lo, a->rn, elt, MO_64);
65
+ elt++;
66
+ read_vec_element(s, hi, elt & 2 ? a->rm : a->rn, elt & 1, MO_64);
67
+ elt++;
68
+
69
+ if (pos != 0) {
70
+ TCGv_i64 hh = tcg_temp_new_i64();
71
+ tcg_gen_extract2_i64(lo, lo, hi, pos);
72
+ read_vec_element(s, hh, a->rm, elt & 1, MO_64);
73
+ tcg_gen_extract2_i64(hi, hi, hh, pos);
74
+ }
75
+
76
+ write_vec_element(s, lo, a->rd, 0, MO_64);
77
+ write_vec_element(s, hi, a->rd, 1, MO_64);
78
+ clear_vec_high(s, true, a->rd);
79
+ return true;
80
+}
81
+
82
/*
83
* Floating-point data-processing (3 source)
84
*/
85
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
86
}
75
}
87
}
76
}
88
77
89
-/* EXT
78
-/*----------------------------------------------------------------------------
90
- * 31 30 29 24 23 22 21 20 16 15 14 11 10 9 5 4 0
79
-| Select which NaN to propagate for a three-input operation.
91
- * +---+---+-------------+-----+---+------+---+------+---+------+------+
80
-| For the moment we assume that no CPU needs the 'larger significand'
92
- * | 0 | Q | 1 0 1 1 1 0 | op2 | 0 | Rm | 0 | imm4 | 0 | Rn | Rd |
81
-| information.
93
- * +---+---+-------------+-----+---+------+---+------+---+------+------+
82
-| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
94
- */
83
-*----------------------------------------------------------------------------*/
95
-static void disas_simd_ext(DisasContext *s, uint32_t insn)
84
-static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
85
- bool infzero, bool have_snan, float_status *status)
96
-{
86
-{
97
- int is_q = extract32(insn, 30, 1);
87
- FloatClass cls[3] = { a_cls, b_cls, c_cls };
98
- int op2 = extract32(insn, 22, 2);
88
- Float3NaNPropRule rule = status->float_3nan_prop_rule;
99
- int imm4 = extract32(insn, 11, 4);
89
- int which;
100
- int rm = extract32(insn, 16, 5);
101
- int rn = extract32(insn, 5, 5);
102
- int rd = extract32(insn, 0, 5);
103
- int pos = imm4 << 3;
104
- TCGv_i64 tcg_resl, tcg_resh;
105
-
90
-
106
- if (op2 != 0 || (!is_q && extract32(imm4, 3, 1))) {
91
- /*
107
- unallocated_encoding(s);
92
- * We guarantee not to require the target to tell us how to
108
- return;
93
- * pick a NaN if we're always returning the default NaN.
109
- }
94
- * But if we're not in default-NaN mode then the target must
95
- * specify.
96
- */
97
- assert(!status->default_nan_mode);
110
-
98
-
111
- if (!fp_access_check(s)) {
99
- if (infzero) {
112
- return;
100
- /*
113
- }
101
- * Inf * 0 + NaN -- some implementations return the default NaN here,
114
-
102
- * and some return the input NaN.
115
- tcg_resh = tcg_temp_new_i64();
103
- */
116
- tcg_resl = tcg_temp_new_i64();
104
- switch (status->float_infzeronan_rule) {
117
-
105
- case float_infzeronan_dnan_never:
118
- /* Vd gets bits starting at pos bits into Vm:Vn. This is
106
- return 2;
119
- * either extracting 128 bits from a 128:128 concatenation, or
107
- case float_infzeronan_dnan_always:
120
- * extracting 64 bits from a 64:64 concatenation.
108
- return 3;
121
- */
109
- case float_infzeronan_dnan_if_qnan:
122
- if (!is_q) {
110
- return is_qnan(c_cls) ? 3 : 2;
123
- read_vec_element(s, tcg_resl, rn, 0, MO_64);
111
- default:
124
- if (pos != 0) {
112
- g_assert_not_reached();
125
- read_vec_element(s, tcg_resh, rm, 0, MO_64);
126
- tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
127
- }
128
- } else {
129
- TCGv_i64 tcg_hh;
130
- typedef struct {
131
- int reg;
132
- int elt;
133
- } EltPosns;
134
- EltPosns eltposns[] = { {rn, 0}, {rn, 1}, {rm, 0}, {rm, 1} };
135
- EltPosns *elt = eltposns;
136
-
137
- if (pos >= 64) {
138
- elt++;
139
- pos -= 64;
140
- }
141
-
142
- read_vec_element(s, tcg_resl, elt->reg, elt->elt, MO_64);
143
- elt++;
144
- read_vec_element(s, tcg_resh, elt->reg, elt->elt, MO_64);
145
- elt++;
146
- if (pos != 0) {
147
- tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
148
- tcg_hh = tcg_temp_new_i64();
149
- read_vec_element(s, tcg_hh, elt->reg, elt->elt, MO_64);
150
- tcg_gen_extract2_i64(tcg_resh, tcg_resh, tcg_hh, pos);
151
- }
113
- }
152
- }
114
- }
153
-
115
-
154
- write_vec_element(s, tcg_resl, rd, 0, MO_64);
116
- assert(rule != float_3nan_prop_none);
155
- if (is_q) {
117
- if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
156
- write_vec_element(s, tcg_resh, rd, 1, MO_64);
118
- /* We have at least one SNaN input and should prefer it */
119
- do {
120
- which = rule & R_3NAN_1ST_MASK;
121
- rule >>= R_3NAN_1ST_LENGTH;
122
- } while (!is_snan(cls[which]));
123
- } else {
124
- do {
125
- which = rule & R_3NAN_1ST_MASK;
126
- rule >>= R_3NAN_1ST_LENGTH;
127
- } while (!is_nan(cls[which]));
157
- }
128
- }
158
- clear_vec_high(s, is_q, rd);
129
- return which;
159
-}
130
-}
160
-
131
-
161
/* TBL/TBX
132
/*----------------------------------------------------------------------------
162
* 31 30 29 24 23 22 21 20 16 15 14 13 12 11 10 9 5 4 0
133
| Returns 1 if the double-precision floating-point value `a' is a quiet
163
* +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
134
| NaN; otherwise returns 0.
164
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
165
{ 0x0f000400, 0x9f800400, disas_simd_shift_imm },
166
{ 0x0e000000, 0xbf208c00, disas_simd_tb },
167
{ 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
168
- { 0x2e000000, 0xbf208400, disas_simd_ext },
169
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
170
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
171
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
172
--
135
--
173
2.34.1
136
2.34.1
174
137
175
138
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
While these functions really do return a 32-bit value,
3
Remove "3" as a special case for which and simply
4
widening the return type means that we need do less
4
branch to return the desired value.
5
marshalling between TCG types.
6
7
Remove NeonGenNarrowEnvFn typedef; add NeonGenOne64OpEnvFn.
8
5
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20240912024114.1097832-27-richard.henderson@linaro.org
8
Message-id: 20241203203949.483774-4-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
10
---
14
target/arm/helper.h | 22 ++++++------
11
fpu/softfloat-parts.c.inc | 20 ++++++++++----------
15
target/arm/tcg/translate.h | 2 +-
12
1 file changed, 10 insertions(+), 10 deletions(-)
16
target/arm/tcg/neon_helper.c | 43 ++++++++++++++---------
17
target/arm/tcg/translate-a64.c | 60 ++++++++++++++++++---------------
18
target/arm/tcg/translate-neon.c | 44 ++++++++++++------------
19
5 files changed, 93 insertions(+), 78 deletions(-)
20
13
21
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
22
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/helper.h
16
--- a/fpu/softfloat-parts.c.inc
24
+++ b/target/arm/helper.h
17
+++ b/fpu/softfloat-parts.c.inc
25
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(neon_qrdmulh_s32, i32, env, i32, i32)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
26
DEF_HELPER_4(neon_qrdmlah_s32, i32, env, s32, s32, s32)
19
* But if we're not in default-NaN mode then the target must
27
DEF_HELPER_4(neon_qrdmlsh_s32, i32, env, s32, s32, s32)
20
* specify.
28
21
*/
29
-DEF_HELPER_1(neon_narrow_u8, i32, i64)
22
- which = 3;
30
-DEF_HELPER_1(neon_narrow_u16, i32, i64)
23
+ goto default_nan;
31
-DEF_HELPER_2(neon_unarrow_sat8, i32, env, i64)
24
} else if (infzero) {
32
-DEF_HELPER_2(neon_narrow_sat_u8, i32, env, i64)
25
/*
33
-DEF_HELPER_2(neon_narrow_sat_s8, i32, env, i64)
26
* Inf * 0 + NaN -- some implementations return the
34
-DEF_HELPER_2(neon_unarrow_sat16, i32, env, i64)
27
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
35
-DEF_HELPER_2(neon_narrow_sat_u16, i32, env, i64)
28
*/
36
-DEF_HELPER_2(neon_narrow_sat_s16, i32, env, i64)
29
switch (s->float_infzeronan_rule) {
37
-DEF_HELPER_2(neon_unarrow_sat32, i32, env, i64)
30
case float_infzeronan_dnan_never:
38
-DEF_HELPER_2(neon_narrow_sat_u32, i32, env, i64)
31
- which = 2;
39
-DEF_HELPER_2(neon_narrow_sat_s32, i32, env, i64)
40
+DEF_HELPER_1(neon_narrow_u8, i64, i64)
41
+DEF_HELPER_1(neon_narrow_u16, i64, i64)
42
+DEF_HELPER_2(neon_unarrow_sat8, i64, env, i64)
43
+DEF_HELPER_2(neon_narrow_sat_u8, i64, env, i64)
44
+DEF_HELPER_2(neon_narrow_sat_s8, i64, env, i64)
45
+DEF_HELPER_2(neon_unarrow_sat16, i64, env, i64)
46
+DEF_HELPER_2(neon_narrow_sat_u16, i64, env, i64)
47
+DEF_HELPER_2(neon_narrow_sat_s16, i64, env, i64)
48
+DEF_HELPER_2(neon_unarrow_sat32, i64, env, i64)
49
+DEF_HELPER_2(neon_narrow_sat_u32, i64, env, i64)
50
+DEF_HELPER_2(neon_narrow_sat_s32, i64, env, i64)
51
DEF_HELPER_1(neon_narrow_high_u8, i32, i64)
52
DEF_HELPER_1(neon_narrow_high_u16, i32, i64)
53
DEF_HELPER_1(neon_narrow_round_high_u8, i32, i64)
54
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/tcg/translate.h
57
+++ b/target/arm/tcg/translate.h
58
@@ -XXX,XX +XXX,XX @@ typedef void NeonGenThreeOpEnvFn(TCGv_i32, TCGv_env, TCGv_i32,
59
typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
60
typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
61
typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
62
-typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
63
typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
64
typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
65
typedef void NeonGenOneSingleOpFn(TCGv_i32, TCGv_i32, TCGv_ptr);
66
typedef void NeonGenTwoSingleOpFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
67
typedef void NeonGenTwoDoubleOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
68
typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
69
+typedef void NeonGenOne64OpEnvFn(TCGv_i64, TCGv_env, TCGv_i64);
70
typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
71
typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
72
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
73
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/neon_helper.c
76
+++ b/target/arm/tcg/neon_helper.c
77
@@ -XXX,XX +XXX,XX @@ NEON_VOP_ENV(qrdmulh_s32, neon_s32, 1)
78
#undef NEON_FN
79
#undef NEON_QDMULH32
80
81
-uint32_t HELPER(neon_narrow_u8)(uint64_t x)
82
+/* Only the low 32-bits of output are significant. */
83
+uint64_t HELPER(neon_narrow_u8)(uint64_t x)
84
{
85
return (x & 0xffu) | ((x >> 8) & 0xff00u) | ((x >> 16) & 0xff0000u)
86
| ((x >> 24) & 0xff000000u);
87
}
88
89
-uint32_t HELPER(neon_narrow_u16)(uint64_t x)
90
+/* Only the low 32-bits of output are significant. */
91
+uint64_t HELPER(neon_narrow_u16)(uint64_t x)
92
{
93
return (x & 0xffffu) | ((x >> 16) & 0xffff0000u);
94
}
95
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_round_high_u16)(uint64_t x)
96
return ((x >> 16) & 0xffff) | ((x >> 32) & 0xffff0000);
97
}
98
99
-uint32_t HELPER(neon_unarrow_sat8)(CPUARMState *env, uint64_t x)
100
+/* Only the low 32-bits of output are significant. */
101
+uint64_t HELPER(neon_unarrow_sat8)(CPUARMState *env, uint64_t x)
102
{
103
uint16_t s;
104
uint8_t d;
105
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_unarrow_sat8)(CPUARMState *env, uint64_t x)
106
return res;
107
}
108
109
-uint32_t HELPER(neon_narrow_sat_u8)(CPUARMState *env, uint64_t x)
110
+/* Only the low 32-bits of output are significant. */
111
+uint64_t HELPER(neon_narrow_sat_u8)(CPUARMState *env, uint64_t x)
112
{
113
uint16_t s;
114
uint8_t d;
115
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_u8)(CPUARMState *env, uint64_t x)
116
return res;
117
}
118
119
-uint32_t HELPER(neon_narrow_sat_s8)(CPUARMState *env, uint64_t x)
120
+/* Only the low 32-bits of output are significant. */
121
+uint64_t HELPER(neon_narrow_sat_s8)(CPUARMState *env, uint64_t x)
122
{
123
int16_t s;
124
uint8_t d;
125
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_s8)(CPUARMState *env, uint64_t x)
126
return res;
127
}
128
129
-uint32_t HELPER(neon_unarrow_sat16)(CPUARMState *env, uint64_t x)
130
+/* Only the low 32-bits of output are significant. */
131
+uint64_t HELPER(neon_unarrow_sat16)(CPUARMState *env, uint64_t x)
132
{
133
uint32_t high;
134
uint32_t low;
135
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_unarrow_sat16)(CPUARMState *env, uint64_t x)
136
high = 0xffff;
137
SET_QC();
138
}
139
- return low | (high << 16);
140
+ return deposit32(low, 16, 16, high);
141
}
142
143
-uint32_t HELPER(neon_narrow_sat_u16)(CPUARMState *env, uint64_t x)
144
+/* Only the low 32-bits of output are significant. */
145
+uint64_t HELPER(neon_narrow_sat_u16)(CPUARMState *env, uint64_t x)
146
{
147
uint32_t high;
148
uint32_t low;
149
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_u16)(CPUARMState *env, uint64_t x)
150
high = 0xffff;
151
SET_QC();
152
}
153
- return low | (high << 16);
154
+ return deposit32(low, 16, 16, high);
155
}
156
157
-uint32_t HELPER(neon_narrow_sat_s16)(CPUARMState *env, uint64_t x)
158
+/* Only the low 32-bits of output are significant. */
159
+uint64_t HELPER(neon_narrow_sat_s16)(CPUARMState *env, uint64_t x)
160
{
161
int32_t low;
162
int32_t high;
163
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_s16)(CPUARMState *env, uint64_t x)
164
high = (high >> 31) ^ 0x7fff;
165
SET_QC();
166
}
167
- return (uint16_t)low | (high << 16);
168
+ return deposit32(low, 16, 16, high);
169
}
170
171
-uint32_t HELPER(neon_unarrow_sat32)(CPUARMState *env, uint64_t x)
172
+/* Only the low 32-bits of output are significant. */
173
+uint64_t HELPER(neon_unarrow_sat32)(CPUARMState *env, uint64_t x)
174
{
175
if (x & 0x8000000000000000ull) {
176
SET_QC();
177
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_unarrow_sat32)(CPUARMState *env, uint64_t x)
178
return x;
179
}
180
181
-uint32_t HELPER(neon_narrow_sat_u32)(CPUARMState *env, uint64_t x)
182
+/* Only the low 32-bits of output are significant. */
183
+uint64_t HELPER(neon_narrow_sat_u32)(CPUARMState *env, uint64_t x)
184
{
185
if (x > 0xffffffffu) {
186
SET_QC();
187
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_narrow_sat_u32)(CPUARMState *env, uint64_t x)
188
return x;
189
}
190
191
-uint32_t HELPER(neon_narrow_sat_s32)(CPUARMState *env, uint64_t x)
192
+/* Only the low 32-bits of output are significant. */
193
+uint64_t HELPER(neon_narrow_sat_s32)(CPUARMState *env, uint64_t x)
194
{
195
if ((int64_t)x != (int32_t)x) {
196
SET_QC();
197
- return ((int64_t)x >> 63) ^ 0x7fffffff;
198
+ return (uint32_t)((int64_t)x >> 63) ^ 0x7fffffff;
199
}
200
- return x;
201
+ return (uint32_t)x;
202
}
203
204
uint64_t HELPER(neon_widen_u8)(uint32_t x)
205
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
206
index XXXXXXX..XXXXXXX 100644
207
--- a/target/arm/tcg/translate-a64.c
208
+++ b/target/arm/tcg/translate-a64.c
209
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
210
int elements = is_scalar ? 1 : (64 / esize);
211
bool round = extract32(opcode, 0, 1);
212
MemOp ldop = (size + 1) | (is_u_shift ? 0 : MO_SIGN);
213
- TCGv_i64 tcg_rn, tcg_rd;
214
- TCGv_i32 tcg_rd_narrowed;
215
- TCGv_i64 tcg_final;
216
+ TCGv_i64 tcg_rn, tcg_rd, tcg_final;
217
218
- static NeonGenNarrowEnvFn * const signed_narrow_fns[4][2] = {
219
+ static NeonGenOne64OpEnvFn * const signed_narrow_fns[4][2] = {
220
{ gen_helper_neon_narrow_sat_s8,
221
gen_helper_neon_unarrow_sat8 },
222
{ gen_helper_neon_narrow_sat_s16,
223
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
224
gen_helper_neon_unarrow_sat32 },
225
{ NULL, NULL },
226
};
227
- static NeonGenNarrowEnvFn * const unsigned_narrow_fns[4] = {
228
+ static NeonGenOne64OpEnvFn * const unsigned_narrow_fns[4] = {
229
gen_helper_neon_narrow_sat_u8,
230
gen_helper_neon_narrow_sat_u16,
231
gen_helper_neon_narrow_sat_u32,
232
NULL
233
};
234
- NeonGenNarrowEnvFn *narrowfn;
235
+ NeonGenOne64OpEnvFn *narrowfn;
236
237
int i;
238
239
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
240
241
tcg_rn = tcg_temp_new_i64();
242
tcg_rd = tcg_temp_new_i64();
243
- tcg_rd_narrowed = tcg_temp_new_i32();
244
tcg_final = tcg_temp_new_i64();
245
246
for (i = 0; i < elements; i++) {
247
read_vec_element(s, tcg_rn, rn, i, ldop);
248
handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
249
false, is_u_shift, size+1, shift);
250
- narrowfn(tcg_rd_narrowed, tcg_env, tcg_rd);
251
- tcg_gen_extu_i32_i64(tcg_rd, tcg_rd_narrowed);
252
+ narrowfn(tcg_rd, tcg_env, tcg_rd);
253
if (i == 0) {
254
tcg_gen_extract_i64(tcg_final, tcg_rd, 0, esize);
255
} else {
256
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
257
* in the source becomes a size element in the destination).
258
*/
259
int pass;
260
- TCGv_i32 tcg_res[2];
261
+ TCGv_i64 tcg_res[2];
262
int destelt = is_q ? 2 : 0;
263
int passes = scalar ? 1 : 2;
264
265
if (scalar) {
266
- tcg_res[1] = tcg_constant_i32(0);
267
+ tcg_res[1] = tcg_constant_i64(0);
268
}
269
270
for (pass = 0; pass < passes; pass++) {
271
TCGv_i64 tcg_op = tcg_temp_new_i64();
272
- NeonGenNarrowFn *genfn = NULL;
273
- NeonGenNarrowEnvFn *genenvfn = NULL;
274
+ NeonGenOne64OpFn *genfn = NULL;
275
+ NeonGenOne64OpEnvFn *genenvfn = NULL;
276
277
if (scalar) {
278
read_vec_element(s, tcg_op, rn, pass, size + 1);
279
} else {
280
read_vec_element(s, tcg_op, rn, pass, MO_64);
281
}
282
- tcg_res[pass] = tcg_temp_new_i32();
283
+ tcg_res[pass] = tcg_temp_new_i64();
284
285
switch (opcode) {
286
case 0x12: /* XTN, SQXTUN */
287
{
288
- static NeonGenNarrowFn * const xtnfns[3] = {
289
+ static NeonGenOne64OpFn * const xtnfns[3] = {
290
gen_helper_neon_narrow_u8,
291
gen_helper_neon_narrow_u16,
292
- tcg_gen_extrl_i64_i32,
293
+ tcg_gen_ext32u_i64,
294
};
295
- static NeonGenNarrowEnvFn * const sqxtunfns[3] = {
296
+ static NeonGenOne64OpEnvFn * const sqxtunfns[3] = {
297
gen_helper_neon_unarrow_sat8,
298
gen_helper_neon_unarrow_sat16,
299
gen_helper_neon_unarrow_sat32,
300
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
301
}
302
case 0x14: /* SQXTN, UQXTN */
303
{
304
- static NeonGenNarrowEnvFn * const fns[3][2] = {
305
+ static NeonGenOne64OpEnvFn * const fns[3][2] = {
306
{ gen_helper_neon_narrow_sat_s8,
307
gen_helper_neon_narrow_sat_u8 },
308
{ gen_helper_neon_narrow_sat_s16,
309
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
310
case 0x16: /* FCVTN, FCVTN2 */
311
/* 32 bit to 16 bit or 64 bit to 32 bit float conversion */
312
if (size == 2) {
313
- gen_helper_vfp_fcvtsd(tcg_res[pass], tcg_op, tcg_env);
314
+ TCGv_i32 tmp = tcg_temp_new_i32();
315
+ gen_helper_vfp_fcvtsd(tmp, tcg_op, tcg_env);
316
+ tcg_gen_extu_i32_i64(tcg_res[pass], tmp);
317
} else {
318
TCGv_i32 tcg_lo = tcg_temp_new_i32();
319
TCGv_i32 tcg_hi = tcg_temp_new_i32();
320
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
321
tcg_gen_extr_i64_i32(tcg_lo, tcg_hi, tcg_op);
322
gen_helper_vfp_fcvt_f32_to_f16(tcg_lo, tcg_lo, fpst, ahp);
323
gen_helper_vfp_fcvt_f32_to_f16(tcg_hi, tcg_hi, fpst, ahp);
324
- tcg_gen_deposit_i32(tcg_res[pass], tcg_lo, tcg_hi, 16, 16);
325
+ tcg_gen_deposit_i32(tcg_lo, tcg_lo, tcg_hi, 16, 16);
326
+ tcg_gen_extu_i32_i64(tcg_res[pass], tcg_lo);
327
}
328
break;
32
break;
329
case 0x36: /* BFCVTN, BFCVTN2 */
33
case float_infzeronan_dnan_always:
330
{
34
- which = 3;
331
TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
35
- break;
332
- gen_helper_bfcvt_pair(tcg_res[pass], tcg_op, fpst);
36
+ goto default_nan;
333
+ TCGv_i32 tmp = tcg_temp_new_i32();
37
case float_infzeronan_dnan_if_qnan:
334
+ gen_helper_bfcvt_pair(tmp, tcg_op, fpst);
38
- which = is_qnan(c->cls) ? 3 : 2;
335
+ tcg_gen_extu_i32_i64(tcg_res[pass], tmp);
39
+ if (is_qnan(c->cls)) {
336
}
40
+ goto default_nan;
337
break;
338
case 0x56: /* FCVTXN, FCVTXN2 */
339
- /* 64 bit to 32 bit float conversion
340
- * with von Neumann rounding (round to odd)
341
- */
342
- assert(size == 2);
343
- gen_helper_fcvtx_f64_to_f32(tcg_res[pass], tcg_op, tcg_env);
344
+ {
345
+ /*
346
+ * 64 bit to 32 bit float conversion
347
+ * with von Neumann rounding (round to odd)
348
+ */
349
+ TCGv_i32 tmp = tcg_temp_new_i32();
350
+ assert(size == 2);
351
+ gen_helper_fcvtx_f64_to_f32(tmp, tcg_op, tcg_env);
352
+ tcg_gen_extu_i32_i64(tcg_res[pass], tmp);
353
+ }
41
+ }
354
break;
42
break;
355
default:
43
default:
356
g_assert_not_reached();
44
g_assert_not_reached();
357
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
45
}
46
+ which = 2;
47
} else {
48
FloatClass cls[3] = { a->cls, b->cls, c->cls };
49
Float3NaNPropRule rule = s->float_3nan_prop_rule;
50
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
51
}
358
}
52
}
359
53
360
for (pass = 0; pass < 2; pass++) {
54
- if (which == 3) {
361
- write_vec_element_i32(s, tcg_res[pass], rd, destelt + pass, MO_32);
55
- parts_default_nan(a, s);
362
+ write_vec_element(s, tcg_res[pass], rd, destelt + pass, MO_32);
56
- return a;
57
- }
58
-
59
switch (which) {
60
case 0:
61
break;
62
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
63
parts_silence_nan(a, s);
363
}
64
}
364
clear_vec_high(s, is_q, rd);
65
return a;
66
+
67
+ default_nan:
68
+ parts_default_nan(a, s);
69
+ return a;
365
}
70
}
366
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
71
367
index XXXXXXX..XXXXXXX 100644
72
/*
368
--- a/target/arm/tcg/translate-neon.c
369
+++ b/target/arm/tcg/translate-neon.c
370
@@ -XXX,XX +XXX,XX @@ DO_2SH(VQSHL_S, gen_neon_sqshli)
371
372
static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
373
NeonGenTwo64OpFn *shiftfn,
374
- NeonGenNarrowEnvFn *narrowfn)
375
+ NeonGenOne64OpEnvFn *narrowfn)
376
{
377
/* 2-reg-and-shift narrowing-shift operations, size == 3 case */
378
- TCGv_i64 constimm, rm1, rm2;
379
- TCGv_i32 rd;
380
+ TCGv_i64 constimm, rm1, rm2, rd;
381
382
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
383
return false;
384
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
385
constimm = tcg_constant_i64(-a->shift);
386
rm1 = tcg_temp_new_i64();
387
rm2 = tcg_temp_new_i64();
388
- rd = tcg_temp_new_i32();
389
+ rd = tcg_temp_new_i64();
390
391
/* Load both inputs first to avoid potential overwrite if rm == rd */
392
read_neon_element64(rm1, a->vm, 0, MO_64);
393
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
394
395
shiftfn(rm1, rm1, constimm);
396
narrowfn(rd, tcg_env, rm1);
397
- write_neon_element32(rd, a->vd, 0, MO_32);
398
+ write_neon_element64(rd, a->vd, 0, MO_32);
399
400
shiftfn(rm2, rm2, constimm);
401
narrowfn(rd, tcg_env, rm2);
402
- write_neon_element32(rd, a->vd, 1, MO_32);
403
+ write_neon_element64(rd, a->vd, 1, MO_32);
404
405
return true;
406
}
407
408
static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
409
NeonGenTwoOpFn *shiftfn,
410
- NeonGenNarrowEnvFn *narrowfn)
411
+ NeonGenOne64OpEnvFn *narrowfn)
412
{
413
/* 2-reg-and-shift narrowing-shift operations, size < 3 case */
414
TCGv_i32 constimm, rm1, rm2, rm3, rm4;
415
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
416
417
tcg_gen_concat_i32_i64(rtmp, rm1, rm2);
418
419
- narrowfn(rm1, tcg_env, rtmp);
420
- write_neon_element32(rm1, a->vd, 0, MO_32);
421
+ narrowfn(rtmp, tcg_env, rtmp);
422
+ write_neon_element64(rtmp, a->vd, 0, MO_32);
423
424
shiftfn(rm3, rm3, constimm);
425
shiftfn(rm4, rm4, constimm);
426
427
tcg_gen_concat_i32_i64(rtmp, rm3, rm4);
428
429
- narrowfn(rm3, tcg_env, rtmp);
430
- write_neon_element32(rm3, a->vd, 1, MO_32);
431
+ narrowfn(rtmp, tcg_env, rtmp);
432
+ write_neon_element64(rtmp, a->vd, 1, MO_32);
433
return true;
434
}
435
436
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
437
return do_2shift_narrow_32(s, a, FUNC, NARROWFUNC); \
438
}
439
440
-static void gen_neon_narrow_u32(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
441
+static void gen_neon_narrow_u32(TCGv_i64 dest, TCGv_ptr env, TCGv_i64 src)
442
{
443
- tcg_gen_extrl_i64_i32(dest, src);
444
+ tcg_gen_ext32u_i64(dest, src);
445
}
446
447
-static void gen_neon_narrow_u16(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
448
+static void gen_neon_narrow_u16(TCGv_i64 dest, TCGv_ptr env, TCGv_i64 src)
449
{
450
gen_helper_neon_narrow_u16(dest, src);
451
}
452
453
-static void gen_neon_narrow_u8(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
454
+static void gen_neon_narrow_u8(TCGv_i64 dest, TCGv_ptr env, TCGv_i64 src)
455
{
456
gen_helper_neon_narrow_u8(dest, src);
457
}
458
@@ -XXX,XX +XXX,XX @@ static bool trans_VZIP(DisasContext *s, arg_2misc *a)
459
}
460
461
static bool do_vmovn(DisasContext *s, arg_2misc *a,
462
- NeonGenNarrowEnvFn *narrowfn)
463
+ NeonGenOne64OpEnvFn *narrowfn)
464
{
465
- TCGv_i64 rm;
466
- TCGv_i32 rd0, rd1;
467
+ TCGv_i64 rm, rd0, rd1;
468
469
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
470
return false;
471
@@ -XXX,XX +XXX,XX @@ static bool do_vmovn(DisasContext *s, arg_2misc *a,
472
}
473
474
rm = tcg_temp_new_i64();
475
- rd0 = tcg_temp_new_i32();
476
- rd1 = tcg_temp_new_i32();
477
+ rd0 = tcg_temp_new_i64();
478
+ rd1 = tcg_temp_new_i64();
479
480
read_neon_element64(rm, a->vm, 0, MO_64);
481
narrowfn(rd0, tcg_env, rm);
482
read_neon_element64(rm, a->vm, 1, MO_64);
483
narrowfn(rd1, tcg_env, rm);
484
- write_neon_element32(rd0, a->vd, 0, MO_32);
485
- write_neon_element32(rd1, a->vd, 1, MO_32);
486
+ write_neon_element64(rd0, a->vd, 0, MO_32);
487
+ write_neon_element64(rd1, a->vd, 1, MO_32);
488
return true;
489
}
490
491
#define DO_VMOVN(INSN, FUNC) \
492
static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
493
{ \
494
- static NeonGenNarrowEnvFn * const narrowfn[] = { \
495
+ static NeonGenOne64OpEnvFn * const narrowfn[] = { \
496
FUNC##8, \
497
FUNC##16, \
498
FUNC##32, \
499
--
73
--
500
2.34.1
74
2.34.1
75
76
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The extract2 tcg op performs the same operation
3
Assign the pointer return value to 'a' directly,
4
as the do_ext64 function.
4
rather than going through an intermediary index.
5
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241203203949.483774-5-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20240912024114.1097832-6-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/tcg/translate-a64.c | 23 +++--------------------
11
fpu/softfloat-parts.c.inc | 32 ++++++++++----------------------
13
1 file changed, 3 insertions(+), 20 deletions(-)
12
1 file changed, 10 insertions(+), 22 deletions(-)
14
13
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/translate-a64.c
16
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/tcg/translate-a64.c
17
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
20
}
19
FloatPartsN *c, float_status *s,
21
}
20
int ab_mask, int abc_mask)
22
21
{
23
-static void do_ext64(DisasContext *s, TCGv_i64 tcg_left, TCGv_i64 tcg_right,
22
- int which;
24
- int pos)
23
bool infzero = (ab_mask == float_cmask_infzero);
25
-{
24
bool have_snan = (abc_mask & float_cmask_snan);
26
- /* Extract 64 bits from the middle of two concatenated 64 bit
25
+ FloatPartsN *ret;
27
- * vector register slices left:right. The extracted bits start
26
28
- * at 'pos' bits into the right (least significant) side.
27
if (unlikely(have_snan)) {
29
- * We return the result in tcg_right, and guarantee not to
28
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
30
- * trash tcg_left.
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
31
- */
30
default:
32
- TCGv_i64 tcg_tmp = tcg_temp_new_i64();
31
g_assert_not_reached();
33
- assert(pos > 0 && pos < 64);
34
-
35
- tcg_gen_shri_i64(tcg_right, tcg_right, pos);
36
- tcg_gen_shli_i64(tcg_tmp, tcg_left, 64 - pos);
37
- tcg_gen_or_i64(tcg_right, tcg_right, tcg_tmp);
38
-}
39
-
40
/* EXT
41
* 31 30 29 24 23 22 21 20 16 15 14 11 10 9 5 4 0
42
* +---+---+-------------+-----+---+------+---+------+---+------+------+
43
@@ -XXX,XX +XXX,XX @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
44
read_vec_element(s, tcg_resl, rn, 0, MO_64);
45
if (pos != 0) {
46
read_vec_element(s, tcg_resh, rm, 0, MO_64);
47
- do_ext64(s, tcg_resh, tcg_resl, pos);
48
+ tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
49
}
32
}
33
- which = 2;
34
+ ret = c;
50
} else {
35
} else {
51
TCGv_i64 tcg_hh;
36
- FloatClass cls[3] = { a->cls, b->cls, c->cls };
52
@@ -XXX,XX +XXX,XX @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
37
+ FloatPartsN *val[3] = { a, b, c };
53
read_vec_element(s, tcg_resh, elt->reg, elt->elt, MO_64);
38
Float3NaNPropRule rule = s->float_3nan_prop_rule;
54
elt++;
39
55
if (pos != 0) {
40
assert(rule != float_3nan_prop_none);
56
- do_ext64(s, tcg_resh, tcg_resl, pos);
41
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
57
+ tcg_gen_extract2_i64(tcg_resl, tcg_resl, tcg_resh, pos);
42
/* We have at least one SNaN input and should prefer it */
58
tcg_hh = tcg_temp_new_i64();
43
do {
59
read_vec_element(s, tcg_hh, elt->reg, elt->elt, MO_64);
44
- which = rule & R_3NAN_1ST_MASK;
60
- do_ext64(s, tcg_hh, tcg_resh, pos);
45
+ ret = val[rule & R_3NAN_1ST_MASK];
61
+ tcg_gen_extract2_i64(tcg_resh, tcg_resh, tcg_hh, pos);
46
rule >>= R_3NAN_1ST_LENGTH;
47
- } while (!is_snan(cls[which]));
48
+ } while (!is_snan(ret->cls));
49
} else {
50
do {
51
- which = rule & R_3NAN_1ST_MASK;
52
+ ret = val[rule & R_3NAN_1ST_MASK];
53
rule >>= R_3NAN_1ST_LENGTH;
54
- } while (!is_nan(cls[which]));
55
+ } while (!is_nan(ret->cls));
62
}
56
}
63
}
57
}
64
58
59
- switch (which) {
60
- case 0:
61
- break;
62
- case 1:
63
- a = b;
64
- break;
65
- case 2:
66
- a = c;
67
- break;
68
- default:
69
- g_assert_not_reached();
70
+ if (is_snan(ret->cls)) {
71
+ parts_silence_nan(ret, s);
72
}
73
- if (is_snan(a->cls)) {
74
- parts_silence_nan(a, s);
75
- }
76
- return a;
77
+ return ret;
78
79
default_nan:
80
parts_default_nan(a, s);
65
--
81
--
66
2.34.1
82
2.34.1
67
83
68
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
While all indices into val[] should be in [0-2], the mask
4
applied is two bits. To help static analysis see there is
5
no possibility of read beyond the end of the array, pad the
6
array to 4 entries, with the final being (implicitly) NULL.
7
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Message-id: 20241203203949.483774-6-richard.henderson@linaro.org
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20240912024114.1097832-16-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
12
---
9
target/arm/tcg/gengvec.c | 2 +-
13
fpu/softfloat-parts.c.inc | 2 +-
10
1 file changed, 1 insertion(+), 1 deletion(-)
14
1 file changed, 1 insertion(+), 1 deletion(-)
11
15
12
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
13
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/gengvec.c
18
--- a/fpu/softfloat-parts.c.inc
15
+++ b/target/arm/tcg/gengvec.c
19
+++ b/fpu/softfloat-parts.c.inc
16
@@ -XXX,XX +XXX,XX @@ void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
17
tcg_gen_add_i32(d, d, t);
21
}
18
}
22
ret = c;
19
23
} else {
20
- void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
24
- FloatPartsN *val[3] = { a, b, c };
21
+void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
25
+ FloatPartsN *val[R_3NAN_1ST_MASK + 1] = { a, b, c };
22
{
26
Float3NaNPropRule rule = s->float_3nan_prop_rule;
23
TCGv_i64 t = tcg_temp_new_i64();
27
24
28
assert(rule != float_3nan_prop_none);
25
--
29
--
26
2.34.1
30
2.34.1
27
31
28
32
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This function is part of the public interface and
4
is not "specialized" to any target in any way.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241203203949.483774-7-richard.henderson@linaro.org
5
Message-id: 20240912024114.1097832-23-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/tcg/a64.decode | 8 +++
11
fpu/softfloat.c | 52 ++++++++++++++++++++++++++++++++++
9
target/arm/tcg/translate-a64.c | 95 +++++++++++++++++-----------------
12
fpu/softfloat-specialize.c.inc | 52 ----------------------------------
10
2 files changed, 55 insertions(+), 48 deletions(-)
13
2 files changed, 52 insertions(+), 52 deletions(-)
11
14
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
13
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
17
--- a/fpu/softfloat.c
15
+++ b/target/arm/tcg/a64.decode
18
+++ b/fpu/softfloat.c
16
@@ -XXX,XX +XXX,XX @@ SSHLL_v 0.00 11110 .... ... 10100 1 ..... ..... @q_shli_s
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
17
USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_b
20
*zExpPtr = 1 - shiftCount;
18
USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_h
21
}
19
USHLL_v 0.10 11110 .... ... 10100 1 ..... ..... @q_shli_s
22
23
+/*----------------------------------------------------------------------------
24
+| Takes two extended double-precision floating-point values `a' and `b', one
25
+| of which is a NaN, and returns the appropriate NaN result. If either `a' or
26
+| `b' is a signaling NaN, the invalid exception is raised.
27
+*----------------------------------------------------------------------------*/
20
+
28
+
21
+SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_b
29
+floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
22
+SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_h
30
+{
23
+SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_s
31
+ bool aIsLargerSignificand;
32
+ FloatClass a_cls, b_cls;
24
+
33
+
25
+RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_b
34
+ /* This is not complete, but is good enough for pickNaN. */
26
+RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_h
35
+ a_cls = (!floatx80_is_any_nan(a)
27
+RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
36
+ ? float_class_normal
28
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
37
+ : floatx80_is_signaling_nan(a, status)
29
index XXXXXXX..XXXXXXX 100644
38
+ ? float_class_snan
30
--- a/target/arm/tcg/translate-a64.c
39
+ : float_class_qnan);
31
+++ b/target/arm/tcg/translate-a64.c
40
+ b_cls = (!floatx80_is_any_nan(b)
32
@@ -XXX,XX +XXX,XX @@ static void gen_urshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
41
+ ? float_class_normal
33
}
42
+ : floatx80_is_signaling_nan(b, status)
34
}
43
+ ? float_class_snan
35
44
+ : float_class_qnan);
36
+static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
37
+ WideShiftImmFn * const fns[3], MemOp sign)
38
+{
39
+ TCGv_i64 tcg_rn, tcg_rd;
40
+ int esz = a->esz;
41
+ int esize;
42
+ WideShiftImmFn *fn;
43
+
45
+
44
+ tcg_debug_assert(esz >= MO_8 && esz <= MO_32);
46
+ if (is_snan(a_cls) || is_snan(b_cls)) {
45
+
47
+ float_raise(float_flag_invalid, status);
46
+ if (!fp_access_check(s)) {
47
+ return true;
48
+ }
48
+ }
49
+
49
+
50
+ tcg_rn = tcg_temp_new_i64();
50
+ if (status->default_nan_mode) {
51
+ tcg_rd = tcg_temp_new_i64();
51
+ return floatx80_default_nan(status);
52
+ tcg_gen_movi_i64(tcg_rd, 0);
53
+
54
+ fn = fns[esz];
55
+ esize = 8 << esz;
56
+ for (int i = 0, elements = 8 >> esz; i < elements; i++) {
57
+ read_vec_element(s, tcg_rn, a->rn, i, (esz + 1) | sign);
58
+ fn(tcg_rn, tcg_rn, a->imm);
59
+ tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, esize * i, esize);
60
+ }
52
+ }
61
+
53
+
62
+ write_vec_element(s, tcg_rd, a->rd, a->q, MO_64);
54
+ if (a.low < b.low) {
63
+ clear_vec_high(s, a->q, a->rd);
55
+ aIsLargerSignificand = 0;
64
+ return true;
56
+ } else if (b.low < a.low) {
57
+ aIsLargerSignificand = 1;
58
+ } else {
59
+ aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
60
+ }
61
+
62
+ if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
63
+ if (is_snan(b_cls)) {
64
+ return floatx80_silence_nan(b, status);
65
+ }
66
+ return b;
67
+ } else {
68
+ if (is_snan(a_cls)) {
69
+ return floatx80_silence_nan(a, status);
70
+ }
71
+ return a;
72
+ }
65
+}
73
+}
66
+
74
+
67
+static WideShiftImmFn * const shrn_fns[] = {
75
/*----------------------------------------------------------------------------
68
+ tcg_gen_shri_i64,
76
| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
69
+ tcg_gen_shri_i64,
77
| and extended significand formed by the concatenation of `zSig0' and `zSig1',
70
+ gen_ushr_d,
78
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
71
+};
79
index XXXXXXX..XXXXXXX 100644
72
+TRANS(SHRN_v, do_vec_shift_imm_narrow, a, shrn_fns, 0)
80
--- a/fpu/softfloat-specialize.c.inc
73
+
81
+++ b/fpu/softfloat-specialize.c.inc
74
+static WideShiftImmFn * const rshrn_fns[] = {
82
@@ -XXX,XX +XXX,XX @@ floatx80 floatx80_silence_nan(floatx80 a, float_status *status)
75
+ gen_urshr_bhs,
83
return a;
76
+ gen_urshr_bhs,
77
+ gen_urshr_d,
78
+};
79
+TRANS(RSHRN_v, do_vec_shift_imm_narrow, a, rshrn_fns, 0)
80
+
81
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
82
* Note that it is the caller's responsibility to ensure that the
83
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
84
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
85
}
86
}
84
}
87
85
88
-/* SHRN/RSHRN - Shift right with narrowing (and potential rounding) */
86
-/*----------------------------------------------------------------------------
89
-static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
87
-| Takes two extended double-precision floating-point values `a' and `b', one
90
- int immh, int immb, int opcode, int rn, int rd)
88
-| of which is a NaN, and returns the appropriate NaN result. If either `a' or
89
-| `b' is a signaling NaN, the invalid exception is raised.
90
-*----------------------------------------------------------------------------*/
91
-
92
-floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
91
-{
93
-{
92
- int immhb = immh << 3 | immb;
94
- bool aIsLargerSignificand;
93
- int size = 32 - clz32(immh) - 1;
95
- FloatClass a_cls, b_cls;
94
- int dsize = 64;
95
- int esize = 8 << size;
96
- int elements = dsize/esize;
97
- int shift = (2 * esize) - immhb;
98
- bool round = extract32(opcode, 0, 1);
99
- TCGv_i64 tcg_rn, tcg_rd, tcg_final;
100
- int i;
101
-
96
-
102
- if (extract32(immh, 3, 1)) {
97
- /* This is not complete, but is good enough for pickNaN. */
103
- unallocated_encoding(s);
98
- a_cls = (!floatx80_is_any_nan(a)
104
- return;
99
- ? float_class_normal
100
- : floatx80_is_signaling_nan(a, status)
101
- ? float_class_snan
102
- : float_class_qnan);
103
- b_cls = (!floatx80_is_any_nan(b)
104
- ? float_class_normal
105
- : floatx80_is_signaling_nan(b, status)
106
- ? float_class_snan
107
- : float_class_qnan);
108
-
109
- if (is_snan(a_cls) || is_snan(b_cls)) {
110
- float_raise(float_flag_invalid, status);
105
- }
111
- }
106
-
112
-
107
- if (!fp_access_check(s)) {
113
- if (status->default_nan_mode) {
108
- return;
114
- return floatx80_default_nan(status);
109
- }
115
- }
110
-
116
-
111
- tcg_rn = tcg_temp_new_i64();
117
- if (a.low < b.low) {
112
- tcg_rd = tcg_temp_new_i64();
118
- aIsLargerSignificand = 0;
113
- tcg_final = tcg_temp_new_i64();
119
- } else if (b.low < a.low) {
114
- read_vec_element(s, tcg_final, rd, is_q ? 1 : 0, MO_64);
120
- aIsLargerSignificand = 1;
115
-
121
- } else {
116
- for (i = 0; i < elements; i++) {
122
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
117
- read_vec_element(s, tcg_rn, rn, i, size+1);
118
- handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
119
- false, true, size+1, shift);
120
-
121
- tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
122
- }
123
- }
123
-
124
-
124
- if (!is_q) {
125
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
125
- write_vec_element(s, tcg_final, rd, 0, MO_64);
126
- if (is_snan(b_cls)) {
127
- return floatx80_silence_nan(b, status);
128
- }
129
- return b;
126
- } else {
130
- } else {
127
- write_vec_element(s, tcg_final, rd, 1, MO_64);
131
- if (is_snan(a_cls)) {
132
- return floatx80_silence_nan(a, status);
133
- }
134
- return a;
128
- }
135
- }
129
-
130
- clear_vec_high(s, is_q, rd);
131
-}
136
-}
132
-
137
-
133
-
138
/*----------------------------------------------------------------------------
134
/* AdvSIMD shift by immediate
139
| Returns 1 if the quadruple-precision floating-point value `a' is a quiet
135
* 31 30 29 28 23 22 19 18 16 15 11 10 9 5 4 0
140
| NaN; otherwise returns 0.
136
* +---+---+---+-------------+------+------+--------+---+------+------+
137
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
138
}
139
140
switch (opcode) {
141
- case 0x10: /* SHRN */
142
+ case 0x10: /* SHRN / SQSHRUN */
143
case 0x11: /* RSHRN / SQRSHRUN */
144
if (is_u) {
145
handle_vec_simd_sqshrn(s, false, is_q, false, true, immh, immb,
146
opcode, rn, rd);
147
} else {
148
- handle_vec_simd_shrn(s, is_q, immh, immb, opcode, rn, rd);
149
+ unallocated_encoding(s);
150
}
151
break;
152
case 0x12: /* SQSHRN / UQSHRN */
153
--
141
--
154
2.34.1
142
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Unpacking and repacking the parts may be slightly more work
4
than we did before, but we get to reuse more code. For a
5
code path handling exceptional values, this is an improvement.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241203203949.483774-8-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240912024114.1097832-11-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/tcg/a64.decode | 12 +++
12
fpu/softfloat.c | 43 +++++--------------------------------------
9
target/arm/tcg/translate-a64.c | 140 ++++++++++++---------------------
13
1 file changed, 5 insertions(+), 38 deletions(-)
10
2 files changed, 61 insertions(+), 91 deletions(-)
11
14
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
13
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
17
--- a/fpu/softfloat.c
15
+++ b/target/arm/tcg/a64.decode
18
+++ b/fpu/softfloat.c
16
@@ -XXX,XX +XXX,XX @@
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
17
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
20
18
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
21
floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
19
22
{
20
+@qrr_e . q:1 ...... esz:2 ...... ...... rn:5 rd:5 &qrr_e
23
- bool aIsLargerSignificand;
21
+
24
- FloatClass a_cls, b_cls;
22
@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0
25
+ FloatParts128 pa, pb, *pr;
23
@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
26
24
@qrrr_s . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=2
27
- /* This is not complete, but is good enough for pickNaN. */
25
@@ -XXX,XX +XXX,XX @@ TRN1 0.00 1110 .. 0 ..... 0 010 10 ..... ..... @qrrr_e
28
- a_cls = (!floatx80_is_any_nan(a)
26
TRN2 0.00 1110 .. 0 ..... 0 110 10 ..... ..... @qrrr_e
29
- ? float_class_normal
27
ZIP1 0.00 1110 .. 0 ..... 0 011 10 ..... ..... @qrrr_e
30
- : floatx80_is_signaling_nan(a, status)
28
ZIP2 0.00 1110 .. 0 ..... 0 111 10 ..... ..... @qrrr_e
31
- ? float_class_snan
29
+
32
- : float_class_qnan);
30
+# Advanced SIMD Across Lanes
33
- b_cls = (!floatx80_is_any_nan(b)
31
+
34
- ? float_class_normal
32
+ADDV 0.00 1110 .. 11000 11011 10 ..... ..... @qrr_e
35
- : floatx80_is_signaling_nan(b, status)
33
+SADDLV 0.00 1110 .. 11000 00011 10 ..... ..... @qrr_e
36
- ? float_class_snan
34
+UADDLV 0.10 1110 .. 11000 00011 10 ..... ..... @qrr_e
37
- : float_class_qnan);
35
+SMAXV 0.00 1110 .. 11000 01010 10 ..... ..... @qrr_e
36
+UMAXV 0.10 1110 .. 11000 01010 10 ..... ..... @qrr_e
37
+SMINV 0.00 1110 .. 11000 11010 10 ..... ..... @qrr_e
38
+UMINV 0.10 1110 .. 11000 11010 10 ..... ..... @qrr_e
39
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/tcg/translate-a64.c
42
+++ b/target/arm/tcg/translate-a64.c
43
@@ -XXX,XX +XXX,XX @@ TRANS(FNMADD, do_fmadd, a, true, true)
44
TRANS(FMSUB, do_fmadd, a, false, true)
45
TRANS(FNMSUB, do_fmadd, a, true, false)
46
47
+/*
48
+ * Advanced SIMD Across Lanes
49
+ */
50
+
51
+static bool do_int_reduction(DisasContext *s, arg_qrr_e *a, bool widen,
52
+ MemOp src_sign, NeonGenTwo64OpFn *fn)
53
+{
54
+ TCGv_i64 tcg_res, tcg_elt;
55
+ MemOp src_mop = a->esz | src_sign;
56
+ int elements = (a->q ? 16 : 8) >> a->esz;
57
+
58
+ /* Reject MO_64, and MO_32 without Q: a minimum of 4 elements. */
59
+ if (elements < 4) {
60
+ return false;
61
+ }
62
+ if (!fp_access_check(s)) {
63
+ return true;
64
+ }
65
+
66
+ tcg_res = tcg_temp_new_i64();
67
+ tcg_elt = tcg_temp_new_i64();
68
+
69
+ read_vec_element(s, tcg_res, a->rn, 0, src_mop);
70
+ for (int i = 1; i < elements; i++) {
71
+ read_vec_element(s, tcg_elt, a->rn, i, src_mop);
72
+ fn(tcg_res, tcg_res, tcg_elt);
73
+ }
74
+
75
+ tcg_gen_ext_i64(tcg_res, tcg_res, a->esz + widen);
76
+ write_fp_dreg(s, a->rd, tcg_res);
77
+ return true;
78
+}
79
+
80
+TRANS(ADDV, do_int_reduction, a, false, 0, tcg_gen_add_i64)
81
+TRANS(SADDLV, do_int_reduction, a, true, MO_SIGN, tcg_gen_add_i64)
82
+TRANS(UADDLV, do_int_reduction, a, true, 0, tcg_gen_add_i64)
83
+TRANS(SMAXV, do_int_reduction, a, false, MO_SIGN, tcg_gen_smax_i64)
84
+TRANS(UMAXV, do_int_reduction, a, false, 0, tcg_gen_umax_i64)
85
+TRANS(SMINV, do_int_reduction, a, false, MO_SIGN, tcg_gen_smin_i64)
86
+TRANS(UMINV, do_int_reduction, a, false, 0, tcg_gen_umin_i64)
87
+
88
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
89
* Note that it is the caller's responsibility to ensure that the
90
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
91
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
92
int opcode = extract32(insn, 12, 5);
93
bool is_q = extract32(insn, 30, 1);
94
bool is_u = extract32(insn, 29, 1);
95
- bool is_fp = false;
96
bool is_min = false;
97
int elements;
98
- int i;
99
- TCGv_i64 tcg_res, tcg_elt;
100
101
switch (opcode) {
102
- case 0x1b: /* ADDV */
103
- if (is_u) {
104
- unallocated_encoding(s);
105
- return;
106
- }
107
- /* fall through */
108
- case 0x3: /* SADDLV, UADDLV */
109
- case 0xa: /* SMAXV, UMAXV */
110
- case 0x1a: /* SMINV, UMINV */
111
- if (size == 3 || (size == 2 && !is_q)) {
112
- unallocated_encoding(s);
113
- return;
114
- }
115
- break;
116
case 0xc: /* FMAXNMV, FMINNMV */
117
case 0xf: /* FMAXV, FMINV */
118
/* Bit 1 of size field encodes min vs max and the actual size
119
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
120
* precision.
121
*/
122
is_min = extract32(size, 1, 1);
123
- is_fp = true;
124
if (!is_u && dc_isar_feature(aa64_fp16, s)) {
125
size = 1;
126
} else if (!is_u || !is_q || extract32(size, 0, 1)) {
127
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
128
}
129
break;
130
default:
131
+ case 0x3: /* SADDLV, UADDLV */
132
+ case 0xa: /* SMAXV, UMAXV */
133
+ case 0x1a: /* SMINV, UMINV */
134
+ case 0x1b: /* ADDV */
135
unallocated_encoding(s);
136
return;
137
}
138
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
139
140
elements = (is_q ? 16 : 8) >> size;
141
142
- tcg_res = tcg_temp_new_i64();
143
- tcg_elt = tcg_temp_new_i64();
144
-
38
-
145
- /* These instructions operate across all lanes of a vector
39
- if (is_snan(a_cls) || is_snan(b_cls)) {
146
- * to produce a single result. We can guarantee that a 64
40
- float_raise(float_flag_invalid, status);
147
- * bit intermediate is sufficient:
148
- * + for [US]ADDLV the maximum element size is 32 bits, and
149
- * the result type is 64 bits
150
- * + for FMAX*V, FMIN*V, ADDV the intermediate type is the
151
- * same as the element size, which is 32 bits at most
152
- * For the integer operations we can choose to work at 64
153
- * or 32 bits and truncate at the end; for simplicity
154
- * we use 64 bits always. The floating point
155
- * ops do require 32 bit intermediates, though.
156
- */
157
- if (!is_fp) {
158
- read_vec_element(s, tcg_res, rn, 0, size | (is_u ? 0 : MO_SIGN));
159
-
160
- for (i = 1; i < elements; i++) {
161
- read_vec_element(s, tcg_elt, rn, i, size | (is_u ? 0 : MO_SIGN));
162
-
163
- switch (opcode) {
164
- case 0x03: /* SADDLV / UADDLV */
165
- case 0x1b: /* ADDV */
166
- tcg_gen_add_i64(tcg_res, tcg_res, tcg_elt);
167
- break;
168
- case 0x0a: /* SMAXV / UMAXV */
169
- if (is_u) {
170
- tcg_gen_umax_i64(tcg_res, tcg_res, tcg_elt);
171
- } else {
172
- tcg_gen_smax_i64(tcg_res, tcg_res, tcg_elt);
173
- }
174
- break;
175
- case 0x1a: /* SMINV / UMINV */
176
- if (is_u) {
177
- tcg_gen_umin_i64(tcg_res, tcg_res, tcg_elt);
178
- } else {
179
- tcg_gen_smin_i64(tcg_res, tcg_res, tcg_elt);
180
- }
181
- break;
182
- default:
183
- g_assert_not_reached();
184
- }
185
-
186
- }
187
- } else {
188
+ {
189
/* Floating point vector reduction ops which work across 32
190
* bit (single) or 16 bit (half-precision) intermediates.
191
* Note that correct NaN propagation requires that we do these
192
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
193
*/
194
TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
195
int fpopcode = opcode | is_min << 4 | is_u << 5;
196
- TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, size,
197
- 0, elements, fpst);
198
- tcg_gen_extu_i32_i64(tcg_res, tcg_res32);
199
+ TCGv_i32 tcg_res = do_reduction_op(s, fpopcode, rn, size,
200
+ 0, elements, fpst);
201
+ write_fp_sreg(s, rd, tcg_res);
202
}
203
-
204
- /* Now truncate the result to the width required for the final output */
205
- if (opcode == 0x03) {
206
- /* SADDLV, UADDLV: result is 2*esize */
207
- size++;
208
- }
41
- }
209
-
42
-
210
- switch (size) {
43
- if (status->default_nan_mode) {
211
- case 0:
44
+ if (!floatx80_unpack_canonical(&pa, a, status) ||
212
- tcg_gen_ext8u_i64(tcg_res, tcg_res);
45
+ !floatx80_unpack_canonical(&pb, b, status)) {
213
- break;
46
return floatx80_default_nan(status);
214
- case 1:
47
}
215
- tcg_gen_ext16u_i64(tcg_res, tcg_res);
48
216
- break;
49
- if (a.low < b.low) {
217
- case 2:
50
- aIsLargerSignificand = 0;
218
- tcg_gen_ext32u_i64(tcg_res, tcg_res);
51
- } else if (b.low < a.low) {
219
- break;
52
- aIsLargerSignificand = 1;
220
- case 3:
53
- } else {
221
- break;
54
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
222
- default:
223
- g_assert_not_reached();
224
- }
55
- }
225
-
56
-
226
- write_fp_dreg(s, rd, tcg_res);
57
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
58
- if (is_snan(b_cls)) {
59
- return floatx80_silence_nan(b, status);
60
- }
61
- return b;
62
- } else {
63
- if (is_snan(a_cls)) {
64
- return floatx80_silence_nan(a, status);
65
- }
66
- return a;
67
- }
68
+ pr = parts_pick_nan(&pa, &pb, status);
69
+ return floatx80_round_pack_canonical(pr, status);
227
}
70
}
228
71
229
/* AdvSIMD modified immediate
72
/*----------------------------------------------------------------------------
230
--
73
--
231
2.34.1
74
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This includes SSHR, USHR, SSRA, USRA, SRSHR, URSHR, SRSRA, URSRA, SRI.
3
Inline pickNaN into its only caller. This makes one assert
4
4
redundant with the immediately preceding IF.
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20240912024114.1097832-17-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-9-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
---
10
target/arm/tcg/a64.decode | 63 ++++++++++++++++++++++++-
11
fpu/softfloat-parts.c.inc | 82 +++++++++++++++++++++++++----
11
target/arm/tcg/translate-a64.c | 86 +++++++++++-----------------------
12
fpu/softfloat-specialize.c.inc | 96 ----------------------------------
12
2 files changed, 89 insertions(+), 60 deletions(-)
13
2 files changed, 73 insertions(+), 105 deletions(-)
13
14
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/tcg/a64.decode
17
--- a/fpu/softfloat-parts.c.inc
17
+++ b/target/arm/tcg/a64.decode
18
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@
19
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
19
&rrx_e rd rn rm idx esz
20
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
20
&rrrr_e rd rn rm ra esz
21
float_status *s)
21
&qrr_e q rd rn esz
22
{
22
+&qrri_e q rd rn imm esz
23
+ int cmp, which;
23
&qrrr_e q rd rn rm esz
24
&qrrx_e q rd rn rm idx esz
25
&qrrrr_e q rd rn rm ra esz
26
@@ -XXX,XX +XXX,XX @@ FMINV_s 0110 1110 10 11000 01111 10 ..... ..... @rr_q1e2
27
28
FMOVI_s 0001 1110 .. 1 imm:8 100 00000 rd:5 esz=%esz_hsd
29
30
-# Advanced SIMD Modified Immediate
31
+# Advanced SIMD Modified Immediate / Shift by Immediate
32
33
%abcdefgh 16:3 5:5
34
35
+# Right shifts are encoded as N - shift, where N is the element size in bits.
36
+%neon_rshift_i6 16:6 !function=rsub_64
37
+%neon_rshift_i5 16:5 !function=rsub_32
38
+%neon_rshift_i4 16:4 !function=rsub_16
39
+%neon_rshift_i3 16:3 !function=rsub_8
40
+
24
+
41
+@q_shri_b . q:1 .. ..... 0001 ... ..... . rn:5 rd:5 \
25
if (is_snan(a->cls) || is_snan(b->cls)) {
42
+ &qrri_e esz=0 imm=%neon_rshift_i3
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
43
+@q_shri_h . q:1 .. ..... 001 .... ..... . rn:5 rd:5 \
27
}
44
+ &qrri_e esz=1 imm=%neon_rshift_i4
28
45
+@q_shri_s . q:1 .. ..... 01 ..... ..... . rn:5 rd:5 \
29
if (s->default_nan_mode) {
46
+ &qrri_e esz=2 imm=%neon_rshift_i5
30
parts_default_nan(a, s);
47
+@q_shri_d . 1 .. ..... 1 ...... ..... . rn:5 rd:5 \
31
- } else {
48
+ &qrri_e esz=3 imm=%neon_rshift_i6 q=1
32
- int cmp = frac_cmp(a, b);
33
- if (cmp == 0) {
34
- cmp = a->sign < b->sign;
35
- }
36
+ return a;
37
+ }
38
39
- if (pickNaN(a->cls, b->cls, cmp > 0, s)) {
40
- a = b;
41
- }
42
+ cmp = frac_cmp(a, b);
43
+ if (cmp == 0) {
44
+ cmp = a->sign < b->sign;
45
+ }
49
+
46
+
50
FMOVI_v_h 0 q:1 00 1111 00000 ... 1111 11 ..... rd:5 %abcdefgh
47
+ switch (s->float_2nan_prop_rule) {
51
48
+ case float_2nan_prop_s_ab:
52
# MOVI, MVNI, ORR, BIC, FMOV are all intermixed via cmode.
49
if (is_snan(a->cls)) {
53
Vimm 0 q:1 op:1 0 1111 00000 ... cmode:4 01 ..... rd:5 %abcdefgh
50
- parts_silence_nan(a, s);
51
+ which = 0;
52
+ } else if (is_snan(b->cls)) {
53
+ which = 1;
54
+ } else if (is_qnan(a->cls)) {
55
+ which = 0;
56
+ } else {
57
+ which = 1;
58
}
59
+ break;
60
+ case float_2nan_prop_s_ba:
61
+ if (is_snan(b->cls)) {
62
+ which = 1;
63
+ } else if (is_snan(a->cls)) {
64
+ which = 0;
65
+ } else if (is_qnan(b->cls)) {
66
+ which = 1;
67
+ } else {
68
+ which = 0;
69
+ }
70
+ break;
71
+ case float_2nan_prop_ab:
72
+ which = is_nan(a->cls) ? 0 : 1;
73
+ break;
74
+ case float_2nan_prop_ba:
75
+ which = is_nan(b->cls) ? 1 : 0;
76
+ break;
77
+ case float_2nan_prop_x87:
78
+ /*
79
+ * This implements x87 NaN propagation rules:
80
+ * SNaN + QNaN => return the QNaN
81
+ * two SNaNs => return the one with the larger significand, silenced
82
+ * two QNaNs => return the one with the larger significand
83
+ * SNaN and a non-NaN => return the SNaN, silenced
84
+ * QNaN and a non-NaN => return the QNaN
85
+ *
86
+ * If we get down to comparing significands and they are the same,
87
+ * return the NaN with the positive sign bit (if any).
88
+ */
89
+ if (is_snan(a->cls)) {
90
+ if (is_snan(b->cls)) {
91
+ which = cmp > 0 ? 0 : 1;
92
+ } else {
93
+ which = is_qnan(b->cls) ? 1 : 0;
94
+ }
95
+ } else if (is_qnan(a->cls)) {
96
+ if (is_snan(b->cls) || !is_qnan(b->cls)) {
97
+ which = 0;
98
+ } else {
99
+ which = cmp > 0 ? 0 : 1;
100
+ }
101
+ } else {
102
+ which = 1;
103
+ }
104
+ break;
105
+ default:
106
+ g_assert_not_reached();
107
+ }
54
+
108
+
55
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_b
109
+ if (which) {
56
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_h
110
+ a = b;
57
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_s
111
+ }
58
+SSHR_v 0.00 11110 .... ... 00000 1 ..... ..... @q_shri_d
112
+ if (is_snan(a->cls)) {
59
+
113
+ parts_silence_nan(a, s);
60
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_b
114
}
61
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_h
115
return a;
62
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_s
116
}
63
+USHR_v 0.10 11110 .... ... 00000 1 ..... ..... @q_shri_d
117
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
64
+
65
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_b
66
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_h
67
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_s
68
+SSRA_v 0.00 11110 .... ... 00010 1 ..... ..... @q_shri_d
69
+
70
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_b
71
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_h
72
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_s
73
+USRA_v 0.10 11110 .... ... 00010 1 ..... ..... @q_shri_d
74
+
75
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_b
76
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_h
77
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_s
78
+SRSHR_v 0.00 11110 .... ... 00100 1 ..... ..... @q_shri_d
79
+
80
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_b
81
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_h
82
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_s
83
+URSHR_v 0.10 11110 .... ... 00100 1 ..... ..... @q_shri_d
84
+
85
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_b
86
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_h
87
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_s
88
+SRSRA_v 0.00 11110 .... ... 00110 1 ..... ..... @q_shri_d
89
+
90
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_b
91
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_h
92
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_s
93
+URSRA_v 0.10 11110 .... ... 00110 1 ..... ..... @q_shri_d
94
+
95
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_b
96
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_h
97
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_s
98
+SRI_v 0.10 11110 .... ... 01000 1 ..... ..... @q_shri_d
99
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
100
index XXXXXXX..XXXXXXX 100644
118
index XXXXXXX..XXXXXXX 100644
101
--- a/target/arm/tcg/translate-a64.c
119
--- a/fpu/softfloat-specialize.c.inc
102
+++ b/target/arm/tcg/translate-a64.c
120
+++ b/fpu/softfloat-specialize.c.inc
103
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm(DisasContext *s, arg_Vimm *a)
121
@@ -XXX,XX +XXX,XX @@ bool float32_is_signaling_nan(float32 a_, float_status *status)
104
return true;
105
}
106
107
+/*
108
+ * Advanced SIMD Shift by Immediate
109
+ */
110
+
111
+static bool do_vec_shift_imm(DisasContext *s, arg_qrri_e *a, GVecGen2iFn *fn)
112
+{
113
+ if (fp_access_check(s)) {
114
+ gen_gvec_fn2i(s, a->q, a->rd, a->rn, a->imm, fn, a->esz);
115
+ }
116
+ return true;
117
+}
118
+
119
+TRANS(SSHR_v, do_vec_shift_imm, a, gen_gvec_sshr)
120
+TRANS(USHR_v, do_vec_shift_imm, a, gen_gvec_ushr)
121
+TRANS(SSRA_v, do_vec_shift_imm, a, gen_gvec_ssra)
122
+TRANS(USRA_v, do_vec_shift_imm, a, gen_gvec_usra)
123
+TRANS(SRSHR_v, do_vec_shift_imm, a, gen_gvec_srshr)
124
+TRANS(URSHR_v, do_vec_shift_imm, a, gen_gvec_urshr)
125
+TRANS(SRSRA_v, do_vec_shift_imm, a, gen_gvec_srsra)
126
+TRANS(URSRA_v, do_vec_shift_imm, a, gen_gvec_ursra)
127
+TRANS(SRI_v, do_vec_shift_imm, a, gen_gvec_sri)
128
+
129
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
130
* Note that it is the caller's responsibility to ensure that the
131
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
132
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
133
}
122
}
134
}
123
}
135
124
136
-/* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */
125
-/*----------------------------------------------------------------------------
137
-static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
126
-| Select which NaN to propagate for a two-input operation.
138
- int immh, int immb, int opcode, int rn, int rd)
127
-| IEEE754 doesn't specify all the details of this, so the
128
-| algorithm is target-specific.
129
-| The routine is passed various bits of information about the
130
-| two NaNs and should return 0 to select NaN a and 1 for NaN b.
131
-| Note that signalling NaNs are always squashed to quiet NaNs
132
-| by the caller, by calling floatXX_silence_nan() before
133
-| returning them.
134
-|
135
-| aIsLargerSignificand is only valid if both a and b are NaNs
136
-| of some kind, and is true if a has the larger significand,
137
-| or if both a and b have the same significand but a is
138
-| positive but b is negative. It is only needed for the x87
139
-| tie-break rule.
140
-*----------------------------------------------------------------------------*/
141
-
142
-static int pickNaN(FloatClass a_cls, FloatClass b_cls,
143
- bool aIsLargerSignificand, float_status *status)
139
-{
144
-{
140
- int size = 32 - clz32(immh) - 1;
145
- /*
141
- int immhb = immh << 3 | immb;
146
- * We guarantee not to require the target to tell us how to
142
- int shift = 2 * (8 << size) - immhb;
147
- * pick a NaN if we're always returning the default NaN.
143
- GVecGen2iFn *gvec_fn;
148
- * But if we're not in default-NaN mode then the target must
149
- * specify via set_float_2nan_prop_rule().
150
- */
151
- assert(!status->default_nan_mode);
144
-
152
-
145
- if (extract32(immh, 3, 1) && !is_q) {
153
- switch (status->float_2nan_prop_rule) {
146
- unallocated_encoding(s);
154
- case float_2nan_prop_s_ab:
147
- return;
155
- if (is_snan(a_cls)) {
148
- }
156
- return 0;
149
- tcg_debug_assert(size <= 3);
157
- } else if (is_snan(b_cls)) {
150
-
158
- return 1;
151
- if (!fp_access_check(s)) {
159
- } else if (is_qnan(a_cls)) {
152
- return;
160
- return 0;
153
- }
161
- } else {
154
-
162
- return 1;
155
- switch (opcode) {
163
- }
156
- case 0x02: /* SSRA / USRA (accumulate) */
164
- break;
157
- gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
165
- case float_2nan_prop_s_ba:
158
- break;
166
- if (is_snan(b_cls)) {
159
-
167
- return 1;
160
- case 0x08: /* SRI */
168
- } else if (is_snan(a_cls)) {
161
- gvec_fn = gen_gvec_sri;
169
- return 0;
162
- break;
170
- } else if (is_qnan(b_cls)) {
163
-
171
- return 1;
164
- case 0x00: /* SSHR / USHR */
172
- } else {
165
- gvec_fn = is_u ? gen_gvec_ushr : gen_gvec_sshr;
173
- return 0;
166
- break;
174
- }
167
-
175
- break;
168
- case 0x04: /* SRSHR / URSHR (rounding) */
176
- case float_2nan_prop_ab:
169
- gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
177
- if (is_nan(a_cls)) {
170
- break;
178
- return 0;
171
-
179
- } else {
172
- case 0x06: /* SRSRA / URSRA (accum + rounding) */
180
- return 1;
173
- gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
181
- }
174
- break;
182
- break;
175
-
183
- case float_2nan_prop_ba:
184
- if (is_nan(b_cls)) {
185
- return 1;
186
- } else {
187
- return 0;
188
- }
189
- break;
190
- case float_2nan_prop_x87:
191
- /*
192
- * This implements x87 NaN propagation rules:
193
- * SNaN + QNaN => return the QNaN
194
- * two SNaNs => return the one with the larger significand, silenced
195
- * two QNaNs => return the one with the larger significand
196
- * SNaN and a non-NaN => return the SNaN, silenced
197
- * QNaN and a non-NaN => return the QNaN
198
- *
199
- * If we get down to comparing significands and they are the same,
200
- * return the NaN with the positive sign bit (if any).
201
- */
202
- if (is_snan(a_cls)) {
203
- if (is_snan(b_cls)) {
204
- return aIsLargerSignificand ? 0 : 1;
205
- }
206
- return is_qnan(b_cls) ? 1 : 0;
207
- } else if (is_qnan(a_cls)) {
208
- if (is_snan(b_cls) || !is_qnan(b_cls)) {
209
- return 0;
210
- } else {
211
- return aIsLargerSignificand ? 0 : 1;
212
- }
213
- } else {
214
- return 1;
215
- }
176
- default:
216
- default:
177
- g_assert_not_reached();
217
- g_assert_not_reached();
178
- }
218
- }
179
-
180
- gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
181
-}
219
-}
182
-
220
-
183
/* SHL/SLI - Vector shift left */
221
/*----------------------------------------------------------------------------
184
static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
222
| Returns 1 if the double-precision floating-point value `a' is a quiet
185
int immh, int immb, int opcode, int rn, int rd)
223
| NaN; otherwise returns 0.
186
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
187
}
188
189
switch (opcode) {
190
- case 0x08: /* SRI */
191
- if (!is_u) {
192
- unallocated_encoding(s);
193
- return;
194
- }
195
- /* fall through */
196
- case 0x00: /* SSHR / USHR */
197
- case 0x02: /* SSRA / USRA (accumulate) */
198
- case 0x04: /* SRSHR / URSHR (rounding) */
199
- case 0x06: /* SRSRA / URSRA (accum + rounding) */
200
- handle_vec_simd_shri(s, is_q, is_u, immh, immb, opcode, rn, rd);
201
- break;
202
case 0x0a: /* SHL / SLI */
203
handle_vec_simd_shli(s, is_q, is_u, immh, immb, opcode, rn, rd);
204
break;
205
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
206
handle_simd_shift_fpint_conv(s, false, is_q, is_u, immh, immb, rn, rd);
207
return;
208
default:
209
+ case 0x00: /* SSHR / USHR */
210
+ case 0x02: /* SSRA / USRA (accumulate) */
211
+ case 0x04: /* SRSHR / URSHR (rounding) */
212
+ case 0x06: /* SRSRA / URSRA (accum + rounding) */
213
+ case 0x08: /* SRI */
214
unallocated_encoding(s);
215
return;
216
}
217
--
224
--
218
2.34.1
225
2.34.1
226
227
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Remember if there was an SNaN, and use that to simplify
4
float_2nan_prop_s_{ab,ba} to only the snan component.
5
Then, fall through to the corresponding
6
float_2nan_prop_{ab,ba} case to handle any remaining
7
nans, which must be quiet.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20241203203949.483774-10-richard.henderson@linaro.org
5
Message-id: 20240912024114.1097832-29-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
13
---
8
target/arm/tcg/a64.decode | 24 +++++
14
fpu/softfloat-parts.c.inc | 32 ++++++++++++--------------------
9
target/arm/tcg/translate-a64.c | 176 ++++++++++++++++++++++++++++++---
15
1 file changed, 12 insertions(+), 20 deletions(-)
10
2 files changed, 186 insertions(+), 14 deletions(-)
11
16
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
13
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/tcg/a64.decode
19
--- a/fpu/softfloat-parts.c.inc
15
+++ b/target/arm/tcg/a64.decode
20
+++ b/fpu/softfloat-parts.c.inc
16
@@ -XXX,XX +XXX,XX @@ SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_h
21
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
17
SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_s
22
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
18
SQSHLU_vi 0.10 11110 .... ... 01100 1 ..... ..... @q_shli_d
23
float_status *s)
19
24
{
20
+SQSHRN_v 0.00 11110 .... ... 10010 1 ..... ..... @q_shri_b
25
+ bool have_snan = false;
21
+SQSHRN_v 0.00 11110 .... ... 10010 1 ..... ..... @q_shri_h
26
int cmp, which;
22
+SQSHRN_v 0.00 11110 .... ... 10010 1 ..... ..... @q_shri_s
27
23
+
28
if (is_snan(a->cls) || is_snan(b->cls)) {
24
+UQSHRN_v 0.10 11110 .... ... 10010 1 ..... ..... @q_shri_b
29
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
25
+UQSHRN_v 0.10 11110 .... ... 10010 1 ..... ..... @q_shri_h
30
+ have_snan = true;
26
+UQSHRN_v 0.10 11110 .... ... 10010 1 ..... ..... @q_shri_s
27
+
28
+SQSHRUN_v 0.10 11110 .... ... 10000 1 ..... ..... @q_shri_b
29
+SQSHRUN_v 0.10 11110 .... ... 10000 1 ..... ..... @q_shri_h
30
+SQSHRUN_v 0.10 11110 .... ... 10000 1 ..... ..... @q_shri_s
31
+
32
+SQRSHRN_v 0.00 11110 .... ... 10011 1 ..... ..... @q_shri_b
33
+SQRSHRN_v 0.00 11110 .... ... 10011 1 ..... ..... @q_shri_h
34
+SQRSHRN_v 0.00 11110 .... ... 10011 1 ..... ..... @q_shri_s
35
+
36
+UQRSHRN_v 0.10 11110 .... ... 10011 1 ..... ..... @q_shri_b
37
+UQRSHRN_v 0.10 11110 .... ... 10011 1 ..... ..... @q_shri_h
38
+UQRSHRN_v 0.10 11110 .... ... 10011 1 ..... ..... @q_shri_s
39
+
40
+SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_b
41
+SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_h
42
+SQRSHRUN_v 0.10 11110 .... ... 10001 1 ..... ..... @q_shri_s
43
+
44
# Advanced SIMD scalar shift by immediate
45
46
@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
47
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/tcg/translate-a64.c
50
+++ b/target/arm/tcg/translate-a64.c
51
@@ -XXX,XX +XXX,XX @@ static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
52
return true;
53
}
54
55
+static void gen_sqshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
56
+{
57
+ tcg_gen_sari_i64(d, s, i);
58
+ tcg_gen_ext16u_i64(d, d);
59
+ gen_helper_neon_narrow_sat_s8(d, tcg_env, d);
60
+}
61
+
62
+static void gen_sqshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
63
+{
64
+ tcg_gen_sari_i64(d, s, i);
65
+ tcg_gen_ext32u_i64(d, d);
66
+ gen_helper_neon_narrow_sat_s16(d, tcg_env, d);
67
+}
68
+
69
+static void gen_sqshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
70
+{
71
+ gen_sshr_d(d, s, i);
72
+ gen_helper_neon_narrow_sat_s32(d, tcg_env, d);
73
+}
74
+
75
+static void gen_uqshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
76
+{
77
+ tcg_gen_shri_i64(d, s, i);
78
+ gen_helper_neon_narrow_sat_u8(d, tcg_env, d);
79
+}
80
+
81
+static void gen_uqshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
82
+{
83
+ tcg_gen_shri_i64(d, s, i);
84
+ gen_helper_neon_narrow_sat_u16(d, tcg_env, d);
85
+}
86
+
87
+static void gen_uqshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
88
+{
89
+ gen_ushr_d(d, s, i);
90
+ gen_helper_neon_narrow_sat_u32(d, tcg_env, d);
91
+}
92
+
93
+static void gen_sqshrun_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
94
+{
95
+ tcg_gen_sari_i64(d, s, i);
96
+ tcg_gen_ext16u_i64(d, d);
97
+ gen_helper_neon_unarrow_sat8(d, tcg_env, d);
98
+}
99
+
100
+static void gen_sqshrun_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
101
+{
102
+ tcg_gen_sari_i64(d, s, i);
103
+ tcg_gen_ext32u_i64(d, d);
104
+ gen_helper_neon_unarrow_sat16(d, tcg_env, d);
105
+}
106
+
107
+static void gen_sqshrun_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
108
+{
109
+ gen_sshr_d(d, s, i);
110
+ gen_helper_neon_unarrow_sat32(d, tcg_env, d);
111
+}
112
+
113
+static void gen_sqrshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
114
+{
115
+ gen_srshr_bhs(d, s, i);
116
+ tcg_gen_ext16u_i64(d, d);
117
+ gen_helper_neon_narrow_sat_s8(d, tcg_env, d);
118
+}
119
+
120
+static void gen_sqrshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
121
+{
122
+ gen_srshr_bhs(d, s, i);
123
+ tcg_gen_ext32u_i64(d, d);
124
+ gen_helper_neon_narrow_sat_s16(d, tcg_env, d);
125
+}
126
+
127
+static void gen_sqrshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
128
+{
129
+ gen_srshr_d(d, s, i);
130
+ gen_helper_neon_narrow_sat_s32(d, tcg_env, d);
131
+}
132
+
133
+static void gen_uqrshrn_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
134
+{
135
+ gen_urshr_bhs(d, s, i);
136
+ gen_helper_neon_narrow_sat_u8(d, tcg_env, d);
137
+}
138
+
139
+static void gen_uqrshrn_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
140
+{
141
+ gen_urshr_bhs(d, s, i);
142
+ gen_helper_neon_narrow_sat_u16(d, tcg_env, d);
143
+}
144
+
145
+static void gen_uqrshrn_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
146
+{
147
+ gen_urshr_d(d, s, i);
148
+ gen_helper_neon_narrow_sat_u32(d, tcg_env, d);
149
+}
150
+
151
+static void gen_sqrshrun_b(TCGv_i64 d, TCGv_i64 s, int64_t i)
152
+{
153
+ gen_srshr_bhs(d, s, i);
154
+ tcg_gen_ext16u_i64(d, d);
155
+ gen_helper_neon_unarrow_sat8(d, tcg_env, d);
156
+}
157
+
158
+static void gen_sqrshrun_h(TCGv_i64 d, TCGv_i64 s, int64_t i)
159
+{
160
+ gen_srshr_bhs(d, s, i);
161
+ tcg_gen_ext32u_i64(d, d);
162
+ gen_helper_neon_unarrow_sat16(d, tcg_env, d);
163
+}
164
+
165
+static void gen_sqrshrun_s(TCGv_i64 d, TCGv_i64 s, int64_t i)
166
+{
167
+ gen_srshr_d(d, s, i);
168
+ gen_helper_neon_unarrow_sat32(d, tcg_env, d);
169
+}
170
+
171
static WideShiftImmFn * const shrn_fns[] = {
172
tcg_gen_shri_i64,
173
tcg_gen_shri_i64,
174
@@ -XXX,XX +XXX,XX @@ static WideShiftImmFn * const rshrn_fns[] = {
175
};
176
TRANS(RSHRN_v, do_vec_shift_imm_narrow, a, rshrn_fns, 0)
177
178
+static WideShiftImmFn * const sqshrn_fns[] = {
179
+ gen_sqshrn_b,
180
+ gen_sqshrn_h,
181
+ gen_sqshrn_s,
182
+};
183
+TRANS(SQSHRN_v, do_vec_shift_imm_narrow, a, sqshrn_fns, MO_SIGN)
184
+
185
+static WideShiftImmFn * const uqshrn_fns[] = {
186
+ gen_uqshrn_b,
187
+ gen_uqshrn_h,
188
+ gen_uqshrn_s,
189
+};
190
+TRANS(UQSHRN_v, do_vec_shift_imm_narrow, a, uqshrn_fns, 0)
191
+
192
+static WideShiftImmFn * const sqshrun_fns[] = {
193
+ gen_sqshrun_b,
194
+ gen_sqshrun_h,
195
+ gen_sqshrun_s,
196
+};
197
+TRANS(SQSHRUN_v, do_vec_shift_imm_narrow, a, sqshrun_fns, MO_SIGN)
198
+
199
+static WideShiftImmFn * const sqrshrn_fns[] = {
200
+ gen_sqrshrn_b,
201
+ gen_sqrshrn_h,
202
+ gen_sqrshrn_s,
203
+};
204
+TRANS(SQRSHRN_v, do_vec_shift_imm_narrow, a, sqrshrn_fns, MO_SIGN)
205
+
206
+static WideShiftImmFn * const uqrshrn_fns[] = {
207
+ gen_uqrshrn_b,
208
+ gen_uqrshrn_h,
209
+ gen_uqrshrn_s,
210
+};
211
+TRANS(UQRSHRN_v, do_vec_shift_imm_narrow, a, uqrshrn_fns, 0)
212
+
213
+static WideShiftImmFn * const sqrshrun_fns[] = {
214
+ gen_sqrshrun_b,
215
+ gen_sqrshrun_h,
216
+ gen_sqrshrun_s,
217
+};
218
+TRANS(SQRSHRUN_v, do_vec_shift_imm_narrow, a, sqrshrun_fns, MO_SIGN)
219
+
220
/*
221
* Advanced SIMD Scalar Shift by Immediate
222
*/
223
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
224
}
31
}
225
32
226
switch (opcode) {
33
if (s->default_nan_mode) {
227
- case 0x10: /* SHRN / SQSHRUN */
34
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
228
- case 0x11: /* RSHRN / SQRSHRUN */
35
229
- if (is_u) {
36
switch (s->float_2nan_prop_rule) {
230
- handle_vec_simd_sqshrn(s, false, is_q, false, true, immh, immb,
37
case float_2nan_prop_s_ab:
231
- opcode, rn, rd);
38
- if (is_snan(a->cls)) {
39
- which = 0;
40
- } else if (is_snan(b->cls)) {
41
- which = 1;
42
- } else if (is_qnan(a->cls)) {
43
- which = 0;
232
- } else {
44
- } else {
233
- unallocated_encoding(s);
45
- which = 1;
46
+ if (have_snan) {
47
+ which = is_snan(a->cls) ? 0 : 1;
48
+ break;
49
}
50
- break;
51
- case float_2nan_prop_s_ba:
52
- if (is_snan(b->cls)) {
53
- which = 1;
54
- } else if (is_snan(a->cls)) {
55
- which = 0;
56
- } else if (is_qnan(b->cls)) {
57
- which = 1;
58
- } else {
59
- which = 0;
234
- }
60
- }
235
- break;
61
- break;
236
- case 0x12: /* SQSHRN / UQSHRN */
62
+ /* fall through */
237
- case 0x13: /* SQRSHRN / UQRSHRN */
63
case float_2nan_prop_ab:
238
- handle_vec_simd_sqshrn(s, false, is_q, is_u, is_u, immh, immb,
64
which = is_nan(a->cls) ? 0 : 1;
239
- opcode, rn, rd);
65
break;
240
- break;
66
+ case float_2nan_prop_s_ba:
241
case 0x1c: /* SCVTF / UCVTF */
67
+ if (have_snan) {
242
handle_simd_shift_intfp_conv(s, false, is_q, is_u, immh, immb,
68
+ which = is_snan(b->cls) ? 1 : 0;
243
opcode, rn, rd);
69
+ break;
244
@@ -XXX,XX +XXX,XX @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
70
+ }
245
case 0x0a: /* SHL / SLI */
71
+ /* fall through */
246
case 0x0c: /* SQSHLU */
72
case float_2nan_prop_ba:
247
case 0x0e: /* SQSHL, UQSHL */
73
which = is_nan(b->cls) ? 1 : 0;
248
+ case 0x10: /* SHRN / SQSHRUN */
74
break;
249
+ case 0x11: /* RSHRN / SQRSHRUN */
250
+ case 0x12: /* SQSHRN / UQSHRN */
251
+ case 0x13: /* SQRSHRN / UQRSHRN */
252
case 0x14: /* SSHLL / USHLL */
253
unallocated_encoding(s);
254
return;
255
--
75
--
256
2.34.1
76
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This includes SSHR, USHR, SSRA, USRA, SRSHR, URSHR,
3
Move the fractional comparison to the end of the
4
SRSRA, URSRA, SRI.
4
float_2nan_prop_x87 case. This is not required for
5
any other 2nan propagation rule. Reorganize the
6
x87 case itself to break out of the switch when the
7
fractional comparison is not required.
5
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20241203203949.483774-11-richard.henderson@linaro.org
8
Message-id: 20240912024114.1097832-24-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
13
---
11
target/arm/tcg/a64.decode | 16 ++++
14
fpu/softfloat-parts.c.inc | 19 +++++++++----------
12
target/arm/tcg/translate-a64.c | 140 ++++++++++++++++-----------------
15
1 file changed, 9 insertions(+), 10 deletions(-)
13
2 files changed, 86 insertions(+), 70 deletions(-)
14
16
15
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/a64.decode
19
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/tcg/a64.decode
20
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@
21
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
20
&rri_sf rd rn imm sf
22
return a;
21
&i imm
22
&rr_e rd rn esz
23
+&rri_e rd rn imm esz
24
&rrr_e rd rn rm esz
25
&rrx_e rd rn rm idx esz
26
&rrrr_e rd rn rm ra esz
27
@@ -XXX,XX +XXX,XX @@ SHRN_v 0.00 11110 .... ... 10000 1 ..... ..... @q_shri_s
28
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_b
29
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_h
30
RSHRN_v 0.00 11110 .... ... 10001 1 ..... ..... @q_shri_s
31
+
32
+# Advanced SIMD scalar shift by immediate
33
+
34
+@shri_d .... ..... 1 ...... ..... . rn:5 rd:5 \
35
+ &rri_e esz=3 imm=%neon_rshift_i6
36
+
37
+SSHR_s 0101 11110 .... ... 00000 1 ..... ..... @shri_d
38
+USHR_s 0111 11110 .... ... 00000 1 ..... ..... @shri_d
39
+SSRA_s 0101 11110 .... ... 00010 1 ..... ..... @shri_d
40
+USRA_s 0111 11110 .... ... 00010 1 ..... ..... @shri_d
41
+SRSHR_s 0101 11110 .... ... 00100 1 ..... ..... @shri_d
42
+URSHR_s 0111 11110 .... ... 00100 1 ..... ..... @shri_d
43
+SRSRA_s 0101 11110 .... ... 00110 1 ..... ..... @shri_d
44
+URSRA_s 0111 11110 .... ... 00110 1 ..... ..... @shri_d
45
+SRI_s 0111 11110 .... ... 01000 1 ..... ..... @shri_d
46
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/tcg/translate-a64.c
49
+++ b/target/arm/tcg/translate-a64.c
50
@@ -XXX,XX +XXX,XX @@ static void gen_ushr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
51
}
23
}
52
}
24
53
25
- cmp = frac_cmp(a, b);
54
+static void gen_ssra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
26
- if (cmp == 0) {
55
+{
27
- cmp = a->sign < b->sign;
56
+ gen_sshr_d(src, src, shift);
57
+ tcg_gen_add_i64(dst, dst, src);
58
+}
59
+
60
+static void gen_usra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
61
+{
62
+ gen_ushr_d(src, src, shift);
63
+ tcg_gen_add_i64(dst, dst, src);
64
+}
65
+
66
static void gen_srshr_bhs(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
67
{
68
assert(shift >= 0 && shift <= 32);
69
@@ -XXX,XX +XXX,XX @@ static void gen_urshr_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
70
}
71
}
72
73
+static void gen_srsra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
74
+{
75
+ gen_srshr_d(src, src, shift);
76
+ tcg_gen_add_i64(dst, dst, src);
77
+}
78
+
79
+static void gen_ursra_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
80
+{
81
+ gen_urshr_d(src, src, shift);
82
+ tcg_gen_add_i64(dst, dst, src);
83
+}
84
+
85
+static void gen_sri_d(TCGv_i64 dst, TCGv_i64 src, int64_t shift)
86
+{
87
+ /* If shift is 64, dst is unchanged. */
88
+ if (shift != 64) {
89
+ tcg_gen_shri_i64(src, src, shift);
90
+ tcg_gen_deposit_i64(dst, dst, src, 0, 64 - shift);
91
+ }
92
+}
93
+
94
static bool do_vec_shift_imm_narrow(DisasContext *s, arg_qrri_e *a,
95
WideShiftImmFn * const fns[3], MemOp sign)
96
{
97
@@ -XXX,XX +XXX,XX @@ static WideShiftImmFn * const rshrn_fns[] = {
98
};
99
TRANS(RSHRN_v, do_vec_shift_imm_narrow, a, rshrn_fns, 0)
100
101
+/*
102
+ * Advanced SIMD Scalar Shift by Immediate
103
+ */
104
+
105
+static bool do_scalar_shift_imm(DisasContext *s, arg_rri_e *a,
106
+ WideShiftImmFn *fn, bool accumulate,
107
+ MemOp sign)
108
+{
109
+ if (fp_access_check(s)) {
110
+ TCGv_i64 rd = tcg_temp_new_i64();
111
+ TCGv_i64 rn = tcg_temp_new_i64();
112
+
113
+ read_vec_element(s, rn, a->rn, 0, a->esz | sign);
114
+ if (accumulate) {
115
+ read_vec_element(s, rd, a->rd, 0, a->esz | sign);
116
+ }
117
+ fn(rd, rn, a->imm);
118
+ write_fp_dreg(s, a->rd, rd);
119
+ }
120
+ return true;
121
+}
122
+
123
+TRANS(SSHR_s, do_scalar_shift_imm, a, gen_sshr_d, false, 0)
124
+TRANS(USHR_s, do_scalar_shift_imm, a, gen_ushr_d, false, 0)
125
+TRANS(SSRA_s, do_scalar_shift_imm, a, gen_ssra_d, true, 0)
126
+TRANS(USRA_s, do_scalar_shift_imm, a, gen_usra_d, true, 0)
127
+TRANS(SRSHR_s, do_scalar_shift_imm, a, gen_srshr_d, false, 0)
128
+TRANS(URSHR_s, do_scalar_shift_imm, a, gen_urshr_d, false, 0)
129
+TRANS(SRSRA_s, do_scalar_shift_imm, a, gen_srsra_d, true, 0)
130
+TRANS(URSRA_s, do_scalar_shift_imm, a, gen_ursra_d, true, 0)
131
+TRANS(SRI_s, do_scalar_shift_imm, a, gen_sri_d, true, 0)
132
+
133
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
134
* Note that it is the caller's responsibility to ensure that the
135
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
136
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
137
}
138
}
139
140
-/* SSHR[RA]/USHR[RA] - Scalar shift right (optional rounding/accumulate) */
141
-static void handle_scalar_simd_shri(DisasContext *s,
142
- bool is_u, int immh, int immb,
143
- int opcode, int rn, int rd)
144
-{
145
- const int size = 3;
146
- int immhb = immh << 3 | immb;
147
- int shift = 2 * (8 << size) - immhb;
148
- bool accumulate = false;
149
- bool round = false;
150
- bool insert = false;
151
- TCGv_i64 tcg_rn;
152
- TCGv_i64 tcg_rd;
153
-
154
- if (!extract32(immh, 3, 1)) {
155
- unallocated_encoding(s);
156
- return;
157
- }
28
- }
158
-
29
-
159
- if (!fp_access_check(s)) {
30
switch (s->float_2nan_prop_rule) {
160
- return;
31
case float_2nan_prop_s_ab:
161
- }
32
if (have_snan) {
162
-
33
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
163
- switch (opcode) {
34
* return the NaN with the positive sign bit (if any).
164
- case 0x02: /* SSRA / USRA (accumulate) */
35
*/
165
- accumulate = true;
36
if (is_snan(a->cls)) {
166
- break;
37
- if (is_snan(b->cls)) {
167
- case 0x04: /* SRSHR / URSHR (rounding) */
38
- which = cmp > 0 ? 0 : 1;
168
- round = true;
39
- } else {
169
- break;
40
+ if (!is_snan(b->cls)) {
170
- case 0x06: /* SRSRA / URSRA (accum + rounding) */
41
which = is_qnan(b->cls) ? 1 : 0;
171
- accumulate = round = true;
42
+ break;
172
- break;
43
}
173
- case 0x08: /* SRI */
44
} else if (is_qnan(a->cls)) {
174
- insert = true;
45
if (is_snan(b->cls) || !is_qnan(b->cls)) {
175
- break;
46
which = 0;
176
- }
47
- } else {
177
-
48
- which = cmp > 0 ? 0 : 1;
178
- tcg_rn = read_fp_dreg(s, rn);
49
+ break;
179
- tcg_rd = (accumulate || insert) ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
50
}
180
-
51
} else {
181
- if (insert) {
52
which = 1;
182
- /* shift count same as element size is valid but does nothing;
53
+ break;
183
- * special case to avoid potential shift by 64.
54
}
184
- */
55
+ cmp = frac_cmp(a, b);
185
- int esize = 8 << size;
56
+ if (cmp == 0) {
186
- if (shift != esize) {
57
+ cmp = a->sign < b->sign;
187
- tcg_gen_shri_i64(tcg_rn, tcg_rn, shift);
58
+ }
188
- tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, 0, esize - shift);
59
+ which = cmp > 0 ? 0 : 1;
189
- }
190
- } else {
191
- handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
192
- accumulate, is_u, size, shift);
193
- }
194
-
195
- write_fp_dreg(s, rd, tcg_rd);
196
-}
197
-
198
/* SHL/SLI - Scalar shift left */
199
static void handle_scalar_simd_shli(DisasContext *s, bool insert,
200
int immh, int immb, int opcode,
201
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
202
}
203
204
switch (opcode) {
205
- case 0x08: /* SRI */
206
- if (!is_u) {
207
- unallocated_encoding(s);
208
- return;
209
- }
210
- /* fall through */
211
- case 0x00: /* SSHR / USHR */
212
- case 0x02: /* SSRA / USRA */
213
- case 0x04: /* SRSHR / URSHR */
214
- case 0x06: /* SRSRA / URSRA */
215
- handle_scalar_simd_shri(s, is_u, immh, immb, opcode, rn, rd);
216
- break;
217
case 0x0a: /* SHL / SLI */
218
handle_scalar_simd_shli(s, is_u, immh, immb, opcode, rn, rd);
219
break;
220
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
221
handle_simd_shift_fpint_conv(s, true, false, is_u, immh, immb, rn, rd);
222
break;
60
break;
223
default:
61
default:
224
+ case 0x00: /* SSHR / USHR */
62
g_assert_not_reached();
225
+ case 0x02: /* SSRA / USRA */
226
+ case 0x04: /* SRSHR / URSHR */
227
+ case 0x06: /* SRSRA / URSRA */
228
+ case 0x08: /* SRI */
229
unallocated_encoding(s);
230
break;
231
}
232
--
63
--
233
2.34.1
64
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We always pass the same value for round; compute it
3
Replace the "index" selecting between A and B with a result variable
4
within common code.
4
of the proper type. This improves clarity within the function.
5
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241203203949.483774-12-richard.henderson@linaro.org
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20240912024114.1097832-21-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/tcg/translate-a64.c | 32 ++++++--------------------------
11
fpu/softfloat-parts.c.inc | 28 +++++++++++++---------------
13
1 file changed, 6 insertions(+), 26 deletions(-)
12
1 file changed, 13 insertions(+), 15 deletions(-)
14
13
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/tcg/translate-a64.c
16
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/tcg/translate-a64.c
17
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
20
* the vector and scalar code.
19
float_status *s)
21
*/
22
static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
23
- TCGv_i64 tcg_rnd, bool accumulate,
24
+ bool round, bool accumulate,
25
bool is_u, int size, int shift)
26
{
20
{
27
bool extended_result = false;
21
bool have_snan = false;
28
- bool round = tcg_rnd != NULL;
22
- int cmp, which;
29
int ext_lshift = 0;
23
+ FloatPartsN *ret;
30
TCGv_i64 tcg_src_hi;
24
+ int cmp;
31
25
32
@@ -XXX,XX +XXX,XX @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
26
if (is_snan(a->cls) || is_snan(b->cls)) {
33
27
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
34
/* Deal with the rounding step */
28
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
35
if (round) {
29
switch (s->float_2nan_prop_rule) {
36
+ TCGv_i64 tcg_rnd = tcg_constant_i64(1ull << (shift - 1));
30
case float_2nan_prop_s_ab:
37
if (extended_result) {
31
if (have_snan) {
38
TCGv_i64 tcg_zero = tcg_constant_i64(0);
32
- which = is_snan(a->cls) ? 0 : 1;
39
if (!is_u) {
33
+ ret = is_snan(a->cls) ? a : b;
40
@@ -XXX,XX +XXX,XX @@ static void handle_scalar_simd_shri(DisasContext *s,
34
break;
41
bool insert = false;
35
}
42
TCGv_i64 tcg_rn;
36
/* fall through */
43
TCGv_i64 tcg_rd;
37
case float_2nan_prop_ab:
44
- TCGv_i64 tcg_round;
38
- which = is_nan(a->cls) ? 0 : 1;
45
39
+ ret = is_nan(a->cls) ? a : b;
46
if (!extract32(immh, 3, 1)) {
47
unallocated_encoding(s);
48
@@ -XXX,XX +XXX,XX @@ static void handle_scalar_simd_shri(DisasContext *s,
49
break;
40
break;
41
case float_2nan_prop_s_ba:
42
if (have_snan) {
43
- which = is_snan(b->cls) ? 1 : 0;
44
+ ret = is_snan(b->cls) ? b : a;
45
break;
46
}
47
/* fall through */
48
case float_2nan_prop_ba:
49
- which = is_nan(b->cls) ? 1 : 0;
50
+ ret = is_nan(b->cls) ? b : a;
51
break;
52
case float_2nan_prop_x87:
53
/*
54
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
55
*/
56
if (is_snan(a->cls)) {
57
if (!is_snan(b->cls)) {
58
- which = is_qnan(b->cls) ? 1 : 0;
59
+ ret = is_qnan(b->cls) ? b : a;
60
break;
61
}
62
} else if (is_qnan(a->cls)) {
63
if (is_snan(b->cls) || !is_qnan(b->cls)) {
64
- which = 0;
65
+ ret = a;
66
break;
67
}
68
} else {
69
- which = 1;
70
+ ret = b;
71
break;
72
}
73
cmp = frac_cmp(a, b);
74
if (cmp == 0) {
75
cmp = a->sign < b->sign;
76
}
77
- which = cmp > 0 ? 0 : 1;
78
+ ret = cmp > 0 ? a : b;
79
break;
80
default:
81
g_assert_not_reached();
50
}
82
}
51
83
52
- if (round) {
84
- if (which) {
53
- tcg_round = tcg_constant_i64(1ULL << (shift - 1));
85
- a = b;
54
- } else {
86
+ if (is_snan(ret->cls)) {
55
- tcg_round = NULL;
87
+ parts_silence_nan(ret, s);
88
}
89
- if (is_snan(a->cls)) {
90
- parts_silence_nan(a, s);
56
- }
91
- }
57
-
92
- return a;
58
tcg_rn = read_fp_dreg(s, rn);
93
+ return ret;
59
tcg_rd = (accumulate || insert) ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
94
}
60
95
61
@@ -XXX,XX +XXX,XX @@ static void handle_scalar_simd_shri(DisasContext *s,
96
static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
62
tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_rn, 0, esize - shift);
63
}
64
} else {
65
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
66
+ handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
67
accumulate, is_u, size, shift);
68
}
69
70
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
71
int elements = is_scalar ? 1 : (64 / esize);
72
bool round = extract32(opcode, 0, 1);
73
MemOp ldop = (size + 1) | (is_u_shift ? 0 : MO_SIGN);
74
- TCGv_i64 tcg_rn, tcg_rd, tcg_round;
75
+ TCGv_i64 tcg_rn, tcg_rd;
76
TCGv_i32 tcg_rd_narrowed;
77
TCGv_i64 tcg_final;
78
79
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
80
tcg_rd_narrowed = tcg_temp_new_i32();
81
tcg_final = tcg_temp_new_i64();
82
83
- if (round) {
84
- tcg_round = tcg_constant_i64(1ULL << (shift - 1));
85
- } else {
86
- tcg_round = NULL;
87
- }
88
-
89
for (i = 0; i < elements; i++) {
90
read_vec_element(s, tcg_rn, rn, i, ldop);
91
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
92
+ handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
93
false, is_u_shift, size+1, shift);
94
narrowfn(tcg_rd_narrowed, tcg_env, tcg_rd);
95
tcg_gen_extu_i32_i64(tcg_rd, tcg_rd_narrowed);
96
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
97
int shift = (2 * esize) - immhb;
98
bool round = extract32(opcode, 0, 1);
99
TCGv_i64 tcg_rn, tcg_rd, tcg_final;
100
- TCGv_i64 tcg_round;
101
int i;
102
103
if (extract32(immh, 3, 1)) {
104
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
105
tcg_final = tcg_temp_new_i64();
106
read_vec_element(s, tcg_final, rd, is_q ? 1 : 0, MO_64);
107
108
- if (round) {
109
- tcg_round = tcg_constant_i64(1ULL << (shift - 1));
110
- } else {
111
- tcg_round = NULL;
112
- }
113
-
114
for (i = 0; i < elements; i++) {
115
read_vec_element(s, tcg_rn, rn, i, size+1);
116
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
117
+ handle_shri_with_rndacc(tcg_rd, tcg_rn, round,
118
false, true, size+1, shift);
119
120
tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
121
--
97
--
122
2.34.1
98
2.34.1
123
99
124
100
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
From: Leif Lindholm <quic_llindhol@quicinc.com>
2
2
3
FreeBSD has longer support cycle for stable release (14.x EoL in 2028)
3
I'm migrating to Qualcomm's new open source email infrastructure, so
4
than OpenBSD (7.3 we use is already EoL). Also bugfixes are backported
4
update my email address, and update the mailmap to match.
5
so we can stay on 14.x for longer.
6
5
7
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
6
Signed-off-by: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
8
Message-id: 20240910-b4-move-to-freebsd-v5-2-0fb66d803c93@linaro.org
7
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20241205114047.1125842-1-leif.lindholm@oss.qualcomm.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
tests/functional/test_aarch64_sbsaref.py | 43 +++++++++++++++++++++++-
14
MAINTAINERS | 2 +-
13
1 file changed, 42 insertions(+), 1 deletion(-)
15
.mailmap | 5 +++--
16
2 files changed, 4 insertions(+), 3 deletions(-)
14
17
15
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
18
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100755
19
index XXXXXXX..XXXXXXX 100644
17
--- a/tests/functional/test_aarch64_sbsaref.py
20
--- a/MAINTAINERS
18
+++ b/tests/functional/test_aarch64_sbsaref.py
21
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@
22
@@ -XXX,XX +XXX,XX @@ F: include/hw/ssi/imx_spi.h
20
#!/usr/bin/env python3
23
SBSA-REF
21
#
24
M: Radoslaw Biernacki <rad@semihalf.com>
22
-# Functional test that boots a Linux kernel and checks the console
25
M: Peter Maydell <peter.maydell@linaro.org>
23
+# Functional test that boots a kernel and checks the console
26
-R: Leif Lindholm <quic_llindhol@quicinc.com>
24
#
27
+R: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
25
# SPDX-FileCopyrightText: 2023-2024 Linaro Ltd.
28
R: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
26
# SPDX-FileContributor: Philippe Mathieu-Daudé <philmd@linaro.org>
29
L: qemu-arm@nongnu.org
27
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_openbsd73_max(self):
30
S: Maintained
28
self.boot_openbsd73("max")
31
diff --git a/.mailmap b/.mailmap
29
32
index XXXXXXX..XXXXXXX 100644
30
33
--- a/.mailmap
31
+ ASSET_FREEBSD_ISO = Asset(
34
+++ b/.mailmap
32
+ ('https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/'
35
@@ -XXX,XX +XXX,XX @@ Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
33
+ '14.1/FreeBSD-14.1-RELEASE-arm64-aarch64-bootonly.iso'),
36
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
34
+ '44cdbae275ef1bb6dab1d5fbb59473d4f741e1c8ea8a80fd9e906b531d6ad461')
37
James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
35
+
38
Juan Quintela <quintela@trasno.org> <quintela@redhat.com>
36
+ # This tests the whole boot chain from EFI to Userspace
39
-Leif Lindholm <quic_llindhol@quicinc.com> <leif.lindholm@linaro.org>
37
+ # We only boot a whole OS for the current top level CPU and GIC
40
-Leif Lindholm <quic_llindhol@quicinc.com> <leif@nuviainc.com>
38
+ # Other test profiles should use more minimal boots
41
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <quic_llindhol@quicinc.com>
39
+ def boot_freebsd14(self, cpu=None):
42
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif.lindholm@linaro.org>
40
+ self.fetch_firmware()
43
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif@nuviainc.com>
41
+
44
Luc Michel <luc@lmichel.fr> <luc.michel@git.antfield.fr>
42
+ img_path = self.ASSET_FREEBSD_ISO.fetch()
45
Luc Michel <luc@lmichel.fr> <luc.michel@greensocs.com>
43
+
46
Luc Michel <luc@lmichel.fr> <lmichel@kalray.eu>
44
+ self.vm.set_console()
45
+ self.vm.add_args(
46
+ "-drive", f"file={img_path},format=raw,snapshot=on",
47
+ )
48
+ if cpu:
49
+ self.vm.add_args("-cpu", cpu)
50
+
51
+ self.vm.launch()
52
+ wait_for_console_pattern(self, 'Welcome to FreeBSD!')
53
+
54
+ def test_sbsaref_freebsd14_cortex_a57(self):
55
+ self.boot_freebsd14("cortex-a57")
56
+
57
+ def test_sbsaref_freebsd14_default_cpu(self):
58
+ self.boot_freebsd14()
59
+
60
+ def test_sbsaref_freebsd14_max_pauth_off(self):
61
+ self.boot_freebsd14("max,pauth=off")
62
+
63
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
64
+ def test_sbsaref_freebsd14_max_pauth_impdef(self):
65
+ self.boot_freebsd14("max,pauth-impdef=on")
66
+
67
+ @skipUnless(os.getenv('QEMU_TEST_TIMEOUT_EXPECTED'), 'Test might timeout')
68
+ def test_sbsaref_freebsd14_max(self):
69
+ self.boot_freebsd14("max")
70
+
71
+
72
if __name__ == '__main__':
73
QemuSystemTest.main()
74
--
47
--
75
2.34.1
48
2.34.1
76
49
77
50
diff view generated by jsdifflib
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
From: Vikram Garhwal <vikram.garhwal@bytedance.com>
2
2
3
We want to run tests using default cpu without having to remember which
3
Previously, maintainer role was paused due to inactive email id. Commit id:
4
Arm core is it.
4
c009d715721861984c4987bcc78b7ee183e86d75.
5
5
6
Change Neoverse-N1 (old default) test to use default cpu (Neoverse-N2 at
6
Signed-off-by: Vikram Garhwal <vikram.garhwal@bytedance.com>
7
the moment).
7
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
8
8
Message-id: 20241204184205.12952-1-vikram.garhwal@bytedance.com
9
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
10
Message-id: 20240910-b4-move-to-freebsd-v5-1-0fb66d803c93@linaro.org
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
10
---
14
tests/functional/test_aarch64_sbsaref.py | 18 ++++++++++--------
11
MAINTAINERS | 2 ++
15
1 file changed, 10 insertions(+), 8 deletions(-)
12
1 file changed, 2 insertions(+)
16
13
17
diff --git a/tests/functional/test_aarch64_sbsaref.py b/tests/functional/test_aarch64_sbsaref.py
14
diff --git a/MAINTAINERS b/MAINTAINERS
18
index XXXXXXX..XXXXXXX 100755
15
index XXXXXXX..XXXXXXX 100644
19
--- a/tests/functional/test_aarch64_sbsaref.py
16
--- a/MAINTAINERS
20
+++ b/tests/functional/test_aarch64_sbsaref.py
17
+++ b/MAINTAINERS
21
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_edk2_firmware(self):
18
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/fuzz-sb16-test.c
22
# This tests the whole boot chain from EFI to Userspace
19
23
# We only boot a whole OS for the current top level CPU and GIC
20
Xilinx CAN
24
# Other test profiles should use more minimal boots
21
M: Francisco Iglesias <francisco.iglesias@amd.com>
25
- def boot_alpine_linux(self, cpu):
22
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
26
+ def boot_alpine_linux(self, cpu=None):
23
S: Maintained
27
self.fetch_firmware()
24
F: hw/net/can/xlnx-*
28
25
F: include/hw/net/xlnx-*
29
iso_path = self.ASSET_ALPINE_ISO.fetch()
26
@@ -XXX,XX +XXX,XX @@ F: include/hw/rx/
30
27
CAN bus subsystem and hardware
31
self.vm.set_console()
28
M: Pavel Pisa <pisa@cmp.felk.cvut.cz>
32
self.vm.add_args(
29
M: Francisco Iglesias <francisco.iglesias@amd.com>
33
- "-cpu", cpu,
30
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
34
"-drive", f"file={iso_path},media=cdrom,format=raw",
31
S: Maintained
35
)
32
W: https://canbus.pages.fel.cvut.cz/
36
+ if cpu:
33
F: net/can/*
37
+ self.vm.add_args("-cpu", cpu)
38
39
self.vm.launch()
40
wait_for_console_pattern(self, "Welcome to Alpine Linux 3.17")
41
@@ -XXX,XX +XXX,XX @@ def boot_alpine_linux(self, cpu):
42
def test_sbsaref_alpine_linux_cortex_a57(self):
43
self.boot_alpine_linux("cortex-a57")
44
45
- def test_sbsaref_alpine_linux_neoverse_n1(self):
46
- self.boot_alpine_linux("neoverse-n1")
47
+ def test_sbsaref_alpine_linux_default_cpu(self):
48
+ self.boot_alpine_linux()
49
50
def test_sbsaref_alpine_linux_max_pauth_off(self):
51
self.boot_alpine_linux("max,pauth=off")
52
@@ -XXX,XX +XXX,XX @@ def test_sbsaref_alpine_linux_max(self):
53
# This tests the whole boot chain from EFI to Userspace
54
# We only boot a whole OS for the current top level CPU and GIC
55
# Other test profiles should use more minimal boots
56
- def boot_openbsd73(self, cpu):
57
+ def boot_openbsd73(self, cpu=None):
58
self.fetch_firmware()
59
60
img_path = self.ASSET_OPENBSD_ISO.fetch()
61
62
self.vm.set_console()
63
self.vm.add_args(
64
- "-cpu", cpu,
65
"-drive", f"file={img_path},format=raw,snapshot=on",
66
)
67
+ if cpu:
68
+ self.vm.add_args("-cpu", cpu)
69
70
self.vm.launch()
71
wait_for_console_pattern(self,
72
@@ -XXX,XX +XXX,XX @@ def boot_openbsd73(self, cpu):
73
def test_sbsaref_openbsd73_cortex_a57(self):
74
self.boot_openbsd73("cortex-a57")
75
76
- def test_sbsaref_openbsd73_neoverse_n1(self):
77
- self.boot_openbsd73("neoverse-n1")
78
+ def test_sbsaref_openbsd73_default_cpu(self):
79
+ self.boot_openbsd73()
80
81
def test_sbsaref_openbsd73_max_pauth_off(self):
82
self.boot_openbsd73("max,pauth=off")
83
--
34
--
84
2.34.1
35
2.34.1
diff view generated by jsdifflib